Currently, we always show the argument passed to dsymForUUID in the
corresponding progress update. Most of the time this is a UUID, but it
can also be an absolute path. The former is pretty uninformative and the
latter needlessly noisy.
This changes the progress update to print the UUID and the module name,
if both are available. Otherwise, we print the UUID or the module name
depending on which one is available.
We now also unconditionally pass the module file spec and architecture
to DownloadObjectAndSymbolFile, while previously this was conditional on
the file existing on-disk. This should be harmless:
- We already check that the file exists in DownloadObjectAndSymbolFile.
- It doesn't make sense to check the filesystem for the architecutre.
rdar://124643548
The commit introduces a new, generic, Alarm class. The class lets you to
schedule functions (callbacks) that will execute after a predefined
timeout. Once scheduled, you can cancel and reset a callback, given the
timeout hasn't expired yet.
The alarm class worker thread that sleeps until the next timeout
expires. When the thread wakes up, it checks for all the callbacks that
have expired and calls them in order. Because the callback is called
from the worker thread, the only guarantee is that a callback is called
no sooner than the timeout. A long running callback could potentially
block the worker threads and delay other callbacks from getting called.
I intentionally kept the implementation as simple as possible while
addressing the needs for the use case of coalescing progress events as
discussed in [1]. If we want to rely on this somewhere else, we can
reassess whether we need to address this class' limitations.
[1] https://discourse.llvm.org/t/rfc-improve-lldb-progress-reporting/75717/
Some languages may create artificial functions that have no real user
code, even though there is line table information for them. One such
case is with coroutine code that receives the CoroSplitter
transformation in LLVM IR. That code transformation creates many
different Functions, cloning one Instruction into many Instructions in
many different Functions and copying the associated debug locations.
It would be difficult to make that pass delete debug locations of cloned
instructions in a language agnostic way (is it even possible?), but LLDB
can ignore certain locations by querying its Language APIs and having it
decide based on, for example, mangling information.
Darwin AArch64 application processors are run with Top Byte Ignore mode
enabled so metadata may be stored in the top byte, it needs to be
ignored when reading/writing memory. David Spickett handled this already
in the base class Process::ReadMemory but ProcessMachCore overrides that
method (to avoid the memory cache) and did not pick up the same change.
I add a test case that creates a pointer with metadata in the top byte
and dereferences it with a live process and with a corefile.
rdar://123784501
by handling *all* errors in IRExecDiagnosticHandler. The function that
call this handles all unhandled errors with an `exit(1)`.
rdar://124459751
I don't really have a testcase for this, since the crash report I got
for this involved the Swift language plugin.
MSVC fails when there is ambiguity (multiple options) around implicit
type conversion operators.
Make ConstString's conversion operator to string_view explicit to avoid
ambiguity with one to StringRef and remove an unused local variable that
MSVC also fails on.
This is a one line fix for a Windows specific (I believe) build break.
The build failure looks like this:
`D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): error C2440:
'<function-style-cast>': cannot convert from 'lldb_private::ConstString'
to 'llvm::StringRef'
D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): note:
'llvm::StringRef::StringRef': ambiguous call to overloaded function
D:\a\_work\1\s\llvm\include\llvm/ADT/StringRef.h(840): note: could be
'llvm::StringRef::StringRef(llvm::StringRef &&)'
D:\a\_work\1\s\llvm\include\llvm/ADT/StringRef.h(104): note: or
'llvm::StringRef::StringRef(std::string_view)'
D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): note: while trying to
match the argument list '(lldb_private::ConstString)'
D:\a\_work\1\s\lldb\source\Symbol\Symtab.cpp(128): error C2672:
'std::multimap<llvm::StringRef,const lldb_private::Symbol
*,std::less<llvm::StringRef>,std::allocator<std::pair<const
llvm::StringRef,const lldb_private::Symbol *>>>::emplace': no matching
overloaded function found
C:\Program Files\Microsoft Visual
Studio\2022\Enterprise\VC\Tools\MSVC\14.37.32822\include\map(557): note:
could be
'std::_Tree_iterator<std::_Tree_val<std::_Tree_simple_types<std::pair<const
llvm::StringRef,const lldb_private::Symbol *>>>>
std::multimap<llvm::StringRef,const lldb_private::Symbol
*,std::less<llvm::StringRef>,std::allocator<std::pair<const
llvm::StringRef,const lldb_private::Symbol *>>>::emplace(_Valty &&...)'
`
The StringRef constructor here is intended to take a ConstString object,
which I assume is implicitly converted to a std::string_view by
compilers other than Visual Studio's. To fix the VS build I made the
StringRef initialization more explicit, as you can see in the diff.
The ValueObjectConstResult classes that back expression result variables
play a complicated game with where the data for their values is stored.
They try to make it appear as though they are still tied to the memory
in the target into which their value was written when the expression is
run, but they also keep a copy in the Host which they use after the
value is made (expression results are "history values" so that's how we
make sure they have "the value at the time of the expression".)
However, that means that if you ask them to cast themselves to a value
bigger than their original size, they don't have a way to get more
memory for that purpose. The same thing is true of ValueObjects backed
by DataExtractors, the data extractors don't know how to get more data
than they were made with in general.
The only place where we actually ask ValueObjects to sample outside
their captured bounds is when you do ValueObject::Cast from one
structure type to a bigger structure type. In
https://reviews.llvm.org/D153657 I handled this by just disallowing
casts from one structure value to a larger one. My reasoning at the time
was that the use case for this was to support discriminator based C
inheritance schemes, and you can't directly cast values in C, only
pointers, so this was not a natural way to handle those types. It seemed
logical that since you would have had to start with pointers in the
implementation, that's how you would write your lldb introspection code
as well.
Famous last words...
Turns out there are some heavy users of the SB API's who were relying on
this working, and this is a behavior change, so this patch makes this
work in the cases where it used to work before, while still disallowing
the cases we don't know how to support.
Note that if you had done this Cast operation before with either
expression results or value objects from data extractors, lldb would not
have returned the correct results, so the cases this patch outlaws are
ones that actually produce invalid results. So nobody should be using
Cast in these cases, or if they were, this patch will point out the bug
they hadn't yet noticed.
symbol
This patch puts the default breakpoint on the
sanitizers_address_on_report symbol, and uses the old symbol as a backup
if the default case is not found
rdar://123911522
Walter Erquinigo added optional instruction annotations for x86
instructions in 2022 for the `thread trace dump instruction` command,
and code to DisassemblerLLVMC to add annotations for instructions that
change flow control, v. https://reviews.llvm.org/D128477
This was added as an option to `disassemble`, and the trace dump command
enables it by default, but several other instruction dumpers were
changed to display them by default as well. These are only implemented
for Intel instructions, so our disassembly on other targets ends up
looking like
```
(lldb) x/5i 0x1000086e4
0x1000086e4: 0xa9be6ffc unknown stp x28, x27, [sp, #-0x20]!
0x1000086e8: 0xa9017bfd unknown stp x29, x30, [sp, #0x10]
0x1000086ec: 0x910043fd unknown add x29, sp, #0x10
0x1000086f0: 0xd11843ff unknown sub sp, sp, #0x610
0x1000086f4: 0x910c63e8 unknown add x8, sp, #0x318
```
instead of `disassemble`'s output style of
```
lldb`main:
lldb[0x1000086e4] <+0>: stp x28, x27, [sp, #-0x20]!
lldb[0x1000086e8] <+4>: stp x29, x30, [sp, #0x10]
lldb[0x1000086ec] <+8>: add x29, sp, #0x10
lldb[0x1000086f0] <+12>: sub sp, sp, #0x610
lldb[0x1000086f4] <+16>: add x8, sp, #0x318
```
Adding symbolic annotations for assembly instructions is something I'm
interested in too, because we may have users investigating a crash or
apparent-incorrect behavior who must debug optimized assembly and they
may not be familiar with the ISA they're using, so short of flipping
through a many-thousand-page PDF to understand each instruction, they're
lost. They don't write assembly or work at that level, but to understand
a bug, they have to understand what the instructions are actually doing.
But the annotations that exist today don't move us forward much on that
front - I'd argue that the flow control instructions on Intel are not
hard to understand from their names, but that might just be my personal
bias. Much trickier instructions exist in any event.
Displaying this information by default for all targets when we only have
one class of instructions on one target is not a good default.
Also, in 2011 when Greg implemented the `memory read -f i` (aka `x/i`)
command
```
commit 5009f9d501
Author: Greg Clayton <gclayton@apple.com>
Date: Thu Oct 27 17:55:14 2011 +0000
[...]
eFormatInstruction will print out disassembly with bytes and it will use the
current target's architecture. The format character for this is "i" (which
used to be being used for the integer format, but the integer format also has
"d", so we gave the "i" format to disassembly), the long format is
"instruction".
```
he had DumpDataExtractor's DumpInstructions print the bytes of the
instruction -- that's the first field we see above for the `x/5i` after
the address -- and this is only useful for people who are debugging the
disassembler itself, I would argue. I don't want this displayed by
default either.
tl;dr this patch removes both fields from `memory read -f -i` and I
think this is the right call today. While I'm really interested in
instruction annotation, I don't think `x/i` is the right place to have
it enabled by default unless it's really compelling on at least some of
our major targets.
If the `m_editor_status` is `EditorStatus::Editing`, PrintAsync clears
the currently edited line. In some situations, the edited line is not
saved. After the stream flushes, PrintAsync tries to display the unsaved
line, causing the loss of the edited line.
The issue arose while I was debugging REPRLRun in
[Fuzzilli](https://github.com/googleprojectzero/fuzzilli). I started
LLDB and attempted to set a breakpoint in libreprl-posix.c. I entered
`breakpoint set -f lib` and used the "tab" key for command completion.
After completion, the edited line was flushed, leaving a blank line.
Change GetNumChildren()/CalculateNumChildren() methods return
llvm::Expected
This is an NFC change that does not yet add any error handling or change
any code to return any errors.
This is the second big change in the patch series started with
https://github.com/llvm/llvm-project/pull/83501
A follow-up PR will wire up error handling.
Change GetNumChildren()/CalculateNumChildren() methods return
llvm::Expected
This is an NFC change that does not yet add any error handling or change
any code to return any errors.
This is the second big change in the patch series started with
https://github.com/llvm/llvm-project/pull/83501
A follow-up PR will wire up error handling.
535da10842 introduced a check, when execve fails,
to see if we are allowed to trace programs at all.
Unfortunately because we like to call Status vars "error" and "status"
in various combinations, one got misnamed. This lead to lldb-server trying
to make an error value out of a success value when you did the following:
```
$ ./bin/lldb-server gdbserver 127.0.0.1:1234 -- is_not_a_file
Assertion failed: (Err && "Cannot create Expected<T> from Error success value."), function Expected...
```
This happened because the execve fails, but the check whether we can
trace says yes we can trace, but then we use the Status from the check
to create the return value. That Status is in fact a success value not
the failed Status we got from the execve attempt.
With the name corrected you now get:
```
$ ./bin/lldb-server gdbserver 127.0.0.1:1234 -- is_not_a_file
error: failed to launch 'is_not_a_file': execve failed: No such file or directory
```
Which is what we expect to see. This also fixes the test `TestGDBRemoteLaunch.py`
when asserts are enabled.
Currently, progress events reported by the ProgressManager and broadcast
to eBroadcastBitProgressCategory always specify they're complete. The
problem is that the ProgressManager reports kNonDeterministicTotal for
both the total and the completed number of (sub)events. Because the
values are the same, the event reports itself as complete.
This patch fixes the issue by reporting 0 as the completed value for the
start event and kNonDeterministicTotal for the end event.
In Swift's downstream lldb, there are a number of experimental properties. This change
extracts a getter function containing the common logic for getting a boolean valued
experimental property.
This also deletes `SetInjectLocalVariables` which isn't used anywhere.
When debugging LLDB itself, it can often be useful to know the mangled
name of the function where a breakpoint is set. Since the `--verbose`
setting of `break --list` is aimed at debugging LLDB, this patch makes
it so that the mangled name is also printed in that mode.
Note about testing: since mangling is not the same on Windows and Linux,
the test refrains from hardcoding mangled names.
DWP files don't usually have a GNU build ID built into them. When
searching for a .dwp file, don't require a UUID to be in the .dwp file.
The debug info search information was checking for a UUID in the .dwp
file when debug info search paths were being used. This is now fixed by
not specifying the UUID in the ModuleSpec being used for the .dwp file
search.
We got user reporting lldb crash while the debuggee is calling vfork()
concurrently from multiple threads.
The crash happens because the current implementation can only handle
single vfork, vforkdone protocol transaction.
This diff fixes the crash by lldb-server storing forked debuggee's <pid,
tid> pair in jstopinfo which will be decoded by lldb client to create
StopInfoVFork for follow parent/child policy. Each StopInfoVFork will
later have a corresponding vforkdone packet. So the patch also changes
the `m_vfork_in_progress` to be reference counting based.
Two new test cases are added which crash/assert without the changes in
this patch.
---------
Co-authored-by: jeffreytan81 <jeffreytan@fb.com>
[lldb] Add SBProcess methods for get/set/use address masks (#83095)
I'm reviving a patch from phabracator, https://reviews.llvm.org/D155905
which was approved but I wasn't thrilled with all the API I was adding
to SBProcess for all of the address mask types / memory regions. In this
update, I added enums to control type address mask type (code, data,
any) and address space specifiers (low, high, all) with defaulted
arguments for the most common case. I originally landed this via
https://github.com/llvm/llvm-project/pull/83095 but it failed on CIs
outside of arm64 Darwin so I had to debug it on more environments
and update the patch.
This patch is also fixing a bug in the "addressable bits to address
mask" calculation I added in AddressableBits::SetProcessMasks. If lldb
were told that 64 bits are valid for addressing, this method would
overflow the calculation and set an invalid mask. Added tests to check
this specific bug while I was adding these APIs.
This patch changes the value of "no mask set" from 0 to
LLDB_INVALID_ADDRESS_MASK, which is UINT64_MAX. A mask of all 1's
means "no bits are used for addressing" which is an impossible mask,
whereas a mask of 0 means "all bits are used for addressing" which
is possible.
I added a base class implementation of ABI::FixCodeAddress and
ABI::FixDataAddress that will apply the Process mask values if they
are set to a value other than LLDB_INVALID_ADDRESS_MASK.
I updated all the callers/users of the Mask methods which were
handling a value of 0 to mean invalid mask to use
LLDB_INVALID_ADDRESS_MASK.
I added code to the all AArch64 ABI Fix* methods to apply the
Highmem masks if they have been set. These will not be set on a
Linux environment, but in TestAddressMasks.py I test the highmem
masks feature for any AArch64 target, so all AArch64 ABI plugins
must handle it.
rdar://123530562
llvm-project/lldb/source/Core/Debugger.cpp:107:14:
error: no type named 'DefaultThreadPoolThreadPool' in namespace 'llvm'
static llvm::DefaultThreadPoolThreadPool *g_thread_pool = nullptr;
~~~~~~^
1 error generated.
The base class llvm::ThreadPoolInterface will be renamed
llvm::ThreadPool in a subsequent commit.
This is a breaking change: clients who use to create a ThreadPool must
now create a DefaultThreadPool instead.
Currently, calls to Host::SystemLog print to stderr on all host
platforms except Darwin. This severely limits its value on the command
line, where we don't want to overload the user with log messages. Switch
to using the syslog function on POSIX systems to send messages to the
system logger instead of stdout.
On Darwin systems this sends the log message to os_log, which matches
what we do today. Nevertheless I kept the current implementation that
uses os_log directly as it gives us more freedom.
I'm not sure if there's an equivalent on Windows, so I kept the existing
behavior of logging to stderr.
Currently, for x86 and x86_64 triples, "+sse" and "+sse2" are appended
to `Features` vector of `TargetOptions` unconditionally. This vector is
later reset in `TargetInfo::CreateTargetInfo` and filled using info from
`FeaturesAsWritten` vector, so previous modifications of the `Features`
vector have no effect. For x86_64 triple, we append "sse2"
unconditionally in `X86TargetInfo::initFeatureMap`, so despite the
`Features` vector reset, we still have the desired sse features enabled.
The corresponding code in `X86TargetInfo::initFeatureMap` is marked as
FIXME, so we should not probably rely on it and should set desired
features properly in `ClangExpressionParser`.
This patch changes the vector the features are appended to from
`Features` to `FeaturesAsWritten`. It's not reset later and is used to
compute resulting `Features` vector.
The help output for `thread backtrace` specifies that you can pass -1 to
`--count` to display all the frames.
```
-c <count> ( --count <count> )
How many frames to display (-1 for all)
```
However, that doesn't work:
```
(lldb) thread backtrace --count -1
error: invalid integer value for option 'c'
```
The problem is that we store the option value as an unsigned and the
code to parse the string correctly rejects it. There's two ways to fix
this:
1. Make `m_count` a signed value so that it accepts negative values and
appease the parser. The function that prints the frames takes an
unsigned so a negative value will just become a really large positive
value, which is what the current implementation relies on.
2. Keep `m_count` unsigned and instead use 0 the magic value to show all
frames. I don't really see a point in not showing any frames at all,
plus that's already broken (`error: error displaying backtrace for
thread: "0x0001"`).
This patch implements (2) and at the same time improve the error
reporting so that we print the invalid value when we cannot parse it.
rdar://123881767
This commit adds the functionality to broadcast events using the
`Debugger::eBroadcastProgressCategory`
bit (https://github.com/llvm/llvm-project/pull/81169) by keeping track
of these reports with the `ProgressManager`
class (https://github.com/llvm/llvm-project/pull/81319). The new bit is
used in such a way that it will only broadcast the initial and final
progress reports for specific progress categories that are managed by
the progress manager.
This commit also adds a new test to the progress report unit test that
checks that only the initial/final reports are broadcasted when using
the new bit.
This patch adds support to sort the symbol table by size. The command
already supports sorting and it already reports sizes. Sorting by size
helps diagnosing size issues.
rdar://123788375
This reverts commit 9a12b0a600.
TestAddressMasks fails its first test on lldb-x86_64-debian,
lldb-arm-ubuntu, lldb-aarch64-ubuntu bots. Reverting while
investigating.
I'm reviving a patch from phabracator, https://reviews.llvm.org/D155905
which was approved but I wasn't thrilled with all the API I was adding
to SBProcess for all of the address mask types / memory regions. In this
update, I added enums to control type address mask type (code, data,
any) and address space specifiers (low, high, all) with defaulted
arguments for the most common case.
This patch is also fixing a bug in the "addressable bits to address
mask" calculation I added in AddressableBits::SetProcessMasks. If lldb
were told that 64 bits are valid for addressing, this method would
overflow the calculation and set an invalid mask. Added tests to check
this specific bug while I was adding these APIs.
rdar://123530562
Layout information for a record gets stored in the `ClangASTImporter`
associated with the `DWARFASTParserClang` that originally parsed the
record. LLDB sometimes moves clang types from one AST to another (in the
reproducer the origin AST was a precompiled-header and the destination
was the AST backing the executable). When clang then asks LLDB to
`layoutRecordType`, it will do so with the help of the
`ClangASTImporter` the type is associated with. If the type's origin is
actually in a different LLDB module (and thus a different
`DWARFASTParserClang` was used to set its layout info), we won't find
the layout info in our local `ClangASTImporter`.
In the reproducer this meant we would drop the alignment info of the
origin type and misread a variable's contents with `frame var` and
`expr`.
There is logic in `ClangASTSource::layoutRecordType` to import an
origin's layout info. This patch re-uses that infrastructure to import
an origin's layout from one `ClangASTImporter` instance to another.
rdar://123274144
This patch moves the logic for copying the layout info of a
`RecordDecl`s origin into a target AST.
A follow-up patch re-uses the logic from within the `ClangASTImporter`,
so the natural choice was to move it there.
There was a think-o in a previous commit that made us only able to
define 1 line commands when using command script add interactively.
There was also no test for this feature, so I fixed the think-o and
added a test.
The help for the `-r` option to `image list` says:
-r[<width>] ( --ref-count=[<width>] )
Display the reference count if the module is still in the shared module
cache.
but that's not what it actually does. It unconditionally shows the
use_count for all Module shared pointers, regardless of whether they are
still in the shared module cache or whether they are just in the
ModuleCollection and other entities are keeping them alive. That seems
like a more useful behavior, but then it is also useful to know what's
in the shared cache, so I changed this to:
-r[<width>] ( --ref-count=[<width>] )
Display whether the module is still in the the shared module cache
(Y/N), and its shared pointer use_count.
So instead of just `{5}` you will see `{Y 5}` if it is in the shared
cache and `{N 5}` if not.
I didn't add tests for this because I'm not sure how much we want to fix
shared cache behavior in the testsuite.
https://github.com/modularml/mojo/issues/1796 discovered that if you try
to complete a space-only line in the REPL on Linux, LLDB crashes. I
suspect that editline doesn't behave the same way on linux and on
darwin, because I can't replicate this on darwin.
Adding a boundary check in the completion code prevents the crash from
happening.
Partly, there's just a lot of unnecessary boiler plate. It's also
possible to define combinations of arguments that make no sense (e.g.
eArgRepeatPlus followed by eArgRepeatPlain...) but these are never
checked since we just push_back directly into the argument definitions.
This commit is step 1 of this cleanup - do the obvious stuff. In it, all
the simple homogenous argument lists and the breakpoint/watchpoint
ID/Range types, are set with common functions. This is an NFC change, it
just centralizes boiler plate. There's no checking yet because you can't
get a single argument wrong.
The end goal is that all argument definition goes through functions and
m_arguments is hidden so that you can't define inconsistent argument
sets.
This updates the remaining SetOptionValue methods in
CommandObjectBreakpoint to use CreateOptionParsingError.
I found a few minor bugs that were fixed during this refactor (e.g.
using the wrong flag in an error message). That is one of the benefits
of centralizing error message creation.
I also found some option parsing code that is written incorrectly. I do
not make an attempt to update those here because this PR is primarily
about changing existing error handling code, not adding new error
handling code.