Files
llvm/lldb/source/Commands/CommandObjectDisassemble.h
Abdullah Mohammad Amin 8ec4db5e17 Stateful variable-location annotations in Disassembler::PrintInstructions() (follow-up to #147460) (#152887)
**Context**  
Follow-up to
[#147460](https://github.com/llvm/llvm-project/pull/147460), which added
the ability to surface register-resident variable locations.
This PR moves the annotation logic out of `Instruction::Dump()` and into
`Disassembler::PrintInstructions()`, and adds lightweight state tracking
so we only print changes at range starts and when variables go out of
scope.

---

## What this does

While iterating the instructions for a function, we maintain a “live
variable map” keyed by `lldb::user_id_t` (the `Variable`’s ID) to
remember each variable’s last emitted location string. For each
instruction:

- **New (or newly visible) variable** → print `name = <location>` once
at the start of its DWARF location range, cache it.
- **Location changed** (e.g., DWARF range switched to a different
register/const) → print the updated mapping.
- **Out of scope** (was tracked previously but not found for the current
PC) → print `name = <undef>` and drop it.

This produces **concise, stateful annotations** that highlight variable
lifetime transitions without spamming every line.

---

## Why in `PrintInstructions()`?

- Keeps `Instruction` stateless and avoids changing the
`Instruction::Dump()` virtual API.
- Makes it straightforward to diff state across instructions (`prev →
current`) inside the single driver loop.

---

## How it works (high-level)

1. For the current PC, get in-scope variables via
`StackFrame::GetInScopeVariableList(/*get_parent=*/true)`.
2. For each `Variable`, query
`DWARFExpressionList::GetExpressionEntryAtAddress(func_load_addr,
current_pc)` (added in #144238).
3. If the entry exists, call `DumpLocation(..., eDescriptionLevelBrief,
abi)` to get a short, ABI-aware location string (e.g., `DW_OP_reg3 RBX →
RBX`).
4. Compare against the last emitted location in the live map:
   - If not present → emit `name = <location>` and record it.
   - If different → emit updated mapping and record it.
5. After processing current in-scope variables, compute the set
difference vs. the previous map and emit `name = <undef>` for any that
disappeared.

Internally:
- We respect file↔load address translation already provided by
`DWARFExpressionList`.
- We reuse the ABI to map LLVM register numbers to arch register names.

---

## Example output (x86_64, simplified)

```
->  0x55c6f5f6a140 <+0>:  cmpl   $0x2, %edi                                                         ; argc = RDI, argv = RSI
    0x55c6f5f6a143 <+3>:  jl     0x55c6f5f6a176            ; <+54> at d_original_example.c:6:3
    0x55c6f5f6a145 <+5>:  pushq  %r15
    0x55c6f5f6a147 <+7>:  pushq  %r14
    0x55c6f5f6a149 <+9>:  pushq  %rbx
    0x55c6f5f6a14a <+10>: movq   %rsi, %rbx
    0x55c6f5f6a14d <+13>: movl   %edi, %r14d
    0x55c6f5f6a150 <+16>: movl   $0x1, %r15d                                                        ; argc = R14
    0x55c6f5f6a156 <+22>: nopw   %cs:(%rax,%rax)                                                    ; i = R15, argv = RBX
    0x55c6f5f6a160 <+32>: movq   (%rbx,%r15,8), %rdi
    0x55c6f5f6a164 <+36>: callq  0x55c6f5f6a030            ; symbol stub for: puts
    0x55c6f5f6a169 <+41>: incq   %r15
    0x55c6f5f6a16c <+44>: cmpq   %r15, %r14
    0x55c6f5f6a16f <+47>: jne    0x55c6f5f6a160            ; <+32> at d_original_example.c:5:10
    0x55c6f5f6a171 <+49>: popq   %rbx                                                               ; i = <undef>
    0x55c6f5f6a172 <+50>: popq   %r14                                                               ; argv = RSI
    0x55c6f5f6a174 <+52>: popq   %r15                                                               ; argc = RDI
    0x55c6f5f6a176 <+54>: xorl   %eax, %eax
    0x55c6f5f6a178 <+56>: retq  
```

Only transitions are shown: the start of a location, changes, and
end-of-lifetime.

---

## Scope & limitations (by design)

- Handles **simple locations** first (registers, const-in-register cases
surfaced by `DumpLocation`).
- **Memory/composite locations** are out of scope for this PR.
- Annotations appear **only at range boundaries** (start/change/end) to
minimize noise.
- Output is **target-independent**; register names come from the target
ABI.

## Implementation notes

- All annotation printing now happens in
`Disassembler::PrintInstructions()`.
- Uses `std::unordered_map<lldb::user_id_t, std::string>` as the live
map.
- No persistent state across calls; the map is rebuilt while walking
instruction by instruction.
- **No changes** to the `Instruction` interface.

---

## Requested feedback

- Placement and wording of the `<undef>` marker.
- Whether we should optionally gate this behind a setting (currently
always on when disassembling with an `ExecutionContext`).
- Preference for immediate inclusion of tests vs. follow-up patch.

---

Thanks for reviewing! Happy to adjust behavior/format based on feedback.

---------

Co-authored-by: Jonas Devlieghere <jonas@devlieghere.com>
Co-authored-by: Adrian Prantl <adrian.prantl@gmail.com>
2025-08-28 10:41:38 -07:00

113 lines
3.6 KiB
C++

//===-- CommandObjectDisassemble.h ------------------------------*- C++ -*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#ifndef LLDB_SOURCE_COMMANDS_COMMANDOBJECTDISASSEMBLE_H
#define LLDB_SOURCE_COMMANDS_COMMANDOBJECTDISASSEMBLE_H
#include "lldb/Interpreter/CommandObject.h"
#include "lldb/Interpreter/Options.h"
#include "lldb/Utility/ArchSpec.h"
namespace lldb_private {
// CommandObjectDisassemble
class CommandObjectDisassemble : public CommandObjectParsed {
public:
class CommandOptions : public Options {
public:
CommandOptions();
~CommandOptions() override;
Status SetOptionValue(uint32_t option_idx, llvm::StringRef option_arg,
ExecutionContext *execution_context) override;
void OptionParsingStarting(ExecutionContext *execution_context) override;
llvm::ArrayRef<OptionDefinition> GetDefinitions() override;
const char *GetPluginName() {
return (plugin_name.empty() ? nullptr : plugin_name.c_str());
}
const char *GetFlavorString() {
if (flavor_string.empty() || flavor_string == "default")
return nullptr;
return flavor_string.c_str();
}
const char *GetCPUString() {
if (cpu_string.empty() || cpu_string == "default")
return nullptr;
return cpu_string.c_str();
}
const char *GetFeaturesString() {
if (features_string.empty() || features_string == "default")
return nullptr;
return features_string.c_str();
}
Status OptionParsingFinished(ExecutionContext *execution_context) override;
bool show_mixed; // Show mixed source/assembly
bool show_bytes;
bool show_control_flow_kind;
uint32_t num_lines_context = 0;
uint32_t num_instructions = 0;
bool raw;
std::string func_name;
bool current_function = false;
lldb::addr_t start_addr = 0;
lldb::addr_t end_addr = 0;
bool at_pc = false;
bool frame_line = false;
std::string plugin_name;
std::string flavor_string;
std::string cpu_string;
std::string features_string;
ArchSpec arch;
bool some_location_specified = false; // If no location was specified, we'll
// select "at_pc". This should be set
// in SetOptionValue if anything the selects a location is set.
lldb::addr_t symbol_containing_addr = 0;
bool force = false;
bool enable_variable_annotations = false;
};
CommandObjectDisassemble(CommandInterpreter &interpreter);
~CommandObjectDisassemble() override;
Options *GetOptions() override { return &m_options; }
protected:
void DoExecute(Args &command, CommandReturnObject &result) override;
llvm::Expected<std::vector<AddressRange>>
GetRangesForSelectedMode(CommandReturnObject &result);
llvm::Expected<std::vector<AddressRange>> GetContainingAddressRanges();
llvm::Expected<std::vector<AddressRange>> GetCurrentFunctionRanges();
llvm::Expected<std::vector<AddressRange>> GetCurrentLineRanges();
llvm::Expected<std::vector<AddressRange>>
GetNameRanges(CommandReturnObject &result);
llvm::Expected<std::vector<AddressRange>> GetPCRanges();
llvm::Expected<std::vector<AddressRange>> GetStartEndAddressRanges();
llvm::Expected<std::vector<AddressRange>>
CheckRangeSize(std::vector<AddressRange> ranges, llvm::StringRef what);
CommandOptions m_options;
};
} // namespace lldb_private
#endif // LLDB_SOURCE_COMMANDS_COMMANDOBJECTDISASSEMBLE_H