Commit Graph

1973 Commits

Author SHA1 Message Date
Maksim Panchenko
0fc791cd2c [BOLT] Fix comparison function for Linux ORC entries (#79921)
Fix ORC entry comparison function to cover a case with multiple
terminator entries matching at the same IP.
2024-01-29 17:45:40 -08:00
Maksim Panchenko
aa1968c2eb [BOLT] Add metadata pre-emit finalization interface (#79925)
Some metadata needs to be updated/finalized before the binary context is
emitted into the binary. Add the interface and use it for Linux ORC
update invocation.
2024-01-29 17:27:33 -08:00
Kazu Hirata
03cba44029 [BOLT] Use SmallString::operator std::string (NFC) 2024-01-27 09:32:21 -08:00
spupyrev
9058503d26 [BOLT] Deprecate hfsort+ in favor of cdsort (#72408)
A new function sorting algorithm (cdsort) in LLVM is an optimized 
version of BOLT's hfsort+. In order to avoid code duplication and 
simplify maintenance, getting rid of hfsort+.

Perf-wise this is likely a neutral change, though differences on 
individual benchmarks are possible, since the generated function layout 
has changed. I tested cdsort vs hfsort+ on a number of open-source and 
prod binaries built in different modes and record an average neutral 
perf difference, perhaps with more "green" counters.
2024-01-26 06:51:55 -08:00
Amir Ayupov
df7d2b2f90 [BOLT] Deduplicate equal offsets in BAT (#76905)
Encode BRANCHENTRY bits as bitmask for deduplicated entries.

Reduces BAT section size:
- large binary: to 11834216 bytes (0.31x original),
- medium binary: to 1565584 bytes (0.26x original),
- small binary: to 336 bytes (0.23x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-25 15:37:47 -08:00
Alexander Yermolovich
7d272722fb [BOLT][DWARF] Add option to specify DW_AT_comp_dir (#79395)
Added an --comp-dir-override option that overrides DW_AT_comp_dir in the
unit die. This allows for llvm-bolt to be invoked from any category and
still find .dwo files.
2024-01-25 15:00:52 -08:00
Amir Ayupov
e9309b27d7 [BOLT] Report input staleness (#79496)
It's beneficial to have uniform reporting in both `infer-stale-profile`
on and off cases, primarily for logging purposes.

Without this change, BOLT would report "input" staleness in
`infer-stale-profile=0` case (without matching), and "output" staleness
in `infer-stale-profile=1` case (after matching).

This change makes BOLT report "input" staleness in both cases. "Output"
staleness information is printed separately with "BOLT-INFO: inferred
profile..."
2024-01-25 14:15:13 -08:00
Alexander Yermolovich
bb6a485055 [BOLT] Fix updating DW_AT_stmt_list for DWARF5 TUs (#79374)
Changed so that we also update DW_AT_stmt_list for DWARF5 TUs. BOLT was
doing it for DWARF4, but it wasn't doing it for DWARF5.
2024-01-24 15:34:29 -08:00
Amir Ayupov
6735ce9d25 [BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)
Fix the bug where merge-fdata unconditionally outputs boltedcollection
line, regardless of whether input files have it set.

Test Plan:
Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this
fix.
2024-01-18 20:00:47 -08:00
Amir Ayupov
9fec33aadc Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)"
This reverts commit 82bc33ea3f.

Accidentally pushed unrelated changes.
2024-01-18 19:59:09 -08:00
Amir Ayupov
82bc33ea3f [BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)
Fix the bug where merge-fdata unconditionally outputs boltedcollection 
line, regardless of whether input files have it set.

Test Plan:
Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this
fix.
2024-01-18 19:44:16 -08:00
Amir Ayupov
8f1d94aaea [BOLT] Use continuous output addresses in delta encoding in BAT
Make output function addresses be delta-encoded wrt last offset in the
previous function. This reduces the deltas in function start addresses.

Test Plan:
Reduces BAT section size to:
- large binary: 12218860 bytes (0.32x original),
- medium binary: 1606580 bytes (0.27x original),
- small binary: 404 bytes (0.28x original),

Reviewers: rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76904
2024-01-18 13:49:44 -08:00
Kazu Hirata
6da4a7a8e2 [BOLT] Use SmallString::operator std::string (NFC) 2024-01-15 21:59:06 -08:00
Amir Ayupov
dcba077146 [BOLT] Embed cold mapping info into function entry in BAT (#76903)
Reduces BAT section size:
- large binary: to 12283500 bytes (0.32x original size),
- medium binary: to 1616020 bytes (0.27x original size),
- small binary: to 404 bytes (0.28x original size).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-12 13:02:32 -08:00
spupyrev
0daf303e79 [BOLT] Fix double conversion in CacheMetrics (#75253)
The change (i) fixes an issue with double-int conversion in CacheMetrics
and
(ii) removes command-line options for computing metrics (which aren't
modified
anyway).
This change might break some tests verifying the exact output of
CacheMetrics.
2024-01-12 10:27:12 -08:00
Alexey Lapshin
35708b0754 [DWARFLinker][NFC] Rename libraries to match with directories name. (#77592)
It was noted that new DWARFLinker libraries do not follow naming
agreement -
https://github.com/llvm/llvm-project/pull/75925#issuecomment-1883301659
This patch rename libraries to match with the agreement.

Rename LLVMDWARFLinkerBase library into the LLVMDWARFLinker. Rename
LLVMDWARFLinker library into the LLVMDWARFLinkerClassic. Correct include
path according to the new directory structure.
2024-01-12 15:36:44 +03:00
Amir Ayupov
8fb8ad66c9 [BOLT] Delta-encode function start addresses in BAT (#76902)
Further reduce the size of BAT section:
- large binary: to 12716312 bytes (0.33x original),
- medium binary: to 1649472 bytes (0.28x original),
- small binary: to 428 bytes (0.30x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 14:35:37 -08:00
Amir Ayupov
bbe07989d7 [BOLT] Delta-encode offsets in BAT (#76900)
This change further reduces the size of BAT:
- large binary: to 13073904 bytes (0.34x original),
- medium binary: to 1703116 bytes (0.29x original),
- small binary: to 436 bytes (0.30x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 14:29:46 -08:00
Amir Ayupov
565f40d66b [BOLT] Encode BAT using ULEB128 (#76899)
Reduces BAT section size, bytes:
- large binary: 38676872 -> 23262524 (0.60x),
- medium binary (trunk clang): 5938004 -> 3213504 (0.54x),
- small binary (X86/bolt-address-translation.test): 1436 -> 680 (0.47x).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 12:16:30 -08:00
Amir Ayupov
a7cf0a1f7f [BOLT] Add BOLT Address Translation documentation (#76899)
Test Plan: Open the page in browser
2024-01-11 12:15:00 -08:00
Amir Ayupov
2bb511e277 [BOLT][NFC] Print BAT section size (#76897)
Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-11 11:04:04 -08:00
avl-llvm
2357e899cb [DWARFLinker][DWARFLinkerParallel][NFC] Refactor DWARFLinker&DWARFLinkerParallel to have a common library. Part 1. (#75925)
This patch creates DWARFLinkerBase library, places DWARFLinker code into
DWARFLinker\Classic, places DWARFLinkerParallel into DWARFLinker\Parallel.
updates BOLT to use new library. This patch is NFC.
2024-01-09 11:32:08 +03:00
Min-Yih Hsu
23e03a85dc [BOLT] Update test case after #77253
PR #77253 removed the '@plt' suffix from callee symbols. Update
RISCV/relax.s accordingly.
2024-01-08 11:05:38 -08:00
Ben Langmuir
08c5f1fede [ORC] Add absoluteSymbolsLinkGraph to expose absolute symbols to platform (#77008)
Adds a function to create a LinkGraph of absolute symbols, and a
callback in dynamic library search generators to enable using it to
expose its symbols to the platform/orc runtime. This allows e.g. using
__orc_rt_run_program to run a precompiled function that was found via
dlsym. Ideally we would use this in llvm-jitlink's own search generator,
but it will require more work to align with the Process/Platform
JITDylib split, so not handled here.

As part of this change we need to handle LinkGraphs that only have
absolute symbols.
2024-01-05 15:32:29 -08:00
ShatianWang
1577483413 [BOLT] Don't split likely fallthrough in CDSplit (#76164)
This diff speeds up CDSplit by not considering any hot-warm splitting
point that could break a fall-through branch from a basic block to its
most likely successor.

Co-authored-by: spupyrev <spupyrev@fb.com>
2023-12-21 16:17:10 -05:00
Alexander Yermolovich
ad4cead67c [BOLT][DWARF][NFC] Initialize CloneUnitCtxMap with current partition size (#75876)
We would always allocate maximum amount for vector containing
DWARFUnitInfo. In real usecases what ends up hapenning is we allocate a
giant vector when processing one CU, or for thin-lto case multiple CUs.
This lead to a lot of memory overhead, and 2x BOLT processing slowdown
for at least one service built with monolithic DWARF.

For binaries built with LTO with clang all of CUs that have cross
references will share an abbrev table and will be processed in one
batch. Rest of CUs are processesd in --cu-processing-batch-size size.
Which defaults to 1.

For theoretical cases where cross-cu references are present, but they do
not share abbrev will increase the size of CloneUnitCtxMap as each CU is
being processsed.
2023-12-20 16:12:52 -08:00
Jon Roelofs
d6f772074c fixup! fixup! [GlobalISel] Always direct-call IFuncs and Aliases (#74902)
Apparently some BOLT bots build with a pre-installed system clang, and others
use the just-built one. These two clangs now behave slightly differently when
it comes to ifunc codegen after https://github.com/llvm/llvm-project/pull/74902

Change the test to accept both patterns.
2023-12-15 12:48:11 -07:00
Jon Roelofs
3017adb37e fixup! [GlobalISel] Always direct-call IFuncs and Aliases (#74902)
The codegen change broke one of the BOLT tests.
2023-12-15 12:17:07 -07:00
Wang Yaduo
c532ba4edd [RISCV] Support printing immediate of RISCV MCInst in hexadecimal format (#74053)
Enable the llvm-objdump to disassemble the immediate of RISCV
instruction in hexadecimal format with --print-imm-hex flag.
2023-12-14 22:42:11 -08:00
Vitaly Buka
fc3adf74d3 Revert "[RISCV] Support printing immediate of RISCV MCInst in hexadecimal format" (#75561)
Reverts llvm/llvm-project#74053

Breaks https://lab.llvm.org/buildbot/#/builders/5/builds/39291

Co-authored-by: Wang Yaduo <wangyaduo@linux.alibaba.com>

Issue #75563
2023-12-14 22:05:47 -08:00
Wang Yaduo
3dde0d0256 [RISCV] Support printing immediate of RISCV MCInst in hexadecimal format (#74053)
Enable the llvm-objdump to disassemble the immediate of RISCV
instruction in hexadecimal format with --print-imm-hex flag.
2023-12-15 10:13:20 +08:00
Alexander Yermolovich
bf2b035e58 [BOLT][DWARF] Fix handling .debug_str_offsets for type units (#75522)
There was an assumpiton that TUs and CUs share .debug_str_offsets
contribution. For ThinLTO builds it is not the case. Changed so that we
parse contributions for TUs also, and did some refactoring so that we
don't re-parse contributions that were not modified.
2023-12-14 17:27:21 -08:00
Kazu Hirata
ad8fd5b185 [BOLT] Use StringRef::{starts,ends}_with (NFC)
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.

I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
2023-12-13 23:34:49 -08:00
Rafael Auler
a26aa79a3b [BOLT] Fix some dwarf tests affected by 75095 (#75327)
PR 75095 introduced some changes to lld that broke some dwarf tests that
were being incorrectly linked as a PIE. Add flags to disable any PIC/PIE
compilation, so the linker can succeed and the tests can run as
intended.
2023-12-13 06:11:15 -08:00
Alexander Yermolovich
fb9a851224 [BOLT][DWARF] Fix handling of debug_str_offsets (#75100)
We were not setting size field of .debug_str_offsets correctly. Fixed
it, and added a test.
2023-12-11 15:56:32 -08:00
Kazu Hirata
1cc5431285 [BOLT] Fix warnings
This patch fixes:

  bolt/lib/Core/BinaryFunctionProfile.cpp:222:10: error: variable
  'BBMergeSI' set but not used [-Werror,-Wunused-but-set-variable]

  bolt/lib/Passes/VeneerElimination.cpp:67:12: error: variable
  'VeneerCallers' set but not used [-Werror,-Wunused-but-set-variable]
2023-12-11 12:55:29 -08:00
Amir Ayupov
b039ccc684 [BOLT] Provide backwards compatibility for YAML profile with std::hash (#74253)
Provide backwards compatibility for YAML profile that uses `std::hash`:
xxh3 hash is the default for newly produced profile (sets `std-hash:
false`),
whereas the profile that doesn't specify `std-hash` will be treated as
`std-hash: true`, preserving old behavior.
2023-12-11 12:27:32 -08:00
sinan
fdb13cf531 [BOLT] Fix local out-of-range stub issue in LongJmp (#73918)
If a local stub is out-of-range, at LongJmp we will try to find another
local stub first. However, The original implementation do not work as
expected and it leads to an infinite loop between replaceTargetWithStub
and fixBranches.

After this patch, we first convert the target of BB back to the target
of the local stub, and then look up for other valid local stubs and so
on.
2023-12-11 10:38:28 +08:00
Nathan Sidwell
9596676e65 [BOLT] Determine address size from binary (#74870)
Query the executable for address size.
2023-12-09 14:39:57 -05:00
Ho Cheung
fa5486e487 [BOLT] [Passes] Fix two compile warnings in BOLT (#73086)
Fix build issue on Windows.

issue:#73085

@maksfb PTAL thank you
2023-12-06 11:19:07 -08:00
sinan
b304873134 [BOLT] Fix a wrong compiler option in test (#74420)
-nopie is an option for OpenBSD, and other linux distribution might
report an `unsupported option '-nopie' for target` error.
2023-12-06 17:16:48 +08:00
eleviant
f20af7372f [bolt] Support arm64 FP register spills (#73021)
At the moment llvm-bolt fails when analyzing jump tables on aarch64 in
case FP register spill/reload is used.
2023-12-05 20:32:58 +01:00
ShatianWang
296088bdf3 [BOLT][NFC] Remove unused code for CDSplit (#74136)
This diff removes JumpInfo related code that is no longer needed by
CDSplit from SplitFunctions.cpp.
2023-12-01 15:21:30 -05:00
Amir Ayupov
9584f58344 [BOLT][utils] Bump default time threshold to 2s in nfc-stat-parser 2023-12-01 09:57:48 -08:00
Amir Ayupov
76a9ea1321 [BOLT][utils] Remove heatmap mode detection from wrapper script
Heatmap mode has been moved to a separate tool. Drop the support in
llvm-bolt-wrapper.
2023-12-01 09:57:48 -08:00
ShatianWang
4483cf2d8b [BOLT] CDSplit main logic part 2/2 (#74032)
This diff implements the main splitting logic of CDSplit. CDSplit
processes functions in a binary in parallel. For each function BF, it
assumes that all other functions are hot-cold split. For each possible
hot-warm split point of BF, it computes its corresponding SplitScore,
and chooses the split point with the best SplitScore. The SplitScore of
each split point is computed in the following way: each call edge or
jump edge has an edge score that is proportional to its execution count,
and inversely proportional to its distance. The SplitScore of a split
point is a sum of edge scores over a fixed set of edges whose distance
can change due to hot-warm splitting BF. This set contains all cover
calls in the form of X->Y or Y->X given function order [... X ... BF ...
Y ...]; we refer to the sum of edge scores over the set of cover calls
as CoverCallScore. This set also contains all jump edges (branches)
within BF as well as all call edges originated from BF; we refer to the
sum of edge scores over this set of edges as LocalScore. CDSplit finds
the split index maximizing CoverCallScore + LocalScore.
2023-11-30 23:17:11 -05:00
ShatianWang
56bbf8135e [BOLT] CDSplit main logic part 1/2 (#73895)
This diff defines and initializes auxiliary variables used by CDSplit
and implements two important helper functions. The first helper function
approximates the block level size increase if a function is hot-warm
split at a given split index (X86 specific). The second helper function
finds all calls in the form of X->Y or Y->X for each BF given function
order [... X ... BF ... Y ...]. These calls are referred to as "cover
calls". Their distance will decrease if BF's hot fragment size is
further reduced by hot-warm splitting. NFC.
2023-11-30 20:55:36 -05:00
Maksim Panchenko
4f3081296f [BOLT][NFC] Fix comment (#73983)
Fix off-by-one error in comment.
2023-11-30 14:31:38 -08:00
Alexander Yermolovich
52be47b890 [BOLT][DWARF] Add support to create path (#73884)
When option --dwarf-output-path is specified, if the path does not exist
BOLT will now create it. This is what also happens when
--plugin-opt=dwo_dir=<value> is specified to LLD.
2023-11-30 09:41:01 -08:00
ShatianWang
c43d0432ef [BOLT] Create .text.warm for 3-way splitting (#73863)
This commit explicitly adds a warm code section, .text.warm, when
-split-functions -split-strategy=cdsplit is used. This replaces the
previous approach of using .text.cold.0 as warm and .text.cold.1 as cold
in 3-way function splitting. NFC.
2023-11-29 22:42:36 -05:00