Commit Graph

2010 Commits

Author SHA1 Message Date
Mehdi Amini
6594f428de Split the llvm::ThreadPool into an abstract base class and an implementation (#82094)
This decouples the public API used to enqueue tasks and wait for
completion from the actual implementation, and opens up the possibility
for clients to set their own thread pool implementation for the pool.

https://discourse.llvm.org/t/construct-threadpool-from-vector-of-existing-threads/76883
2024-03-02 19:10:50 -08:00
Maksim Panchenko
d7d564b2fc [BOLT] Add BinaryFunction::registerBranch(). NFC (#83337)
Add an external interface to register a branch in a function that is in
disassembled state. Allows to make custom modifications to the
disassembler. E.g., a pre-CFG pass can add an instruction and register a
branch that will later be used during the CFG construction.
2024-02-28 20:04:28 -08:00
Maksim Panchenko
3f2a9e5910 [BOLT] Sort TakenBranches immediately before use. NFCI (#83333)
Move code that sorts TakenBranches right before the branches are used.
We can populate TakenBranches in pre-CFG post-processing and hence have
to postpone the sorting to a later point in the processing pipeline.
Will add such a pass later. For now it's NFC.
2024-02-28 19:51:44 -08:00
Maksim Panchenko
7c206c7812 [BOLT] Refactor interface for instruction labels. NFCI (#83209)
To avoid accidentally setting the label twice for the same instruction,
which can lead to a "lost" label, introduce getOrSetInstLabel()
function. Rename existing functions to getInstLabel()/setInstLabel() to
make it explicit that they operate on instruction labels. Add an
assertion in setInstLabel() that the instruction did not have a prior
label set.
2024-02-27 18:44:28 -08:00
Maksim Panchenko
0e84e2748b [BOLT] Move test under X86 target. NFCI (#83202)
instrument-wrong-target.s test requires X86 host. Move it under
runtime/X86.
2024-02-27 15:38:31 -08:00
Elvina Yakubova
b98e6a5ced [BOLT][AArch64] Skip BBs only instead of functions (#81989)
After [this
](846eb76761)
commit we noticed that the size of fdata file decreased a lot. That's
why the better and more precise way will be to skip basic blocks with
exclusive instructions only instead of the whole function
2024-02-27 19:19:47 +03:00
Alexander Yermolovich
6de5fcc746 [BOLT][DWARF] Add support for .debug_names (#81062)
DWARF5 spec supports the .debug_names acceleration table. This is the
formalized version of combination of gdb-index/pubnames/types. Added
implementation of it to BOLT. It supports both monolothic and split
dwarf, with and without Type Units. It does not include parent indices.
This will be in followup PR. Unlike LLVM output this will put all the
CUs and TUs into one Module.
2024-02-26 14:00:31 -08:00
Alexander Yermolovich
841a4168ad [BOLT] Fix runtime/instrument-wrong-target.s test (#82858)
Test was failing when only X86 was specified for LLVM_TARGETS_TO_BUILD.
Changed so that it will now report unsupporeted.

For "X86;AArch64" it still passes.
For "X86" reports UNSUPPORTED: BOLT :: runtime/instrument-wrong-target.s
(1 of 1)
2024-02-26 13:43:39 -08:00
Alexander Yermolovich
004c1972b4 [BOLT][DWARF][NFC] Expose DebugStrOffsetsWriter::clear (#82548)
Refactored cod that clears data-structures in DebugStrOffsetsWriter into
clear() function and made initialize() public. This is for
https://github.com/llvm/llvm-project/pull/81062.
2024-02-21 16:48:02 -08:00
Alexander Yermolovich
640e781dc8 [BOLT][DWARF][NFC] Use SkeletonCU in place of IsDWO check (#82540)
Changed isDWO to a function that checks Skeleton CU that is passed in.
This is for preparation for
https://github.com/llvm/llvm-project/pull/81062.
2024-02-21 16:18:18 -08:00
Maksim Panchenko
5daf2001a1 [BOLT] Fix memory leak in BinarySection (#82520)
The change in #80950 exposed a memory leak in BinarySection. Let
BinarySection manage memory passed via updateContents() unless a valid
SectionID is set indicating that the contents are managed by JITLink.
2024-02-21 11:54:34 -08:00
Mehdi Amini
744616b3ae Rename ThreadPool::getThreadCount() to getMaxConcurrency() (NFC) (#82296)
This is addressing a long-time TODO to rename this misleading API. The
old one is preserved for now but marked deprecated.
2024-02-19 18:07:12 -08:00
Maksim Panchenko
0ce0171243 [BOLT][NFC] Switch logging in LinuxKernelRewriter (#82195)
Use journaling streams introduced in #81524 for LinuxKernelRewriter.
2024-02-19 03:24:04 +00:00
Maksim Panchenko
2646dccaa3 [BOLT] Add support for Linux kernel static calls table (#82072)
Static calls are calls that are getting patched during runtime. Hence,
for every such call the kernel runtime needs the location of the call or
jmp instruction that will be patched. Instruction locations together
with a corresponding key are stored in the static call site table. As
BOLT rewrites these instructions it needs to update the table.
2024-02-18 17:20:25 -08:00
Alexander Yermolovich
f81f7a5766 [BOLT][DWARF] Remove redundant code (#82118)
Removed some redundant code. Should be NFC change.
2024-02-17 12:37:07 -08:00
Maksim Panchenko
5a82daafc1 [BOLT][NFC] Remove redundant assertion (#82056)
processLKSections() used to be a member of RewriteInstance. Since now it
is part of the LinuxKernelRewriter, the assertion is no longer needed.
2024-02-16 15:37:54 -08:00
Maksim Panchenko
5a29887145 [BOLT] Add writing support for Linux kernel ORC (#80950)
Update ORC information based on the new code layout and emit
corresponding ORC sections for the Linux kernel.

We rewrite ORC sections in place, which puts a limit on the size of new
section contents. Since ORC info changes for the new code layout and the
number of ORC entries can become larger, we free up space in the tables
by removing redundant ORC terminators. As a result, we effectively emit
fewer entries and have to add duplicate terminators at the end to match
the original section sizes. Ideally, we need to update ORC boundaries to
reflect the reduced size and optimize runtime lookup, but we will need
relocations for this, and the benefits will be marginal, if any.
2024-02-16 14:25:59 -08:00
Alexander Yermolovich
5ff8b30327 [BOLT][DWARF] Do not emit zero low_pc address arange (#81955)
According to DWARF spec zero entires indicate end of arange. Changed so
that BOLT does not emit zero low_pc arange.
2024-02-16 11:23:28 -08:00
Amir Ayupov
340b1ab9dc [BOLT] Add missing include
Address the comment in
https://github.com/llvm/llvm-project/pull/76906#issuecomment-1947335336
2024-02-15 15:01:33 -08:00
Amir Ayupov
d2c9a19dd8 [BOLT][NFC] Pass BF/BB hashes to BAT
Test Plan: NFC

Reviewers: dcci, rafaelauler, maksfb, ayermolo

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76906
2024-02-15 12:49:43 -08:00
Alexander Yermolovich
82ca752393 [BOLT][DWARF] Add test for DW_AT_ranges input without function output (#81794)
Added a test that relies on -fbasic-block-sections=all and --gc-sections
that exercises a code path that previously printed a warning.
2024-02-14 15:43:39 -08:00
Alexander Yermolovich
c9e8e91aca [BOLT][DWARF] Fix out of order rangelists/loclists (#81645)
GCC can generate rangelists/loclists that are out of order. Fixed so
that we don't assert, and instead generate partially optimized list.
Through most code paths we do sort rnglists/loclists, but not for
loclist for a path where BOLT does not modify a function. Although it's
nice to have lists sorted, this implementation shouldn't rely on it.
This also fixes an issue if we partially capture a list we would write
out *end_of_list in helper function. So tools won't see the rest of the
addresses being written out.
2024-02-14 11:23:57 -08:00
Amir Ayupov
52cf07116b [BOLT][NFC] Log through JournalingStreams (#81524)
Make core BOLT functionality more friendly to being used as a
library instead of in our standalone driver llvm-bolt. To
accomplish this, we augment BinaryContext with journaling streams
that are to be used by most BOLT code whenever something needs to
be logged to the screen. Users of the library can decide if logs
should be printed to a file, no file or to the screen, as
before. To illustrate this, this patch adds a new option
`--log-file` that allows the user to redirect BOLT logging to a
file on disk or completely hide it by using
`--log-file=/dev/null`. Future BOLT code should now use
`BinaryContext::outs()` for printing important messages instead of
`llvm::outs()`. A new test log.test enforces this by verifying that
no strings are print to screen once the `--log-file` option is
used.

In previous patches we also added a new BOLTError class to report
common and fatal errors, so code shouldn't call exit(1) now. To
easily handle problems as before (by quitting with exit(1)),
callers can now use
`BinaryContext::logBOLTErrorsAndQuitOnFatal(Error)` whenever code
needs to deal with BOLT errors. To test this, we have fatal.s
that checks we are correctly quitting and printing a fatal error
to the screen.

Because this is a significant change by itself, not all code was
yet ported. Code from Profiler libs (DataAggregator and friends)
still print errors directly to screen.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:53:53 -08:00
Amir Ayupov
13d60ce2f2 [BOLT][NFC] Propagate BOLTErrors from Core, RewriteInstance, and passes (2/2) (#81523)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch continue the migration
on libCore, libRewrite and libPasses to use the new BOLTError
class whenever a failure occurs.

Test Plan: NFC

Co-authored-by: Rafael Auler <rafaelauler@fb.com>
2024-02-12 14:51:15 -08:00
Amir Ayupov
fa7dd4919a [BOLT][NFC] Add BOLTError and return it from passes (1/2) (#81522)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we add a new class
BOLTError and auxiliary functions `createFatalBOLTError()` and
`createNonFatalBOLTError()` that allow BOLT code to bubble up the
problem to the caller by using the Error class as a return
type (or Expected). Also changes passes to use these.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:39:59 -08:00
Amir Ayupov
a5f3d1a803 [BOLT][NFC] Return Error from BinaryFunctionPass::runOnFunctions (#81521)
As part of the effort to refactor old error handling code that
would directly call exit(1), in this patch we change the
interface to `BinaryFunctionPass` to return an Error on
`runOnFunctions()`. This gives passes the ability to report a
serious problem to the caller (RewriteInstance class), so the
caller may decide how to best handle the exceptional situation.

Co-authored-by: Rafael Auler <rafaelauler@fb.com>

Test Plan: NFC
2024-02-12 14:36:12 -08:00
Maksim Panchenko
7fe97f0420 [BOLT] Always run CheckLargeFunctions in non-relocation mode (#80922)
We run CheckLargeFunctions pass in non-relocation mode to prevent the
emission of functions that later could not be written to the output due
to their large size. The main reason behind the pass is to prevent the
emission of metadata for such functions since this metadata becomes
incorrect if the function is left unmodified.

Currently, the pass is enabled in non-relocation mode only when debug
info output is also enabled. As we emit increasingly more kinds of
metadata, e.g. for the Linux kernel, it becomes more challenging to
track metadata that needs to be fixed. Hence, I'm enabling the pass to
always run in non-relocation mode.
2024-02-08 14:21:49 -08:00
Job Noorman
e7c0e59bbc [BOLT] Fix crash for relocs in data sections against ABS symbols (#76026)
Fixes #75771
2024-02-07 07:53:02 +00:00
Maksim Panchenko
8ea7f1d20a [BOLT][NFCI] Keep instruction annotations (#80382)
We used to delete most instruction annotations before code emission. It
was done to release memory taken by annotations and to reduce overall
memory consumption. However, since the implementation of annotations has
moved to using existing instruction operands, the memory overhead
associated with them has reduced drastically. I measured that savings
are less than 0.5% on large binaries and processing time is just
slightly reduced if we keep them. Additionally, I plan to use
annotations in pre-emission passes for the Linux kernel rewriter.
2024-02-06 19:59:53 -08:00
Jon Roelofs
b98db441f0 [BOLT] Make ifunc test not statically-resolvable. NFC
This fixes a breakage caused by e976385415
2024-02-06 15:15:11 -08:00
Maksim Panchenko
8075f0db16 [BOLT] Use new contents when emitting sections with relocations (#80782)
We can use BinarySection::updateContents() to change section contents.
However, if we also add relocations for new contents, then the original
data (i.e. not updated) is going to be used. Fix that. A follow-up diff
will use the update interface and will include a test case.
2024-02-06 14:38:21 -08:00
Maksim Panchenko
082fe9a5dd [BOLT] Remove duplicate expression (#80380)
Reported by cpp check static analyzer in #80111.

Fixes #80111.
2024-02-01 19:05:11 -08:00
Maksim Panchenko
a693ae5306 [BOLT] Enable re-writing of Linux kernel binary (#80228)
Write modified Linux kernel binary to disk. The output is not supposed
to be functional at the moment, but it will allow for future patches to
test the output binary.
2024-02-01 12:11:26 -08:00
Maksim Panchenko
116e801a15 [BOLT] Adjust section sizes based on file offsets (#80226)
When we adjust section sizes while rewriting a binary, we should be
using section offsets and not addresses to determine if section overlap.
NFC for existing binaries.
2024-02-01 12:08:41 -08:00
Amir Ayupov
bed3608c22 [BOLT][NFC] Factor out RI::disassemblePLTInstruction (#80302) 2024-02-01 08:26:21 -08:00
Amir Ayupov
3c64b24ed3 [BOLT] Add extra staleness logging (#80225)
Report two extra metrics:
- # of stale functions with matching block count,
- # of stale blocks with matching instruction count.
2024-02-01 07:16:40 -08:00
Maksim Panchenko
2abcbbd96a [BOLT] Detect Linux kernel based on ELF program headers (#80086)
Check if program header addresses fall into the kernel space to detect a
Linux kernel binary on x86-64.

Delete opts::LinuxKernelMode and use BinaryContext::IsLinuxKernel
instead.
2024-01-30 18:04:29 -08:00
Maksim Panchenko
0fc791cd2c [BOLT] Fix comparison function for Linux ORC entries (#79921)
Fix ORC entry comparison function to cover a case with multiple
terminator entries matching at the same IP.
2024-01-29 17:45:40 -08:00
Maksim Panchenko
aa1968c2eb [BOLT] Add metadata pre-emit finalization interface (#79925)
Some metadata needs to be updated/finalized before the binary context is
emitted into the binary. Add the interface and use it for Linux ORC
update invocation.
2024-01-29 17:27:33 -08:00
Kazu Hirata
03cba44029 [BOLT] Use SmallString::operator std::string (NFC) 2024-01-27 09:32:21 -08:00
spupyrev
9058503d26 [BOLT] Deprecate hfsort+ in favor of cdsort (#72408)
A new function sorting algorithm (cdsort) in LLVM is an optimized 
version of BOLT's hfsort+. In order to avoid code duplication and 
simplify maintenance, getting rid of hfsort+.

Perf-wise this is likely a neutral change, though differences on 
individual benchmarks are possible, since the generated function layout 
has changed. I tested cdsort vs hfsort+ on a number of open-source and 
prod binaries built in different modes and record an average neutral 
perf difference, perhaps with more "green" counters.
2024-01-26 06:51:55 -08:00
Amir Ayupov
df7d2b2f90 [BOLT] Deduplicate equal offsets in BAT (#76905)
Encode BRANCHENTRY bits as bitmask for deduplicated entries.

Reduces BAT section size:
- large binary: to 11834216 bytes (0.31x original),
- medium binary: to 1565584 bytes (0.26x original),
- small binary: to 336 bytes (0.23x original).

Test Plan: Updated bolt/test/X86/bolt-address-translation.test
2024-01-25 15:37:47 -08:00
Alexander Yermolovich
7d272722fb [BOLT][DWARF] Add option to specify DW_AT_comp_dir (#79395)
Added an --comp-dir-override option that overrides DW_AT_comp_dir in the
unit die. This allows for llvm-bolt to be invoked from any category and
still find .dwo files.
2024-01-25 15:00:52 -08:00
Amir Ayupov
e9309b27d7 [BOLT] Report input staleness (#79496)
It's beneficial to have uniform reporting in both `infer-stale-profile`
on and off cases, primarily for logging purposes.

Without this change, BOLT would report "input" staleness in
`infer-stale-profile=0` case (without matching), and "output" staleness
in `infer-stale-profile=1` case (after matching).

This change makes BOLT report "input" staleness in both cases. "Output"
staleness information is printed separately with "BOLT-INFO: inferred
profile..."
2024-01-25 14:15:13 -08:00
Alexander Yermolovich
bb6a485055 [BOLT] Fix updating DW_AT_stmt_list for DWARF5 TUs (#79374)
Changed so that we also update DW_AT_stmt_list for DWARF5 TUs. BOLT was
doing it for DWARF4, but it wasn't doing it for DWARF5.
2024-01-24 15:34:29 -08:00
Amir Ayupov
6735ce9d25 [BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)
Fix the bug where merge-fdata unconditionally outputs boltedcollection
line, regardless of whether input files have it set.

Test Plan:
Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this
fix.
2024-01-18 20:00:47 -08:00
Amir Ayupov
9fec33aadc Revert "[BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)"
This reverts commit 82bc33ea3f.

Accidentally pushed unrelated changes.
2024-01-18 19:59:09 -08:00
Amir Ayupov
82bc33ea3f [BOLT] Fix unconditional output of boltedcollection in merge-fdata (#78653)
Fix the bug where merge-fdata unconditionally outputs boltedcollection 
line, regardless of whether input files have it set.

Test Plan:
Added bolt/test/X86/merge-fdata-nobat-mode.test which fails without this
fix.
2024-01-18 19:44:16 -08:00
Amir Ayupov
8f1d94aaea [BOLT] Use continuous output addresses in delta encoding in BAT
Make output function addresses be delta-encoded wrt last offset in the
previous function. This reduces the deltas in function start addresses.

Test Plan:
Reduces BAT section size to:
- large binary: 12218860 bytes (0.32x original),
- medium binary: 1606580 bytes (0.27x original),
- small binary: 404 bytes (0.28x original),

Reviewers: rafaelauler

Reviewed By: rafaelauler

Pull Request: https://github.com/llvm/llvm-project/pull/76904
2024-01-18 13:49:44 -08:00
Kazu Hirata
6da4a7a8e2 [BOLT] Use SmallString::operator std::string (NFC) 2024-01-15 21:59:06 -08:00