Commit Graph

859 Commits

Author SHA1 Message Date
Fangrui Song
dbf37e956a [ELF] Move InputFile storage from make<> to LinkerDriver::files 2024-11-16 23:50:35 -08:00
Fangrui Song
2991a4e209 [ELF] Replace functions bAlloc/saver/uniqueSaver with member access 2024-11-16 22:34:13 -08:00
Fangrui Song
483516fd83 [ELF] Remove unneeded Twine() 2024-11-16 20:32:44 -08:00
Fangrui Song
c1a6defd9f [ELF] Make RelType a struct type
otherwise operator<<(const ELFSyncStream &s, RelType type) applies to
non-reloc-type uint32_t, which can be confusing.
2024-11-16 20:26:34 -08:00
Fangrui Song
33ff9e43b4 [ELF] Move SharedFile::vernauxNum to Ctx 2024-11-16 17:00:51 -08:00
Fangrui Song
3b75a5c4c8 [ELF] Replace message(...) with Msg(ctx) 2024-11-16 15:34:42 -08:00
Fangrui Song
a626eb2a2f [ELF] Pass ctx to bAlloc/saver/uniqueSaver 2024-11-16 15:20:21 -08:00
Fangrui Song
38870fe124 [ELF] Remove unneeded toString(Error) when using ELFSyncStream 2024-11-16 13:22:06 -08:00
Fangrui Song
58a971f42f [ELF] Replace contex-less toString(x) with toStr(ctx, x)
so that we can remove the global `ctx` from toString implementations.
Rename lld::toString (to lld::elf::toStr) to simplify name lookup (we
have many llvm::toString and another lld::toString(const llvm::opt::Arg
&)).
2024-11-16 11:58:10 -08:00
Fangrui Song
942928f3df [ELF] Migrate away from global ctx 2024-11-14 23:04:18 -08:00
Fangrui Song
c13258ac49 [ELF] Replace log with Log(ctx) 2024-11-07 09:30:20 -08:00
Fangrui Song
9b058bb42d [ELF] Replace errorOrWarn(...) with Err 2024-11-06 22:33:51 -08:00
Fangrui Song
f8bae3af74 [ELF] Replace warn(...) with Warn 2024-11-06 22:19:31 -08:00
Fangrui Song
09c2c5e1e9 [ELF] Replace error(...) with ErrAlways or Err
Most are migrated to ErrAlways mechanically.
In the future we should change most to Err.
2024-11-06 22:04:52 -08:00
Fangrui Song
63c6fe4a0b [ELF] Replace fatal(...) with Fatal or Err 2024-11-06 21:17:26 -08:00
Fangrui Song
201d7607f8 [ELF] Add context-aware diagnostic functions (#112319)
The current diagnostic functions log/warn/error/fatal lack a context
argument and call the global `lld::errorHandler()`, which prevents
multiple lld instances in one process.

This patch introduces context-aware replacements:

* log => Log(ctx)
* warn => Warn(ctx)
* errorOrWarn => Err(ctx)
* error => ErrAlways(ctx)
* fatal => Fatal(ctx)

Example: `errorOrWarn(toString(f) + "xxx")` => `Err(ctx) << f << "xxx"`.
(`toString(f)` is shortened to `f` as a bonus and may access `ctx`
without accessing the global variable (see `Target.cpp`)).

`ctx.e = &context->e;` can be replaced with a non-global Errorhandler
when `ctx` becomes a local variable.

(For the ELF port, the long term goal is to eliminate `error`. Most can
be straightforwardly converted to `Err(ctx)`.)
2024-11-06 08:25:58 -08:00
Fangrui Song
fe8af49a1b [ELF] Pass Ctx & to Defined & CommonSymbol 2024-10-20 01:38:16 +00:00
Fangrui Song
dbd197118d [ELF] Pass Ctx & to Symbol 2024-10-11 23:34:43 -07:00
Fangrui Song
dd326b1225 [ELF] Pass Ctx & 2024-10-11 21:10:05 -07:00
Fangrui Song
6dd773b650 [ELF] Pass Ctx & 2024-10-11 20:15:02 -07:00
Fangrui Song
1c28f31133 [ELF] Pass Ctx & 2024-10-11 18:35:02 -07:00
Fangrui Song
b672071ba5 [ELF] Pass Ctx & to InputFile 2024-10-06 18:09:52 -07:00
Fangrui Song
f2b0133858 [ELF] Move static nextGroupId isInGroup to LinkerDriver 2024-10-06 17:38:35 -07:00
Fangrui Song
f1dccda1b5 [ELF] Pass Ctx & to Symbols 2024-10-06 17:05:43 -07:00
Fangrui Song
49865107d4 [ELF] Pass Ctx & to InputFiles 2024-10-06 11:27:24 -07:00
Fangrui Song
b3e0bd3d28 [ELF] Pass Ctx & to Arch/ 2024-10-06 00:31:51 -07:00
Fangrui Song
6d03a69034 [ELF] Pass Ctx & to Arch/ 2024-10-06 00:14:12 -07:00
Fangrui Song
c4c34f0474 [ELF] Pass Ctx & to InputFiles 2024-10-03 23:06:18 -07:00
Fangrui Song
079b8327ec [ELF] Pass Ctx & to InputFiles and SyntheticSections 2024-09-29 16:06:47 -07:00
Fangrui Song
df0864e761 [ELF] Move elf::symtab into Ctx
Remove the global variable `symtab` and add a member variable
(`std::unique_ptr<SymbolTable>`) to `Ctx` instead.

This is one step toward eliminating global states.

Pull Request: https://github.com/llvm/llvm-project/pull/109612
2024-09-23 10:33:43 -07:00
Fangrui Song
eba30b3370 [ELF] Replace config-> with ctx.arg. in [IS]*.cpp 2024-09-21 12:47:47 -07:00
Fangrui Song
e88b7ff016 [ELF] Move InStruct into Ctx. NFC
Ctx was introduced in March 2022 as a more suitable place for such
singletons.

llvm/Support/thread.h includes <thread>, which transitively includes
sstream in libc++ and uses ios_base::in, so we cannot use `#define in ctx.sec`.

`symtab, config, ctx` are now the only variables using
LLVM_LIBRARY_VISIBILITY.
2024-09-15 22:15:02 -07:00
Fangrui Song
1cd07526b4 [ELF] Rename unique_saver to uniqueSaver. NFC
and remove an unneeded FIXME.
2024-09-15 16:20:58 -07:00
Mingming Liu
09b231cb38 Re-apply "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF" (#107792)
Fix the use-after-free bug and re-apply
https://github.com/llvm/llvm-project/pull/106193
* Without the fix, the string referenced by `objSym.Name` could be
destroyed even if string saver keeps a copy of the referenced string.
This caused use-after-free.
* The fix ([latest
commit](9776ed44cf))
updates `objSym.Name` to reference (via `StringRef`) the string saver's
copy.

Test:
1. For `lld/test/ELF/lto/asmundef.ll`, its test failure is reproducible
with `-DLLVM_USE_SANITIZER=Address` and gone with the fix.
3. Run all tests by following
https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes.
* Without the fix, `ELF/lto/asmundef.ll` aborted the multi-stage test at
`@@@BUILD_STEP stage2/asan_ubsan check@@@`, defined
[here](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh#L30)
* With the fix, the [multi-stage
test](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh)
pass stage2 {asan, ubsan, masan}. This is also the test used by
https://lab.llvm.org/buildbot/#/builders/169


**Original commit message**

`StringMap<T>` creates a [copy of the
string](d4c519e7b2/llvm/include/llvm/ADT/StringMapEntry.h (L55-L58))
for entry insertions and intentionally keep copies [since the
implementation optimizes string memory
usage](d4c519e7b2/llvm/include/llvm/ADT/StringMap.h (L124)).
On the other hand, linker keeps copies of symbol names [1] in
`lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3].

This change proposes to optimize away string copies inside
[LTO::GlobalResolutions](24e791b416/llvm/include/llvm/LTO/LTO.h (L409)),
which will make LTO indexing more memory efficient for ELF. There are
similar opportunities for other (COFF, wasm, MachO) formats.

The optimization takes place for lld (ELF) only. For the rest of use
cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep
copies and use global resolution key for de-duplication.

Together with @kazutakahirata's work to make `ComputeCrossModuleImport`
more memory efficient, we see a ~20% peak memory usage reduction in a
binary where peak memory usage needs to go down. Thanks to the
optimization in
329ba523cc,
the max (as opposed to the sum) of `ComputeCrossModuleImport` or
`GlobalResolution` shows up in peak memory usage.
* Regarding correctness, the set of
[resolved](80c47ad3ae/llvm/lib/LTO/LTO.cpp (L739))
[per-module
symbols](80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L188-L191))
is a subset of
[llvm::lto::InputFile::Symbols](80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L120)).
And bitcode symbol parsing saves symbol name when iterating
`obj->symbols` in `BitcodeFile::parse` already. This change updates
`BitcodeFile::parseLazy` to keep copies of per-module undefined symbols.
* Presumably the undefined symbols in a LTO unit (copied in this patch
in linker unique saver) is a small set compared with the set of symbols
in global-resolution (copied before this patch), making this a
worthwhile trade-off. Benchmarking this change alone shows measurable
memory savings across various benchmarks.

[1] ELF
1cea5c2138/lld/ELF/InputFiles.cpp (L1748)
[2]
ef7b18a53c/lld/ELF/Driver.cpp (L2863)
[3]
ef7b18a53c/lld/ELF/Driver.cpp (L2995)
2024-09-09 11:16:58 -07:00
Mingming Liu
1cc4c87198 Revert "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF" (#107788)
Reverts llvm/llvm-project#106193 while investigating bot failures
https://lab.llvm.org/buildbot/#/builders/169/builds/2989/steps/9/logs/stdio
2024-09-08 16:45:59 -07:00
Mingming Liu
9ade4e2646 [NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF (#106193)
`StringMap<T>` creates a [copy of the
string](d4c519e7b2/llvm/include/llvm/ADT/StringMapEntry.h (L55-L58))
for entry insertions and intentionally keep copies [since the
implementation optimizes string memory
usage](d4c519e7b2/llvm/include/llvm/ADT/StringMap.h (L124)).
On the other hand, linker keeps copies of symbol names [1] in
`lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3].

This change proposes to optimize away string copies inside
[LTO::GlobalResolutions](24e791b416/llvm/include/llvm/LTO/LTO.h (L409)),
which will make LTO indexing more memory efficient for ELF. There are
similar opportunities for other (COFF, wasm, MachO) formats.

The optimization takes place for lld (ELF) only. For the rest of use
cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep
copies and use global resolution key for de-duplication.

Together with @kazutakahirata's work to make `ComputeCrossModuleImport`
more memory efficient, we see a ~20% peak memory usage reduction in a
binary where peak memory usage needs to go down. Thanks to the
optimization in
329ba523cc,
the max (as opposed to the sum) of `ComputeCrossModuleImport` or
`GlobalResolution` shows up in peak memory usage.
* Regarding correctness, the set of
[resolved](80c47ad3ae/llvm/lib/LTO/LTO.cpp (L739))
[per-module
symbols](80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L188-L191))
is a subset of
[llvm::lto::InputFile::Symbols](80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L120)).
And bitcode symbol parsing saves symbol name when iterating
`obj->symbols` in `BitcodeFile::parse` already. This change updates
`BitcodeFile::parseLazy` to keep copies of per-module undefined symbols.
* Presumably the undefined symbols in a LTO unit (copied in this patch
in linker unique saver) is a small set compared with the set of symbols
in global-resolution (copied before this patch), making this a
worthwhile trade-off. Benchmarking this change alone shows measurable
memory savings across various benchmarks.

[1] ELF
1cea5c2138/lld/ELF/InputFiles.cpp (L1748)
[2]
ef7b18a53c/lld/ELF/Driver.cpp (L2863)
[3]
ef7b18a53c/lld/ELF/Driver.cpp (L2995)
2024-09-08 14:52:03 -07:00
Mingming Liu
64498c5483 [LTO][ELF][lld] Use unique string saver in ELF bitcode symbol parsing (#106670)
lld ELF
[BitcodeFile](a527248a3c/lld/ELF/InputFiles.h (L328))
uses [string
saver](a527248a3c/lld/include/lld/Common/CommonLinkerContext.h (L57))
to keep copies of bitcode symbols. Symbol duplication is very common
when compiling application binaries.

This change proposes to introduce a UniqueStringSaver in lld context and
use it for bitcode symbol parsing. The implementation covers ELF only.
Similar opportunities should exist on other (COFF, MachO, wasm) formats.

For an internal production binary where lto indexing takes ~10GiB
originally, this changes optimizes away ~800MiB (~7.8%), measured by
https://github.com/google/pprof. Flame graph breaks down memory by usage
call stacks and agrees with this measurement.
2024-09-05 14:49:03 -07:00
Oliver Stannard
a1c6467bd9 [lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985)
Previously, we selected the Thumb2 PLT sequences if any input object is
marked as not supporting the ARM ISA, which then causes assertion
failures when calls from ARM code in other objects are seen. I think the
intention here was to only use Thumb PLTs when the target does not have
the ARM ISA available, signalled by no objects being marked as having it
available. To do that we need to track which ISAs we have seen as we
parse the build attributes, and defer the decision about PLTs until all
input objects have been parsed.

This bug was triggered by real code in picolibc, which have some
versions of string.h functions built with Thumb2-only build attributes,
so that they are compatible with v7-A, v7-R and v7-M.

Fixes #99008.
2024-08-07 10:20:26 +01:00
Fangrui Song
0af07c0787 [ELF] Support relocatable files using CREL with explicit addends
... using the temporary section type code 0x40000020
(`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the
code and break compatibility (Clang and lld of different versions are
not guaranteed to cooperate, unlike other features). CREL with implicit
addends are not supported.

---

Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and
update users to check `crels`.

(The decoding performance is critical and error checking is difficult.
Follow `skipLeb` and `R_*LEB128` handling, do not use
`llvm::decodeULEB128`, whichs compiles to a lot of code.)

A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass
`/*supportsCrel=*/false` to `relsOrRelas` to allocate a buffer and
convert CREL to RELA (`relas` instead of `crels` will be used). Since
allocating a buffer increases, the conversion is only performed when
absolutely necessary.

---

Non-alloc SHT_CREL sections may be created in -r and --emit-relocs
links. SHT_CREL and SHT_RELA components need reencoding since
r_offset/r_symidx/r_type/r_addend may change. (r_type may change because
relocations referencing a symbol in a discarded section are converted to
`R_*_NONE`).

* SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`)
* SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section.
* SHT_REL components: print an error for now.

SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and
unsupported yet.

Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600

Pull Request: https://github.com/llvm/llvm-project/pull/98115
2024-08-01 10:22:03 -07:00
Fangrui Song
fd791f0fe5 [ELF] Move TarWriter into Ctx. NFC
Similar to e980f16d52.
2024-07-28 15:32:22 -07:00
Daniil Kovalev
65f9601fb1 [NFC][lld][ELF] Remove unused sec param of ObjFile<ELFT>::getRelocTarget (#96500) 2024-06-25 13:49:51 +03:00
Nikita Popov
49ae2dcf36 [PassManager] Remove some unnecessary includes (NFC) (#96175)
SmallPtrSet.h and TimeProfiler.h are unused. CommandLine.h is only
needed for the UseNewDbgInfoFormat declare, which can be moved to the
places that need it.
2024-06-20 17:41:35 +02:00
Paul Kirth
608fb463d2 [lld] Discard SHT_LLVM_LTO sections in relocatable links (#92825)
So long as ld -r links using bitcode always result in an ELF object, and
not a merged bitcode object, the output form a relocatable link using
FatLTO objects should not have a .llvm.lto section. Prior to this, using
the object code sections would cause the bitcode section in the output
of a relocatable link to be corrupted, by concatenating all the
.llvm.lto
sections together.

This patch discards SHT_LLVM_LTO sections when not using
--fat-lto-objects, so that the relocatable ELF output won't contain
inalid bitcode.
2024-06-07 17:56:35 -07:00
Fangrui Song
4d9020ca0b [ELF] Implement --force-group-allocation
GNU ld's relocatable linking behaviors:

* Sections with the `SHF_GROUP` flag are handled like sections matched
  by the `--unique=pattern` option. They are processed like orphan
  sections and ignored by input section descriptions.
* Section groups' (usually named `.group`) content is updated as the
  section indexes are updated. Section groups can be discarded with
  `/DISCARD/ : { *(.group) }`.

`-r --force-group-allocation` discards section groups and allows
sections with the `SHF_GROUP` flag to be matched like normal sections.
If two section group members are placed into the same output section,
their relocation sections (if present) are combined as well.
This behavior can be useful when -r output is used as a pseudo shared
object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments).

This patch implements --force-group-allocation:

* Input SHT_GROUP sections are discarded.
* Input sections do not get the SHF_GROUP flag, so `addInputSec`
  will combine relocation sections if their relocated section group
  members are combined.

The default behavior is:

* Input SHT_GROUP sections are retained.
* Input SHF_GROUP sections can be matched (unlike GNU ld)
* Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec`
  will create different OutputDesc copies.

GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not
implemented.

Pull Request: https://github.com/llvm/llvm-project/pull/94704
2024-06-07 14:19:06 -07:00
PiJoules
025394fa0d Reapply "[lld] Support thumb PLTs" (#93631) (#93644)
This reverts commit 7832769d32.

This was reverted prior due to a test failure on the windows builder. I
think this was because we didn't specify the triple and assumed windows.
The other tests use the full triple specifying linux, so we follow suite
here.

---

We are using PLTs for cortex-m33 which only supports thumb. More
specifically, this is for a very restricted use case. There's no MMU so
there's no sharing of virtual addresses between two processes, but this
is fine. The MCU is used for running [chre
nanoapps](https://android.googlesource.com/platform/system/chre/+/HEAD/doc/nanoapp_overview.md)
for android. Each nanoapp is a shared library (but effectively acts as
an executable containing a test suite) that is loaded and run on the MCU
one binary at a time and there's only one process running at a time, so
we ensure that the same text segment cannot be shared by two different
running executables. GNU LD supports thumb PLTs but we want to migrate
to a clang toolchain and use LLD, so thumb PLTs are needed.
2024-05-29 13:28:32 -07:00
Mehdi Amini
7832769d32 Revert "[lld] Support thumb PLTs" (#93631)
Reverts llvm/llvm-project#86223

windows pre-merge is broken.
2024-05-28 19:46:23 -06:00
PiJoules
760c2aa55f [lld] Support thumb PLTs (#86223)
We are using PLTs for cortex-m33 which only supports thumb. More
specifically, this is for a very restricted use case. There's no MMU so
there's no sharing of virtual addresses between two processes, but this
is fine. The MCU is used for running [chre
nanoapps](https://android.googlesource.com/platform/system/chre/+/HEAD/doc/nanoapp_overview.md)
for android. Each nanoapp is a shared library (but effectively acts as
an executable containing a test suite) that is loaded and run on the MCU
one binary at a time and there's only one process running at a time, so
we ensure that the same text segment cannot be shared by two different
running executables. GNU LD supports thumb PLTs but we want to migrate
to a clang toolchain and use LLD, so thumb PLTs are needed.
2024-05-28 15:37:03 -07:00
Daniil Kovalev
cca9115b1c [lld][AArch64][ELF][PAC] Support AUTH relocations and AUTH ELF marking (#72714)
This patch adds lld support for:

- Dynamic R_AARCH64_AUTH_* relocations (without including RELR compressed AUTH
relocations) as described here:
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#auth-variant-dynamic-relocations

- .note.AARCH64-PAUTH-ABI-tag section as defined here
https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#elf-marking

Depends on #72713 and #85231

---------

Co-authored-by: Peter Collingbourne <peter@pcc.me.uk>
Co-authored-by: Fangrui Song <i@maskray.me>
2024-04-04 12:38:09 +03:00
Fangrui Song
df54f627fa [ELF] Enhance --no-allow-shlib-undefined for non-exported definitions
For a DSO with all DT_NEEDED entries accounted for, if it contains an
undefined non-weak symbol that shares a name with a non-exported
definition (hidden visibility or localized by a version script), and
there is no DSO definition, we should report an error.

#70769 implemented the error when we see `ref.so def-hidden.so`. This patch
implementes the error when we see `def-hidden.so ref.so`, matching GNU
ld.

Close #86777
2024-03-29 23:36:50 -07:00
Fangrui Song
2763353891 [Object,ELFType] Rename TargetEndianness to Endianness (#86604)
`TargetEndianness` is long and unwieldy. "Target" in the name is confusing. Rename it to "Endianness".

I cannot find noticeable out-of-tree users of `TargetEndianness`, but
keep `TargetEndianness` to make this patch safer. `TargetEndianness`
will be removed by a subsequent change.
2024-03-28 09:10:34 -07:00