intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-02-06 06:31:50 +08:00

Author	SHA1	Message	Date
Fangrui Song	dbf37e956a	[ELF] Move InputFile storage from make<> to LinkerDriver::files	2024-11-16 23:50:35 -08:00
Fangrui Song	2991a4e209	[ELF] Replace functions bAlloc/saver/uniqueSaver with member access	2024-11-16 22:34:13 -08:00
Fangrui Song	483516fd83	[ELF] Remove unneeded Twine()	2024-11-16 20:32:44 -08:00
Fangrui Song	c1a6defd9f	[ELF] Make RelType a struct type otherwise operator<<(const ELFSyncStream &s, RelType type) applies to non-reloc-type uint32_t, which can be confusing.	2024-11-16 20:26:34 -08:00
Fangrui Song	33ff9e43b4	[ELF] Move SharedFile::vernauxNum to Ctx	2024-11-16 17:00:51 -08:00
Fangrui Song	3b75a5c4c8	[ELF] Replace message(...) with Msg(ctx)	2024-11-16 15:34:42 -08:00
Fangrui Song	a626eb2a2f	[ELF] Pass ctx to bAlloc/saver/uniqueSaver	2024-11-16 15:20:21 -08:00
Fangrui Song	38870fe124	[ELF] Remove unneeded toString(Error) when using ELFSyncStream	2024-11-16 13:22:06 -08:00
Fangrui Song	58a971f42f	[ELF] Replace contex-less toString(x) with toStr(ctx, x) so that we can remove the global `ctx` from toString implementations. Rename lld::toString (to lld::elf::toStr) to simplify name lookup (we have many llvm::toString and another lld::toString(const llvm::opt::Arg &)).	2024-11-16 11:58:10 -08:00
Fangrui Song	942928f3df	[ELF] Migrate away from global ctx	2024-11-14 23:04:18 -08:00
Fangrui Song	c13258ac49	[ELF] Replace log with Log(ctx)	2024-11-07 09:30:20 -08:00
Fangrui Song	9b058bb42d	[ELF] Replace errorOrWarn(...) with Err	2024-11-06 22:33:51 -08:00
Fangrui Song	f8bae3af74	[ELF] Replace warn(...) with Warn	2024-11-06 22:19:31 -08:00
Fangrui Song	09c2c5e1e9	[ELF] Replace error(...) with ErrAlways or Err Most are migrated to ErrAlways mechanically. In the future we should change most to Err.	2024-11-06 22:04:52 -08:00
Fangrui Song	63c6fe4a0b	[ELF] Replace fatal(...) with Fatal or Err	2024-11-06 21:17:26 -08:00
Fangrui Song	201d7607f8	[ELF] Add context-aware diagnostic functions (#112319 ) The current diagnostic functions log/warn/error/fatal lack a context argument and call the global `lld::errorHandler()`, which prevents multiple lld instances in one process. This patch introduces context-aware replacements: * log => Log(ctx) * warn => Warn(ctx) * errorOrWarn => Err(ctx) * error => ErrAlways(ctx) * fatal => Fatal(ctx) Example: `errorOrWarn(toString(f) + "xxx")` => `Err(ctx) << f << "xxx"`. (`toString(f)` is shortened to `f` as a bonus and may access `ctx` without accessing the global variable (see `Target.cpp`)). `ctx.e = &context->e;` can be replaced with a non-global Errorhandler when `ctx` becomes a local variable. (For the ELF port, the long term goal is to eliminate `error`. Most can be straightforwardly converted to `Err(ctx)`.)	2024-11-06 08:25:58 -08:00
Fangrui Song	fe8af49a1b	[ELF] Pass Ctx & to Defined & CommonSymbol	2024-10-20 01:38:16 +00:00
Fangrui Song	dbd197118d	[ELF] Pass Ctx & to Symbol	2024-10-11 23:34:43 -07:00
Fangrui Song	dd326b1225	[ELF] Pass Ctx &	2024-10-11 21:10:05 -07:00
Fangrui Song	6dd773b650	[ELF] Pass Ctx &	2024-10-11 20:15:02 -07:00
Fangrui Song	1c28f31133	[ELF] Pass Ctx &	2024-10-11 18:35:02 -07:00
Fangrui Song	b672071ba5	[ELF] Pass Ctx & to InputFile	2024-10-06 18:09:52 -07:00
Fangrui Song	f2b0133858	[ELF] Move static nextGroupId isInGroup to LinkerDriver	2024-10-06 17:38:35 -07:00
Fangrui Song	f1dccda1b5	[ELF] Pass Ctx & to Symbols	2024-10-06 17:05:43 -07:00
Fangrui Song	49865107d4	[ELF] Pass Ctx & to InputFiles	2024-10-06 11:27:24 -07:00
Fangrui Song	b3e0bd3d28	[ELF] Pass Ctx & to Arch/	2024-10-06 00:31:51 -07:00
Fangrui Song	6d03a69034	[ELF] Pass Ctx & to Arch/	2024-10-06 00:14:12 -07:00
Fangrui Song	c4c34f0474	[ELF] Pass Ctx & to InputFiles	2024-10-03 23:06:18 -07:00
Fangrui Song	079b8327ec	[ELF] Pass Ctx & to InputFiles and SyntheticSections	2024-09-29 16:06:47 -07:00
Fangrui Song	df0864e761	[ELF] Move elf::symtab into Ctx Remove the global variable `symtab` and add a member variable (`std::unique_ptr<SymbolTable>`) to `Ctx` instead. This is one step toward eliminating global states. Pull Request: https://github.com/llvm/llvm-project/pull/109612	2024-09-23 10:33:43 -07:00
Fangrui Song	eba30b3370	[ELF] Replace config-> with ctx.arg. in [IS]*.cpp	2024-09-21 12:47:47 -07:00
Fangrui Song	e88b7ff016	[ELF] Move InStruct into Ctx. NFC Ctx was introduced in March 2022 as a more suitable place for such singletons. llvm/Support/thread.h includes <thread>, which transitively includes sstream in libc++ and uses ios_base::in, so we cannot use `#define in ctx.sec`. `symtab, config, ctx` are now the only variables using LLVM_LIBRARY_VISIBILITY.	2024-09-15 22:15:02 -07:00
Fangrui Song	1cd07526b4	[ELF] Rename unique_saver to uniqueSaver. NFC and remove an unneeded FIXME.	2024-09-15 16:20:58 -07:00
Mingming Liu	09b231cb38	Re-apply "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF" (#107792 ) Fix the use-after-free bug and re-apply https://github.com/llvm/llvm-project/pull/106193 * Without the fix, the string referenced by `objSym.Name` could be destroyed even if string saver keeps a copy of the referenced string. This caused use-after-free. * The fix ([latest commit](`9776ed44cf`)) updates `objSym.Name` to reference (via `StringRef`) the string saver's copy. Test: 1. For `lld/test/ELF/lto/asmundef.ll`, its test failure is reproducible with `-DLLVM_USE_SANITIZER=Address` and gone with the fix. 3. Run all tests by following https://github.com/google/sanitizers/wiki/SanitizerBotReproduceBuild#try-local-changes. * Without the fix, `ELF/lto/asmundef.ll` aborted the multi-stage test at `@@@BUILD_STEP stage2/asan_ubsan check@@@`, defined [here](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh#L30) * With the fix, the [multi-stage test](https://github.com/llvm/llvm-zorg/blob/main/zorg/buildbot/builders/sanitizers/buildbot_fast.sh) pass stage2 {asan, ubsan, masan}. This is also the test used by https://lab.llvm.org/buildbot/#/builders/169 Original commit message `StringMap<T>` creates a [copy of the string](`d4c519e7b2/llvm/include/llvm/ADT/StringMapEntry.h (L55-L58)`) for entry insertions and intentionally keep copies [since the implementation optimizes string memory usage](`d4c519e7b2/llvm/include/llvm/ADT/StringMap.h (L124)`). On the other hand, linker keeps copies of symbol names [1] in `lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3]. This change proposes to optimize away string copies inside [LTO::GlobalResolutions](`24e791b416/llvm/include/llvm/LTO/LTO.h (L409)`), which will make LTO indexing more memory efficient for ELF. There are similar opportunities for other (COFF, wasm, MachO) formats. The optimization takes place for lld (ELF) only. For the rest of use cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep copies and use global resolution key for de-duplication. Together with @kazutakahirata's work to make `ComputeCrossModuleImport` more memory efficient, we see a ~20% peak memory usage reduction in a binary where peak memory usage needs to go down. Thanks to the optimization in `329ba523cc`, the max (as opposed to the sum) of `ComputeCrossModuleImport` or `GlobalResolution` shows up in peak memory usage. * Regarding correctness, the set of [resolved](`80c47ad3ae/llvm/lib/LTO/LTO.cpp (L739)`) [per-module symbols](`80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L188-L191)`) is a subset of [llvm::lto::InputFile::Symbols](`80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L120)`). And bitcode symbol parsing saves symbol name when iterating `obj->symbols` in `BitcodeFile::parse` already. This change updates `BitcodeFile::parseLazy` to keep copies of per-module undefined symbols. * Presumably the undefined symbols in a LTO unit (copied in this patch in linker unique saver) is a small set compared with the set of symbols in global-resolution (copied before this patch), making this a worthwhile trade-off. Benchmarking this change alone shows measurable memory savings across various benchmarks. [1] ELF `1cea5c2138/lld/ELF/InputFiles.cpp (L1748)` [2] `ef7b18a53c/lld/ELF/Driver.cpp (L2863)` [3] `ef7b18a53c/lld/ELF/Driver.cpp (L2995)`	2024-09-09 11:16:58 -07:00
Mingming Liu	1cc4c87198	Revert "[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF" (#107788 ) Reverts llvm/llvm-project#106193 while investigating bot failures https://lab.llvm.org/buildbot/#/builders/169/builds/2989/steps/9/logs/stdio	2024-09-08 16:45:59 -07:00
Mingming Liu	9ade4e2646	[NFCI][LTO][lld] Optimize away symbol copies within LTO global resolution in ELF (#106193 ) `StringMap<T>` creates a [copy of the string](`d4c519e7b2/llvm/include/llvm/ADT/StringMapEntry.h (L55-L58)`) for entry insertions and intentionally keep copies [since the implementation optimizes string memory usage](`d4c519e7b2/llvm/include/llvm/ADT/StringMap.h (L124)`). On the other hand, linker keeps copies of symbol names [1] in `lld::elf::parseFiles` [2] before invoking `compileBitcodeFiles` [3]. This change proposes to optimize away string copies inside [LTO::GlobalResolutions](`24e791b416/llvm/include/llvm/LTO/LTO.h (L409)`), which will make LTO indexing more memory efficient for ELF. There are similar opportunities for other (COFF, wasm, MachO) formats. The optimization takes place for lld (ELF) only. For the rest of use cases (gold plugin, `llvm-lto2`, etc), LTO owns a string saver to keep copies and use global resolution key for de-duplication. Together with @kazutakahirata's work to make `ComputeCrossModuleImport` more memory efficient, we see a ~20% peak memory usage reduction in a binary where peak memory usage needs to go down. Thanks to the optimization in `329ba523cc`, the max (as opposed to the sum) of `ComputeCrossModuleImport` or `GlobalResolution` shows up in peak memory usage. * Regarding correctness, the set of [resolved](`80c47ad3ae/llvm/lib/LTO/LTO.cpp (L739)`) [per-module symbols](`80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L188-L191)`) is a subset of [llvm::lto::InputFile::Symbols](`80c47ad3ae/llvm/include/llvm/LTO/LTO.h (L120)`). And bitcode symbol parsing saves symbol name when iterating `obj->symbols` in `BitcodeFile::parse` already. This change updates `BitcodeFile::parseLazy` to keep copies of per-module undefined symbols. * Presumably the undefined symbols in a LTO unit (copied in this patch in linker unique saver) is a small set compared with the set of symbols in global-resolution (copied before this patch), making this a worthwhile trade-off. Benchmarking this change alone shows measurable memory savings across various benchmarks. [1] ELF `1cea5c2138/lld/ELF/InputFiles.cpp (L1748)` [2] `ef7b18a53c/lld/ELF/Driver.cpp (L2863)` [3] `ef7b18a53c/lld/ELF/Driver.cpp (L2995)`	2024-09-08 14:52:03 -07:00
Mingming Liu	64498c5483	[LTO][ELF][lld] Use unique string saver in ELF bitcode symbol parsing (#106670 ) lld ELF [BitcodeFile](`a527248a3c/lld/ELF/InputFiles.h (L328)`) uses [string saver](`a527248a3c/lld/include/lld/Common/CommonLinkerContext.h (L57)`) to keep copies of bitcode symbols. Symbol duplication is very common when compiling application binaries. This change proposes to introduce a UniqueStringSaver in lld context and use it for bitcode symbol parsing. The implementation covers ELF only. Similar opportunities should exist on other (COFF, MachO, wasm) formats. For an internal production binary where lto indexing takes ~10GiB originally, this changes optimizes away ~800MiB (~7.8%), measured by https://github.com/google/pprof. Flame graph breaks down memory by usage call stacks and agrees with this measurement.	2024-09-05 14:49:03 -07:00
Oliver Stannard	a1c6467bd9	[lld][ARM] Fix assertion when mixing ARM and Thumb objects (#101985 ) Previously, we selected the Thumb2 PLT sequences if any input object is marked as not supporting the ARM ISA, which then causes assertion failures when calls from ARM code in other objects are seen. I think the intention here was to only use Thumb PLTs when the target does not have the ARM ISA available, signalled by no objects being marked as having it available. To do that we need to track which ISAs we have seen as we parse the build attributes, and defer the decision about PLTs until all input objects have been parsed. This bug was triggered by real code in picolibc, which have some versions of string.h functions built with Thumb2-only build attributes, so that they are compatible with v7-A, v7-R and v7-M. Fixes #99008.	2024-08-07 10:20:26 +01:00
Fangrui Song	0af07c0787	[ELF] Support relocatable files using CREL with explicit addends ... using the temporary section type code 0x40000020 (`clang -c -Wa,--crel,--allow-experimental-crel`). LLVM will change the code and break compatibility (Clang and lld of different versions are not guaranteed to cooperate, unlike other features). CREL with implicit addends are not supported. --- Introduce `RelsOrRelas::crels` to iterate over SHT_CREL sections and update users to check `crels`. (The decoding performance is critical and error checking is difficult. Follow `skipLeb` and `R_LEB128` handling, do not use `llvm::decodeULEB128`, whichs compiles to a lot of code.) A few users (e.g. .eh_frame, LLDDwarfObj, s390x) require random access. Pass `/supportsCrel=/false` to `relsOrRelas` to allocate a buffer and convert CREL to RELA (`relas` instead of `crels` will be used). Since allocating a buffer increases, the conversion is only performed when absolutely necessary. --- Non-alloc SHT_CREL sections may be created in -r and --emit-relocs links. SHT_CREL and SHT_RELA components need reencoding since r_offset/r_symidx/r_type/r_addend may change. (r_type may change because relocations referencing a symbol in a discarded section are converted to `R__NONE`). * SHT_CREL components: decode with `RelsOrRelas` and re-encode (`OutputSection::finalizeNonAllocCrel`) * SHT_RELA components: convert to CREL (`relToCrel`). An output section can only have one relocation section. * SHT_REL components: print an error for now. SHT_REL to SHT_CREL conversion for -r/--emit-relocs is complex and unsupported yet. Link: https://discourse.llvm.org/t/rfc-crel-a-compact-relocation-format-for-elf/77600 Pull Request: https://github.com/llvm/llvm-project/pull/98115	2024-08-01 10:22:03 -07:00
Fangrui Song	fd791f0fe5	[ELF] Move TarWriter into Ctx. NFC Similar to `e980f16d52`.	2024-07-28 15:32:22 -07:00
Daniil Kovalev	65f9601fb1	[NFC][lld][ELF] Remove unused `sec` param of `ObjFile<ELFT>::getRelocTarget` (#96500 )	2024-06-25 13:49:51 +03:00
Nikita Popov	49ae2dcf36	[PassManager] Remove some unnecessary includes (NFC) (#96175 ) SmallPtrSet.h and TimeProfiler.h are unused. CommandLine.h is only needed for the UseNewDbgInfoFormat declare, which can be moved to the places that need it.	2024-06-20 17:41:35 +02:00
Paul Kirth	608fb463d2	[lld] Discard SHT_LLVM_LTO sections in relocatable links (#92825 ) So long as ld -r links using bitcode always result in an ELF object, and not a merged bitcode object, the output form a relocatable link using FatLTO objects should not have a .llvm.lto section. Prior to this, using the object code sections would cause the bitcode section in the output of a relocatable link to be corrupted, by concatenating all the .llvm.lto sections together. This patch discards SHT_LLVM_LTO sections when not using --fat-lto-objects, so that the relocatable ELF output won't contain inalid bitcode.	2024-06-07 17:56:35 -07:00
Fangrui Song	4d9020ca0b	[ELF] Implement --force-group-allocation GNU ld's relocatable linking behaviors: * Sections with the `SHF_GROUP` flag are handled like sections matched by the `--unique=pattern` option. They are processed like orphan sections and ignored by input section descriptions. * Section groups' (usually named `.group`) content is updated as the section indexes are updated. Section groups can be discarded with `/DISCARD/ : { (.group) }`. `-r --force-group-allocation` discards section groups and allows sections with the `SHF_GROUP` flag to be matched like normal sections. If two section group members are placed into the same output section, their relocation sections (if present) are combined as well. This behavior can be useful when -r output is used as a pseudo shared object (e.g., FreeBSD's amd64 kernel modules, CHERIoT compartments). This patch implements --force-group-allocation: Input SHT_GROUP sections are discarded. * Input sections do not get the SHF_GROUP flag, so `addInputSec` will combine relocation sections if their relocated section group members are combined. The default behavior is: * Input SHT_GROUP sections are retained. * Input SHF_GROUP sections can be matched (unlike GNU ld) * Input SHF_GROUP sections keep the SHF_GROUP flag, so `addInputSec` will create different OutputDesc copies. GNU ld provides the `FORCE_GROUP_ALLOCATION` command, which is not implemented. Pull Request: https://github.com/llvm/llvm-project/pull/94704	2024-06-07 14:19:06 -07:00
PiJoules	025394fa0d	Reapply "[lld] Support thumb PLTs" (#93631 ) (#93644 ) This reverts commit `7832769d32`. This was reverted prior due to a test failure on the windows builder. I think this was because we didn't specify the triple and assumed windows. The other tests use the full triple specifying linux, so we follow suite here. --- We are using PLTs for cortex-m33 which only supports thumb. More specifically, this is for a very restricted use case. There's no MMU so there's no sharing of virtual addresses between two processes, but this is fine. The MCU is used for running [chre nanoapps](https://android.googlesource.com/platform/system/chre/+/HEAD/doc/nanoapp_overview.md) for android. Each nanoapp is a shared library (but effectively acts as an executable containing a test suite) that is loaded and run on the MCU one binary at a time and there's only one process running at a time, so we ensure that the same text segment cannot be shared by two different running executables. GNU LD supports thumb PLTs but we want to migrate to a clang toolchain and use LLD, so thumb PLTs are needed.	2024-05-29 13:28:32 -07:00
Mehdi Amini	7832769d32	Revert "[lld] Support thumb PLTs" (#93631 ) Reverts llvm/llvm-project#86223 windows pre-merge is broken.	2024-05-28 19:46:23 -06:00
PiJoules	760c2aa55f	[lld] Support thumb PLTs (#86223 ) We are using PLTs for cortex-m33 which only supports thumb. More specifically, this is for a very restricted use case. There's no MMU so there's no sharing of virtual addresses between two processes, but this is fine. The MCU is used for running [chre nanoapps](https://android.googlesource.com/platform/system/chre/+/HEAD/doc/nanoapp_overview.md) for android. Each nanoapp is a shared library (but effectively acts as an executable containing a test suite) that is loaded and run on the MCU one binary at a time and there's only one process running at a time, so we ensure that the same text segment cannot be shared by two different running executables. GNU LD supports thumb PLTs but we want to migrate to a clang toolchain and use LLD, so thumb PLTs are needed.	2024-05-28 15:37:03 -07:00
Daniil Kovalev	cca9115b1c	[lld][AArch64][ELF][PAC] Support AUTH relocations and AUTH ELF marking (#72714 ) This patch adds lld support for: - Dynamic R_AARCH64_AUTH_* relocations (without including RELR compressed AUTH relocations) as described here: https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#auth-variant-dynamic-relocations - .note.AARCH64-PAUTH-ABI-tag section as defined here https://github.com/ARM-software/abi-aa/blob/main/pauthabielf64/pauthabielf64.rst#elf-marking Depends on #72713 and #85231 --------- Co-authored-by: Peter Collingbourne <peter@pcc.me.uk> Co-authored-by: Fangrui Song <i@maskray.me>	2024-04-04 12:38:09 +03:00
Fangrui Song	df54f627fa	[ELF] Enhance --no-allow-shlib-undefined for non-exported definitions For a DSO with all DT_NEEDED entries accounted for, if it contains an undefined non-weak symbol that shares a name with a non-exported definition (hidden visibility or localized by a version script), and there is no DSO definition, we should report an error. #70769 implemented the error when we see `ref.so def-hidden.so`. This patch implementes the error when we see `def-hidden.so ref.so`, matching GNU ld. Close #86777	2024-03-29 23:36:50 -07:00
Fangrui Song	2763353891	[Object,ELFType] Rename TargetEndianness to Endianness (#86604 ) `TargetEndianness` is long and unwieldy. "Target" in the name is confusing. Rename it to "Endianness". I cannot find noticeable out-of-tree users of `TargetEndianness`, but keep `TargetEndianness` to make this patch safer. `TargetEndianness` will be removed by a subsequent change.	2024-03-28 09:10:34 -07:00

1 2 3 4 5 ...

859 Commits