intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-22 07:01:03 +08:00

Author	SHA1	Message	Date
Daniel Bertalan	6afbda7130	[lld-macho] Fix duplicate GOT entries for personality functions (#95054 ) As stated in `UnwindInfoSectionImpl::prepareRelocations`'s comments, the unwind info uses section+addend relocations for personality functions defined in the same file as the function itself. As personality functions are always accessed via the GOT, we need to resolve those to a symbol. Previously, we did this by keeping a map which resolves these to symbols, creating a synthetic symbol if we didn't find it in the map. This approach has an issue: if we process the object file containing the personality function before any external uses, the entry in the map remains unpopulated, so we create a synthetic symbol and a corresponding GOT entry. If we encounter a relocation to it in a later file which requires GOT (such as in `__eh_frame`), we add that symbol to the GOT, too, effectively creating two entries which point to the same piece of code. This commit fixes that by searching the personality function's section for a symbol at that offset which already has a GOT entry, and only creating a synthetic symbol if there is none. As all non-unwind sections are already processed by this point, it ensures no duplication. This should only really affect our tests (and make them clearer), as personality functions are usually defined in platform runtime libraries. Or even if they are local, they are likely not in the first object file to be linked.	2024-06-11 21:51:28 +02:00
alx32	2a3a79ce4c	[lld-macho][NFC] Preserve original symbol isec, unwindEntry and size (#88357 ) Currently, when moving symbols from one `InputSection` to another (like in ICF) we directly update the symbol's `isec`, `unwindEntry` and `size`. By doing this we lose the original information. This information will be needed in a future change. Since when moving symbols we always set the symbol's `wasCoalesced` and `isec-> replacement`, we can just use this info to conditionally get the information we need at access time.	2024-04-18 11:42:22 -07:00
Fangrui Song	fb2a971c01	[Support] Change MapVector's default template parameter to SmallVector<, 0> SmallVector<, 0> is often a better replacement for std::vector : both the object size and the code size are smaller. (SmallMapVector uses SmallVector as well, but it is not common.) clang size decreases by 0.0226%. instructions:u decreases 0.037% when compiling a sqlite3 amalgram. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D156016	2023-07-24 22:04:03 -07:00
Vy Nguyen	e60b30d5e3	Reland "D144999 [MC][MachO]Only emits compact-unwind format for "canonical" personality symbols. For the rest, use DWARFs." Reasons for rolling forward: - the crash reported from Chromium was fixed in D151824 (not related to this patch at all) - since D152824 was committed, it should now be safe to roll this forward. New change: - add an additional _ in name check This reverts commit `4980eead4d`.	2023-06-07 10:03:50 -04:00
Vincent Lee	ed59b8a11c	[lld-macho] Remove partially supported 32-bit ARM arch We never really supported 32-bit ARM arch entirely, and partial support was added for very specific features. Regardless, it fails to even link the most basic applications that at this point, it might be better to move this arch as unsupported. Given that Apple will be moving towards arm64 long term, I don't see any reason for anyone to invest time in supporting this either, and for those who still need it should use apple's ld64 linker. Fixes #62691 Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D150544	2023-05-20 13:06:03 -07:00
Nico Weber	4980eead4d	Revert "[RFC][MC][MachO]Only emits compact-unwind format for "canonical" personality symbols. For the rest, use DWARFs." This reverts commit `09aaf53a05`. Causes toolchain asserts building libc++ for x86_64, see https://reviews.llvm.org/D144999#4356215	2023-05-19 09:40:54 -04:00
Vy Nguyen	09aaf53a05	[RFC][MC][MachO]Only emits compact-unwind format for "canonical" personality symbols. For the rest, use DWARFs. Details: https://github.com/rust-lang/rust/issues/102754 The MachO format uses 2 bits to encode these personality funtions, with 0 reserved for "no-personality". This means we can only have up to 3 personality. There are already three popular personalities: __gxx_personality_v0, __gcc_personality_v0, and __objc_personality_v0. As a result, any system that needs custom-personality will run into a problem. This patch implemented jyknight's proposal to simply force DWARFs for all non-canonical personality functions. Differential Revision: https://reviews.llvm.org/D144999	2023-05-18 13:27:47 -04:00
Jez Ng	c4d9df9f78	[lld-macho][nfc] Clean up a bunch of clang-tidy issues	2023-04-05 07:50:28 -04:00
Jez Ng	f7bc79c1c7	[lld-macho] Check if DWARF offset is too large for compact unwind For functions that use DWARF encodings, their compact unwind entry will contain a hint about the offset of their DWARF entry from the start of the `__eh_frame` section. The encoding only has 3 bytes to encode this hint. Previously, I neglected to check for overflow (and didn't realize that the value was merely a hint without needing to be exact.) So for large `__eh_frame` sections, the hint would overflow and cause the compact unwind MODE flag to be corrupted, leading to uncaught exceptions at runtime. This diff fixes things by encoding zero as the hint for offsets that are too large. The unwinder will start a linear search at the hint location for the matching CFI record. The only requirement is that the hint points to a valid CFI record start, and the start of the section is always the start of a CFI record (in well-formed programs). I'm not adding a test for this because generating the test inputs takes a bit too much time. However, I have been testing locally with this lit file, which takes about 15s to run on my machine: ``` # RUN: rm -rf %t; mkdir %t # RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos11.0 %s -o %t/test.o # RUN: %lld -dylib -lSystem %t/test.o -o %t/test .subsections_via_symbols .text .p2align 2 _f: .cfi_startproc .rept 0x7fffff .cfi_escape 0x2e, 0x10 .endr ret .cfi_endproc _g: .cfi_startproc .cfi_escape 0x2e, 0x10 ret .cfi_endproc ``` Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D147505	2023-04-04 09:30:22 -04:00
Jez Ng	453102a028	[lld-macho][re-land] Warn on method name collisions from category definitions This implements ld64's checks for duplicate method names in categories & classes. In addition, this sets us up for implementing Obj-C category merging. This diff handles the most of the parsing work; what's left is rewriting those category / class structures. Numbers for chromium_framework: base diff difference (95% CI) sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%] user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%] wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%] samples 33 22 Fixes https://github.com/llvm/llvm-project/issues/54912. Issues seen with the previous land will be fixed in the next commit. Reviewed By: #lld-macho, thevinster, oontvoo Differential Revision: https://reviews.llvm.org/D142916	2023-03-30 14:33:42 -04:00
Jez Ng	ecad968f4a	Revert "[lld-macho] Warn on method name collisions from category definitions" This reverts commit `ef122753db`. Apparently it is causing some crashes: https://reviews.llvm.org/D142916#4178869	2023-03-08 15:57:24 -08:00
Jez Ng	ef122753db	[lld-macho] Warn on method name collisions from category definitions This implements ld64's checks for duplicate method names in categories & classes. In addition, this sets us up for implementing Obj-C category merging. This diff handles the most of the parsing work; what's left is rewriting those category / class structures. Numbers for chromium_framework: base diff difference (95% CI) sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%] user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%] wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%] samples 33 22 Fixes https://github.com/llvm/llvm-project/issues/54912. Reviewed By: #lld-macho, thevinster, oontvoo Differential Revision: https://reviews.llvm.org/D142916	2023-03-07 14:48:56 -08:00
Kazu Hirata	55e2cd1609	Use llvm::count{lr}_{zero,one} (NFC)	2023-01-28 12:41:20 -08:00
Jez Ng	7f60ed12ef	[reland][lld-macho] Private label aliases to weak symbols should not retain section data This reverts commit `a650f2ec7a`. The crashes it was causing will be fixed by the stacked diff {D140606}.	2022-12-23 14:51:18 -05:00
Jez Ng	a650f2ec7a	Revert "[lld-macho] Private label aliases to weak symbols should not retain section data" This reverts commit `6736bce6db`. It's causing Swift-related crashes in e.g. https://bugs.chromium.org/p/chromium/issues/detail?id=1400716 and elsewhere.	2022-12-15 18:43:00 -05:00
Jez Ng	04b1dad717	[lld-macho] Canonicalize LSDA pointers This was causing an uncaught exception issue in one of our programs. The issue was fairly subtle / rare as it required two identical LSDAs that were referenced by a pair of non-identical compact unwind encodings. Reviewed By: #lld-macho, smeenai Differential Revision: https://reviews.llvm.org/D139269	2022-12-05 16:18:15 -05:00
Jez Ng	6736bce6db	[lld-macho] Private label aliases to weak symbols should not retain section data If we have two files with the same weak symbol like so: ``` ltmp0: _weak: <contents> ``` and ``` ltmp1: _weak: <contents> ``` Linking them together should leave only one copy of `<contents>`, not two. Previously, we would keep around both copies because of the private-label `ltmp<N>` symbols (i.e. symbols that start with `l`) -- we would not coalesce those, so we would treat them as retaining the contents. This matters for more than just size -- we are depending upon this behavior internally for emitting a certain file format. This file format's header is repeated in each object file, but we want it to appear just once in our output. Why can't we not emit those aliases to `_weak`, or reference the `ltmp<N>` symbols instead of `_weak`? Well, MC actually adds `ltmp<N>` symbols as part of the assembly-to-binary translation step. So any codegen at the clang level can't access them. All that said... this solution is actually kind of hacky. Here, we avoid creating the private-label symbols at parse time. This is acceptable since we never emit those symbols in our output. However, in ld64, any aliasing temporary symbols (ignored or otherwise) won't retain coalesced data. But implementing this is harder -- we would have to create those symbols first (so we can emit their names later), but we would have to ensure the linker correctly shuffles them around when their aliasees get coalesced. Additionally, ld64 treats these temporary symbols as functionally equivalent to the weak symbols themselves -- that is, it will emit weak binds when those non-weak temporary aliases are referenced. We have imitated this behavior for private-label symbols, but implementing it for local aliases in general seems substantially more difficult. I'm not sure if any programs actually depend on this behavior though, so maybe it's a moot point. Finally, ld64 does all this regardless of whether `.subsections_via_symbols` is specified. We don't. But again, given how rare the lack of that directive is (I've only seen it from hand-written assembly inputs), I don't think we need to worry about it. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D139069	2022-12-01 12:01:32 -05:00
Vy Nguyen	65226d3f1f	[lld-macho] Fix bug in CUE folding that resulted in wrong unwind table. PR/59070 Differential Revision: https://reviews.llvm.org/D138320	2022-11-22 10:39:51 -05:00
Fangrui Song	640d9b3296	[lld] Fix duplicate word typos. NFC Based on lld/ part of D137338 but reflowed comments.	2022-11-08 17:28:04 -08:00
Vy Nguyen	fc7a71890d	[lld-macho][nfc] Clean up includes - remove unused/duplicate includes - reformatting/whitespaces Differential Revision: https://reviews.llvm.org/D136266	2022-10-19 13:56:24 -04:00
Vy Nguyen	a6d6734a41	[lld-macho][nfc] define command UNWIND_MODE_MASK for convenience and rewrite mode-mask checking logic for clarity The previous form is currently "harmless" and happened to work but may not in the future: Consider the struct: (for x86-64, but same issue can be said for the ARM/64 families): ``` UNWIND_X86_64_MODE_MASK = 0x0F000000, UNWIND_X86_64_MODE_RBP_FRAME = 0x01000000, UNWIND_X86_64_MODE_STACK_IMMD = 0x02000000, UNWIND_X86_64_MODE_STACK_IND = 0x03000000, UNWIND_X86_64_MODE_DWARF = 0x04000000, ``` Previously, we were doing: `(encoding & MODE_DWARF) == MODE_DWARF` As soon as a new `UNWIND_X86_64_MODE_FOO = 0x05000000` is defined, then the check above would always return true for encoding=MODE_FOO (because `(0b0101 & 0b0100) == 0b0100` ) Differential Revision: https://reviews.llvm.org/D135359	2022-10-14 15:16:40 -04:00
Jez Ng	7b45dfc681	[lld-macho] Canonicalize personality pointers in EH frames We already do this for personality pointers referenced from compact unwind entries; this patch extends that behavior to personalities referenced via EH frames as well. This reduces the number of distinct personalities we need in the final binary, and helps us avoid hitting the "too many personalities" error. I renamed `UnwindInfoSection::prepareRelocations()` to simply `prepare` since we now do some non-reloc-specific stuff within. Fixes #58277. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D135728	2022-10-11 23:50:46 -04:00
Kazu Hirata	32aa35b504	Drop empty string literals from static_assert (NFC) Identified with modernize-unary-static-assert.	2022-09-03 11:17:47 -07:00
Shoaib Meenai	56bd3185cd	[MachO] Don't fold compact unwind entries with LSDA Folding them will cause the unwinder to compute the incorrect function start address for the folded entries, which in turn will cause the personality function to interpret the LSDA incorrectly and break exception handling. You can verify the end-to-end flow by creating a simple C++ file: ``` void h(); int main() { h(); } ``` and then linking this file against the liblsda.dylib produced by the test case added here. Before this change, running the resulting program would result in a program termination with an uncaught exception. Afterwards, it works correctly. Reviewed By: #lld-macho, thevinster Differential Revision: https://reviews.llvm.org/D132845	2022-08-30 12:34:19 +05:00
Shoaib Meenai	f0c00942a5	[MachO] Remove stale comments https://reviews.llvm.org/D93267 implemented handling more than 127 compact unwind encodings, and https://reviews.llvm.org/D123435 and https://reviews.llvm.org/D124561 implemented stripping redundant __eh_frame entries.	2022-08-29 00:10:33 +05:00
Martin Storsjö	59c6f418fa	[LLD] [MachO] Fix GCC build warnings This fixes the following warnings produced by GCC 9: ../tools/lld/MachO/Arch/ARM64.cpp: In member function ‘void {anonymous}::OptimizationHintContext::applyAdrpLdr(const lld::macho::OptimizationHint&)’: ../tools/lld/MachO/Arch/ARM64.cpp:448:18: warning: comparison of integer expressions of different signedness: ‘int64_t’ {aka ‘long int’} and ‘uint64_t’ {aka ‘long unsigned int’} [-Wsign-compare] 448 \| if (ldr.offset != (rel1->referentVA & 0xfff)) \| ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../tools/lld/MachO/UnwindInfoSection.cpp: In function ‘bool canFoldEncoding(compact_unwind_encoding_t)’: ../tools/lld/MachO/UnwindInfoSection.cpp:404:44: warning: comparison between ‘enum<unnamed>’ and ‘enum<unnamed>’ [-Wenum-compare] 404 \| static_assert(UNWIND_X86_64_MODE_MASK == UNWIND_X86_MODE_MASK, ""); \| ^~~~~~~~~~~~~~~~~~~~ ../tools/lld/MachO/UnwindInfoSection.cpp:405:49: warning: comparison between ‘enum<unnamed>’ and ‘enum<unnamed>’ [-Wenum-compare] 405 \| static_assert(UNWIND_X86_64_MODE_STACK_IND == UNWIND_X86_MODE_STACK_IND, ""); \| ^~~~~~~~~~~~~~~~~~~~~~~~~ Differential Revision: https://reviews.llvm.org/D130970	2022-08-03 00:14:39 +03:00
Jez Ng	241f62d8d3	[lld-macho] Fix assertion when two symbols at same addr have unwind info If there are multiple symbols at the same address, our unwind info implementation assumes that we always register unwind entries to a single canonical symbol. This assumption was violated by the `registerEhFrame` code. Fixes #56570. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D130208	2022-07-21 09:44:49 -04:00
David Spickett	79942d32a6	[lld-macho] Fix compact unwind output for 32 bit builds This test was failing on our 32 bit build bot: https://lab.llvm.org/buildbot/#/builders/178/builds/2463 This happened because in UnwindInfoSectionImpl::finalize a decision is made whether to write out regular or compressed unwind info. One check in this does: ``` if (cuPtr->functionAddress >= functionAddressMax) { break; ``` Where cuPtr->functionAddress was uint64_t and functionAddressMax was uintptr_t, which is 4 bytes on a 32 bit system. Using uint64_t for functionAddressMax fixes this problem. Presumably because at only 4 bytes, the max is much lower than we expect. We're targetting 64 bit though so the size of the max should match the size of the addresses. Reviewed By: #lld-macho, int3 Differential Revision: https://reviews.llvm.org/D129363	2022-07-11 08:21:03 +00:00
Nico Weber	7effcbda49	Rename parallelForEachN to just parallelFor Patch created by running: rg -l parallelForEachN \| xargs sed -i '' -c 's/parallelForEachN/parallelFor/' No behavior change. Differential Revision: https://reviews.llvm.org/D128140	2022-06-19 17:49:00 -04:00
Daniel Bertalan	f2e92cf60e	[lld-macho] Print the name of functions containing undefined references The error used to look like this: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o Now it displays the name of the function that contains the undefined reference as well: ld64.lld: error: undefined symbol: _foo >>> referenced by /path/to/bar.o:(symbol _baz+0x4) Differential Revision: https://reviews.llvm.org/D127696	2022-06-14 09:41:28 -04:00
Jez Ng	e183bf8e15	[lld-macho][reland] Initial support for EH Frames This reverts commit `942f4e3a7c`. The additional change required to avoid the assertion errors seen previously is: --- a/lld/MachO/ICF.cpp +++ b/lld/MachO/ICF.cpp @@ -443,7 +443,9 @@ void macho::foldIdenticalSections() { /relocVA=/0); isec->data = copy; } - } else { + } else if (!isEhFrameSection(isec)) { + // EH frames are gathered as hashables from unwindEntry above; give a + // unique ID to everything else. isec->icfEqClass[0] = ++icfUniqueID; } } Differential Revision: https://reviews.llvm.org/D123435	2022-06-13 07:45:16 -04:00
Douglas Yung	942f4e3a7c	Revert "[lld-macho] Initial support for EH Frames" This reverts commit `826be330af`. This was causing a test failure on build bots: - https://lab.llvm.org/buildbot/#/builders/36/builds/21770 - https://lab.llvm.org/buildbot/#/builders/58/builds/23913	2022-06-09 05:25:43 -07:00
Jez Ng	826be330af	[lld-macho] Initial support for EH Frames == Background == `llvm-mc` generates unwind info in both compact unwind and DWARF formats. LLD already handles the compact unwind format; this diff gets us close to handling the DWARF format properly. == Caveats == It's not quite done yet, but I figure it's worth getting this reviewed and landed first as it's shaping up to be a fairly large code change. Known limitations of the current code: * Only works for x86_64, for which `llvm-mc` emits "abs-ified" relocations as described in `618def651b`. `llvm-mc` emits regular relocations for ARM EH frames, which we do not yet handle correctly. Since the feature is not ready for real use yet, I've gated it behind a flag that only gets toggled on during test suite runs. With most of the new code disabled, we see just a hint of perf regression, so I don't think it'd be remiss to land this as-is: base diff difference (95% CI) sys_time 1.926 ± 0.168 1.979 ± 0.117 [ -1.2% .. +6.6%] user_time 3.590 ± 0.033 3.606 ± 0.028 [ +0.0% .. +0.9%] wall_time 7.104 ± 0.184 7.179 ± 0.151 [ -0.2% .. +2.3%] samples 30 31 == Design == Like compact unwind entries, EH frames are also represented as regular ConcatInputSections that get pointed to via `Defined::unwindEntry`. This allows them to be handled generically by e.g. the MarkLive and ICF code. (But note that unlike compact unwind subsections, EH frame subsections do end up in the final binary.) In order to make EH frames "look like" a regular ConcatInputSection, some processing is required. First, we need to split the `__eh_frame` section along EH frame boundaries rather than along symbol boundaries. We do this by decoding the length field of each EH frame. Second, the abs-ified relocations need to be turned into regular Relocs. == Next Steps == In order to support EH frames on ARM targets, we will either have to teach LLD how to handle EH frames with explicit relocs, or we can try to make `llvm-mc` emit abs-ified relocs for ARM as well. I'm hoping to do the latter as I think it will make the LLD implementation both simpler and faster to execute. == Misc == The `obj-file-with-stabs.s` test had to be updated as the previous version would trip assertion errors in the code. It appears that in our attempt to produce a minimal YAML test input, we created a file with invalid EH frame data. I've fixed this by re-generating the YAML and not doing any hand-pruning of it. Reviewed By: #lld-macho, Roger Differential Revision: https://reviews.llvm.org/D123435	2022-06-08 23:40:52 -04:00
Alex Brachet	190b0f42cf	[lld-macho] Stop crash when emitting personalities with -dead_strip The <internal> symbol was tripping an assertion in getVA() because it was not marked as used. Per the comment above that symbols creation, dead stripping has already occurred so marking this symbol as used is accurate. Fixes https://github.com/llvm/llvm-project/issues/55565 Differential revision: https://reviews.llvm.org/D126072	2022-05-20 21:40:47 +00:00
Jez Ng	2a6669060f	[lld-macho][nfc] De-templatize UnwindInfoSection Follow-on to {D123276}. Now that we work with an internal representation of compact unwind entries, we no longer need to template our UnwindInfoSectionImpl code based on the pointer size of the target architecture. I've still kept the split between `UnwindInfoSectionImpl` and `UnwindInfoSection`. I'd introduced that split in order to do type erasure, but I think it's still useful to have in order to keep `UnwindInfoSection`'s definition in the header file clean. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D123277	2022-04-13 16:19:22 -04:00
Jez Ng	1cff723ff5	[lld-macho][nfc] Use includeInSymtab for all symtab-skipping logic {D123302} got me looking deeper at `includeInSymtab`. I thought it was a little odd that there were excluded (live) symbols for which `includeInSymtab` was false; we shouldn't have so many different ways to exclude a symbol. As such, this diff makes the `L`-prefixed-symbol exclusion code use `includeInSymtab` too. (Note that as part of our support for `__eh_frame`, we will also be excluding all `__eh_frame` symbols from the symtab in a future diff.) Another thing I noticed is that the `emitStabs` code never has to deal with excluded symbols because `SymtabSection::finalize()` already filters them out. As such, I've updated the comments and asserts from {D123302} to reflect this. Reviewed By: #lld-macho, thakis Differential Revision: https://reviews.llvm.org/D123433	2022-04-11 15:45:46 -04:00
Jez Ng	82dcf30636	[lld-macho] Use fewer indirections in UnwindInfo implementation The previous implementation of UnwindInfoSection materialized all the compact unwind entries & applied their relocations, then parsed the resulting data to generate the final unwind info. This design had some unfortunate conseqeuences: since relocations can only be applied after their referents have had addresses assigned, operations that need to happen before address assignment must contort themselves. (See {D113582} and observe how this diff greatly simplifies it.) Moreover, it made synthesizing new compact unwind entries awkward. Handling PR50956 will require us to do this synthesis, and is the main motivation behind this diff. Previously, instead of generating a new CompactUnwindEntry directly, we would have had to generate a ConcatInputSection with a number of `Reloc`s that would then get "flattened" into a CompactUnwindEntry. This diff introduces an internal representation of `CompactUnwindEntry` (the former `CompactUnwindEntry` has been renamed to `CompactUnwindLayout`). The new CompactUnwindEntry stores references to its personality symbol and LSDA section directly, without the use of `Reloc` structs. In addition to being easier to work with, this diff also allows us to handle unwind info whose personality symbols are located in sections placed after the `__unwind_info`. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D123276	2022-04-08 23:49:07 -04:00
Jez Ng	06f863ac5e	[lld-macho] Include address offsets in error messages This makes it easier to pinpoint the source of the problem. TODO: Have more relocation error messages make use of this functionality. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D118798	2022-02-07 21:06:18 -05:00
Jez Ng	3e951808d5	[lld-macho][nfc] Comments and style fixes Added some comments (particularly around finalize() and finalizeContents()) as well as doing some rephrasing / grammar fixes for existing comments. Also did some minor style fixups, such as by putting methods together in a class definition and having fields of similar types next to each other. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D118714	2022-02-01 13:45:59 -05:00
Fangrui Song	0aae2bf373	[lld-macho] Add --start-lib --end-lib In ld.lld, when an ObjFile/BitcodeFile is read in --start-lib state, the file is given archive semantics. --end-lib closes the previous --start-lib. A build system can use this feature as an alternative to archives. This patch ports the feature to lld-macho. --start-lib and --end-lib are positional, unlike usual ld64 options. I think the slight drawback does not matter as (a) reusing option names make build systems convenient (b) `--start-lib a.o b.o --end-lib` conveys more information than an alternative design: `-objlib a.o -objlib b.o` because --start-lib makes it clear which objects are in the same conceptual archive. This provides flexibility (c) `-objlib`/`-filelist` interaction may be weird. Close https://github.com/llvm/llvm-project/issues/52931 Reviewed By: #lld-macho, Jez Ng, oontvoo Differential Revision: https://reviews.llvm.org/D116913	2022-01-19 10:14:49 -08:00
Fangrui Song	97a5dccb7d	[lld-macho] Rename LazySymbol to LazyArchive. NFC D116913 will add LazyObject. Rename LazySymbol to LazyArchive to avoid confusion and mirror ELF. Reviewed By: #lld-macho, Jez Ng Differential Revision: https://reviews.llvm.org/D116914	2022-01-11 16:49:06 -08:00
Vy Nguyen	944071eca2	[lld-macho] Don't replace local personality symbol with LazySymbol Follup-up to D107533, where we replaced local syms with non-local. It doesn't make sense to replace local symbol with lazy. Differential Revision: https://reviews.llvm.org/D110040	2021-11-22 14:09:54 -05:00
Greg McGary	9cc489a4b2	[lld-macho][nfc] Factor-out NFC changes from main __eh_frame diff In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here. Differential Revision: https://reviews.llvm.org/D114017	2021-11-17 15:16:44 -07:00
Jez Ng	9d0b237c51	[lld-macho] Fix symbol relocs handling for LSDAs Similar to D113702, but for the LSDAs. Clang seems to emit all LSDA relocs as section relocs, but ld -r can turn those relocs into symbol ones. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D113721	2021-11-12 16:02:49 -05:00
Jez Ng	d9b6f7e312	[lld-macho] Teach ICF to dedup functions with identical unwind info Dedup'ing unwind info is tricky because each CUE contains a different function address, if ICF operated naively and compared the entire contents of each CUE, entries with identical unwind info but belonging to different functions would never be considered identical. To work around this problem, we slice away the function address before performing ICF. We rely on `relocateCompactUnwind()` to correctly handle these truncated input sections. Here are the numbers before and after D109944, D109945, and this diff were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W: Without any optimizations: base diff difference (95% CI) sys_time 0.849 ± 0.015 0.896 ± 0.012 [ +4.8% .. +6.2%] user_time 3.357 ± 0.030 3.512 ± 0.023 [ +4.3% .. +5.0%] wall_time 3.944 ± 0.039 4.032 ± 0.031 [ +1.8% .. +2.6%] samples 40 38 With `-dead_strip`: base diff difference (95% CI) sys_time 0.847 ± 0.010 0.896 ± 0.012 [ +5.2% .. +6.5%] user_time 3.377 ± 0.014 3.532 ± 0.015 [ +4.4% .. +4.8%] wall_time 3.962 ± 0.024 4.060 ± 0.030 [ +2.1% .. +2.8%] samples 47 30 With `-dead_strip` and `--icf=all`: base diff difference (95% CI) sys_time 0.935 ± 0.013 0.957 ± 0.018 [ +1.5% .. +3.2%] user_time 3.472 ± 0.022 6.531 ± 0.046 [ +87.6% .. +88.7%] wall_time 4.080 ± 0.040 5.329 ± 0.060 [ +30.0% .. +31.2%] samples 37 30 Unsurprisingly, ICF is now a lot slower, likely due to the much larger number of input sections it needs to process. But the rest of the linker only suffers a mild slowdown. Note that the compact-unwind-bad-reloc.s test was expanded because we now handle the relocation for CUE's function address in a separate code path from the rest of the CUE relocations. The extended test covers both code paths. Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D109946	2021-11-12 16:02:49 -05:00
Jez Ng	a2404f11c7	[lld-macho] Support renaming of LSDA section Previously, our unwind info finalization logic assumed that the LSDA section referenced by `__compact_unwind` was already finalized before `__TEXT,__unwind_info` itself. However, that assumption could be broken by the use of `-rename_section` -- it could be (and is) used to move `__gcc_except_tab` it into a different segment later in the file. (__TEXT is always the first non-zerofill segment, so any rename basically guarantees that the section will be ordered after `__unwind_info`.) To handle this case, we compare LSDA relocations instead of their final values in `UnwindInfoSection::finalize()`, and we actually relocate those LSDAs in `UnwindInfoSection::writeTo()`. In order to do this, we need an easy way to track which Symbol a given CUE corresponds to. My solution was to change our `cuPtrVector` into a vector of indices, with each index used for both the symbols vector (`symbolsVec`) as well as the CUE vector (`cuVector`). This change seems perf neutral. Numbers for linking chromium_framework on my 16 core Mac Pro: base diff difference (95% CI) sys_time 1.248 ± 0.025 1.245 ± 0.026 [ -1.3% .. +0.8%] user_time 3.588 ± 0.045 3.587 ± 0.037 [ -0.6% .. +0.5%] wall_time 4.605 ± 0.069 4.595 ± 0.069 [ -1.0% .. +0.5%] samples 42 26 Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D113582	2021-11-10 19:31:54 -05:00
Vy Nguyen	3f35dd06a5	[lld-macho][nfc][cleanup] Fix a few code style lints and clang-tidy findings - Use .empty() instead of `size() == 0` when possible. - Use const-ref to avoid copying Differential Revision: https://reviews.llvm.org/D112978	2021-11-02 11:26:15 -04:00
Jez Ng	a9353dbe51	[lld-macho] Simplify the handling of "no unwind info" functions This diff does away with `addEntriesForFunctionsWithoutUnwindInfo()`, because `addSymbol()` can now determine which functions need those entries. While overhauling UnwindInfoSection, I also parallelized the relocation of the contents of the CUEs. This somewhat offsets the time regression from creating one InputSection per CUE (which was done in D109944). Reviewed By: #lld-macho, oontvoo Differential Revision: https://reviews.llvm.org/D109945	2021-10-26 16:04:16 -04:00
Jez Ng	002eda7056	[lld-macho] Associate compact unwind entries with function symbols Compact unwind entries (CUEs) contain pointers to their respective function symbols. However, during the link process, it's far more useful to have pointers from the function symbol to the CUE than vice versa. This diff adds that pointer in the form of `Defined::compactUnwind`. In particular, when doing dead-stripping, we want to mark CUEs live when their function symbol is live; and when doing ICF, we want to dedup sections iff the symbols in that section have identical CUEs. In both cases, we want to be able to locate the symbols within a given section, as well as locate the CUEs belonging to those symbols. So this diff also adds `InputSection::symbols`. The ultimate goal of this refactor is to have ICF support dedup'ing functions with unwind info, but that will be handled in subsequent diffs. This diff focuses on simplifying `-dead_strip` -- `findFunctionsWithUnwindInfo` is no longer necessary, and `Defined::isLive()` is now a lot simpler. Moreover, UnwindInfoSection no longer has to check for dead CUEs -- we simply avoid adding them in the first place. Additionally, we now support stripping of dead LSDAs, which follows quite naturally since `markLive()` can now reach them via the CUEs. Reviewed By: #lld-macho, gkm Differential Revision: https://reviews.llvm.org/D109944	2021-10-26 16:04:15 -04:00
Vy Nguyen	b428c3e8c1	[lld-macho] Ignore local personality symbols if non-local with the same name exisst, to avoid "too many personalities" error. Sometimes people intentionally re-define a dylib personlity symbol as a local defined symbol as a workaround to a ld -r bug. As a result, we could see "too many personalities" to encode. This patch tries to handle this case by ignoring the local symbols entirely. Differential Revision: https://reviews.llvm.org/D107533	2021-09-17 12:59:42 -04:00

1 2 3

103 Commits