As stated in `UnwindInfoSectionImpl::prepareRelocations`'s comments, the
unwind info uses section+addend relocations for personality functions
defined in the same file as the function itself. As personality
functions are always accessed via the GOT, we need to resolve those to a
symbol. Previously, we did this by keeping a map which resolves these to
symbols, creating a synthetic symbol if we didn't find it in the map.
This approach has an issue: if we process the object file containing the
personality function before any external uses, the entry in the map
remains unpopulated, so we create a synthetic symbol and a corresponding
GOT entry. If we encounter a relocation to it in a later file which
requires GOT (such as in `__eh_frame`), we add that symbol to the GOT,
too, effectively creating two entries which point to the same piece of
code.
This commit fixes that by searching the personality function's section
for a symbol at that offset which already has a GOT entry, and only
creating a synthetic symbol if there is none. As all non-unwind sections
are already processed by this point, it ensures no duplication.
This should only really affect our tests (and make them clearer), as
personality functions are usually defined in platform runtime libraries.
Or even if they are local, they are likely not in the first object file
to be linked.
Currently, when moving symbols from one `InputSection` to another (like
in ICF) we directly update the symbol's `isec`, `unwindEntry` and
`size`. By doing this we lose the original information. This information
will be needed in a future change. Since when moving symbols we always
set the symbol's `wasCoalesced` and `isec-> replacement`, we can just
use this info to conditionally get the information we need at access
time.
SmallVector<*, 0> is often a better replacement for std::vector :
both the object size and the code size are smaller.
(SmallMapVector uses SmallVector as well, but it is not common.)
clang size decreases by 0.0226%.
instructions:u decreases 0.037% when compiling a sqlite3 amalgram.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D156016
Reasons for rolling forward:
- the crash reported from Chromium was fixed in D151824 (not related to this patch at all)
- since D152824 was committed, it should now be safe to roll this forward.
New change:
- add an additional _ in name check
This reverts commit 4980eead4d.
We never really supported 32-bit ARM arch entirely, and partial support was added for
very specific features. Regardless, it fails to even link the most basic applications that at
this point, it might be better to move this arch as unsupported. Given that Apple will be
moving towards arm64 long term, I don't see any reason for anyone to invest time in
supporting this either, and for those who still need it should use apple's ld64 linker.
Fixes#62691
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D150544
Details: https://github.com/rust-lang/rust/issues/102754
The MachO format uses 2 bits to encode these personality funtions, with 0 reserved for "no-personality".
This means we can only have up to 3 personality. There are already three popular personalities: __gxx_personality_v0, __gcc_personality_v0, and __objc_personality_v0.
As a result, any system that needs custom-personality will run into a problem.
This patch implemented jyknight's proposal to simply force DWARFs for all non-canonical personality functions.
Differential Revision: https://reviews.llvm.org/D144999
For functions that use DWARF encodings, their compact unwind entry will
contain a hint about the offset of their DWARF entry from the start of
the `__eh_frame` section. The encoding only has 3 bytes to encode this
hint.
Previously, I neglected to check for overflow (and didn't realize that
the value was merely a hint without needing to be exact.) So for large
`__eh_frame` sections, the hint would overflow and cause the compact
unwind MODE flag to be corrupted, leading to uncaught exceptions at
runtime.
This diff fixes things by encoding zero as the hint for offsets that are
too large. The unwinder will start a linear search at the hint location
for the matching CFI record. The only requirement is that the hint
points to a valid CFI record start, and the start of the section is
always the start of a CFI record (in well-formed programs).
I'm not adding a test for this because generating the test inputs takes
a bit too much time. However, I have been testing locally with this lit
file, which takes about 15s to run on my machine:
```
# RUN: rm -rf %t; mkdir %t
# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-macos11.0 %s -o %t/test.o
# RUN: %lld -dylib -lSystem %t/test.o -o %t/test
.subsections_via_symbols
.text
.p2align 2
_f:
.cfi_startproc
.rept 0x7fffff
.cfi_escape 0x2e, 0x10
.endr
ret
.cfi_endproc
_g:
.cfi_startproc
.cfi_escape 0x2e, 0x10
ret
.cfi_endproc
```
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D147505
This implements ld64's checks for duplicate method names in categories &
classes.
In addition, this sets us up for implementing Obj-C category merging.
This diff handles the most of the parsing work; what's left is rewriting
those category / class structures.
Numbers for chromium_framework:
base diff difference (95% CI)
sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%]
user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%]
wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%]
samples 33 22
Fixes https://github.com/llvm/llvm-project/issues/54912.
Issues seen with the previous land will be fixed in the next commit.
Reviewed By: #lld-macho, thevinster, oontvoo
Differential Revision: https://reviews.llvm.org/D142916
This implements ld64's checks for duplicate method names in categories &
classes.
In addition, this sets us up for implementing Obj-C category merging.
This diff handles the most of the parsing work; what's left is rewriting
those category / class structures.
Numbers for chromium_framework:
base diff difference (95% CI)
sys_time 2.182 ± 0.027 2.200 ± 0.047 [ -0.2% .. +1.8%]
user_time 6.451 ± 0.034 6.479 ± 0.062 [ -0.0% .. +0.9%]
wall_time 6.841 ± 0.048 6.885 ± 0.105 [ -0.1% .. +1.4%]
samples 33 22
Fixes https://github.com/llvm/llvm-project/issues/54912.
Reviewed By: #lld-macho, thevinster, oontvoo
Differential Revision: https://reviews.llvm.org/D142916
This was causing an uncaught exception issue in one of our programs. The
issue was fairly subtle / rare as it required two identical LSDAs that were
referenced by a pair of non-identical compact unwind encodings.
Reviewed By: #lld-macho, smeenai
Differential Revision: https://reviews.llvm.org/D139269
If we have two files with the same weak symbol like so:
```
ltmp0:
_weak:
<contents>
```
and
```
ltmp1:
_weak:
<contents>
```
Linking them together should leave only one copy of `<contents>`, not
two. Previously, we would keep around both copies because of the
private-label `ltmp<N>` symbols (i.e. symbols that start with `l`) -- we
would not coalesce those, so we would treat them as retaining the
contents.
This matters for more than just size -- we are depending upon this
behavior internally for emitting a certain file format. This file
format's header is repeated in each object file, but we want it to
appear just once in our output.
Why can't we not emit those aliases to `_weak`, or reference the
`ltmp<N>` symbols instead of `_weak`? Well, MC actually adds `ltmp<N>`
symbols as part of the assembly-to-binary translation step. So any
codegen at the clang level can't access them.
All that said... this solution is actually kind of hacky. Here, we avoid
creating the private-label symbols at parse time. This is acceptable
since we never emit those symbols in our output. However, in ld64, any
aliasing temporary symbols (ignored or otherwise) won't retain coalesced
data. But implementing this is harder -- we would have to create those
symbols first (so we can emit their names later), but we would have to
ensure the linker correctly shuffles them around when their aliasees get
coalesced.
Additionally, ld64 treats these temporary symbols as functionally
equivalent to the weak symbols themselves -- that is, it will emit weak
binds when those non-weak temporary aliases are referenced. We have
imitated this behavior for private-label symbols, but implementing it for
local aliases in general seems substantially more difficult. I'm not
sure if any programs actually depend on this behavior though, so maybe
it's a moot point.
Finally, ld64 does all this regardless of whether
`.subsections_via_symbols` is specified. We don't. But again, given how
rare the lack of that directive is (I've only seen it from hand-written
assembly inputs), I don't think we need to worry about it.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D139069
The previous form is currently "harmless" and happened to work but may not in the future:
Consider the struct: (for x86-64, but same issue can be said for the ARM/64 families):
```
UNWIND_X86_64_MODE_MASK = 0x0F000000,
UNWIND_X86_64_MODE_RBP_FRAME = 0x01000000,
UNWIND_X86_64_MODE_STACK_IMMD = 0x02000000,
UNWIND_X86_64_MODE_STACK_IND = 0x03000000,
UNWIND_X86_64_MODE_DWARF = 0x04000000,
```
Previously, we were doing: `(encoding & MODE_DWARF) == MODE_DWARF`
As soon as a new `UNWIND_X86_64_MODE_FOO = 0x05000000` is defined, then the check above would always return true for encoding=MODE_FOO (because `(0b0101 & 0b0100) == 0b0100` )
Differential Revision: https://reviews.llvm.org/D135359
We already do this for personality pointers referenced from compact
unwind entries; this patch extends that behavior to personalities
referenced via EH frames as well.
This reduces the number of distinct personalities we need in the final
binary, and helps us avoid hitting the "too many personalities" error.
I renamed `UnwindInfoSection::prepareRelocations()` to simply `prepare`
since we now do some non-reloc-specific stuff within.
Fixes#58277.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D135728
Folding them will cause the unwinder to compute the incorrect function
start address for the folded entries, which in turn will cause the
personality function to interpret the LSDA incorrectly and break
exception handling.
You can verify the end-to-end flow by creating a simple C++ file:
```
void h();
int main() { h(); }
```
and then linking this file against the liblsda.dylib produced by the
test case added here. Before this change, running the resulting program
would result in a program termination with an uncaught exception.
Afterwards, it works correctly.
Reviewed By: #lld-macho, thevinster
Differential Revision: https://reviews.llvm.org/D132845
This fixes the following warnings produced by GCC 9:
../tools/lld/MachO/Arch/ARM64.cpp: In member function ‘void {anonymous}::OptimizationHintContext::applyAdrpLdr(const lld::macho::OptimizationHint&)’:
../tools/lld/MachO/Arch/ARM64.cpp:448:18: warning: comparison of integer expressions of different signedness: ‘int64_t’ {aka ‘long int’} and ‘uint64_t’ {aka ‘long unsigned int’} [-Wsign-compare]
448 | if (ldr.offset != (rel1->referentVA & 0xfff))
| ~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
../tools/lld/MachO/UnwindInfoSection.cpp: In function ‘bool canFoldEncoding(compact_unwind_encoding_t)’:
../tools/lld/MachO/UnwindInfoSection.cpp:404:44: warning: comparison between ‘enum<unnamed>’ and ‘enum<unnamed>’ [-Wenum-compare]
404 | static_assert(UNWIND_X86_64_MODE_MASK == UNWIND_X86_MODE_MASK, "");
| ^~~~~~~~~~~~~~~~~~~~
../tools/lld/MachO/UnwindInfoSection.cpp:405:49: warning: comparison between ‘enum<unnamed>’ and ‘enum<unnamed>’ [-Wenum-compare]
405 | static_assert(UNWIND_X86_64_MODE_STACK_IND == UNWIND_X86_MODE_STACK_IND, "");
| ^~~~~~~~~~~~~~~~~~~~~~~~~
Differential Revision: https://reviews.llvm.org/D130970
If there are multiple symbols at the same address, our unwind info
implementation assumes that we always register unwind entries to a
single canonical symbol.
This assumption was violated by the `registerEhFrame` code.
Fixes#56570.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D130208
This test was failing on our 32 bit build bot:
https://lab.llvm.org/buildbot/#/builders/178/builds/2463
This happened because in UnwindInfoSectionImpl::finalize
a decision is made whether to write out regular or compressed
unwind info.
One check in this does:
```
if (cuPtr->functionAddress >= functionAddressMax) {
break;
```
Where cuPtr->functionAddress was uint64_t and functionAddressMax
was uintptr_t, which is 4 bytes on a 32 bit system.
Using uint64_t for functionAddressMax fixes this problem.
Presumably because at only 4 bytes, the max is much lower than
we expect. We're targetting 64 bit though so the size of the max
should match the size of the addresses.
Reviewed By: #lld-macho, int3
Differential Revision: https://reviews.llvm.org/D129363
Patch created by running:
rg -l parallelForEachN | xargs sed -i '' -c 's/parallelForEachN/parallelFor/'
No behavior change.
Differential Revision: https://reviews.llvm.org/D128140
The error used to look like this:
ld64.lld: error: undefined symbol: _foo
>>> referenced by /path/to/bar.o
Now it displays the name of the function that contains the undefined
reference as well:
ld64.lld: error: undefined symbol: _foo
>>> referenced by /path/to/bar.o:(symbol _baz+0x4)
Differential Revision: https://reviews.llvm.org/D127696
This reverts commit 942f4e3a7c.
The additional change required to avoid the assertion errors seen
previously is:
--- a/lld/MachO/ICF.cpp
+++ b/lld/MachO/ICF.cpp
@@ -443,7 +443,9 @@ void macho::foldIdenticalSections() {
/*relocVA=*/0);
isec->data = copy;
}
- } else {
+ } else if (!isEhFrameSection(isec)) {
+ // EH frames are gathered as hashables from unwindEntry above; give a
+ // unique ID to everything else.
isec->icfEqClass[0] = ++icfUniqueID;
}
}
Differential Revision: https://reviews.llvm.org/D123435
== Background ==
`llvm-mc` generates unwind info in both compact unwind and DWARF
formats. LLD already handles the compact unwind format; this diff gets
us close to handling the DWARF format properly.
== Caveats ==
It's not quite done yet, but I figure it's worth getting this reviewed
and landed first as it's shaping up to be a fairly large code change.
**Known limitations of the current code:**
* Only works for x86_64, for which `llvm-mc` emits "abs-ified"
relocations as described in 618def651b.
`llvm-mc` emits regular relocations for ARM EH frames, which we do not
yet handle correctly.
Since the feature is not ready for real use yet, I've gated it behind a
flag that only gets toggled on during test suite runs. With most of the
new code disabled, we see just a hint of perf regression, so I don't
think it'd be remiss to land this as-is:
base diff difference (95% CI)
sys_time 1.926 ± 0.168 1.979 ± 0.117 [ -1.2% .. +6.6%]
user_time 3.590 ± 0.033 3.606 ± 0.028 [ +0.0% .. +0.9%]
wall_time 7.104 ± 0.184 7.179 ± 0.151 [ -0.2% .. +2.3%]
samples 30 31
== Design ==
Like compact unwind entries, EH frames are also represented as regular
ConcatInputSections that get pointed to via `Defined::unwindEntry`. This
allows them to be handled generically by e.g. the MarkLive and ICF
code. (But note that unlike compact unwind subsections, EH frame
subsections do end up in the final binary.)
In order to make EH frames "look like" a regular ConcatInputSection,
some processing is required. First, we need to split the `__eh_frame`
section along EH frame boundaries rather than along symbol boundaries.
We do this by decoding the length field of each EH frame. Second, the
abs-ified relocations need to be turned into regular Relocs.
== Next Steps ==
In order to support EH frames on ARM targets, we will either have to
teach LLD how to handle EH frames with explicit relocs, or we can try to
make `llvm-mc` emit abs-ified relocs for ARM as well. I'm hoping to do
the latter as I think it will make the LLD implementation both simpler
and faster to execute.
== Misc ==
The `obj-file-with-stabs.s` test had to be updated as the previous
version would trip assertion errors in the code. It appears that in our
attempt to produce a minimal YAML test input, we created a file with
invalid EH frame data. I've fixed this by re-generating the YAML and not
doing any hand-pruning of it.
Reviewed By: #lld-macho, Roger
Differential Revision: https://reviews.llvm.org/D123435
The <internal> symbol was tripping an assertion in getVA() because it
was not marked as used. Per the comment above that symbols creation,
dead stripping has already occurred so marking this symbol as used is
accurate.
Fixes https://github.com/llvm/llvm-project/issues/55565
Differential revision: https://reviews.llvm.org/D126072
Follow-on to {D123276}. Now that we work with an internal
representation of compact unwind entries, we no longer need to template
our UnwindInfoSectionImpl code based on the pointer size of the target
architecture.
I've still kept the split between `UnwindInfoSectionImpl` and
`UnwindInfoSection`. I'd introduced that split in order to do type
erasure, but I think it's still useful to have in order to keep
`UnwindInfoSection`'s definition in the header file clean.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D123277
{D123302} got me looking deeper at `includeInSymtab`. I thought it was a
little odd that there were excluded (live) symbols for which
`includeInSymtab` was false; we shouldn't have so many different ways to
exclude a symbol. As such, this diff makes the `L`-prefixed-symbol
exclusion code use `includeInSymtab` too. (Note that as part of our
support for `__eh_frame`, we will also be excluding all `__eh_frame`
symbols from the symtab in a future diff.)
Another thing I noticed is that the `emitStabs` code never has to deal
with excluded symbols because `SymtabSection::finalize()` already
filters them out. As such, I've updated the comments and asserts from
{D123302} to reflect this.
Reviewed By: #lld-macho, thakis
Differential Revision: https://reviews.llvm.org/D123433
The previous implementation of UnwindInfoSection materialized
all the compact unwind entries & applied their relocations, then parsed
the resulting data to generate the final unwind info. This design had
some unfortunate conseqeuences: since relocations can only be applied
after their referents have had addresses assigned, operations that need
to happen before address assignment must contort themselves. (See
{D113582} and observe how this diff greatly simplifies it.)
Moreover, it made synthesizing new compact unwind entries awkward.
Handling PR50956 will require us to do this synthesis, and is the main
motivation behind this diff.
Previously, instead of generating a new CompactUnwindEntry directly, we
would have had to generate a ConcatInputSection with a number of
`Reloc`s that would then get "flattened" into a CompactUnwindEntry.
This diff introduces an internal representation of `CompactUnwindEntry`
(the former `CompactUnwindEntry` has been renamed to
`CompactUnwindLayout`). The new CompactUnwindEntry stores references to
its personality symbol and LSDA section directly, without the use of
`Reloc` structs.
In addition to being easier to work with, this diff also allows us to
handle unwind info whose personality symbols are located in sections
placed after the `__unwind_info`.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D123276
This makes it easier to pinpoint the source of the problem.
TODO: Have more relocation error messages make use of this
functionality.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D118798
Added some comments (particularly around finalize() and
finalizeContents()) as well as doing some rephrasing / grammar fixes for
existing comments.
Also did some minor style fixups, such as by putting methods together in
a class definition and having fields of similar types next to each
other.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D118714
In ld.lld, when an ObjFile/BitcodeFile is read in --start-lib state, the file is
given archive semantics. --end-lib closes the previous --start-lib. A build
system can use this feature as an alternative to archives. This patch ports
the feature to lld-macho.
--start-lib and --end-lib are positional, unlike usual ld64 options.
I think the slight drawback does not matter as (a) reusing option names
make build systems convenient (b) `--start-lib a.o b.o --end-lib` conveys more
information than an alternative design: `-objlib a.o -objlib b.o` because
--start-lib makes it clear which objects are in the same conceptual archive.
This provides flexibility (c) `-objlib`/`-filelist` interaction may be weird.
Close https://github.com/llvm/llvm-project/issues/52931
Reviewed By: #lld-macho, Jez Ng, oontvoo
Differential Revision: https://reviews.llvm.org/D116913
D116913 will add LazyObject. Rename LazySymbol to LazyArchive to avoid confusion
and mirror ELF.
Reviewed By: #lld-macho, Jez Ng
Differential Revision: https://reviews.llvm.org/D116914
Follup-up to D107533, where we replaced local syms with non-local.
It doesn't make sense to replace local symbol with lazy.
Differential Revision: https://reviews.llvm.org/D110040
In order to keep signal:noise high for the `__eh_frame` diff, I have teased-out the NFC changes and put them here.
Differential Revision: https://reviews.llvm.org/D114017
Similar to D113702, but for the LSDAs. Clang seems to emit all LSDA
relocs as section relocs, but ld -r can turn those relocs into symbol
ones.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D113721
Dedup'ing unwind info is tricky because each CUE contains a different
function address, if ICF operated naively and compared the entire
contents of each CUE, entries with identical unwind info but belonging
to different functions would never be considered identical. To work
around this problem, we slice away the function address before
performing ICF. We rely on `relocateCompactUnwind()` to correctly handle
these truncated input sections.
Here are the numbers before and after D109944, D109945, and this diff
were applied, as tested on my 3.2 GHz 16-Core Intel Xeon W:
Without any optimizations:
base diff difference (95% CI)
sys_time 0.849 ± 0.015 0.896 ± 0.012 [ +4.8% .. +6.2%]
user_time 3.357 ± 0.030 3.512 ± 0.023 [ +4.3% .. +5.0%]
wall_time 3.944 ± 0.039 4.032 ± 0.031 [ +1.8% .. +2.6%]
samples 40 38
With `-dead_strip`:
base diff difference (95% CI)
sys_time 0.847 ± 0.010 0.896 ± 0.012 [ +5.2% .. +6.5%]
user_time 3.377 ± 0.014 3.532 ± 0.015 [ +4.4% .. +4.8%]
wall_time 3.962 ± 0.024 4.060 ± 0.030 [ +2.1% .. +2.8%]
samples 47 30
With `-dead_strip` and `--icf=all`:
base diff difference (95% CI)
sys_time 0.935 ± 0.013 0.957 ± 0.018 [ +1.5% .. +3.2%]
user_time 3.472 ± 0.022 6.531 ± 0.046 [ +87.6% .. +88.7%]
wall_time 4.080 ± 0.040 5.329 ± 0.060 [ +30.0% .. +31.2%]
samples 37 30
Unsurprisingly, ICF is now a lot slower, likely due to the much larger
number of input sections it needs to process. But the rest of the
linker only suffers a mild slowdown.
Note that the compact-unwind-bad-reloc.s test was expanded because we
now handle the relocation for CUE's function address in a separate code
path from the rest of the CUE relocations. The extended test covers both
code paths.
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D109946
Previously, our unwind info finalization logic assumed that the LSDA
section referenced by `__compact_unwind` was already finalized before
`__TEXT,__unwind_info` itself. However, that assumption could be broken
by the use of `-rename_section` -- it could be (and is) used to move
`__gcc_except_tab` it into a different segment later in the file.
(__TEXT is always the first non-zerofill segment, so any rename
basically guarantees that the section will be ordered after
`__unwind_info`.)
To handle this case, we compare LSDA relocations instead of their final
values in `UnwindInfoSection::finalize()`, and we actually relocate
those LSDAs in `UnwindInfoSection::writeTo()`. In order to do this, we
need an easy way to track which Symbol a given CUE corresponds to. My
solution was to change our `cuPtrVector` into a vector of indices, with
each index used for both the symbols vector (`symbolsVec`) as well as
the CUE vector (`cuVector`).
This change seems perf neutral. Numbers for linking chromium_framework
on my 16 core Mac Pro:
base diff difference (95% CI)
sys_time 1.248 ± 0.025 1.245 ± 0.026 [ -1.3% .. +0.8%]
user_time 3.588 ± 0.045 3.587 ± 0.037 [ -0.6% .. +0.5%]
wall_time 4.605 ± 0.069 4.595 ± 0.069 [ -1.0% .. +0.5%]
samples 42 26
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D113582
This diff does away with `addEntriesForFunctionsWithoutUnwindInfo()`,
because `addSymbol()` can now determine which functions need those
entries.
While overhauling UnwindInfoSection, I also parallelized the relocation
of the contents of the CUEs. This somewhat offsets the time regression
from creating one InputSection per CUE (which was done in D109944).
Reviewed By: #lld-macho, oontvoo
Differential Revision: https://reviews.llvm.org/D109945
Compact unwind entries (CUEs) contain pointers to their respective
function symbols. However, during the link process, it's far more useful
to have pointers from the function symbol to the CUE than vice versa.
This diff adds that pointer in the form of `Defined::compactUnwind`.
In particular, when doing dead-stripping, we want to mark CUEs live when
their function symbol is live; and when doing ICF, we want to dedup
sections iff the symbols in that section have identical CUEs. In both
cases, we want to be able to locate the symbols within a given section,
as well as locate the CUEs belonging to those symbols. So this diff also
adds `InputSection::symbols`.
The ultimate goal of this refactor is to have ICF support dedup'ing
functions with unwind info, but that will be handled in subsequent
diffs. This diff focuses on simplifying `-dead_strip` --
`findFunctionsWithUnwindInfo` is no longer necessary, and
`Defined::isLive()` is now a lot simpler. Moreover, UnwindInfoSection no
longer has to check for dead CUEs -- we simply avoid adding them in the
first place.
Additionally, we now support stripping of dead LSDAs, which follows
quite naturally since `markLive()` can now reach them via the CUEs.
Reviewed By: #lld-macho, gkm
Differential Revision: https://reviews.llvm.org/D109944
Sometimes people intentionally re-define a dylib personlity symbol as a local defined symbol as a workaround to a ld -r bug.
As a result, we could see "too many personalities" to encode. This patch tries to handle this case by ignoring the local symbols entirely.
Differential Revision: https://reviews.llvm.org/D107533