Commit Graph

7796 Commits

Author SHA1 Message Date
Pengying Xu
6c705d1136 [lld][elf] Skip BP ordering input sections with null data (#149265) 2025-07-18 08:01:16 -07:00
Fangrui Song
3cb0c7f45b MC: Rework .reloc directive and fix the offset when it evaluates to a constant
* Fix `.reloc constant` to mean section_symbol+constant instead of
  .+constant . The initial .reloc support from MIPS incorrectly
  interpreted the offset.
* Delay the evaluation of the offset expression after
  MCAssembler::layout, deleting a lot of code working with MCFragment.
* Delete many FIXME from https://reviews.llvm.org/D79625
* Some lld/ELF/Arch/LoongArch.cpp relaxation tests rely on .reloc .,
  R_LARCH_ALIGN generating ALIGN relocations at specific location.
  Sort the relocations.
2025-07-17 00:36:11 -07:00
Brian Cain
d2bcc51a5a [LLD] Merge .hexagon.attributes sections (#148098)
Merge the attributes of object files being linked together. The
`.hexagon.attributes` section can be used by loaders and analysis tools.
This is similar to the .riscv.attributes, introduced in
8a900f2438 /
https://reviews.llvm.org/D138550.
2025-07-14 22:36:05 -05:00
Parth
923a3cc160 [LLD] Fix crash on parsing ':ALIGN' in linker script (#146723)
The linker was crashing due to stack overflow when parsing ':ALIGN' in
an output section description. This commit fixes the linker script
parser so that the crash does not happen.

The root cause of the stack overflow is how we parse expressions
(readExpr) in linker script and the behavior of ScriptLexer::expect(...)
utility. ScriptLexer::expect does not do anything if errors have already
been encountered during linker script parsing. In particular, it never
increments the current token position in the script file, even if the
current token is the same as the expected token. This causes an infinite
call cycle on parsing an expression such as '(4096)' when an error has
already been encountered.

readExpr() calls readPrimary()
readPrimary() calls readParenExpr()

readParenExpr():

  expect("("); // no-op, current token still points to '('
  Expression *E = readExpr(); // The cycle continues...

Closes #146722

Signed-off-by: Parth Arora <partaror@qti.qualcomm.com>
2025-07-06 10:22:50 -07:00
bd1976bris
3b4e79398d [DTLTO][LLD][ELF] Add support for Integrated Distributed ThinLTO (#142757)
This patch introduces support for Integrated Distributed ThinLTO (DTLTO)
in ELF LLD.

DTLTO enables the distribution of ThinLTO backend compilations via
external distribution systems, such as Incredibuild, during the
traditional link step: https://llvm.org/docs/DTLTO.html.

It is expected that users will invoke DTLTO through the compiler driver
(e.g., Clang) rather than calling LLD directly. A Clang-side interface
for DTLTO will be added in a follow-up patch.

Note: Bitcode members of archives (thin or non-thin) are not currently
supported. This will be addressed in a future change. As a consequence
of this lack of support, this patch is not sufficient to allow for
self-hosting an LLVM build with DTLTO. Theoretically,
--start-lib/--end-lib could be used instead of archives in a self-host
build. However, it's unclear how --start-lib/--end-lib can be easily
used with the LLVM build system.

Testing:
- ELF LLD `lit` test coverage has been added, using a mock distributor
  to avoid requiring Clang.
- Cross-project `lit` tests cover integration with Clang.

For the design discussion of the DTLTO feature, see: #126654.
2025-07-02 16:12:27 +01:00
Zhaoxin Yang
2c1900860c [lld][LoongArch] Support TLSDESC GD/LD to IE/LE (#123715)
Support TLSDESC to initial-exec or local-exec optimizations. Introduce a
new hook RE_LOONGARCH_RELAX_TLS_GD_TO_IE_PAGE_PC and use existing
R_RELAX_TLS_GD_TO_IE_ABS to support TLSDESC => IE, while use existing
R_RELAX_TLS_GD_TO_LE to support TLSDESC => LE.
    
In normal or medium code model, there are two forms of code sequences:
* pcalau12i  $a0, %desc_pc_hi20(sym_desc)
* addi.d     $a0, $a0, %desc_pc_lo12(sym_desc)
* ld.d       $ra, $a0, %desc_ld(sym_desc)
* jirl       $ra, $ra, %desc_call(sym_desc)
------
* pcaddi     $a0, %desc_pcrel_20(sym_desc)
* ld.d       $ra, $a0, %desc_ld(sym_desc)
* jirl       $ra, $ra, %desc_call(sym_desc)
    
Convert to IE:
* pcalau12i $a0, %ie_pc_hi20(sym_ie)
* ld.[wd]   $a0, $a0, %ie_pc_lo12(sym_ie)

Convert to LE:
* lu12i.w $a0, %le_hi20(sym_le) # le_hi20 != 0, otherwise NOP
* ori $a0 src, %le_lo12(sym_le) # le_hi20 != 0, src = $a0, otherwise src = $zero

Simplicity, whether tlsdescToIe or tlsdescToLe, we always tend to
convert the preceding instructions to NOPs, due to both forms of code
sequence (corresponding to relocation combinations:
R_LARCH_TLS_DESC_PC_HI20+R_LARCH_TLS_DESC_PC_LO12 and
R_LARCH_TLS_DESC_PCREL20_S2) have same process.
    
TODO: When relaxation enables, redundant NOPs can be removed. It will be
implemented in a future patch.
    
Note: All forms of TLSDESC code sequences should not appear interleaved
in the normal, medium or extreme code model, which compilers do not
generate and lld is unsupported. This is thanks to the guard in
PostRASchedulerList.cpp in llvm.
```
Calls are not scheduling boundaries before register allocation,
but post-ra we don't gain anything by scheduling across calls
since we don't need to worry about register pressure.
```
2025-07-02 16:09:51 +08:00
Mingjie Xu
6323541a2a [LLD][ELF] Skip non-SHF_ALLOC sections when checking max VA and max VA difference in relaxOnce() (#145863)
For non-SHF_ALLOC sections, sh_addr is set to 0.
Skip sections without SHF_ALLOC flag, so `minVA` will not be set to 0
with non-SHF_ALLOC sections, and the size of non-SHF_ALLOC sections will
not contribute to `maxVA`.
2025-07-01 09:02:06 +08:00
Peter Collingbourne
494a74882b Reapply "ELF: Add branch-to-branch optimization."
Fixed assertion failure when reading .eh_frame sections, and added
.eh_frame sections to tests.

This reverts commit 1e95349dbe.

Original commit message follows:

When code calls a function which then immediately tail calls another
function there is no need to go via the intermediate function. By
branching directly to the target function we reduce the program's working
set for a slight increase in runtime performance.

Normally it is relatively uncommon to have functions that just tail call
another function, but with LLVM control flow integrity we have jump tables
that replace the function itself as the canonical address. As a result,
when a function address is taken and called directly, for example after
a compiler optimization resolves the indirect call, or if code built
without control flow integrity calls the function, the call will go via
the jump table.

The impact of this optimization was measured using a large internal
Google benchmark. The results were as follows:

CFI enabled:  +0.1% ± 0.05% queries per second
CFI disabled: +0.01% queries per second [not statistically significant]

The optimization is enabled by default at -O2 but may also be enabled
or disabled individually with --{,no-}branch-to-branch.

This optimization is implemented for AArch64 and X86_64 only.

lld's runtime performance (real execution time) after adding this
optimization was measured using firefox-x64 from lld-speed-test [1]
with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows:

```
    N           Min           Max        Median           Avg        Stddev
x 512     1.2264546     1.3481076     1.2970261     1.2965788   0.018620888
+ 512     1.2561196     1.3839965     1.3214632     1.3209327   0.019443971
Difference at 95.0% confidence
        0.0243538 +/- 0.00233202
        1.87831% +/- 0.179859%
        (Student's t, pooled s = 0.0190369)
```

[1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057

Reviewers: zmodem, MaskRay

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/145579
2025-06-24 22:16:18 -07:00
Hans Wennborg
1e95349dbe Revert "ELF: Add branch-to-branch optimization."
This caused assertion failures in applyBranchToBranchOpt():

  llvm/include/llvm/Support/Casting.h:578:
  decltype(auto) llvm::cast(From*)
  [with To = lld::elf::InputSection; From = lld::elf::InputSectionBase]:
  Assertion `isa<To>(Val) && "cast<Ty>() argument of incompatible type!"' failed.

See comment on the PR (https://github.com/llvm/llvm-project/pull/138366)

This reverts commit 491b82a5ec.

This also reverts the follow-up "[lld] Use llvm::partition_point (NFC) (#145209)"

This reverts commit 2ac293f5ac.
2025-06-23 13:26:02 +02:00
Kazu Hirata
2ac293f5ac [lld] Use llvm::partition_point (NFC) (#145209) 2025-06-22 06:30:10 -07:00
Peter Collingbourne
491b82a5ec ELF: Add branch-to-branch optimization.
When code calls a function which then immediately tail calls another
function there is no need to go via the intermediate function. By
branching directly to the target function we reduce the program's working
set for a slight increase in runtime performance.

Normally it is relatively uncommon to have functions that just tail call
another function, but with LLVM control flow integrity we have jump tables
that replace the function itself as the canonical address. As a result,
when a function address is taken and called directly, for example after
a compiler optimization resolves the indirect call, or if code built
without control flow integrity calls the function, the call will go via
the jump table.

The impact of this optimization was measured using a large internal
Google benchmark. The results were as follows:

CFI enabled:  +0.1% ± 0.05% queries per second
CFI disabled: +0.01% queries per second [not statistically significant]

The optimization is enabled by default at -O2 but may also be enabled
or disabled individually with --{,no-}branch-to-branch.

This optimization is implemented for AArch64 and X86_64 only.

lld's runtime performance (real execution time) after adding this
optimization was measured using firefox-x64 from lld-speed-test [1]
with ldflags "-O2 -S" on an Apple M2 Ultra. The results are as follows:

```
    N           Min           Max        Median           Avg        Stddev
x 512     1.2264546     1.3481076     1.2970261     1.2965788   0.018620888
+ 512     1.2561196     1.3839965     1.3214632     1.3209327   0.019443971
Difference at 95.0% confidence
	0.0243538 +/- 0.00233202
	1.87831% +/- 0.179859%
	(Student's t, pooled s = 0.0190369)
```

[1] https://discourse.llvm.org/t/improving-the-reproducibility-of-linker-benchmarking/86057

Pull Request: https://github.com/llvm/llvm-project/pull/138366
2025-06-20 13:16:24 -07:00
Peter Smith
eb0f1dc00e [LLD][ELF] Include offset when adding Thunk symbols (#144995)
Include the offset of a thunk in the ThunkSection when adding symbols.

At Thunk creation time the offset is set to 0 as we don't know where in
the ThunkSection the Thunk will end up. The symbol values are updated by
the setOffset() call in assignOffsets().

When we transform a thunk from a short to a long, we sometimes add a
mapping symbol. At this point the offset of the thunk is non zero and we
need to account for that when defining the symbol, as the setOffset()
call subtracts the offset before adding the new one back in.

To test; added a second thunk that is converted to a long thunk to
aarch64-thunk-bit-multipass. This second thunk is given a non zero
offset from the start of the Thunk Section so we can observe the mapping
symbol being put in the wrong place without accounting for the offset.

fixes: https://github.com/llvm/llvm-project/issues/142326
2025-06-20 10:11:42 +01:00
Ming-Yi Lai
9adde28df7 [LLD][ELF][RISCV][Zicfilp][Zicfiss] Support -z zicfilp= and -z zicfiss= to force enable/disable features (#143114)
+ If `-z zicfilp=implicit` or option not specified, the output would
have the ZICFILP feature enabled/disabled based on input objects
+ If `-z zicfilp=<never|unlabeled|func-sig>`, the output would have
ZICFILP feature forced <off|on to the "unlabeled" scheme|on to the
"func-sig" scheme>
+ If `-z zicfiss=implicit` or option not specified, the output would
have the ZICFISS feature enabled/disabled based on input objects
+ If `-z zicfiss=<never|always>`, the output would have the ZICFISS
feature forced <off|on>
2025-06-16 11:18:41 +08:00
Kazu Hirata
d78eec864c [lld] Use range-based for loops (NFC) (#144251) 2025-06-15 10:32:45 -07:00
SivanShani-Arm
5762491e2a [lld] Refactor storage of PAuth ABI core info (#141920)
Previously, the AArch64 PAuth ABI core values were stored as an
ArrayRef<uint8_t>, introducing unnecessary indirection.

This patch replaces the ArrayRef with two explicit uint64_t fields:
aarch64PauthAbiPlatform and aarch64PauthAbiVersion. This simplifies the
representation and improves readability.

No functional change intended, aside from improved error messages.
2025-06-13 11:02:33 +01:00
Fangrui Song
07dad4ecba [ELF] Implement -z dynamic-undefined-weak
The behavior of an undefined weak reference is implementation defined.
For static -no-pie linking, dynamic relocations are generally avoided (except
IRELATIVE). -shared linking generally emits dynamic relocations.

Dynamic -no-pie linking and -pie allow flexibility. Changes adjust the
behavior for better consistency and simpler internal representation,
e.g. https://reviews.llvm.org/D63003 https://reviews.llvm.org/D105164
(generalized to undefined non-weak in
2fcaa00d1e).

GNU ld introduced -z [no]dynamic-undefined-weak option to fine-tune the
behavior. (The option is not very effective with -no-pie, e.g. on
x86-64, `ld.bfd a.o s.so -z dynamic-undefined-weak` generates
R_X86_64_NONE relocations instead of GLOB_DAT/JUMP_SLOT)

This patch implements -z [no]dynamic-undefined-weak option.
The effects are summarized as follows:

* Static -no-pie: no-op
* Dynamic -no-pie: nodynamic-undefined-weak suppresses GLOB_DAT/JUMP_SLOT
* Static -pie: dynamic-undefined-weak generates ABS/GLOB_DAT/JUMP_SLOT.
  https://discourse.llvm.org/t/lld-weak-undefined-symbols-in-vdso-only/86749
* Dynamic -pie: nodynamic-undefined-weak suppresses ABS/GLOB_DAT/JUMP_SLOT

The -pie behavior likely stays stable while -no-pie (`!ctx.arg.isPic` in
`isStaticLinkTimeConstant`) behavior will likely change in the future.
The current default value of ctx.arg.zDynamicUndefined is selected to
prevent behavior changes.

Pull Request: https://github.com/llvm/llvm-project/pull/143831
2025-06-12 19:50:41 -07:00
Arthur Eubanks
46085d8f83 [lld/ELF][x86-64] Place large executable sections at the edges of binary (#70358)
So that when mixing small and large text, large text stays out of the
way of the rest of the binary.

Place large RX sections at the beginning rather than at the end so that
with `--no-rosegment`, the large text and rodata share a single PT_LOAD
segment. Place large RWX sections at the end to keep writable and
readonly sections separate.

Clang started emitting the large section flag for `.ltext` sections in
#73037.
2025-06-12 11:41:16 -07:00
Fangrui Song
2fcaa00d1e [ELF] -z undefs: handle relocations referencing undefined non-weak like undefined weak
* Merge the special case into isStaticLinkTimeConstant
* Generalize isUndefWeak to isUndefined. undefined non-weak is an error
  case. We choose to be general, which also brings us in line with GNU ld.
2025-06-11 20:37:15 -07:00
Kazu Hirata
c1d21f4434 [lld] Use std::tie to implement comparison operators (NFC) (#143726)
std::tie facilitates lexicographical comparisons through std::tuple's
built-in operator< and operator>.
2025-06-11 12:50:19 -07:00
Alexander Ziaee
44a7ecd1d7 [doc] Use ISO nomenclature for 1024 byte units (#133148)
Increase specificity by using the correct unit sizes. KBytes is an
abbreviation for kB, 1000 bytes, and the hardware industry as well as
several operating systems have now switched to using 1000 byte kBs.

If this change is acceptable, sometimes GitHub mangles merges to use the
original email of the account. $dayjob asks contributions have my work
email. Thanks!
2025-06-11 13:27:23 +02:00
Fangrui Song
8957e64a20 [ELF,RISCV] Fix oscillation due to call relaxation
The new test (derived from riscv32 openssl/test/cmp_msg_test.c) revealed
oscillation in two R_RISCV_CALL_PLT jumps:

- First jump (~2^11 bytes away): alternated between 4 and 8 bytes.
- Second jump (~2^20 bytes away): alternated between 2 and 8 bytes.

The issue is not related to alignment. In 2019, GNU ld addressed a
similar problem by reducing the relaxation allowance for cross-section
relaxation (https://sourceware.org/bugzilla/show_bug.cgi?id=25181).
This approach would result in a suboptimal layout for the tight range
tested by riscv-relax-call.s.

This patch stabilizes the process by preventing `remove` increment after
a few passes, similar to integrated assembler's fragment relaxation.
(For the Android bit reproduce, `pass < 2` leads to non-optimal layout
while `pass < 3` and `pass < 4` output is identical.)

Fix https://github.com/llvm/llvm-project/issues/113838
Possibly fix https://github.com/llvm/llvm-project/issues/123248 (inputs
are bitcode, subject to ever-changing code generation, not reproducible)

Pull Request: https://github.com/llvm/llvm-project/pull/142899
2025-06-10 09:28:39 -07:00
Jorge Gorbe Moya
d099d953ef [lld] Add missing includes. (#143453)
Some inline methods in these headers require a complete type but the
corresponding include was missing.
2025-06-09 15:52:37 -07:00
Kazu Hirata
0f5a78516a [lld] Use llvm::has_single_bit (NFC) (#143393) 2025-06-09 12:46:07 -07:00
Kazu Hirata
9ce8dde54c [lld] Use std::none_of (NFC) (#143318) 2025-06-08 16:18:23 -07:00
Ming-Yi Lai
1728405b13 [LLD][ELF][RISCV][Zicfilp] Handle .note.gnu.property sections for Zicfilp/Zicfiss features (#127193)
+ When all relocatable files contain a `.note.gnu.property` section
(with `NT_GNU_PROPERTY_TYPE_0` notes) which contains a
`GNU_PROPERTY_RISCV_FEATURE_1_AND` property in which the
`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_UNLABELED`/`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_FUNC_SIG`/`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_SS`
bit is set:
  + The output file will contain a `.note.gnu.property` section with the
bit set
  + A `PT_GNU_PROPERTY` program header is created to encompass the
`.note.gnu.property` section
+ If `-z zicfilp-unlabeled-report=[warning|error]`/`-z
zicfilp-func-sig-report=[warning|error]`/`-z
zicfiss-report=[warning|error]` is specified, the linker will report a
warning or error for any relocatable file lacking the feature bit

RISC-V Zicfilp/Zicfiss features indicate their adoptions as bits in the
`.note.gnu.property` section of ELF files. This patch enables LLD to
process the information correctly by parsing, checking and merging the
bits from all input ELF files and writing the merged result to the
output ELF file.

These feature bits are encoded as a mask in each input ELF files and
intended to be "and"-ed together to check that all input files support a
particular feature.

For RISC-V Zicfilp features, there are 2 conflicting bits allocated:
`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_UNLABELED` and
`GNU_PROPERTY_RISCV_FEATURE_1_CFI_LP_FUNC_SIG`. They represent the
adoption of the forward edge protection of control-flow integrity with
the "unlabeled" or "func-sig" policy. Since these 2 policies conflicts
with each other, these 2 bits also conflict with each other. This patch
adds the `-z zicfilp-unlabeled-report=[none|warning|error]` and `-z
zicfilp-func-sig-report=[none|warning|error]` commandline options to
make LLD report files that do not have the expected bits toggled on.

For RISC-V Zicfiss feature, there's only one bit allocated:
`GNU_PROPERTY_RISCV_FEATURE_1_CFI_SS`. This bit indicates that the ELF
file supports Zicfiss-based shadow stack. This patch adds the `-z
zicfiss-report=[none|warning|error]` commandline option to make LLD
report files that do not have the expected bit toggled on.

The adoption of the `.note.gnu.property` section for RISC-V targets can
be found in the psABI PR
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/417>
(`CFI_LP_UNLABELED` and `CFI_SS`) and PR
<https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/434>
(`CFI_LP_FUNC_SIG`).
2025-06-06 10:32:17 +08:00
Jessica Clarke
6be4670dfb [ELF] Consistently use gotEntrySize for GOT entries (#142064)
d62413452f ("[lld][X86] Restore gotEntrySize.") (re-)introduced
gotEntrySize and used it for various GOT calculations, but was not
exhaustive (nor consistent; Symbol::getGotOffset was modified but not
GotSection::finalizeContents, so we undercompute the size, on top of
computing the wrong offsets for TLS), and since then even more uses have
been added that use wordsize instead of gotEntrySize (presumably due to
looking at the existing incorrect ones).

This doesn't really matter upstream, as the only architecture where the
two differ is X32, and from looking at the code it's not properly
supported (e.g. TLS relaxation assumes LP64 sequences), but downstream
in CHERI LLVM it does matter, as CHERI's pointers are more than just an
integer address (a machine word).

Note this ignores the special MipsGotSection; on MIPS, wordsize and
gotEntrySize are the same, CHERI-MIPS is no longer something we support
downstream and even when we did we didn't reuse that implementation.
2025-06-04 01:30:47 +01:00
SharonXSharon
79cc728b77 [lld][macho] Strip .__uniq. and .llvm. hashes in -order_file (#140670)
```
/// Symbols can be appended with "(.__uniq.xxxx)?.llvm.yyyy" where "xxxx" and
/// "yyyy" are numbers that could change between builds. We need to use the root
/// symbol name before this suffix so these symbols can be matched with profiles
/// which may have different suffixes.
```
Just like what we are doing in BP,
https://github.com/llvm/llvm-project/blob/main/lld/MachO/BPSectionOrderer.cpp#L127

the patch removes the suffixes when parsing the order file and getting
the symbol priority to have a better symbol match.

---------

Co-authored-by: Sharon Xu <sharonxu@fb.com>
Co-authored-by: Ellis Hoag <ellis.sparky.hoag@gmail.com>
2025-06-03 10:12:36 -07:00
Fangrui Song
5859863bab [ELF] Postpone ASSERT error
assignAddresses is executed more than once. When an ASSERT expression
evaluates to zero, we should only report an error for the last
assignAddresses. Make a change similar to #66854 and #96361.

This change might help https://github.com/ClangBuiltLinux/linux/issues/2094
2025-05-28 20:56:13 -07:00
Kazu Hirata
19f00c0570 [lld] Remove unused includes (NFC) (#141421) 2025-05-25 10:55:39 -07:00
Fangrui Song
0edc8b59ab [ELF] Error if a section address is smaller than image base
When using `-no-pie` without a `SECTIONS` command, the linker uses the
target's default image base. If `-Ttext=` or `--section-start` specifies
an output section address below this base, the result is likely
unintended.

- With `--no-rosegment`, the PT_LOAD segment covering the ELF header cannot include `.text` if `.text`'s address is too low, causing an `error: output file too large`.
- With default `--rosegment`:
  - If a read-only section (e.g., `.rodata`) exists, a similar `error: output file too large` occurs.
  - Without read-only sections, the PT_LOAD segment covering the ELF header and program headers includes no sections, which is unusual and likely undesired. This also causes non-ascending PT_LOAD `p_vaddr` values related to the PT_LOAD that overlaps with PT_PHDR (#138584).

To prevent these issues, report an error if a section address is below
the image base and suggest `--image-base`. This check also applies when
`--image-base` is explicitly set but is skipped when a `SECTIONS`
command is used.

Pull Request: https://github.com/llvm/llvm-project/pull/140187
2025-05-21 09:19:47 -07:00
Kazu Hirata
575f66cf5e [lld] Drop const from a return type (NFC) (#140667) 2025-05-19 21:37:48 -07:00
Kazu Hirata
91a7085faf [lld] Use llvm::stable_sort (NFC) (#140488) 2025-05-19 06:20:00 -07:00
Fangrui Song
369c409348 Support,lld: Rename misnamed F_no_mmap to F_mmap
`F_no_mmap` introduced by https://reviews.llvm.org/D69294 is misnamed.
It oughts to be `F_mmap`

When the output is a regular file or do not exist,
`--no-mmap-output-file` is the default. Relands #134787 by fixing the
lld option default. Note: changing the default to --map-output-file
would likely fail on llvm-clang-x86_64-sie-win
(https://lab.llvm.org/buildbot/#/builders/46/builds/14847)

Pull Request: https://github.com/llvm/llvm-project/pull/139836
2025-05-14 21:00:49 -07:00
Hans Wennborg
fd3fecfc09 Revert "[lld] Merge equivalent symbols found during ICF (#134342)"
The change would also merge *non-equivalent* symbols under some circumstances,
see comment with a reproducer on the PR.

> Fixes a correctness issue for AArch64 when ADRP and LDR instructions are
> outlined in separate sections and sections are fed to ICF for
> deduplication.
>
> See test case (based on
> https://github.com/llvm/llvm-project/issues/129122) for details. All
> rodata.* sections are folded into a single section with ICF. This leads
> to all f2_* function sections getting folded into one (as their
> relocation target symbols g* belong to .rodata.g* sections that have
> already been folded into one). Since relocations still refer original g*
> symbols, we end up creating duplicate GOT entry for all such symbols.
> This PR addresses that by tracking such folded symbols and create one
> GOT entry for all such symbols.
>
> Fixes https://github.com/llvm/llvm-project/issues/129122
>
> Co-authored by: @jyknight

This reverts commit 8389d6fad7.
2025-05-13 10:57:46 +02:00
Fangrui Song
bdcabc4862 [ELF] writeTrapInstr: Don't decrease p_memsz
When the last PT_LOAD segment is executable and includes BSS sections,
its p_memsz may exceed the aligned p_filesz. This change ensures p_memsz
is not reduced in such cases (e.g. --omagic).

In addition, disable this behavior when a SECTIONS command is specified.

Refined behavior introduced in https://reviews.llvm.org/D37369 (2017).

The -z separate-loadable-segments --omagic test adds coverage for the
option combination, even if it might be practical.

Pull Request: https://github.com/llvm/llvm-project/pull/139207
2025-05-10 10:04:01 +08:00
David Sankel
652ab98008 [lld][NFC] Fix minor typo in docs (#138898) 2025-05-08 19:12:58 +01:00
Kazu Hirata
d3e792cec1 [lld] Remove unused local variables (NFC) (#138470) 2025-05-04 14:14:34 -07:00
Pranav Kant
46838e1a09 [lld] NFC. Rename function to better reflect its implementation (#136625) 2025-04-30 18:17:35 -07:00
Pranav Kant
8389d6fad7 [lld] Merge equivalent symbols found during ICF (#134342)
Fixes a correctness issue for AArch64 when ADRP and LDR instructions are
outlined in separate sections and sections are fed to ICF for
deduplication.

See test case (based on
https://github.com/llvm/llvm-project/issues/129122) for details. All
rodata.* sections are folded into a single section with ICF. This leads
to all f2_* function sections getting folded into one (as their
relocation target symbols g* belong to .rodata.g* sections that have
already been folded into one). Since relocations still refer original g*
symbols, we end up creating duplicate GOT entry for all such symbols.
This PR addresses that by tracking such folded symbols and create one
GOT entry for all such symbols.

Fixes https://github.com/llvm/llvm-project/issues/129122

Co-authored by: @jyknight
2025-04-21 15:17:03 -07:00
Kazu Hirata
f347a06591 [lld] Use llvm::unique (NFC) (#136453) 2025-04-19 13:35:51 -07:00
Daniil Kovalev
8fdebff69d [PAC][ThinLTO] Fix auth key for GOT entries of function symbols (#131467)
Symtab is first filled with the data from the bitcode file, and all
undefined symbols except TLS ones are `STT_NOTYPE`. Since auth key for a
signed GOT entry depends on the symbol type being `STT_FUNC` or not, we
need to update the symtab after the bitcode is compiled to an ELF object
and update symbol types for function symbols. This patch implements the
described behavior.
2025-04-19 17:40:50 +03:00
Csanád Hajdú
10f75b8ef7 [LLD][ELF][AArch64] Mark .plt and .iplt with PURECODE flag (#134798)
Mark the synthetic sections `.plt` and `.iplt` with the
`SHF_AARCH64_PURECODE` section flag, allowing them to be placed in an
executable-only segment.
2025-04-17 16:26:11 +02:00
Zhaoxin Yang
8a351f1f2e [lld][LoongArch] Support relaxation during IE to LE conversion (#123702)
Complement https://github.com/llvm/llvm-project/pull/123680. When
relaxation enable, remove redundant NOPs.
2025-04-11 16:55:59 +08:00
Peter Collingbourne
f53eb88d25 ELF: Remove lock from MTE global relocation handling code.
This lock is unnecessary because we can add the relocations to
shards and let them be sorted later.

Reviewers: smithp35, fmayer, MaskRay

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/135123
2025-04-10 10:38:59 -07:00
Douglas
156e2532ed Revert "Rename F_no_mmap to F_mmap" (#134924)
Reverts llvm/llvm-project#134787

Causes the LIT test `lld\test\ELF\link-open-file.test` to fail on the
`llvm-clang-x86_64-sie-win` Build Bot. First instance of the failure
observed in: https://lab.llvm.org/buildbot/#/builders/46/builds/14847
2025-04-08 13:20:42 -07:00
Dmitry Chestnykh
d6c8e8908d Rename F_no_mmap to F_mmap (#134787)
The `F_no_mmap` flag was introduced by
6814232429
2025-04-08 19:22:03 +03:00
Csanád Hajdú
2c1bdd4a08 [LLD][ELF] Allow merging XO and RX sections, and add --[no-]xosegment flag (#132412)
Following from the discussion in #132224, this seems like the best
approach to deal with a mix of XO and RX output sections in the same
binary. This change will also simplify the implementation of the
PURECODE section flag for AArch64.

To control this behaviour, the `--[no-]xosegment` flag is added to LLD
(similarly to `--[no-]rosegment`), which determines whether to allow
merging XO and RX sections in the same segment. The default value is
`--no-xosegment`, which is a breaking change compared to the previous
behaviour.

Release notes are also added, since this will be a breaking change.
2025-04-08 08:47:51 +02:00
Zhaoxin Yang
bd84d66700 [lld][LoongArch] Convert TLS IE to LE in the normal or medium code model (#123680)
Original code sequence:
* pcalau12i $a0, %ie_pc_hi20(sym)
* ld.d           $a0, $a0, %ie_pc_lo12(sym)

The code sequence converted is as follows:
* lu12i.w   $a0, %le_hi20(sym)         # le_hi20 != 0, otherwise NOP
* ori          $a0, src, %le_lo12(sym)  # le_hi20 != 0, src = $a0,
                                                         # otherwise,    src = $zero

TODO: When relaxation is enabled, redundant NOP can be removed. This
will be implemented in a future patch.
    
Note: In the normal or medium code model, original code sequence with
relocations allow interleaving, because converted code sequence
calculates the absolute offset. However, in extreme code model, to
identify the current code model, the first four instructions with
relocations must appear consecutively.
2025-04-07 19:58:48 +08:00
Daniel Thornburgh
e84b57dfbf [LLD][ELF] Support OVERLAY NOCROSSREFS (#133807)
This allows NOCROSSREFS to be specified in OVERLAY linker script
descriptions. This is a particularly useful part of the OVERLAY syntax,
since it's very rarely possible for one overlay section to sensibly
reference another.

Closes #128790
2025-04-02 09:25:18 -07:00
Peter Smith
e47d3a3088 [LLD][AArch64] Increase alignment of AArch64AbsLongThunk to 8 (#133738)
This permits an AArch64AbsLongThunk to be used in an environment where
unaligned accesses are disabled.

The AArch64AbsLongThunk does a load of an 8-byte address. When unaligned
accesses are disabled this address must be 8-byte aligned.

The vast majority of AArch64 systems will have unaligned accesses
enabled in userspace. However, after a reset, before the MMU has been
enabled, all memory accesses are to "device" memory, which requires
aligned accesses. In systems with multi-stage boot loaders a thunk may
be required to a later stage before the MMU has been enabled.

As we only want to increase the alignment when the ldr is used we delay
the increase in thunk alignment until we know we are going to write an
ldr. We also need to account for the ThunkSection alignment increase
when this happens.

In some of the test updates, particularly those with shared CHECK lines
with position independent thunks it was easier to ensure that the thunks
started at an 8-byte aligned address in all cases.
2025-04-01 09:49:27 +01:00