intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-20 01:58:44 +08:00

Author	SHA1	Message	Date
Simi Pallipurath	f146763e07	Revert "Revert "[lld][Arm] Big Endian - Byte invariant support."" This reverts commit `d8851384c6`. Reason: Applied the fix for the Asan buildbot failures.	2023-06-22 16:10:18 +01:00
Mitch Phillips	cd116e0460	Revert "Revert "Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support""" This reverts commit `9246df7049`. Reason: This patch broke the UBSan buildbots. See more information in the original phabricator review: https://reviews.llvm.org/D139092	2023-06-22 14:33:57 +02:00
Amilendra Kodithuwakku	9246df7049	Revert "Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support"" This reverts commit `a685ddf1d1`. This relands Arm CMSE support (D139092) and fixes the GCC build bot errors.	2023-06-21 22:27:13 +01:00
Amilendra Kodithuwakku	a685ddf1d1	Revert "[LLD][ELF] Cortex-M Security Extensions (CMSE) Support" This reverts commit `c4fea39056`. I am reverting this for now until I figure out how to fix the build bot errors and warnings. Errors: llvm-project/lld/ELF/Arch/ARM.cpp:1300:29: error: expected primary-expression before ‘>’ token osec->writeHeaderTo<ELFT>(++sHdrs); Warnings: llvm-project/lld/ELF/Arch/ARM.cpp:1306:31: warning: left operand of comma operator has no effect [-Wunused-value]	2023-06-21 16:13:44 +01:00
Amilendra Kodithuwakku	c4fea39056	[LLD][ELF] Cortex-M Security Extensions (CMSE) Support This commit provides linker support for Cortex-M Security Extensions (CMSE). The specification for this feature can be found in ARM v8-M Security Extensions: Requirements on Development Tools. The linker synthesizes a security gateway veneer in a special section; `.gnu.sgstubs`, when it finds non-local symbols `__acle_se_<entry>` and `<entry>`, defined relative to the same text section and having the same address. The address of `<entry>` is retargeted to the starting address of the linker-synthesized security gateway veneer in section `.gnu.sgstubs`. In summary, the linker translates input: ``` .text entry: __acle_se_entry: [entry_code] ``` into: ``` .section .gnu.sgstubs entry: SG B.W __acle_se_entry .text __acle_se_entry: [entry_code] ``` If addresses of `__acle_se_<entry>` and `<entry>` are not equal, the linker considers that `<entry>` already defines a secure gateway veneer so does not synthesize one. If `--out-implib=<out.lib>` is specified, the linker writes the list of secure gateway veneers into a CMSE import library `<out.lib>`. The CMSE import library will have 3 sections: `.symtab`, `.strtab`, `.shstrtab`. For every secure gateway veneer <entry> at address `<addr>`, `.symtab` contains a `SHN_ABS` symbol `<entry>` with value `<addr>`. If `--in-implib=<in.lib>` is specified, the linker reads the existing CMSE import library `<in.lib>` and preserves the entry function addresses in the resulting executable and new import library. Reviewed By: MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D139092	2023-06-21 14:47:34 +01:00
Simi Pallipurath	d8851384c6	Revert "[lld][Arm] Big Endian - Byte invariant support." This reverts commit `8cf8956897`.	2023-06-20 17:27:44 +01:00
Simi Pallipurath	8cf8956897	[lld][Arm] Big Endian - Byte invariant support. Arm has BE8 big endian configuration called a byte-invariant(every byte has the same address on little and big-endian systems). When in BE8 mode: 1. Instructions are big-endian in relocatable objects but little-endian in executables and shared objects. 2. Data is big-endian. 3. The data encoding of the ELF file is ELFDATA2MSB. To support BE8 without an ABI break for relocatable objects,the linker takes on the responsibility of changing the endianness of instructions. At a high level the only difference between BE32 and BE8 in the linker is that for BE8: 1. The linker sets the flag EF_ARM_BE8 in the ELF header. 2. The linker endian reverses the instructions, but not data. This patch adds BE8 big endian support for Arm. To endian reverse the instructions we'll need access to the mapping symbols. Code sections can contain a mix of Arm, Thumb and literal data. We need to endian reverse Arm instructions as words, Thumb instructions as half-words and ignore literal data.The only way to find these transitions precisely is by using mapping symbols. The instruction reversal will need to take place after relocation. For Arm BE8 code sections (Section has SHF_EXECINSTR flag ) we inserted a step after relocation to endian reverse the instructions. The implementation strategy i have used here is to write all sections BE32 including SyntheticSections then endian reverse all code in InputSections via mapping symbols. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D150870	2023-06-20 14:08:21 +01:00
Andreu Carminati	e4118a7ac0	[ELF] Fix early overflow check in finalizeAddressDependentContent LLD terminates with errors when it detects overflows in the finalizeAddressDependentContent calculation. Although, sometimes, those errors are not really errors, but an intermediate result of an ongoing address calculation. If we continue the fixed-point algorithm we can converge to the correct result. This patch * Removes the verification inside the fixed point algorithm. * Calls checkMemoryRegions at the end. Reviewed By: peter.smith, MaskRay Differential Revision: https://reviews.llvm.org/D152170	2023-06-14 15:26:31 -07:00
Fangrui Song	698ac4aba5	[ELF] Add PT_RISCV_ATTRIBUTES program header Close https://github.com/llvm/llvm-project/issues/63084 Unlike AArch32, RISC-V defines PT_RISCV_ATTRIBUTES to include the SHT_RISCV_ATTRIBUTES section. There is no real-world use case yet. We place PT_RISCV_ATTRIBUTES after PT_GNU_STACK, similar to PT_ARM_EXIDX. GNU ld places PT_RISCV_ATTRIBUTES earlier, but the placement should not matter. Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/71 Reviewed By: asb Differential Revision: https://reviews.llvm.org/D152065	2023-06-06 13:06:21 -07:00
Fangrui Song	8d85c96e0e	[lld] StringRef::{starts,ends}with => {starts,ends}_with. NFC The latter form is now preferred to be similar to C++20 starts_with. This replacement also removes one function call when startswith is not inlined.	2023-06-05 14:36:19 -07:00
Fangrui Song	8aea109504	[ELF] x86-64: place .lrodata, .lbss, and .ldata away from code sections The x86-64 medium code model utilizes large data sections, namely .lrodata, .lbss, and .ldata (along with some variants of .ldata). There is a proposal to extend the use of large data sections to the large code model as well[1]. This patch aims to place large data sections away from code sections in order to alleviate relocation overflow pressure caused by code sections referencing regular data sections. ``` .lrodata .rodata .text # if --ro-segment, MAXPAGESIZE alignment RELRO # MAXPAGESIZE alignment .data # MAXPAGESIZE alignment .bss .ldata # MAXPAGESIZE alignment .lbss ``` In comparison to GNU ld, which places .lbss, .lrodata, and .ldata after .bss, we place .lrodata above .rodata to minimize the number of permission transitions in the memory image. While GNU ld places .lbss after .bss, the subsequent sections don't reuse the file offset bytes of BSS. Our approach is to place .ldata and .lbss after .bss and create a PT_LOAD segment for .bss to large data section transition in the absence of SECTIONS commands. assignFileOffsets ensures we insert an alignment instead of allocating space for BSS, and therefore we don't waste more than MAXPAGESIZE bytes. We have a missing optimization to prevent all waste, but implementing it would introduce complexity and likely be error-prone. GNU ld's layout introduces 2 more MAXPAGESIZE alignments while ours introduces just one. [1]: https://groups.google.com/g/x86-64-abi/c/jnQdJeabxiU "Large data sections for the large code model" With help from Arthur Eubanks. Co-authored-by: James Y Knight <jyknight@google.com> Reviewed By: aeubanks, tkoeppe Differential Revision: https://reviews.llvm.org/D150510	2023-05-25 07:35:38 -07:00
Petr Hosek	811cbfc262	[lld][ELF] Implement –print-memory-usage This option was introduced in GNU ld in https://sourceware.org/legacy-ml/binutils/2015-06/msg00086.html and is often used in embedded development. This change implements this option in LLD matching the GNU ld output verbatim. Differential Revision: https://reviews.llvm.org/D150644	2023-05-25 07:14:18 +00:00
Fangrui Song	2473b1af08	[ELF] Simplify getSectionRank and rewrite comments Replace some RF_ flags with integer literals. Rewrite the isWrite/isExec block to make the code block order reflect the section order. Rewrite some imprecise comments. This is NFC, if we don't count invalid cases such as non-writable TLS and non-writable RELRO.	2023-05-12 23:58:39 -07:00
Fangrui Song	a2648bc4ea	[ELF] Remove remnant ranks for PPC64 ELFv1 special sections	2023-05-12 23:21:14 -07:00
Peter Smith	7a2000ac53	[LLD][ARM] Handle .ARM.exidx sections at non-zero output sec offset Embedded systems that do not use an ELF loader locate the .ARM.exidx exception table via linker defined __exidx_start and __exidx_end rather than use the PT_ARM_EXIDX program header. This means that some linker scripts such as the picolibc C library's linker script, do not have the .ARM.exidx sections at offset 0 in the OutputSection. For example: .except_unordered : { . = ALIGN(8); PROVIDE(__exidx_start = .); (.ARM.exidx) PROVIDE(__exidx_end = .); } >flash AT>flash :text This is within the specification of Arm exception tables, and is handled correctly by ld.bfd. This patch has 2 parts. The first updates the writing of the data of the .ARM.exidx SyntheticSection to account for a non-zero OutputSection offset. The second part makes the PT_ARM_EXIDX program header generation a special case so that it covers only the SyntheticSection and not the parent OutputSection. While not strictly necessary for programs locating the exception tables via the symbols it may cause ELF utilities that locate the exception tables via the PT_ARM_EXIDX program header to fail. This does not seem to be the case for GNU and LLVM readelf which seems to look for the SHT_ARM_EXIDX section. Differential Revision: https://reviews.llvm.org/D148033	2023-04-14 10:09:46 +01:00
Craig Topper	85444794cd	[lld][RISCV] Implement GP relaxation for R_RISCV_HI20/R_RISCV_LO12_I/R_RISCV_LO12_S. This implements support for relaxing these relocations to use the GP register to compute addresses of globals in the .sdata and .sbss sections. This feature is off by default and must be enabled by passing --relax-gp to the linker. The GP register might not always be the "global pointer". It can be used for other purposes. See discussion here https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/371 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D143673	2023-04-13 10:52:15 -07:00
Peter Smith	7ef47dd54e	[LLD] Increase thunk pass limit In issue 61250 https://github.com/llvm/llvm-project/issues/61250 there is an example of a program that takes 17 passes to converge, which is 2 more than the current limit of 15. Analysis of the program shows a particular section that is made up of many roughly thunk sized chunks of code ending in a call to a symbol that needs a thunk. Due to the positioning of the section, at each pass a subset of the calls go out of range of their original thunk, needing a new one created, which then pushes more thunks out of range. This process eventually stops after 17 passes. This patch is the simplest fix for the problem, which is to increase the pass limit. I've chosen to double it which should comfortably account for future cases like this, while only taking a few more seconds to reach the limit in case of non-convergence. As discussed in the issue, there could be some additional work done to limit thunk reuse, this would potentially increase the number of thunks in the program but should speed up convergence. Differential Revision: https://reviews.llvm.org/D145786	2023-03-13 10:04:39 +00:00
Fangrui Song	07d0a4fb53	[ELF][RISCV] Make .sdata and .sbss closer GNU ld's internal linker scripts for RISC-V place .sdata and .sbss close. This makes GP relaxation more profitable. While here, when .sbss is present, set `__bss_start` to the start of .sbss instead of .bss, to match GNU ld. Note: GNU ld's internal linker scripts have symbol assignments and input section descriptions which are not relevant for modern systems. We only add things that make sense. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145118	2023-03-07 10:37:04 -08:00
Peter Collingbourne	82c2fcffc2	ELF: Respect MEMORY command when specified without a SECTIONS command. We were previously ignoring the MEMORY command unless SECTIONS was also specified. Fix it. Differential Revision: https://reviews.llvm.org/D145132	2023-03-01 22:40:32 -08:00
Kazu Hirata	55e2cd1609	Use llvm::count{lr}_{zero,one} (NFC)	2023-01-28 12:41:20 -08:00
serge-sans-paille	984b800a03	Move from llvm::makeArrayRef to ArrayRef deduction guides - last part This is a follow-up to https://reviews.llvm.org/D140896, split into several parts as it touches a lot of files. Differential Revision: https://reviews.llvm.org/D141298	2023-01-10 11:47:43 +01:00
Gregory Alfonso	d22f050e15	Remove redundant .c_str() and .get() calls Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D139485	2022-12-18 00:33:53 +00:00
Fangrui Song	d98c172712	[ELF] Fix TimeTraceScope for "Finalize .eh_frame"	2022-12-03 18:00:51 +00:00
Guillaume Chatelet	08e2a76381	[lld][NFC] rename ELF alignment into addralign	2022-12-01 16:20:12 +00:00
Fangrui Song	910204cfbd	[ELF] createSyntheticSections: simplify config->relocatable. NFC We can add .riscv.attributes synthetic section here in the future.	2022-11-22 20:09:15 -08:00
Fangrui Song	8610cb0460	[ELF] -r: don't define _TLS_MODULE_BASE_ _TLS_MODULE_BASE_ is supposed to be defined by the final link. Defining it in a relocatable link may render the final link value incorrect. GNU ld i386/x86-64 have the same issue: https://sourceware.org/bugzilla/show_bug.cgi?id=29820	2022-11-22 12:59:45 -08:00
Fangrui Song	d9ef5574d4	[ELF] -r: don't define __global_pointer$ This symbol is supposed to be defined by the final executable link. The new behavor matches GNU ld.	2022-11-22 12:37:51 -08:00
Fangrui Song	9015e41f0f	[ELF] addRelIpltSymbols: make it explicit some passes are for non-relocatable links. NFC and prepare for __global_pointer$ and _TLS_MODULE_BASE_ fix.	2022-11-22 11:38:57 -08:00
Fangrui Song	2bf5d86422	[ELF] Change rawData to content() and data() to contentMaybeDecompress() Clarify data() which may trigger decompression and make it feasible to refactor the member variable rawData.	2022-11-20 22:43:22 +00:00
Kazu Hirata	d42357007d	[lld] Use llvm::reverse (NFC)	2022-11-06 08:39:41 -08:00
Fangrui Song	14f996dca8	[ELF] Move inputSections/ehInputSections into Ctx. NFC	2022-10-16 00:49:48 -07:00
Fangrui Song	1837333dac	[ELF] --check-sections: allow address 0xffffffff for ELFCLASS32 Fix https://github.com/llvm/llvm-project/issues/58101	2022-10-01 15:37:07 -07:00
Fangrui Song	9c626d4a0d	[ELF] Remove symtab indirection. NFC Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection.	2022-10-01 14:46:49 -07:00
Fangrui Song	34fa860048	[ELF] Remove ctx indirection. NFC Add LLVM_LIBRARY_VISIBILITY to remove unneeded GOT and unique_ptr indirection. We can move other global variables into ctx without indirection concern. In the long term we may consider passing Ctx as a parameter to various functions and eliminate global state as much as possible and then remove `Ctx::reset`.	2022-10-01 12:06:33 -07:00
Nico Weber	cd7ffa2e52	lld: Include name of output file in "failed to write output" diag Differential Revision: https://reviews.llvm.org/D133110	2022-09-14 14:57:47 -04:00
Fangrui Song	12607f57da	[ELF] Cache compute_thread_count. NFC	2022-09-12 19:09:08 -07:00
Fangrui Song	e6aebff674	[ELF] Parallelize relocation scanning * Change `Symbol::flags` to a `std::atomic<uint16_t>` * Add `llvm::parallel::threadIndex` as a thread-local non-negative integer * Add `relocsVec` to part.relaDyn and part.relrDyn so that relative relocations can be added without a mutex * Arbitrarily change -z nocombreloc to move relative relocations to the end. Disable parallelism for deterministic output. MIPS and PPC64 use global states for relocation scanning. Keep serial scanning. Speed-up with mimalloc and --threads=8 on an Intel Skylake machine: * clang (Release): 1.27x as fast * clang (Debug): 1.06x as fast * chrome (default): 1.05x as fast * scylladb (default): 1.04x as fast Speed-up with glibc malloc and --threads=16 on a ThunderX2 (AArch64): * clang (Release): 1.31x as fast * scylladb (default): 1.06x as fast Reviewed By: andrewng Differential Revision: https://reviews.llvm.org/D133003	2022-09-12 12:56:35 -07:00
Fangrui Song	94ca041905	[ELF] Move scanRelocations into Relocations.cpp. NFC	2022-09-04 21:31:18 -07:00
Fangrui Song	3b4d800911	[ELF] Parallelize writes of different OutputSections We currently process one OutputSection at a time and for each OutputSection write contained input sections in parallel. This strategy does not leverage multi-threading well. Instead, parallelize writes of different OutputSections. The default TaskSize for parallelFor often leads to inferior sharding. We prepare the task in the caller instead. * Move llvm::parallel::detail::TaskGroup to llvm::parallel::TaskGroup * Add llvm::parallel::TaskGroup::execute. * Change writeSections to declare TaskGroup and pass it to writeTo. Speed-up with --threads=8: * clang -DCMAKE_BUILD_TYPE=Release: 1.11x as fast * clang -DCMAKE_BUILD_TYPE=Debug: 1.10x as fast * chrome -DCMAKE_BUILD_TYPE=Release: 1.04x as fast * scylladb build/release: 1.09x as fast On M1, many benchmarks are a small fraction of a percentage faster. Mozilla showed the largest difference with the patch being about 1.03x as fast. Differential Revision: https://reviews.llvm.org/D131247	2022-08-24 09:40:03 -07:00
Sam Clegg	2cd4cd9a32	[lld][ELF] Rename SymbolTable::symbols() to SymbolTable::getSymbols(). NFC This change renames this method match its original name and the name used in the wasm linker. Back in `d8f8abbd4a` the ELF SymbolTable method `getSymbols()` was replaced with `forEachSymbol`. Then in `a2fc964417` `forEachSymbol` was replaced with a `llvm::iterator_range`. Then in `e9262edf0d` we came full circle and the `llvm::iterator_range` was replaced with a `symbols()` accessor that was identical the original `getSymbols()`. `getSymbols` also matches the name used elsewhere in the ELF linker as well as in both COFF and wasm backend (e.g. `InputFiles.h` and `SyntheticSections.h`) Differential Revision: https://reviews.llvm.org/D130787	2022-08-19 14:56:08 -07:00
Alex Brachet	dbd04b853b	[ELF] Support --package-metadata This was recently introduced in GNU linkers and it makes sense for ld.lld to have the same support. This implementation omits checking if the input string is valid json to reduce size bloat. Differential Revision: https://reviews.llvm.org/D131439	2022-08-08 21:31:58 +00:00
Fangrui Song	c09d323599	[ELF] Move EhInputSection out of inputSections. NFC inputSections temporarily contains EhInputSection objects mainly for combineEhSections. Place EhInputSection objects into a new vector ehInputSections instead of inputSections.	2022-07-31 11:58:08 -07:00
Fangrui Song	0a28cfdff5	[ELF] Simplify getRankProximity. NFC	2022-07-30 16:32:42 -07:00
Fangrui Song	2e2d5304f0	[ELF] Move combineEhSections from Writer to SyntheticSections. NFC This not only places the function in the right place, but also allows inlining addSection.	2022-07-29 00:47:30 -07:00
Fangrui Song	c72973608d	[ELF] Combine EhInputSection removal and MergeInputSection removal. NFC	2022-07-29 00:39:57 -07:00
Fangrui Song	8d4b11b4f1	[ELF] Remove redundant isa<InputSection>(sec). NFC combineEhSections has been called to remove EhInputSection.	2022-07-29 00:30:52 -07:00
Fangrui Song	85cfd91723	[ELF] Optimize some non-constant alignTo with alignToPowerOf2. NFC My x86-64 lld executable is 2KiB smaller. .eh_frame writing gets faster as there were lots of divisions.	2022-07-24 11:20:49 -07:00
Fangrui Song	51b9e099d5	[ELF] Reword --no-allow-shlib-undefined diagnostic Use a format more similar to unresolved references from regular object files. It's probably easier to read for people who are less familiar with the linker diagnostics. Reviewed By: ikudrin Differential Revision: https://reviews.llvm.org/D129790	2022-07-15 01:29:58 -07:00
YongKang Zhu	2324c2e3c3	[LLD] Two tweaks to symbol ordering scheme When `--symbol-ordering-file` is specified, the linker today will always put hot contributions in the middle of cold ones when targeting RISC machine, so to minimize the chances that branch thunks need be generated for hot code calling into cold code. This is not necessary when user specifies an ordering of read-only data (vs. function) symbols, or when output section is small such that no branch thunk would ever be required. The latter is common for mobile apps. For example, among all the native ARM64 libraries in Facebook Instagram App for Android, 80% of them have text section smaller than 64KB and the largest text section seen is less than 8MB, well below the distance that a BRANCH26 can reach. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D128382	2022-07-12 11:34:17 -07:00
Fangrui Song	6611d58f5b	[ELF] Relax R_RISCV_ALIGN Alternative to D125036. Implement R_RISCV_ALIGN relaxation so that we can handle -mrelax object files (i.e. -mno-relax is no longer needed) and creates a framework for future relaxation. `relaxAux` is placed in a union with InputSectionBase::jumpInstrMod, storing auxiliary information for relaxation. In the first pass, `relaxAux` is allocated. The main data structure is `relocDeltas`: when referencing `relocations[i]`, the actual offset is `r_offset - (i ? relocDeltas[i-1] : 0)`. `relaxOnce` performs one relaxation pass. It computes `relocDeltas` for all text section. Then, adjust st_value/st_size for symbols relative to this section based on `SymbolAnchor`. `bytesDropped` is set so that `assignAddresses` knows that the size has changed. Run `relaxOnce` in the `finalizeAddressDependentContent` loop to wait for convergence of text sections and other address dependent sections (e.g. SHT_RELR). Note: extrating `relaxOnce` into a separate loop works for many cases but has issues in some linker script edge cases. After convergence, compute section contents: shrink the NOP sequence of each R_RISCV_ALIGN as appropriate. Instead of deleting bytes, we run a sequence of memcpy on the content delimitered by relocation locations. For R_RISCV_ALIGN let the next memcpy skip the desired number of bytes. Section content computation is parallelizable, but let's ensure the implementation is mature before optimizations. Technically we can save a copy if we interleave some code with `OutputSection::writeTo`, but let's not pollute the generic code (we don't have templated relocation resolving, so using conditions can impose overhead to non-RISCV.) Tested: `make ARCH=riscv CROSS_COMPILE=riscv64-linux-gnu- LLVM=1 defconfig all` built Linux kernel using -mrelax is bootable. FreeBSD RISCV64 system using -mrelax is bootable. bash/curl/firefox/libevent/vim/tmux using -mrelax works. Differential Revision: https://reviews.llvm.org/D127581	2022-07-07 10:16:09 -07:00

1 2 3 4 5 ...

1753 Commits