intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-24 08:30:34 +08:00

Author	SHA1	Message	Date
Maksim Panchenko	53bd88c7fe	[BOLT] Refactor reading of debug line info Summary: Match BinaryFunction to a DWARFUnit based on the unit's address ranges skipping the parsing of DIEs. (cherry picked from FBD24269325)	2020-10-12 21:04:42 -07:00
Maksim Panchenko	9f15b9f3c2	[BOLT] Fix debug line info in lite relocation mode Summary: Emit line info for functions that were not emitted in relocation mode. (cherry picked from FBD24267650)	2020-10-12 20:16:59 -07:00
Maksim Panchenko	247b4181a3	[BOLT] Emit symbol size for functions Summary: On targets that support it, emit size of the emitted function symbol. At the moment there's no use for the size except that it is visible in a temporary .o file symbol table. (cherry picked from FBD24246177)	2020-10-12 13:02:50 -07:00
Maksim Panchenko	0465d952cc	[BOLT] Refactor PatchEntries pass Summary: Use injected functions with fixed addresses to patch original function entries. (cherry picked from FBD24133890)	2020-10-09 16:06:27 -07:00
Maksim Panchenko	db4642d0a6	[BOLT] Support -hot-text in lite mode Summary: Update special symbol references in functions that are not emitted. (cherry picked from FBD22120995)	2020-06-18 11:10:41 -07:00
Maksim Panchenko	0ce0bce9e7	[BOLT] Support for lite mode with relocations Summary: Add '-lite' support for relocations for improved processing time, memory consumption, and more resilient processing of binaries with embedded assembly code. In lite relocation mode, BOLT will skip full processing of functions without a profile. It will run scanExternalRefs() on such functions to discover external references and to create internal relocations to update references to optimized functions. Note that we could have relied on the compiler/linker to provide relocations for function references. However, there's no assurance that all such references are reported. E.g., the compiler can resolve inter-procedural references internally, leaving no relocations for the linker. The scan process takes about <10 seconds per 100MB of code on modern hardware. It's a reasonable overhead to live with considering the flexibility it provides. If BOLT fails to scan or disassemble a function, .e.g., due to a data object embedded in code, or an unsupported instruction, it enables a patching mode to guarantee that the failed function will call optimized/moved versions of functions. The patching happens at original function entry points. '-skip=<func1,func2,...>' option now can be used to skip processing of arbitrary functions in the relocation mode. With '-use-old-text' or '-strict' we require all functions to be processed. As such, it is incompatible with '-lite' option, and '-skip' option will only disable optimizations of listed functions, not their disassembly and emission. (cherry picked from FBD22040717)	2020-06-15 00:15:47 -07:00
Xun Li	9bd7161529	Adding automatic huge page support Summary: This patch enables automated hugify for Bolt. When running Bolt against a binary with -hugify specified, Bolt will inject a call to a runtime library function at the entry of the binary. The runtime library calls madvise to map the hot code region into a 2M huge page. We support both new kernel with THP support and old kernels. For kernels with THP support we simply make a madvise call, while for old kernels, we first copy the code out, remap the memory with huge page, and then copy the code back. With this change, we no longer need to manually call into hugify_self and precompile it with --hot-text. Instead, we could simply combine --hugify option with existing optimizations, and at runtime it will automatically move hot code into 2M pages. Some details around the changes made: 1. Add an command line option to support --hugify. --hugify will automatically turn on --hot-text to get the proper hot code symbols. However, running with both --hugify and --hot-text is not allowed, since --hot-text is used on binaries that has precompiled call to hugify_self, which contradicts with the purpose of --hugify. 2. Moved the common utility functions out of instr.cpp to common.h, which will also be used by hugify.cpp. Added a few new system calls definitions. 3. Added a new class that inherits RuntimeLibrary, and implemented the necessary emit and link logic for hugify. 4. Added a simple test for hugify. (cherry picked from FBD21384529)	2020-05-02 11:14:38 -07:00
Xun Li	00892a5fd0	Refactor runtime library Summary: As we are adding more types of runtime libraries, it would be better to move the runtime library out of RewriteInstance so that it could grow separately. This also requires splitting the current implementation of Instrumentation.cpp to two separate pieces, one as normal Pass, one as the runtime library. The Instrumentation Pass would pass over the generated data to the runtime library, which will use to emit binary and perform linking. This patch does the following: 1. Turn Instrumentation class into an optimization pass. Register the pass in the pass manager instead of in RewriteInstance. 2. Split all the data that are generated by Instrumentation that's needed by runtime library into a separate data structure called InstrumentationSummary. At the creation of Instrumentation pass, we create an instance of such data structure, which will be moved over to the runtime at the end of the pass. 3. Added a runtime library member to BinaryContext. Set the member at the end of Instrumentation pass. 4. In BinaryEmitter, make BinaryContext to also emit runtime library binary. 5. Created a base class RuntimeLibrary, that defines the interface of a runtime library, along with a few common helper functions. 6. Created InstrumentationRuntimeLibrary which inherits from RuntimeLibrary, that does all the work (mostly copied over) for emit and linking. 7. Added a new directory called RuntimeLibs, and put all the runtime library related files into it. (cherry picked from FBD21694762)	2020-05-21 14:28:47 -07:00
Maksim Panchenko	689447bf10	[BOLT] Change .debug_line emission for non-simple functions Summary: We use a special routine to emit line info for functions that we do not overwrite. The resulting DWARF was not quite efficient as we were advancing addresses using a larger than needed opcodes. Since there were only a few functions that we didn't emit/overwrite, it was not a big issue. However, in lite mode the majority of functions are not overwritten and as a result, the inefficiency in debug line encoding got exposed and binaries were getting larger than expected .debug_line sections. Fix it by using more conventional line table opcodes for address advancing. (cherry picked from FBD21423074)	2020-05-05 23:56:50 -07:00
Maksim Panchenko	04c5d4fcab	[BOLT] Introduce isIgnored() function attribute Summary: Whenever a function is not meant for processing, e.g. when the user requests to optimize only a subset of functions, mark the function as ignored. Use this attribute instead of opts::shouldProcess(). (cherry picked from FBD21374806)	2020-05-03 13:54:45 -07:00
Maksim Panchenko	5296b6d12a	[BOLT] Change symbol handling for secondary function entries Summary: Some functions could be called at an address inside their function body. Typically, these functions are written in assembly as C/C++ does not have a multi-entry function concept. The addresses inside a function body that could be referenced from outside are called secondary entry points. In BOLT we support processing functions with secondary/multiple entry points. We used to mark basic blocks representing those entry points with a special flag. There was only one problem - each basic block has exactly one MCSymbol associated with it, and for the most efficient processing we prefer that symbol to be local/temporary. However, in certain scenarios, e.g. when running in non-relocation mode, we need the entry symbol to be global/non-temporary. We could create global symbols for secondary points ahead of time when the entry point is marked in the symbol table. But not all such entries are properly marked. This means that potentially we could discover an entry point only after disassembling the code that references it, and it could happen after a local label was already created at the same location together with all its references. Replacing the local symbol and updating the references turned out to be an error-prone process. This diff takes a different approach. All basic blocks are created with permanently local symbols. Whenever there's a need to add a secondary entry point, we create an extra global symbol or use an existing one at that location. Containing BinaryFunction maps a local symbol of a basic block to the global symbol representing a secondary entry point. This way we can tell if the basic block is a secondary entry point, and we emit both symbols for all secondary entry points. Since secondary entry points are quite rare, the overhead of this approach is minimal. Note that the same location could be referenced via local symbol from inside a function and via global entry point symbol from outside. This is true for both primary and secondary entry points. (cherry picked from FBD21150193)	2020-04-19 22:29:54 -07:00
Maksim Panchenko	23edb3ed9c	[BOLT] Option to control .text alignment Summary: Add option `-align-text=<n>` to control .text alignment within a segment. Set to page size by default. (cherry picked from FBD21120063)	2020-04-19 15:02:50 -07:00
Maksim Panchenko	1f3e351a9c	[BOLT] Refactor code and data emission code Summary: Consolidate code and data emission code in ELF-independent BinaryEmitter. The high-level interface includes only two functions emitBinaryContext() and emitFunctionBody() used by RewriteInstance and BinaryContext respectively. (cherry picked from FBD20332901)	2020-03-06 15:06:37 -08:00

13 Commits