intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-17 06:40:01 +08:00

Author	SHA1	Message	Date
Denis Revunov	ad4e0770ca	[BOLT][Instrumentation] Put Allocator itslef in shared memory by default In absence of instrumentation-file-append-pid option, global allocator uses shared pages for allocation. However, since it is a global variable, it gets COW'd after fork if instrumentation-sleep-time is used, or if a process forks by itself. This means it handles the same pages to every process which causes hash table corruption. Thus, if we want shared pages, we need to put the allocator itself in a shared page, which we do in this commit in __bolt_instr_setup. I also added a couple of assertions to sanity-check the hash table. Reviewed By: rafauler, Amir Differential Revision: https://reviews.llvm.org/D153771	2023-06-30 01:03:52 +03:00
Denis Revunov	02c3724d43	[BOLT][Instrumentation] Don't share counters when using append-pid The point of append-pid option is to record separate profiles for separate forks, which is impossible when counters are the same for every process. It leads to a sum of all profiles in every file, plus GlobalWriteProfileMutex located in a shared memory prevents some processes from dumping their data at all. Reviewed By: rafauler, Amir Differential Revision: https://reviews.llvm.org/D153771	2023-06-30 01:03:52 +03:00
Denis Revunov	8f7c53ef81	[BOLT][Instrumentation] Add mmap return value assertions In a very rare case that mmap call fails, we'll at least get a message instead of segfault. Reviewed By: rafauler, Amir Differential Revision: https://reviews.llvm.org/D154056	2023-06-30 01:03:52 +03:00
Denis Revunov	f0b45fba4b	[BOLT][Instrumentation][NFC] define and use mmap flags Reviewed By: rafauler, Amir Differential Revision: https://reviews.llvm.org/D154056	2023-06-30 01:03:52 +03:00
Haojian Wu	4b47c6e018	Fix -Wunused-variable in release build.	2023-06-30 00:02:05 +02:00
Brendon Cahoon	853b2a84cb	[AMDGPU] Reserve SGPR pair when long branches are present Branch relaxation requires 2 additional SGPRs for AMDGPU to handle the case when an indirect branch target is too far away. The register scavanger may not find available registers, which causes a “did not find scavenging index” assert to occur in assignRegToScavengingIndex. In this patch, we estimate before register allocation whether an indirect branch is likely to be needed, and reserve 2 SGPRs if the branch distance is found to be above a threshold. The distance threshold is an approximation as the exact code size and branch distance are unknown prior to register allocation. Patch by Corbin Robeck. Thanks! Differential Review: https://reviews.llvm.org/D149775	2023-06-29 16:50:46 -05:00
varconst	b5270ba20d	[libc++] Remove the legacy debug mode. See https://discourse.llvm.org/t/rfc-removing-the-legacy-debug-mode-from-libc/71026 Reviewed By: #libc, Mordante, ldionne Differential Revision: https://reviews.llvm.org/D153672	2023-06-29 14:49:51 -07:00
Yuanfang Chen	632dd6a4ca	[Clang] Implements CTAD for aggregates P1816R0 and P2082R1 Differential Revision: https://reviews.llvm.org/D139837	2023-06-29 14:22:24 -07:00
philass	b287a4cbc4	[mlir] Remove self-include from BytecodeOpInterface.h (NFC) Differential Revision: https://reviews.llvm.org/D153830	2023-06-29 14:14:23 -07:00
Louis Dionne	e7c63c0e90	[libc++] Stop using __builtin_assume in _LIBCPP_ASSERT __builtin_assume can sometimes worsen code generation. For now, the guideline seems to be to avoid adding assumptions without a clear optimization intent. Since _LIBCPP_ASSERT is very general, we can't have a clear optimization intent at this level, which makes __builtin_assume the wrong tool for the job -- at least until __builtin_assume is changed. See https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609 for a discussion of this. Differential Revision: https://reviews.llvm.org/D153968	2023-06-29 16:59:29 -04:00
Joseph Huber	b15ac1fd89	[libc] Enable the 'div' routines on the GPU This patch simply enables the `div`, `ldiv,` and, `lldiv` functions on the GPU. This should be straightforward enough. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D154143	2023-06-29 15:42:46 -05:00
Johannes Doerfert	5186c2f9f8	[Attributor][NFC] Introduce IRP helpers for Attr and Arg handling	2023-06-29 13:32:06 -07:00
Johannes Doerfert	d6fa3b374f	[Attributor] Remove now obsolete initialization code With the helpers in place to judge AAs [1] we can remove the custom rolled initialization checking code. This exposed a minor oversight in the AAMemoryLocation where we did not check the IR before we gave up for a declaration. [1] `d33bca840a`	2023-06-29 13:32:06 -07:00
V Donaldson	2b12d8350e	[flang] Block containing an interface Name mangling may be invoked for an interface procedure contained in a block in a context that does not have access to block ID mapping. Procedures can't be defined inside a block, so name mangling doesn't need a block map. Relax an assert to account for this. block interface subroutine ss(n) bind(c) integer :: n end subroutine end interface call ss(5) end block end	2023-06-29 13:25:45 -07:00
Ben Langmuir	1ede7b4749	[clang][modules] Avoid serializing all diag mappings in non-deterministic order When writing a pcm, we serialize diagnostic mappings in order to accurately reproduce the diagnostic environment inside any headers from that module. However, the diagnostic state mapping table contains entries for every diagnostic ID ever accessed, while we only want to serialize the ones that are actually modified from their default value. Futher, we need to serialize them in a deterministic order. rdar://111477511 Differential Revision: https://reviews.llvm.org/D154016	2023-06-29 13:17:24 -07:00
Daniel Kiss	92fbb602f3	[lld][AArch64] Add BTI landing pad to PLT entries when the symbol is exported. With relative vtables the caller jumps directly to the plt entries in the shared object, therefore landing pad is need for these entries. Reproducer: main.cpp ``` #include "v.hpp" int main() { A* a = new B(); a->do_something2(); return 0; } ``` v.hpp ``` struct A { virtual void do_something() = 0; virtual void do_something2(); }; struct B : public A { void do_something() override; void do_something2() override; }; ``` v.cpp ``` #include "v.hpp" void A::do_something2() { } void B::do_something() { } void B::do_something2() { } ``` ``` CC="clang++ --target=aarch64-unknown-linux-gnu -fuse-ld=lld -mbranch-protection=bti" F=-fexperimental-relative-c++-abi-vtables ${=CC} $F -shared v.cpp -o v.so -z force-bti ${=CC} $F main.cpp -L./ v.so -Wl,-rpath=. -z force-bti qemu-aarch64-static -L /usr/aarch64-linux-gnu -cpu max ./a.out ``` For v.so, the regular vtable entry is relocated by an R_AARCH64_ABS64 relocation referencing _ZN1B13do_something2Ev. ``` _ZTV1B: .xword _ZN1B13do_something2Ev ``` Using relative vtable entry for a DSO has a downside of creating many PLT entries and making their addresses escape. The relative vtable entry references a PLT entry _ZN1B13do_something2Ev@plt. ``` .L_ZTV1A.local: .word (_ZN1A13do_something2Ev@PLT-.L_ZTV1A.local)-8 ``` fixes: #63580 Reviewed By: peter.smith, MaskRay Differential Revision: https://reviews.llvm.org/D153264	2023-06-29 22:17:17 +02:00
LLVM GN Syncbot	fcdc3c9775	[gn build] Port `b4ff893877`	2023-06-29 20:10:25 +00:00
Christopher Ferris	36ca9a2902	[scudo] Use getMonotonicTimeFast for tryLock. In tryLock, the Precedence value is set using the fast time function now. This should speed up tryLock calls slightly. This should be okay even though the value is used as a kind of random value in getTSDAndLockSlow. The fast time call still sets enough bits to avoid getting the same TSD on every call. Reviewed By: Chia-hungDuan Differential Revision: https://reviews.llvm.org/D154039	2023-06-29 13:07:08 -07:00
Christian Trott	b4ff893877	[libc++][mdspan] Implement layout_left This commit implements layout_left in support of C++23 mdspan (https://wg21.link/p0009). layout_left is a layout mapping policy whose index mapping corresponds to the memory layout of Fortran arrays. Thus the left most index has stride-1 access, and the right most index is associated with the largest stride. Co-authored-by: Damien L-G <dalg24@gmail.com> Differential Revision: https://reviews.llvm.org/D153783	2023-06-29 14:01:08 -06:00
Fangrui Song	9979417d4d	Revert D145226 "[mlir][Transforms][NFC] CSE: Add non-pass entry point" This reverts commit `189033e6be`. This commit causes memory leak. See comments on D145226.	2023-06-29 12:53:31 -07:00
Johannes Doerfert	1221526681	[Attributor][FIX] Check AA preconditions AAs often have preconditions, e.g., that the associated type is a pointer type. If these do not hold, we do not need to bother creating the AA. Best case, we invalidate it right away, worst case, we crash or do something wrong (as happend in the issues below). Fixes: https://github.com/llvm/llvm-project/issues/63553 Fixes: https://github.com/llvm/llvm-project/issues/63597	2023-06-29 12:32:45 -07:00
Johannes Doerfert	de88628ab9	[Attributor][FIX] Ensure AAAssumptionInfo properly reports change I have no test as I just noticed the wrong change status reported by update randomly.	2023-06-29 12:32:45 -07:00
Johannes Doerfert	d33bca840a	[Attributor] Introduce helpers to judge AAs prior to creation This is a partial cleanup to centralize the initialization and update decisions for AAs. Lifting the burdon and boilerplate on users and making it harder to accidentally perform unsound deductions. The two static helpers show how we can lift the decisions to generate an AA into the Attributor, avoiding trivial AAs that just cost us compile time and maintenance code (to check for pre-conditions).	2023-06-29 12:32:45 -07:00
John Harrison	b9b0ab32f9	[lldb-vscode] Adjusting CreateSource to detect compiler generated frames. Reviewed By: wallace Differential Revision: https://reviews.llvm.org/D154026	2023-06-29 15:31:08 -04:00
Antonio Frighetto	a2ba4e8075	[ConstraintElimination] Handle solving-only `ICMP_NE` predicates Simplification of non-equality predicates for solving constraint systems is now supported by checking the validity of related inequalities and equalities. Differential Revision: https://reviews.llvm.org/D152684	2023-06-29 21:22:48 +02:00
Sam McCall	2f7d30dee8	[dataflow] fix compile on gcc7 Reported on https://reviews.llvm.org/D153674 This returned expression is move-eligible, this is a bug in old GCC.	2023-06-29 21:20:53 +02:00
Jennifer Yu	085845a2ac	[OMP5.2] Initial support for doacross clause.	2023-06-29 11:58:17 -07:00
Yaxun (Sam) Liu	41a1625e07	[HIP] Fix version detection for old HIP-PATH ROCm used to install components under individual directories, e.g. HIP installed to /opt/rocm/hip and rocblas installed to /opt/rocm/rocblas. ROCm has transitioned to a flat directory structure where all components are installed to /opt/rocm. HIP-PATH and --hip-path are supposed to be /opt/rocm as clang detect HIP version by /opt/rocm/share/hip/version. However, some existing HIP app still uses HIP-PATH=/opt/rocm/hip. To avoid regression, clang will also try detect share/hip/version under the parent directory of HIP-PATH or --hip-path. This way, the detection will work for both new HIP-PATH and old HIP-PATH. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D154077 Fixes: SWDEV-407757	2023-06-29 14:57:26 -04:00
Alexey Bataev	bb4e547a60	[SLP][NFC]Add a test for buildvector with reused scalars and extractelements.	2023-06-29 11:52:12 -07:00
John Harrison	fc52f8dc6c	[lldb-vscode] Prior to running the launchCommands during a launch request set the launch info so the configured launch information is accessible by the launch commands. Reviewed By: wallace Differential Revision: https://reviews.llvm.org/D154028	2023-06-29 14:50:19 -04:00
John Harrison	227b2180eb	Creating a startDebugging reverse DAP request handler in lldb-vscode. Adds support for a reverse DAP request to startDebugging. The new request can be used to launch child processes from lldb scripts, for example it would be start forward to configure a debug configuration for a server and a client allowing you to launch both processes with a single debug configuraiton. Reviewed By: wallace, ivanhernandez13 Differential Revision: https://reviews.llvm.org/D153447	2023-06-29 14:45:57 -04:00
Aart Bik	6b88c852b6	[mlir][sparse] Start migration to new surface syntax for STEA We are in the progress of migrating to a much improved surface syntax for the Sparse Tensor Encoding Attribute (STEA). You can see a preview of this in the StableHLO RFC at https://github.com/openxla/stablehlo/blob/main/rfcs/20230210-sparsity.md //This design is courtesy Wren Romano.// This initial revision (1) Introduces the first version of a new parser written by Wren Romano (2) Introduces a simple "migration plan" using NEW_SYNTAX on the STEA, which will allow us to test the new parser with new examples, as well as migrate existing examples over without the need to rewrite them all This first "drop" merely provides the entry points to parse the new syntax. The parser is still under active development. For example, we need to address the "lookahead" issue when parsing the lvl spec (viz. do we see l0 = d0 or a direct d0). Another larger task is to actually implement "affine" parsing (since the MLIR affine parser is not accessible in other parts of the tree). EXAMPLE: Currently, CSR looks like #CSR = #sparse_tensor.encoding<{ lvlTypes = ["dense","compressed"], dimToLvl = affine_map<(i,j) -> (i,j)> }> but you can "force" the new parser with #CSR = #sparse_tensor.encoding<{ NEW_SYNTAX = (d0, d1) -> (l0 = d0 : dense, l1 = d1 : compressed) }> Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D153997	2023-06-29 11:32:07 -07:00
Arthur Eubanks	e9a9fccc9d	[gn build] Fix tablegen dependencies The source_set needs to depend on Support so llvm-config files are generated first.	2023-06-29 11:27:58 -07:00
Joseph Huber	667c10353e	[libc] Fix the implementation of exit on the GPU The RPC calls all have delays associated with them. Currently the `exit` function does an async send and immediately exits the GPU. This can have the effect that the RPC server never sees the exit call and we continue. This patch changes that to first sync with the server before continuing to perform its exit. There is still a hazard here, where the kernel can complete before the RPC call reads back its response, but this is simply multi-threaded hazards. This change ensures that the server will always exit some time after the GPU exits. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D154112	2023-06-29 13:22:23 -05:00
Aiden Grossman	46f42e2ee5	[llvm-exegesis] Change map address in memory annotation tests Test failures have been reported by some LLVM developers in regards to the low value of of the location where the memory is being mapped into the virtual address space as it causes problems with some default configurations of vm.mmap_min_addr. This patch sets it to 2^20 (1048576) to alleviate this issues as most distros seem to use a default value of 65536.	2023-06-29 18:21:06 +00:00
wlei	444d2e1a54	[CSSPGO] Enable stale profile matching by default for CSSPGO We tested the stale profile matching on several Meta's internal services, all results are positive, for instance, in one service that refreshed its profile every one or two weeks, it consistently gave 1~2% performance improvement. We also observed an instance that a trivial refactoring caused a 2% regression and the matching can successfully recover the whole regression. Therefore, we'd like to turn it on by default for CSSPGO. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D154027	2023-06-29 11:18:51 -07:00
Slava Zakharin	7b4aa95d7c	[flang][hlfir] Set/propagate 'unordered' attribute for elementals. This patch adds 'unordered' attribute handling the HLFIR elementals' builders and fixes the attribute handling in lowering and transformations. Depends on D154031, D154032 Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154035	2023-06-29 11:16:38 -07:00
Slava Zakharin	65379d40cf	[flang][hlfir] Do not inline ordered elementals. This patch just disables inlining of ordered hlfir.elemental operations. Proving the safeness of inlining is left for future development. Depends on D154032 Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154034	2023-06-29 11:16:38 -07:00
Noah Goldstein	13f16f4dea	[InstCombine] Canonicalize `(icmp eq/ne (and x, C), x)` -> `(icmp eq/ne (and x, ~C), 0)` This increases the likelyhood `x` is single-use and is typically easier to analyze. Proofs: https://alive2.llvm.org/ce/z/8ZpS2W Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D154004	2023-06-29 13:14:37 -05:00
Mark de Wever	9595a18de1	[NFC][libc++] Use a better type_trait to show the intention.	2023-06-29 19:56:28 +02:00
Igor Kirillov	17bde328d6	[LV] Add mask support for vectorizing interleaved groups This patch extends LoopVectorize to handle the vectorization of interleaved memory accesses with scalable vectors when mask is required or/and predicated tail folding is enabled. Differential Revision: https://reviews.llvm.org/D152258	2023-06-29 17:50:56 +00:00
Luke Lau	d0d864f6f4	[SLP] Explicitly pass AccessTy to getGEPCost Building on D149889, this patch updates SLP to pass the vector type as the AccessTy to getGEPCost. This should have the effect of GEPs being costed for more often instead of being treated as foldable into the address mode and thus free, as some architectures, notably RISC-V, do not have offset+reg addressing modes for vector memory accesses. Note that in SLP, GEPs are costed in two places: getPointersChainCost and GetGEPCostDiff. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D153570	2023-06-29 18:42:24 +01:00
Luke Lau	2b28f8f044	[RISCV][SLP] Add tests for unprofitable SLP vectorization due to GEP. NFC Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D149888	2023-06-29 18:42:22 +01:00
Slava Zakharin	39e87db192	[flang][hlfir] Codegen for unordered elemental operations. Depends on D154031, D154032 Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154033	2023-06-29 10:35:43 -07:00
Slava Zakharin	583168ee86	[flang][hlfir] Parse unordered attribute for elemental operations. By default, `hlfir.elemental` and `hlfir.elemental_addr` must process the elements in order. The `unordered` attribute may be set, if it is safe to process the elements out of order. This patch just adds parsing support for the new attribute. Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154032	2023-06-29 10:35:43 -07:00
Slava Zakharin	5983b8b6d3	[flang][hlfir] Lower ordered elemental subroutine calls. This patch sets `unordered` `fir.do_loop` attribute during lowering of elemental subroutine calls to HLFIR, when it is safe to do so. Proper handling of `hlfir.elemental` will be done in a separate patch. Reviewed By: jeanPerier, tblah Differential Revision: https://reviews.llvm.org/D154031	2023-06-29 10:35:43 -07:00
Sergei Barannikov	2348902268	[clang][CodeGen] Remove no-op EmitCastToVoidPtr (NFC) Reviewed By: JOE1994 Differential Revision: https://reviews.llvm.org/D153694	2023-06-29 20:29:38 +03:00
Craig Topper	0e84eec745	[RISCV] Add a helper class for creating GPR register classes. Reduces the amount of repeated template parameters for every class. Reviewed By: asb Differential Revision: https://reviews.llvm.org/D154042	2023-06-29 10:23:39 -07:00
Joseph Huber	968f65ae03	[OpenMP] Adjust using the NVPTX architecture detection tool A previous patch by @arsenm adjusted these to find the `amdgpu-arch` tool correctly if we do a `LLVM_ENABLE_PROJECTS` build. This patch applies the same to `nvptx-arch` tool to keep it consistent. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D154107	2023-06-29 12:14:44 -05:00
Scott Todd	c304be7cfd	[mlir][docgen] Handle Windows line endings in doc generation. The `printReindented` function searches for Unix style line endings (`\n`), but strings may have Windows style line endings (`\r\n`). Prior to this change, generated document sections could have extra indentation, which some markdown renderers interpret as code blocks rather than paragraphs. Differential Revision: https://reviews.llvm.org/D153591	2023-06-29 09:56:49 -07:00

1 2 3 4 5 ...

466087 Commits