intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-23 07:58:23 +08:00

Author	SHA1	Message	Date
agozillon	3723449955	[OpenMP] Allocatable explicit member mapping fortran offloading tests (#113555 ) This PR is one in a series of 3 that aim to add support for explicit member mapping of allocatable components in derived types within OpenMP+Fortran for Flang. This PR provides all of the runtime tests that are currently upstreamable, unfortunately some of the other tests would require linking of the fortran runtime for offload which we currently do not do. But regardless, this is plenty to ensure that the mapping is working in most cases.	2024-11-16 12:22:33 +01:00
Louis Dionne	0fd6f684b9	[libc++] Adjust workflow file for building the libc++ docker image (#116366 )	2024-11-16 12:05:12 +01:00
David Green	100376a2fa	[AArch64] Add a test for phis of different types. NFC	2024-11-16 10:40:06 +00:00
Serge Pavlov	f97f96492d	[GlobalISel][ARM] Legalize reset_fpmode (#115859 ) Implement lowering intrinsic `reset_fpmode` in Global Selector for ARM target.	2024-11-16 17:21:33 +07:00
Sergei Barannikov	b69f646c46	[AArch64] Remove unused SDNodes (NFC) (#116236 ) The corresponding enum members were only used by `EmitMOPS`, which immediately translated them to machine opcodes. Just pass the machine opcodes instead.	2024-11-16 13:14:42 +03:00
Jay Foad	89cb0eefcb	[AMDGPU] Move GCNPreRAOptimizations after MachineScheduler (#116211 ) This is in preparation for adding a new optimization to the pass that cares about the order of instructions. The existing optimization does not care, so this just causes minor codegen differences.	2024-11-16 09:40:46 +00:00
Martin Storsjö	dc3156d8e6	[OpenMP] Don't hardcode _WIN32_WINNT for MinGW targets (#115708 ) Instead respect what the toolchain default is (or what the user sets via CMAKE_CXX_FLAGS). This fixes builds with libcxx, with mingw toolchains targeting msvcrt.dll, after 5d8be4c036aa5ce4a94f1f37a9155d5c877e23db; after that commit, the libcxx public headers reference symbols such as iswspace_l, which are unavailable when targeting msvcrt.dll on older versions of Windows (it's only available in msvcrt.dll since Windows Vista).	2024-11-16 11:23:15 +02:00
Kunwar Grover	db115ba3ef	[mlir][Linalg] Fix non-matmul linalg structured ops (#116412 ) `3ad0148020` broke linalg structured ops other than MatmulOp. The patch: - Changes the printer to hide additional attributes, which weren't hidden before: "indexing_maps". - Changes the build of every linalg structured op to have an indexing map for matmul. These changes combined, hide the problem until you print the operation in it's generic form. Reproducer: ```mlir func.func public @bug(%arg0 : tensor<5x10x20xf32>, %arg1 : tensor<5x20x40xf32>, %arg3 : tensor<5x10x40xf32>) -> tensor<5x10x40xf32> { %out = linalg.batch_matmul ins(%arg0, %arg1 : tensor<5x10x20xf32>, tensor<5x20x40xf32>) outs(%arg3 : tensor<5x10x40xf32>) -> tensor<5x10x40xf32> func.return %out : tensor<5x10x40xf32> } ``` Prints fine, with `mlir-opt <file>`, but if you do `mlir-opt --mlir-print-op-generic <file>`: ``` #map = affine_map<(d0, d1, d2) -> (d0, d2)> #map1 = affine_map<(d0, d1, d2) -> (d2, d1)> #map2 = affine_map<(d0, d1, d2) -> (d0, d1)> #map3 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d3)> #map4 = affine_map<(d0, d1, d2, d3) -> (d0, d3, d2)> #map5 = affine_map<(d0, d1, d2, d3) -> (d0, d1, d2)> "builtin.module"() ({ "func.func"() <{function_type = (tensor<5x10x20xf32>, tensor<5x20x40xf32>, tensor<5x10x40xf32>) -> tensor<5x10x40xf32>, sym_name = "bug", sym_visibility = "public"}> ({ ^bb0(%arg0: tensor<5x10x20xf32>, %arg1: tensor<5x20x40xf32>, %arg2: tensor<5x10x40xf32>): %0 = "linalg.batch_matmul"(%arg0, %arg1, %arg2) <{operandSegmentSizes = array<i32: 2, 1>}> ({ ^bb0(%arg3: f32, %arg4: f32, %arg5: f32): %1 = "arith.mulf"(%arg3, %arg4) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32 %2 = "arith.addf"(%arg5, %1) <{fastmath = #arith.fastmath<none>}> : (f32, f32) -> f32 "linalg.yield"(%2) : (f32) -> () }) {indexing_maps = [#map, #map1, #map2], linalg.memoized_indexing_maps = [#map3, #map4, #map5]} : (tensor<5x10x20xf32>, tensor<5x20x40xf32>, tensor<5x10x40xf32>) -> tensor<5x10x40xf32> "func.return"(%0) : (tensor<5x10x40xf32>) -> () }) : () -> () }) : () -> () ``` The batch_matmul operation's builder now always inserts a indexing_map which is unrelated to the operation itself. This was caught when a transformation from one LinalgStructuredOp to another, tried to pass it's attributes to the other ops builder and there were multiple indexing_map attributes in the result. This patch fixes this by specializing the builders for MatmulOp with indexing map information.	2024-11-16 08:13:10 +00:00
Thorsten Schütt	2906fcadb8	[GlobalISel] Combine G_MERGE_VALUES of x and zero (#116283 ) into zext x LegalizerHelper has two padding strategies: undef or zero. see LegalizerHelper:273 see LegalizerHelper:315 This PR is about zero sugar and Coke Zero. ; CHECK-NEXT: [[MV2:%[0-9]+]]:_(s64) = G_MERGE_VALUES %a(s32), [[C]](s32) Please continue padding merge values. // %bits_8_15:(s8) = G_CONSTANT i8 0 // %0:(s16) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8) %bits_8_15 is defined by zero. For optimization, we pick zext. // %0:_(s16) = G_ZEXT %bits_0_7:(s8) The upper bits of %0 are zero and the lower bits come from %bits_0_7.	2024-11-16 08:00:21 +01:00
Julian Schmidt	ec0a27f658	Revert "Reland: [clang][test] add testing for the AST matcher reference" (#116477 ) Reverts llvm/llvm-project#112168	2024-11-16 07:34:20 +01:00
Valentin Clement	42be165dde	Reland '[flang][cuda] Specialize entry point for scalar to desc data transfer'	2024-11-15 19:13:55 -08:00
Matthias Springer	309c890921	[llvm] `APFloat`: Add helpers to query NaN/inf semantics (#116315 ) `APFloat` changes extracted from #116176 as per reviewer comments.	2024-11-16 11:48:05 +09:00
Valentin Clement (バレンタインクレメン)	70b9440c88	Revert "[flang][cuda] Specialize entry point for scalar to desc data transfer" (#116458 ) Reverts llvm/llvm-project#116457	2024-11-15 17:44:48 -08:00
Valentin Clement (バレンタインクレメン)	43cb424a54	[flang][cuda] Specialize entry point for scalar to desc data transfer (#116457 ) The runtime Assign function is not meant to initialize an array from a scalar. For that we need to use DoAssignFromSource. Update the data transfer from scalar to descriptor to use a new entry point that use this function underneath.	2024-11-15 17:41:23 -08:00
Kyungwoo Lee	ab27253ad3	[CGData][lld-macho] Merge CG Data by LLD (#112674 ) LLD now processes raw CG data for stable functions, similar to how it handles raw CG data for the outliner's hash tree. This data is encoded in the custom section (`__llvm_merge`) within object files. LLD merges this information into the indexed CG data file specified by the `-codegen-data-generate-path={path}` option. For the linker that does not support this feature, we could use `llvm-cgdata` tool -- https://github.com/llvm/llvm-project/blob/main/llvm/docs/CommandGuide/llvm-cgdata.rst. Depends on #115750. This is a patch for https://discourse.llvm.org/t/rfc-global-function-merging/82608.	2024-11-15 17:24:35 -08:00
Craig Topper	6a0905d11e	[RISCV][GISel] Add isel patterns for i16 load/store (#116293 ) In order to support f16 load/store we need to make load/stores with s16 register type legal. If regbank selection doesn't pick the FPR bank, we'll be left with a GPR load or store which we don't have isel patterns for from SelectionDAG. In order to add the patterns we need to make i16 a legal type for the GPR register class. Tests are currently disabling the legality check because I haven't update the legalizer yet.	2024-11-15 17:23:46 -08:00
Craig Topper	131d73ed34	[RegAlloc] Remove redundant prints of LiveInterval weight. (#116451 ) LiveInterval::print has included the weight since early 2018. We don't need to print again after we print the interval.	2024-11-15 16:43:30 -08:00
vporpo	1be9827754	[SandboxVec][BottomUpVec] Implement packing of vectors (#116447 ) Up until now we could only support packing of scalar elements. This patch fixes this by implementing packing of vector elements, by generating extractelement and insertelement instruction pairs.	2024-11-15 16:12:22 -08:00
Kazu Hirata	0d38f64e7d	[memprof] Remove MemProf format Version 0 (#116442 ) This patch removes MemProf format Version 0 now that version 2 and 3 seem to be working well. I'm not touching version 1 for now because some tests still rely on version 1. Note that Version 0 is identical to Version 1 except that the MemProf section of the indexed format has a MemProf version field.	2024-11-15 15:37:00 -08:00
Kazu Hirata	57ed628fb3	[memprof] Speed up caller-callee pair extraction (Part 2) (#116441 ) This patch further speeds up the extraction of caller-callee pairs from the profile. Recall that we reconstruct a call stack by traversing the radix tree from one of its leaf nodes toward a root. The implication is that when we decode many different call stacks, we end up visiting nodes near the root(s) repeatedly. That in turn adds many duplicates to our data structure: DenseMap<uint64_t, SmallVector<CallEdgeTy, 0>> Calls; only to be deduplicated later with sort+unique for each vector. This patch makes the extraction process more efficient by keeping track of indices of the radix tree array we've visited so far and terminating traversal as soon as we encounter an element previously visited. Note that even with this improvement, we still add at least one caller-callee pair to the data structure above for each call stack because we do need to add a caller-callee pair for the leaf node with the callee GUID being 0. Without this patch, it takes 4 seconds to extract caller-callee pairs from a large MemProf profile. This patch shortenes that down to 900ms.	2024-11-15 15:33:23 -08:00
Valentin Clement (バレンタインクレメン)	b1fa9d154b	[flang][cuda] Correctly embox logical constant (#116445 )	2024-11-15 15:29:41 -08:00
Vitaly Buka	64c455077a	[docs][asan][lsan] Drop list of supported architechures (#116302 ) Full list is quite long, and quality of implementation can vary. Drop the lists to avoid confusion like https://github.com/rust-lang/rust/pull/123617#issuecomment-2471695102 We don't maintain these for other sanitizers.	2024-11-15 15:15:50 -08:00
Kyungwoo Lee	816c975ea7	Fix crash from [CGData] Global Merge Functions (#112671 ) (#116241 ) Module summary index is optional for this pass, and we shouldn't run it, but import it as necessary.	2024-11-15 14:57:17 -08:00
vporpo	3be3b33e57	[SandboxVec][BottomUpVec] Implement pack of scalars (#115549 ) This patch implements packing of scalar operands when the vectorizer decides to stop vectorizing. Packing is implemented with a sequence of InsertElement instructions. Packing vectors requires different instructions so it's implemented in a follow-up patch.	2024-11-15 14:45:17 -08:00
Valentin Clement (バレンタインクレメン)	012fad975e	[flang][cuda] Materialize the box in memory when dst is emboxed (#116320 ) Similar to #116289 but for the dst.	2024-11-15 14:31:36 -08:00
Valentin Clement (バレンタインクレメン)	e8469f1577	[flang][cuda] Add support for character type in cuf.alloc and cuf.data_transfer (#116277 ) Add support for character type in bytes computation	2024-11-15 14:31:21 -08:00
Shilei Tian	4b50ec43d0	[Clang] Avoid Using `byval` for `ndrange_t` when emitting `__enqueue_kernel_basic` (#116435 ) AMDGPU disabled the use of `byval` for struct argument passing in commit `d77c620`. However, when emitting `__enqueue_kernel_basic`, Clang still adds the `byval` attribute by default. Emitting the `byval` attribute by default in this context doesn’t seem like a good idea, as argument-passing conventions are highly target-dependent, and assumptions here could lead to issues. This PR removes the addition of the `byval` attribute, aligning the behavior with other `__enqueue_kernel_*` functions.	2024-11-15 16:54:29 -05:00
Jon Roelofs	34ebfabc34	[llvm][ARM] Restore the default to -mstrict-align on Apple firmwares (#115546 ) This is a partial revert of `e314622f20` rdar://139237593	2024-11-15 13:54:21 -08:00
Ognyan Mirev	9204eba912	Remove device override for operator new when the C++ standard >= 26 (#114056 ) Related to https://github.com/llvm/llvm-project/issues/114048	2024-11-15 13:53:24 -08:00
Kazu Hirata	ec353b7418	[memprof] Use llvm::function_ref instead of std::function (#116306 ) We've seen bugs where we lost track of error states stored in the functor because we passed the functor by value (that is, std::function) as opposed to reference (llvm::function_ref). This patch fixes a couple of places we pass functors by value. While we are at it, this patch adds curly braces around a "for" loop spanning multiple lines.	2024-11-15 13:03:24 -08:00
Florian Hahn	3734e4c0c4	[MergedLoadStore] Preserve common metadata when sinking stores. (#116382 ) When sinking a store, preserve common metadata present on stores on both sides of the diamond. PR: https://github.com/llvm/llvm-project/pull/116382	2024-11-15 20:52:02 +00:00
Ramkumar Ramachandra	94eebf721a	InstSimplify: support floating-point equivalences (#115152 ) Since `cd16b07` (IR: introduce CmpInst::isEquivalence), there is now an isEquivalence routine in CmpInst that we can use to determine equivalence in simplifySelectWithICmpEq. Implement this, extending the code from integer-equalities to integer and floating-point equivalences.	2024-11-15 20:06:11 +00:00
Craig Topper	92f3f27106	[RISCV][GISel] Remove -disable-gisel-legality-check from most RVV tests. NFC	2024-11-15 12:04:55 -08:00
Janek van Oirschot	9a5e5e28ec	[AMDGPU] Newly added test modified for recent SGPR use change (#116427 ) Mistimed rebase for #112251 which added new tests which did not consider the changes introduced in #112403 yet	2024-11-15 14:51:58 -05:00
Petr Hosek	1e492285f3	[Fuchsia] Include runtimes for armv8.1m.main-none-eabi (#116420 ) These are needed by some of our users.	2024-11-15 11:32:15 -08:00
Craig Topper	47a0e24a3b	[GISel][RISCV] Add G_SMIN/SMAX/UMIN/UMAX to GISelKnownBits::computeNumSignBits. (#116321 )	2024-11-15 11:23:15 -08:00
Janek van Oirschot	bd9145c8c2	Reapply [AMDGPU] Avoid resource propagation for recursion through multiple functions (#112251 ) I was wrong last patch. I viewed the `Visited` set purely as a possible recursion deterrent where functions calling a callee multiple times are handled elsewhere. This doesn't consider cases where a function is called multiple times by different callers still part of the same call graph. New test shows the aforementioned case. Reapplies #111004, fixes #115562.	2024-11-15 18:40:05 +00:00
Peter Smith	098b0d18ad	[LLD][AArch64] Detach Landing Pad creation from Thunk creation (#116402 ) Move Landing Pad Creation to a new function that checks each thunk every pass to see if it needs a landing pad. This permits a thunk to be created without needing a landing pad, but later needing one due to drifting out of direct branch range and requiring an indirect branch. We record all the Thunks created so far in a new vector rather than trying to iterate over the DenseMap as we need a deterministic order of adding LandingPadThunks due to the short branch fall through. We cannot use normalizeExistingThunk() either as that only iterates through live thunks. Fixes: https://crbug.com/377438309 Original PR: https://github.com/llvm/llvm-project/pull/108989 Sending without a new test case to fix existing test. A new regression test will come in a separate PR as coming up with a small enough reproducer for this case is non-trivial.	2024-11-15 18:18:18 +00:00
lialan	ef92aba52a	[MLIR] Fix VectorEmulateNarrowType constant op mask bug (#116064 ) This commit adds support for handling mask constants generated by the `arith.constant` op in the `VectorEmulateNarrowType` pattern. Previously, this pattern would not match due to the lack of mask constant handling in `getCompressedMaskOp`. The changes include: 1. Updating `getCompressedMaskOp` to recognize and handle `arith.constant` ops as mask value sources. 2. Handling cases where the mask is not aligned with the emulated load width. The compressed mask is adjusted to account for the offset. Limitations: - The arith.constant op can only have 1-dimensional constant values. Resolves: #115742 Signed-off-by: Alan Li <me@alanli.org>	2024-11-15 10:06:40 -08:00
Krzysztof Parzyszek	0398cb4592	[flang][OpenMP][OpenACC] Use iterator_range in check-directive-struct… (#115872 ) …ure, NFC The OpenMP code is already using iterator_range, lift it to the shared header file.	2024-11-15 11:54:58 -06:00
Aaron Ballman	3130691a60	[C23] Move WG14 N2754 to the TS 18661 section This paper is about the quantum exponent of NAN, which only applies if we support decimal floating-point types from the TS. That is why the status changed from Unknown to No.	2024-11-15 12:52:18 -05:00
Krzysztof Drewniak	f2e42d9324	[mlir][IntRangeInference] Handle ceildivsi(INT_MIN, x > 1) as expected (#116284 ) Fixes #115293 While the definition of ceildivsi is integer division, rounding up, most implementations will use `-(-a / b)` for dividing `a ceildiv b` with `a` negative and `b` positive. Mathematically, and for most integers, these two definitions are equivalent. However, with `a == INT_MIN`, the initial negation is a noop, which means that, while divinding and rounding up would give a negative result, `-((- INT_MIN) / b)` is `-(INT_MIN / b)`, which is positive. This commit adds a special case to ceilDivSI inference to handle this case and bring it in line with the operational instead of the mathematical semantics of ceiling division.	2024-11-15 11:43:05 -06:00
Fangrui Song	d82422f69c	[ELF] Remove errorOrWarn	2024-11-15 09:37:38 -08:00
Sergei Barannikov	032014ef10	[PowerPC] Add `SDNPMemOperand` to some nodes (#115580 ) Nodes created with `getMemIntrinsicNode` have memory operands. In order for operands to be propagated to machine instructions, the nodes should have `SDNPMemOperand` property. Similar to `3c8c385a`.	2024-11-15 20:36:56 +03:00
Eric Astor	e9e8f59dd4	[clang] Instantiate attributes on LabelDecls (#115924 ) Start propagating attributes on (e.g.) labels inside of templated functions to their instances.	2024-11-15 12:33:20 -05:00
Cyndy Ishida	2d48489cc3	[Clang][Darwin] Introduce `SubFrameworks` as a SDK default location (#115048 ) * Have clang always append & pass System/Library/SubFrameworks when determining default sdk search paths. * Teach clang-installapi to traverse there for framework input. * Teach llvm-readtapi that the library files (TBD or binary) in there should be considered private. resolves: rdar://137457006	2024-11-15 09:27:08 -08:00
Stephen Tozer	2188a56a75	[DebugInfo][SimplifyCFG] Fully propagate merged invoke DILocations (#114235 ) Currently when we merge invokes as part of SimplifyCFG we apply a merge of the invoke DILocations to the merged invoke. We also insert an unconditional branch to the merged invoke at the positions previously occupied by the original invokes; as this branch is part of the substitution for the invoke it has replaced, we should propagate the original invoke DebugLoc to it.	2024-11-15 17:20:55 +00:00
Simon Pilgrim	92cc805193	[IR] Add ICmpInst::isCommutative and FCmpInst::isCommutative static wrappers (#116398 ) Add static variants that can used with the Predicate enum directly.	2024-11-15 17:13:43 +00:00
Anchu Rajendran S	e67e09a77e	[Flang][OpenMP][Sema] Adding parsing and semantic support for scan directive. (#102792 )	2024-11-15 09:10:36 -08:00
Joseph Huber	fd5fcfb1e6	[Clang] Add 'gpuintrin.h' to the release notes (#116410 )	2024-11-15 11:08:06 -06:00

1 2 3 4 5 ...

518420 Commits