intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-23 07:58:23 +08:00

Author	SHA1	Message	Date
Daniil Kovalev	72fb8ae541	[lld][test][PAC] Do not rely on concrete offsets in LTO tests (#143358 ) When changing codegen (e.g. in #130809), offsets in binaries produced by LTO tests might change. We do not need to match concrete offset values, it's enough to ensure that hex values in particular places are identical. --------- Co-authored-by: Anatoly Trosinenko <atrosinenko@accesssoftek.com>	2025-06-17 15:29:37 +00:00
Matt Arsenault	c80282d333	AMDGPU: Directly select minimumnum/maximumnum with ieee_mode=0 (#141903 ) The hardware min/max follow the IR rules with IEEE mode disabled, so we can avoid the canonicalizes of the input. We lose the quieting of a signaling nan if both inputs are nans, but we only require that with strictfp.	2025-06-18 00:27:41 +09:00
Steven Perron	1410e69b64	[SPIRV] Allow __spirv_SpecConstant in Vulkan shaders (#143543 ) There is a builtin __spirv_SpecConstant that the SPIR-V backend expands into a specialization constant. However, it is currently only enable for OpenCL shaders, and not the graphic shaders. We want to use it for specialization constants coming from HLSL, so we are enabling it for graphic shaders as well. Implements https://github.com/llvm/wg-hlsl/pull/287 Fixes https://github.com/llvm/llvm-project/issues/142991	2025-06-17 11:26:47 -04:00
Jeremy Morse	14286244f1	Follow up to `9eb0020555`, squelch unused variable warning It turns out that this now-deleted debug-intrinsic code was the only use of CI.	2025-06-17 16:24:12 +01:00
Andrew Rogers	02b78ff9c6	[llvm] include Compiler.h in a few headers where it was missed (#144464 ) Add missing `#include "llvm/Support/Compiler.h"` in a few LLVM headers that use the `LLVM_ABI` macro.	2025-06-17 08:21:24 -07:00
Jeffrey Byrnes	c9a87a50ae	[SLPVectorizer] Use accurate cost for external users of resize shuffles (#137419 ) When implementing the vectorization, we potentially need to add shuffles for external users. In such cases, we may be shuffling a smaller vector into a larger vector. When this happens `ResizeToVF` will just build a poison padded identity vector. Then the to build the final shuffle, we just use the `SK_InsertSubvector` mask. This is possibly clearer by looking at the included test in SLPVectorizer/AMDGPU/external-shuffle.ll In the exit block we have a bunch of shuffles to glue the vectorized tree match the `InsertElement` users. `TMP25` holds the result of resizing the v2i16 vectorized sequence to match the `InsertElement` size v16i16. Then `TMP26` is the final shuffle which replaces the `InsertElement` sequence. This is just an insertsubvector. However, when calculating the cost for these shuffles, we aren't modelling this correctly. `ResizeToVF` will indicate to `performExtractsShuffleAction` that we cannot use the original mask due to the resize shuffle. The consequence is that the cost calculation uses a different shuffle mask than what is ultimately used. Going back to the included test, we can consider again `TMP26`. Clearly we can see the shuffle uses a mask {0, 1, 2, 3, 16, 17, poison ..}. However, we will currently calculate the cost with a mask {0, 1, 2, 3, 20, 21, ...} we have replaced 16 and 17 with 20 and 21 (Index + Vector Size). Queries like BasicTTImpl::improveShuffleKindFromMask will not recognize this as an `SK_InsertSubvector` mask, and targets which have reduced costs for `SK_InsertSubvector` will not accurately calculate the cost.	2025-06-17 08:14:05 -07:00
Jeremy Morse	9eb0020555	[DebugInfo][RemoveDIs] Remove a swathe of debug-intrinsic code (#144389 ) Seeing how we can't generate any debug intrinsics any more: delete a variety of codepaths where they're handled. For the most part these are plain deletions, in others I've tweaked comments to remain coherent, or added a type to (what was) type-generic-lambdas. This isn't all the DbgInfoIntrinsic call sites but it's most of the simple scenarios. Co-authored-by: Nikita Popov <github@npopov.com>	2025-06-17 15:55:14 +01:00
Karlo Basioli	7ec103a984	Port #143108 to bazel (#144538 )	2025-06-17 15:52:33 +01:00
Davide Grohmann	549bc55cc3	[mlir][spirv] Fix int type declaration duplication when serializing (#143108 ) At the MLIR level unsigned integer and signless integers are different types. Indeed when looking up the two types in type definition cache they do not match. Hence when translating a SPIR-V module which contains both usign and signless integers will contain the same type declaration twice (something like OpTypeInt 32 0) which is not permitted in SPIR-V and such generated modules fail validation. This patch solves the problem by mapping unisgned integer types to singless integer types before looking up in the type definition cache. --------- Signed-off-by: Davide Grohmann <davide.grohmann@arm.com>	2025-06-17 10:35:14 -04:00
Kajetan Puchalski	4cfe0d7f4c	[flang][OpenMP] Support using copyprivate with fir.boxchar arguments (#144092 ) Implement the lowering for passing a fir.boxchar argument to the copyprivate clause. Resolves https://github.com/llvm/llvm-project/issues/142123. --------- Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>	2025-06-17 15:32:23 +01:00
Simon Pilgrim	0fb198e132	[X86] Remove combineShuffleOfConcatUndef fold (#144524 ) We can now let a mixture of combineConcatVectorOps and target shuffle combining handle this instead of creating ISD::CONCAT_VECTORS nodes and hoping they will merge properly. In the horizontal-sum.ll test changes we were creating a ISD::CONCAT_VECTORS node that was being split shortly after, but not before causing issues with HADD folding due to additional uses.	2025-06-17 15:30:49 +01:00
Florian Hahn	8f79754225	[SCEV] Better preserve wrapping info in SimplifyICmpOperands for UGE. (#144404 ) Update SimplifyICmpOperands to only try subtracting 1 from RHS first, if RHS is an op we can fold the subtract directly into. Otherwise try adding to LHS first, as we can preserve NUW flags. This improves results in a few cases, including the modified test case from berkeley-abc and new code to be added in https://github.com/llvm/llvm-project/pull/128061. Note that there are more cases where the results can be improved by better ordering here which I'll try to investigate as follow-up. PR: https://github.com/llvm/llvm-project/pull/144404	2025-06-17 15:30:08 +01:00
Michael Buch	0a7b0c844c	[lldb][Expression] Remove IR pointer checker (#144483 ) Currently when jitting expressions, LLDB scans the IR instructions of the `$__lldb_expr` and will insert a call to a utility function for each load/store instruction. The purpose of the utility funciton is to dereference the load/store operand. If that operand was an invalid pointer the utility function would trap and LLDB asks the IR checker whether it was responsible for the trap, in which case it prints out an error message saying the expression dereferenced an invalid pointer. This is a lot of setup for not much gain. In fact, creating/running this utility expression shows up as ~2% of the expression evaluation time (though we cache them for subsequent expressions). And the error message we get out of it is arguably less useful than if we hadn't instrumented the IR. It was also untested. Before: ``` (lldb) expr int a = returns_invalid_ptr() error: Execution was interrupted, reason: Attempted to dereference an invalid pointer.. The process has been returned to the state before expression evaluation. ``` After: ``` (lldb) expr int a = returns_invalid_ptr() error: Expression execution was interrupted: EXC_BAD_ACCESS (code=1, address=0x5). The process has been returned to the state before expression evaluation. ``` This patch removes this IR checker.	2025-06-17 15:24:26 +01:00
Richard Howell	35f6d91720	[lld] check cache in loadDylib before real_path (#143595 )	2025-06-17 07:18:50 -07:00
Han-Kuan Chen	414710c753	[SLP] Fix isCommutative to check uses of the original instruction instead of the converted instruction. (#143094 )	2025-06-17 22:03:14 +08:00
AZero13	dc72b91ffe	[AArch64] Report icmp as free if it can be folded into ands (#143286 ) Since changing the backend to fold x >= 1 / x < 1 -> x > 0 / x <= 0 and x <= -1 / x > -1 -> x > 0 / x <= 0, this should be reflected in the cost.	2025-06-17 14:59:38 +01:00
Benjamin Kramer	de3339063a	[bazel] Port `b4e39e4ff9`	2025-06-17 15:44:30 +02:00
William Moses	917bc90967	[MLIR][LLVMIR] Mark Funcop as affinescope (#144456 ) All functions are conceptually an affine scope.	2025-06-17 06:41:15 -07:00
Amir Ayupov	9fed480f18	[BOLT] Explicitly check for returns when extending call continuation profile (#143295 ) Call continuation logic relies on assumptions about fall-through origin: - the branch is external to the function, - fall-through start is at the beginning of the block, - the block is not an entry point or a landing pad. Leverage trace information to explicitly check whether the origin is a return instruction, and defer to checks above only in case of DSO-external branch source. This covers both regular and BAT cases, addressing call continuation fall-through undercounting in the latter mode, which improves BAT profile quality metrics. For example, for one large binary: - CFG discontinuity 21.83% -> 0.00%, - CFG flow imbalance 10.77%/100.00% -> 3.40%/13.82% (weighted/worst) - CG flow imbalance 8.49% —> 8.49%. Depends on #143289. Test Plan: updated callcont-fallthru.s	2025-06-17 06:28:27 -07:00
Rahul Joshi	816ab1af0d	[NFCI][TableGen][DecoderEmitter] Cull Op handling when possible (#142974 ) TryDecode/CheckPredicate/SoftFail MCD ops are not used by many targets. Track the set of opcodes that were emitted and emit code for handling TryDecode/CheckPredicate/SoftFail ops when decoding only if there were emitted. This is purely eliminating dead code in the generated `decodeInstruction` function. This results in the following reduction in the size of the Disassembler .so files with a release x86_64 release build on Linux: ``` Target Old Size New Size % reduction build/lib/libLLVMAArch64Disassembler.so.21.0git 256656 256656 0.00 build/lib/libLLVMAMDGPUDisassembler.so.21.0git 813000 808168 0.59 build/lib/libLLVMARCDisassembler.so.21.0git 44816 43536 2.86 build/lib/libLLVMARMDisassembler.so.21.0git 281744 278808 1.04 build/lib/libLLVMAVRDisassembler.so.21.0git 36040 34496 4.28 build/lib/libLLVMBPFDisassembler.so.21.0git 26248 23168 11.73 build/lib/libLLVMCSKYDisassembler.so.21.0git 55960 53632 4.16 build/lib/libLLVMHexagonDisassembler.so.21.0git 115952 113416 2.19 build/lib/libLLVMLanaiDisassembler.so.21.0git 24360 21008 13.76 build/lib/libLLVMLoongArchDisassembler.so.21.0git 58584 56168 4.12 build/lib/libLLVMM68kDisassembler.so.21.0git 57264 53880 5.91 build/lib/libLLVMMSP430Disassembler.so.21.0git 28896 28440 1.58 build/lib/libLLVMMipsDisassembler.so.21.0git 123128 120568 2.08 build/lib/libLLVMPowerPCDisassembler.so.21.0git 80656 78096 3.17 build/lib/libLLVMRISCVDisassembler.so.21.0git 154080 150200 2.52 build/lib/libLLVMSparcDisassembler.so.21.0git 42040 39568 5.88 build/lib/libLLVMSystemZDisassembler.so.21.0git 97056 94552 2.58 build/lib/libLLVMVEDisassembler.so.21.0git 83944 81352 3.09 build/lib/libLLVMWebAssemblyDisassembler.so.21.0git 25280 25280 0.00 build/lib/libLLVMX86Disassembler.so.21.0git 2920624 2920624 0.00 build/lib/libLLVMXCoreDisassembler.so.21.0git 48320 44288 8.34 build/lib/libLLVMXtensaDisassembler.so.21.0git 42248 35840 15.17 ```	2025-06-17 06:21:21 -07:00
Vincent	977d8a4bcd	[clang][Sema] Fixed Compound Literal is not Constant Expression (#143852 ) Added a check for a compound literal hiding inside a function. fixes #87867	2025-06-17 09:20:41 -04:00
Nikita Popov	76ea1db174	[PowerPC] Split test into assembly and MIR variants (NFC) So that both can be generated.	2025-06-17 15:16:24 +02:00
Nikita Popov	3451cd5d20	[PowerPC] Regenerate MIR test checks (NFC)	2025-06-17 15:04:16 +02:00
Sirraide	b4e39e4ff9	[LLVM] [Support] Query the terminal width using `ioctl()` (#143514 ) On unix systems, we were trying to determine the terminal width using the `COULMNS` environment variable. Unfortunately, `COLUMNS` is not exported by all shells and thus not available on some systems. We were previously using `ioctl()` for this; fall back to doing so if `COLUMNS` does not exist or does not store a positive integer. This essentially reverts `a3eb3d3d92` and parts of https://reviews.llvm.org/D61326. For more information, see #139499. Fixes #139499.	2025-06-17 15:03:37 +02:00
Matt Arsenault	b91936aeff	AMDGPU: Combine nnan fminimum/fmaximum to fminnum_ieee/fmaxnum_ieee (#142217 ) This improves codegen for gfx950, where fminimum/fmaximum are legal through fminimum3/fmaximum3, so may have an additional encoding cost.	2025-06-17 21:55:57 +09:00
Krzysztof Parzyszek	5f841a6284	[flang][OpenMP] Set _OPENMP macro for version 6.0 (#144410 )	2025-06-17 07:41:20 -05:00
Aaron Ballman	3377b56338	Revert "[clang] Add managarm support" (#144514 ) Reverts llvm/llvm-project#139271 There are multiple failing build bots: https://lab.llvm.org/buildbot/#/builders/10/builds/7482 https://lab.llvm.org/buildbot/#/builders/11/builds/17473	2025-06-17 08:39:15 -04:00
Juan Manuel Martinez Caamaño	cb011d3199	[CUDA][HIP] Add a __device__ version of std::__glibcxx_assert_fail() (#136133 ) libstdc++ 15 uses the non-constexpr function std::__glibcxx_assert_fail() to trigger compilation errors when the __glibcxx_assert(cond) macro is used in a constantly evaluated context. Compilation fails when using code from the libstdc++ (such as std::array) on device code, since these assertions invoke a non-constexpr host function from device code. This patch proposes a cuda wrapper header "bits/c++config.h" which adds a __device__ version of std::__glibcxx_assert_fail(). Solves SWDEV-518041	2025-06-17 14:32:05 +02:00
Gaëtan Bossu	087d83e0c6	[SLP] vectorizeStores: Name things a bit more clearly (NFC) (#144511 ) I believe the new variable names better convey their purpose. However, I also believe that function is more complex than it needs to be, and this tiny patch should be seen as a first step towards (maybe) further refactoring. The previous names were very generic (Size, Sz, Cnt, StartIdx). This made it easy to get confused given that the vecotrizeStores() function is already complex enough. My hope would be to eventually have a function concise enough to clearly see what are the different strategies being attempted to vectorise a group of related store instructions.	2025-06-17 13:20:52 +01:00
Denzel-Brian Budii	12611a7fc7	[mlir] Improve mlir-query by adding matcher combinators (#141423 ) Whereas backward-slice matching provides support to limit traversal by specifying the desired depth level, this pull request introduces support for limiting traversal with a nested matcher (adding forward-slice also). It also adds support for variadic operators, including `anyOf` and `allOf`. Rather than simply stopping traversal when an operation named foo is encountered, one can now define a matcher that specifies different exit conditions. Variadic support implementation within mlir-query is very similar to clang-query.	2025-06-17 14:07:20 +02:00
Simon Pilgrim	9700930bd9	[X86] detectZextAbsDiff - convert to SDPatternMatch matching. NFC. (#144498 ) Match the entire ABS(SUB(ZEXT(vXi8),ZEXT(vXi8))) pattern and simplify the logic in combineBasicSADPattern accordingly	2025-06-17 13:06:10 +01:00
Oleksandr "Alex" Zinenko	875b36a874	[mlir] fix MemRefToLLVM lowering of atomic operations (#139045 ) We have been confusingly, and arguably incorrectly, lowering `mimumf` atomic RMW operations in the MemRef dialect to `fm` atomic RMW operations in the LLVM dialect, which have different NaN-propagation semantics: `mimumf` propagates NaNs from either operand whereas `fm`, which lowers to the `fmnum` intrinsic returns the non-NaN operand. This also contradicts the lowering of `arith.mimumf` and `arith.m**numf` operations. Change the lowering to match the terminology in arith. Add tests for these lowerings. Keep a debug message in case of surprising behavior downstream (the code may be producing more NaNs now).	2025-06-17 13:40:57 +02:00
Simon Pilgrim	990d2540bf	[X86] isAddSubOrSubAdd - convert to SDPatternMatch matching. NFC. (#144486 )	2025-06-17 12:12:46 +01:00
Arseniy Zaostrovnykh	2d336e7c5e	[analyzer] Avoid contradicting assumption in tainted div-by-0 error node (#144491 ) This patch corrects the state of the error node generated by the core.DivideZero checker when it detects potential division by zero involving a tainted denominator. The checker split in `91ac5ed10a` started to introduce a conflicting assumption about the denominator into the error node: Node with the Bug Report "Division by a tainted value, possibly zero" has an assumption "denominator != 0". This has been done as a shortcut to continue analysis with the correct assumption after the division - if we proceed, we can only assume the denominator was not zero. However, this assumption is introduced one-node too soon, leading to a self-contradictory error node. In this patch, I make the error node with assumption of zero denominator fatal, but allow analysis to continue on the second half of the state split with the assumption of non-zero denominator. --- CPP-6376	2025-06-17 13:07:44 +02:00
Nikita Popov	49c6235d1f	[PowerPC] Regenerate MIR test checks (NFC)	2025-06-17 12:52:00 +02:00
Timm Baeder	576ced56d7	[clang][bytecode] Simplify Block::replacePointer() (#144490 ) Try to do less work here instead of a full remove + add.	2025-06-17 12:43:39 +02:00
Timm Baeder	ce96fdde54	[clang][bytecode] Keep the last chunk in InterpStack::clear() (#144487 ) We call clear when checking for potential constant expressions, but that used to free all the chunks. Keep the last one so we don't have to re-allocate it.	2025-06-17 12:38:02 +02:00
Antonio Frighetto	d3f13a0732	[GVN] MemorySSA for GVN: embed the memory state in symbolic expressions (#123218 ) While migrating towards MemorySSA, account for the memory state modeled by MemorySSA by hashing it, when computing the symbolic expressions for the memory operations. Likewise, when phi-translating while walking the CFG for PRE possibilities, see if the value number of an operand may be refined with one of the value from the incoming edges of the MemoryPhi associated to the current phi. Co-authored-by: Momchil Velikov <momchil.velikov@arm.com>	2025-06-17 12:30:47 +02:00
Simon Pilgrim	71f72f4d5d	[DAG] Move foldMaskedMerge before visitAND. NFC. Reduces diff in #144342	2025-06-17 11:21:56 +01:00
Paul Walker	465e3ce9f1	[LLVM][CodeGen] Lower ConstantInt vectors like shufflevector base splats. (#144395 ) ConstantInt vectors utilise DAG.getConstant() when constructing the initial DAG. This can have the effect of legalising the constant before the DAG combiner is run, significant altering the generated code. To mitigate this (hopefully as a temporary measure) we instead try to construct the DAG in the same way as shufflevector based splats.	2025-06-17 11:09:22 +01:00
Mary Kassayova	c377ce1216	[AArch64][VecLib] Add libmvec support for AArch64 targets (#143696 ) This patch adds support for the `libmvec` vector library on AArch64 targets. Currently, all `libmvec` functions in GLIBC version 2.40 are supported. The full list of math functions enabled can be found [here](`96abd59bf2/sysdeps/aarch64/fpu/Versions`) (up to GLIBC 2.40). Previously, `libmvec` was only supported on x86_64 targets. Attempts to use it on AArch64 resulted in the following error from Clang: `unsupported option 'libmvec' for target 'aarch64'`.	2025-06-17 11:07:43 +01:00
Momchil Velikov	7eda8274fe	[MLIR] Integration tests for lowering vector.contract to SVE FEAT_I8MM (#140573 )	2025-06-17 11:03:14 +01:00
Ying Yi	6f29837659	Reland: "[Frontend][PCH]-Add support for ignoring PCH options (-ignore-pch). (#142409 )" (#143614 ) Visual Studio has an argument to ignore all PCH related switches. clang-cl has also support option /Y-. Having the same option in clang would be helpful. This commit is to add support for ignoring PCH options (-ignore-pch). The commit includes: 1. Implement -ignore-pch as a Driver option. 2. Add a Driver test and a PCH test. 3. Add a section of -ignore-pch to user manual. 4. Add a release note for the new option '-ignore-pch'. The change since the original landing: 1. preprocessing-only mode doesn't imply that -include-pch is disabled. Co-authored-by: Matheus Izvekov <mizvekov@gmail.com>	2025-06-17 10:54:22 +01:00
Donát Nagy	4c8f434409	[analyzer] Conversion to CheckerFamily: NullabilityChecker (#143735 ) This commit converts NullabilityChecker to the new checker family framework that was introduced in the recent commit `6833076a5d` This commit removes the dummy checker `nullability.NullabilityBase` because it was hidden from the users and didn't have any useful role except for helping the registration of the checker parts in the old ad-hoc system (which is replaced by the new standardized framework). Except for the removal of this dummy checker, no functional changes intended.	2025-06-17 11:51:09 +02:00
Matt Arsenault	00709c306d	AArch64: Fix hardcoding calling convention of sincos_stret (NFC) (#144336 )	2025-06-17 18:44:32 +09:00
Tom Eccles	aa01e8e9cf	[mlir][OpenMP] Fix broken insertion point for charbox with omp task (#143112 ) Fixes #142365	2025-06-17 10:42:42 +01:00
Simon Pilgrim	277b2b6da7	[X86] combineCastedMaskArithmetic - convert to SDPatternMatch matching. NFC. (#144472 )	2025-06-17 10:39:54 +01:00
Karlo Basioli	dfd00edbab	Fix for #144391 not fully addressed by #144484 (#144488 )	2025-06-17 10:37:18 +01:00
Kareem Ergawy	97e17e1595	Revert "[flang] Enable delayed localization by default for `do concurrent` (#144074 )" (#144476 ) This reverts commit `b5dbf8210a`. Reverting again due to gfortran failure: https://lab.llvm.org/buildbot/#/builders/17/builds/8868	2025-06-17 11:34:05 +02:00
Jesse Huang	e5ad7f4556	[RISCV] Move RISCVIndirectBranchTracking before Branch Relaxation (#139993 ) The `RISCVIndirectBranchTracking` pass inserts `lpad` instruction and could change the basic block alignment, so this should not happen after the branch relaxation as the adjusted offset is possible to exceed the branch range.	2025-06-17 17:21:24 +08:00

1 2 3 4 5 ...

541275 Commits