intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-22 07:01:03 +08:00

Author	SHA1	Message	Date
Charles Zablit	6751b3a549	Revert "[lit] cleanup unused imports" (#144054 ) Reverts llvm/llvm-project#143930 as it causes build failures: https://github.com/llvm/llvm-project/pull/143930#issuecomment-2969115461	2025-06-13 08:16:09 -07:00
Kazu Hirata	bcfbba12e6	[llvm] Compare std::optional<T> to values directly (NFC) (#143913 ) This patch transforms: X && *X == Y to: X == Y where X is of std::optional<T>, and Y is of T or similar.	2025-06-13 08:11:20 -07:00
Darren Wihandi	9e62298652	[mlir][spirv] Fix FuncOpVectorUnroll to process placeholder values in all blocks (#142339 ) `FuncOpVectorUnroll` contains logic that replaces function arguments by placeholders values. These replacements also involve changing all instructions in the function that use the arguments to use these placeholders. These placeholder values will later be changed back to use the function arguments (either new or original if already legal). The current implementation however only replaces back (the second replacement, i.e. replacing the placeholder values to new/legal arguments) the first block of instructions and not all of the blocks. This may leave some instructions to use these placeholder values (which for already legal arguments are just zeroattr values that will get DCE'd) instead of the arguments, which is incorrect. Closes #132158.	2025-06-13 11:06:31 -04:00
Orlando Cazalet-Hyams	ebd7f7539b	[KeyInstr][NFC] Fix incorrect atomGroup/rank uint size in computeKeyInstructions	2025-06-13 16:04:03 +01:00
nicebert	cf6ae065a0	[OpenMP] Remove declaration and usage of __AMDGCN_WAVEFRONT_SIZE (#143761 ) Removes usage of __AMDGCN_WAVEFRONT_SIZE as compile time constant. --------- Co-authored-by: Shilei Tian <i@tianshilei.me>	2025-06-13 10:46:36 -04:00
Devon Loehr	9670e09d0e	Enable unique-object-duplication warning for windows (#143537 ) Followup to #125526. This expands the logic of the unique-object-duplication warning so that it also works for windows code. For the most part, the logic is unchanged, merely substituting "has no import/export annotation" in place of "has hidden visibility". However, there are some small inconsistencies between the two; namely, visibility is propagated through nested classes, while import/export annotations aren't. This PR: 1. Updates the logic for the warning to account for the differences between posix and windows 2. Changes the warning message and documentation appropriately 3. Updates the tests to cover windows, and adds new test cases for the places where behavior differs. This PR was tested by building chromium (cross compiling linux->windows) with the changes in place. After accounting for the differences in semantics, no new warnings were discovered.	2025-06-13 10:29:42 -04:00
David Spickett	82911f188b	[lldb][test] Skip ReadAfterClose JSON Transport tests on Windows These were failing on our Windows on Arm bot, or more precisely, not even completing. This is because Microsoft's C runtime does extra parameter validation. So when we called _read with an invalid fd, it called an invalid parameter handler instead of returning an error. https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/reference/read?view=msvc-170 https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/parameter-validation?view=msvc-170 (lldb) run Process 8440 launched: 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\unittests\Host\HostTests.exe' (aarch64) Process 8440 stopped * thread #1, stop reason = Exception 0xc0000409 encountered at address 0x7ffb7453564c frame #0: 0x00007ffb7453564c ucrtbase.dll`_get_thread_local_invalid_parameter_handler + 652 ucrtbase.dll`_get_thread_local_invalid_parameter_handler: -> 0x7ffb7453564c <+652>: brk #0xf003 ucrtbase.dll`_invalid_parameter_noinfo: 0x7ffb74535650 <+0>: b 0x7ffb745354d8 ; _get_thread_local_invalid_parameter_handler + 280 0x7ffb74535654 <+4>: nop 0x7ffb74535658 <+8>: nop You can override this handler but I'm assuming that this reading after close isn't a crucial feature, so disabling the tests seems like the way to go. If it is crucial, we can check the fd before we use it. Tests added by https://github.com/llvm/llvm-project/pull/143946.	2025-06-13 14:26:06 +00:00
Ross Brunton	e6a3579653	[Offload] Replace device info queue with a tree (#144050 ) Previously, device info was returned as a queue with each element having a "Level" field indicating its nesting level. This replaces this queue with a more traditional tree-like structure. This should not result in a change to the output of `llvm-offload-device-info`.	2025-06-13 09:22:47 -05:00
Darren Wihandi	0a0960dac6	[mlir][spirv] Add bfloat16 support (#141458 ) Adds bf16 support to SPIRV by using the `SPV_KHR_bfloat16` extension. Only a few operations are supported, including loading from and storing to memory, conversion to/from other types, cooperative matrix operations (including coop matrix arithmetic ops) and dot product support. This PR adds the type definition and implements the basic cast operations. Arithmetic/coop matrix ops will be added in a separate PR.	2025-06-13 10:14:45 -04:00
Fabian Ritter	8b11de7068	[AMDGPU][SDAG] Initial support for ISD::PTRADD (#141725 ) Enable generation of PTRADD SelectionDAG nodes for pointer arithmetic for SI, for now behind an internal CLI option. Also add basic patterns to match these nodes. Optimizations will come in follow-up PRs. Basic tests for SDAG codegen with PTRADD are in test/CodeGen/AMDGPU/ptradd-sdag.ll Only affects 64-bit address spaces for now, since the immediate use case only affects the flat address space. For SWDEV-516125.	2025-06-13 15:59:58 +02:00
Yash Solanki	a361a3dc7a	[llvm][InstCombine] Fold select to cmp for weak and inverted inequalities (#143445 )	2025-06-13 21:53:34 +08:00
Simon Pilgrim	6f999a5d99	[x86] vector-pcmp.ll - regenerate VPTERNLOGD asm comment	2025-06-13 14:52:24 +01:00
Michael Buch	c3ec9e3f65	[lldb][DWARF] Don't try to compute address range information of forward declarations (#144059 ) This fixes the error reported in https://github.com/llvm/llvm-project/pull/144037. When computing the aranges table of a CU, LLDB would currently visit all `DW_TAG_subprogram` DIEs and check their `DW_AT_low_pc`/`DW_AT_high_pc`/`DW_AT_ranges` attributes. If those don't exist it would error out and spam the console. Some subprograms (particularly forward declarations) don't have low/high pc attributes, so it's not really an "error". See DWARFv5 spec section `3.3.3 Subroutine and Entry Point Locations`: ``` A subroutine entry may have either a DW_AT_low_pc and DW_AT_high_pc pair of attributes or a DW_AT_ranges attribute whose values encode the contiguous or non-contiguous address ranges, respectively, of the machine instructions generated for the subroutine (see Section 2.17 on page 51). ... A subroutine entry representing a subroutine declaration that is not also a definition does not have code address or range attributes. ``` We should just ignore those DIEs.	2025-06-13 14:40:27 +01:00
zhijian lin	ea73fc5f07	[PowerPC] fixed mtvsrbmi.ll test case error caused by run the update_llc_test_checks.py (#144075 ) fixed mtvsrbmi.ll test case error which caused by run the update_llc_test_checks.py	2025-06-13 09:38:54 -04:00
Kareem Ergawy	7e0bb2b0b9	[flang][fir] Extend locality specs lowering to support `init` and `dealloc` regions (#144027 ) Extending `fir.do_concurrent` to `fir.do_loop ... unordered` lowering by adding support for lowring/inlining non-empty `init` and `dealloc` regions. Resolves https://github.com/llvm/llvm-project/issues/143897 (actually handles the todo).	2025-06-13 15:21:23 +02:00
zhijian lin	9c2e0bd59c	[PowerPC][NFC] Pre-commit test case for checking whether `mtvsrbmi` power10 instruction not used (#143956 ) Verify whether the generated assembly for the following function includes the mtvsrbmi instruction. vector unsigned char v00FF() { vector unsigned char x = { 0xFF, 0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0 }; return x; }	2025-06-13 09:19:10 -04:00
Tom Eccles	6ca31ad720	[flang][OpenMP] improve semantic check for invalid goto (#144040 ) Fixes #143229	2025-06-13 14:17:39 +01:00
Tom Eccles	4a47634a00	[flang][OpenMP] Support substrings and complex part refs for DEPEND (#143907 ) Fixes #142404 The parser can't tell the difference between array indexing and a substring: that has to be done in semantics once we have types. Substrings can only be in the form string([lower]:[higher]) not string(index) or string(lower:higher:step). I added semantic checks to catch this for the DEPEND clause. This patch also adds lowering for correct substrings and for complex part references.	2025-06-13 14:16:58 +01:00
zhijian lin	85a9f2e148	[PowerPC] enable AtomicExpandImpl::expandAtomicCmpXchg for powerpc (#142395 ) In PowerPC, the AtomicCmpXchgInst is lowered to ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS. However, this node does not handle the weak attribute of AtomicCmpXchgInst. As a result, when compiling C++ atomic_compare_exchange_weak_explicit, the generated assembly includes a "reservation lost" loop — i.e., it branches back and retries if the stwcx. (store-conditional) fails. This differs from GCC’s codegen, which does not include that loop for weak compare-exchange. Since PowerPC uses LL/SC-style atomic instructions, the patch enables AtomicExpandImpl::expandAtomicCmpXchg for PowerPC. With this, the weak attribute is properly respected, and the "reservation lost" loop is removed for weak operations. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-06-13 09:14:48 -04:00
Ryan Buchner	a59e4acd75	[RISCV] Lower SELECT's with one constant more efficiently using Zicond (#143581 ) See #143580 for MR with the test commit. Performs the following transformations: (select c, c1, t) -> (add (czero_nez t - c1, c), c1) (select c, t, c1) -> (add (czero_eqz t - c1, c), c1) @mgudim	2025-06-13 08:57:46 -04:00
Nikita Popov	4f8187c0dc	[TSan] Regenerate test checks (NFC)	2025-06-13 14:54:24 +02:00
Simon Pilgrim	d7ddd46116	[X86] Add start/end debug messages for the X86CompressEVEXPass and X86PadShortFunctionPass (#144056 )	2025-06-13 13:47:45 +01:00
Shilei Tian	9a237f35ef	[AMDGPU][AsmParser] Support true16 register suffix for valid register range (#143997 )	2025-06-13 08:39:00 -04:00
Martin Wehking	fbea0fc5c7	Add Macro for CSSC Feature (#143148 ) Add a new __ARM_FEATURE_CSSC macro that can be utilized during the preprocessing stage. __ARM_FEATURE_CSSC is defined to 1 if there is hardware support for CSSC. Implements the ACLE change: https://github.com/ARM-software/acle/pull/394	2025-06-13 13:33:46 +01:00
Stephen Tozer	cc365331af	[DLCov] Origin-Tracking: Add config options (#143590 ) This patch is part of a series that adds origin-tracking to the debugify source location coverage checks, allowing us to report symbolized stack traces of the point where missing source locations appear. This patch adds the configuration options needed to enable this feature, in the form of a new CMake option that enables a flag in `llvm-config.h`; this is not an entirely new CMake flag, but a new option, `COVERAGE_AND_ORIGIN`, for the existing flag `LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING`. This patch contains documentation, but no actual implementation for the flag itself.	2025-06-13 12:54:30 +01:00
Simon Pilgrim	f1036d844e	[X86] X86InstrInfo::commuteInstructionImpl - remove (V)BLENDPD/S commutation to (V)MOVSD/S optsize handling (#144051 ) Just commute with (V)BLENDPD/S like all other BLEND instructions This is now handled more generally by the X86FixupInstTuningPass (OptSize fold occurs even without a scheduler model). First step towards #142972	2025-06-13 12:49:22 +01:00
Michael Buch	41b37f0555	[lldb] CommandObjectMemoryFind: Improve expression evaluation error messages (#144036 ) We now bubble up the expression evaluation diagnostics to the user and also distinguish between "expression failed to parse/run" versus other ways in which expressions didn't complete (e.g., setup errors, etc.). Before: ``` (lldb) memory find -e "" 0x16fdfedc0 0x16fdfede0 error: expression evaluation failed. pass a string instead (lldb) memory find -e "invalid" 0x16fdfedc0 0x16fdfede0 error: expression evaluation failed. pass a string instead ``` After: ``` (lldb) memory find -e "" 0x16fdfedc0 0x16fdfede0 error: Expression evaluation failed: error: No result returned from expression. Exit status: 1 (lldb) memory find -e "invalid" 0x16fdfedc0 0x16fdfede0 error: Expression evaluation failed: error: <user expression 0>:1:1: use of undeclared identifier 'invalid' 1 \| invalid \| ^~~~~~~ ```	2025-06-13 12:43:27 +01:00
Ilia Kuklin	4236423ee8	[LLDB] Add bit extraction to DIL (#141422 )	2025-06-13 16:31:25 +05:00
Aaron Ballman	30725efe67	Fix build after removing delayed typo expression This addresses issues found by: https://lab.llvm.org/buildbot/#/builders/64/builds/4220 https://lab.llvm.org/buildbot/#/builders/51/builds/17890	2025-06-13 07:12:41 -04:00
Abhina Sree	be9994b092	[SystemZ][z/OS] Refactor AutoConvert more (#143955 ) This patch removes the C++ disablezOSAutoConversion,enablezOSAutoConversion declarations and also updates Path.inc to use the common function.	2025-06-13 07:00:36 -04:00
Diana Picus	a5cbd2ab0b	Revert "[AMDGPU] Skip register uses in AMDGPUResourceUsageAnalysis (#… (#144039 ) …133242)" This reverts commit `130080fab1` because it causes issues in testcases similar to coalescer_remat.ll [1], i.e. when we use a VGPR tuple but only write to its lower parts. The high VGPRs would then not be included in the vgpr_count, and accessing them would be an out of bounds violation. [1] https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/coalescer_remat.ll	2025-06-13 12:48:24 +02:00
Aaron Ballman	9eef4d1c5f	Remove delayed typo expressions (#143423 ) This removes the delayed typo correction functionality from Clang (regular typo correction still remains) due to fragility of the solution. An RFC was posted here: https://discourse.llvm.org/t/rfc-removing-support-for-delayed-typo-correction/86631 and while that RFC was asking for folks to consider stepping up to be maintainers, and we did have a few new contributors show some interest, experiments show that it's likely worth it to remove this functionality entirely and focus efforts on improving regular typo correction. This removal fixes ~20 open issues (quite possibly more), improves compile time performance by roughly .3-.4% (https://llvm-compile-time-tracker.com/?config=Overview&stat=instructions%3Au&remote=AaronBallman&sortBy=date), and does not appear to regress diagnostic behavior in a way we wouldn't find acceptable. Fixes #142457 Fixes #139913 Fixes #138850 Fixes #137867 Fixes #137860 Fixes #107840 Fixes #93308 Fixes #69470 Fixes #59391 Fixes #58172 Fixes #46215 Fixes #45915 Fixes #45891 Fixes #44490 Fixes #36703 Fixes #32903 Fixes #23312 Fixes #69874	2025-06-13 06:45:40 -04:00
David Sherwood	541e5118ce	[LV] Use getFixedValue instead of getKnownMinValue when appropriate (#143526 ) There are many places in VPlan and LoopVectorize where we use getKnownMinValue to discover the number of elements in a vector. Where we expect the vector to have a fixed length, I have used the stronger getFixedValue call. I believe this is clearer and adds extra protection in the form of an assert in getFixedValue that the vector is not scalable. While looking at VPFirstOrderRecurrencePHIRecipe::computeCost I also took the liberty of simplifying the code. In theory I believe this patch should be NFC, but I'm reluctant to add that to the title in case we're just missing tests for some of the VPlan changes. I built and ran the LLVM test suite when targeting neoverse-v1 and it seemed ok.	2025-06-13 11:43:50 +01:00
Nikita Popov	2019553a0b	[InstCombine] Extract EmitGEPOffsets() helper (NFC) Extract a reusable helper for emitting a sum of multiple GEP offsets.	2025-06-13 12:34:18 +02:00
Nikita Popov	6fc8ec720e	[InstCombine] Restore splat gep support in OptimizePointerDifference() (#143906 ) When looking for the common base pointer, support the case where the type changes because the GEP goes from pointer to vector of pointers. This was supported prior to #142958.	2025-06-13 12:29:50 +02:00
Simon Pilgrim	e2c27fd66a	[X86] X86FixupInstTuning - hoist OptSize flag. NFC. Allow reuse in a future patch.	2025-06-13 11:11:01 +01:00
Simon Pilgrim	058602372e	[X86] X86FixupInstTuning - fold BLENDPS -> MOVSD (#144029 ) Reduces codesize - make use of free PS<->PD domain transfers (like we do in many other places) and replace a suitable BLENDPS mask with MOVSD if OptSize or the scheduler prefers it	2025-06-13 11:05:57 +01:00
SivanShani-Arm	5762491e2a	[lld] Refactor storage of PAuth ABI core info (#141920 ) Previously, the AArch64 PAuth ABI core values were stored as an ArrayRef<uint8_t>, introducing unnecessary indirection. This patch replaces the ArrayRef with two explicit uint64_t fields: aarch64PauthAbiPlatform and aarch64PauthAbiVersion. This simplifies the representation and improves readability. No functional change intended, aside from improved error messages.	2025-06-13 11:02:33 +01:00
David Spickett	06c7835670	[lldb][test] Disable TestMultipleDebuggers again I did manage to turn a crash into a non-zero return code, but on the very first build it managed to time out. I thought I had the appetite to tweak timeouts but on second thought, I don't want yet another test to look out for. The test is not wrong, but on heavily loaded machines it's always going to be inherently unstable.	2025-06-13 09:12:01 +00:00
Tim Gymnich	67c590004d	[mlir][AMDGPU] Add scaled floating point conversion ops (#141554 ) implement `ScaledExtPackedOp` and `PackedScaledTruncOp`	2025-06-13 11:09:11 +02:00
Simone Pellegrini	4b59b7b946	[mlir][Linalg] Fix fusing of indexed linalg consumer with different axes (#140892 ) When fusing two `linalg.genericOp`, where the producer has index semantics, invalid `affine.apply` ops can be generated where the number of indices do not match the number of loops in the fused genericOp. This patch fixes the issue by directly using the number of loops from the generated fused op.	2025-06-13 10:03:09 +01:00
David Sherwood	2d49bc01cf	[LV][NFC] Tidy up check-prof-info.ll test (#143884 )	2025-06-13 10:02:27 +01:00
Sirui Mu	8ba62fdb3d	[CIR] Function calls with aggregate arguments and return values (#143377 ) This patch updates cir.call operation and allows function calls with aggregate arguments and return values. It seems that C++ class support is still at a minimum now. I tried to make a call to a C++ function with an argument of aggregate type but it failed because the initialization of C++ class / struct is NYI. I also tried to inline this part of support into this patch, but the mixed patch quickly blows in size and becomes unsuitable for review. Thus, tests for calling functions with aggregate arguments are added only for C for now.	2025-06-13 16:47:56 +08:00
David Spickett	addd98f7a5	[lldb][test] Don't call SBDebugger::Terminate if TestMultipleDebuggers times out (#143732 ) Fixes #101162 This test did this: * SBDebugger::Initialize * Spawn a bunch of threads that do: * SBDebugger::Create * some work * SBDebugger::Destroy * Wait on those threads to finish then call SBDebugger::Terminate and exit, or - * Reach a time limit before all the threads finish, call SBDebugger::Terminate and exit. The problem was that in the timeout case, calling SBDebugger::Terminate destroys data being used by threads that are still running. I expect this test was expecting said threads to be so broken they were probably stuck, but when the machine is just heavily loaded, one of them might read that data before the whole program exits. This means what should have been a timeout becomes a crash. Sometimes. Which explains why we saw both timeouts and various signals on the AArch64 Linux bot. It depends on the timings. So I'm changing it not to call SBDebugger::Terminate in the timeout case. We will have to tweak the timeout value based on what happens on the buildbot, but we will know it's machine load not an lldb bug. Also use _exit instead of exit, to skip more cleanup that might cause a crash.	2025-06-13 09:31:57 +01:00
Orlando Cazalet-Hyams	0cf333878d	[NFC] Pack MDNodeKeyImpl<DILocation> from 40 to 32 bytes (#143891 )	2025-06-13 09:26:08 +01:00
Sander de Smalen	d4826cd324	[AArch64] Observe Z-reg inline asm clobbers without SVE (#143742 ) inline asm that clobbers any of the z-registers when not in streaming mode, should still observe that the lower 128 bits of those registers are clobbered.	2025-06-13 09:07:09 +01:00
LU-JOHN	c4caf00bfb	[AMDGPU] Convert more 64-bit lshr to 32-bit if shift amt>=32 (#138204 ) Convert vector 64-bit lshr to 32-bit if shift amt is known to be >= 32. Also convert scalar 64-bit lshr to 32-bit if shift amt is variable but known to be >=32. --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-06-13 17:03:06 +09:00
Shamshura Egor	02b6ed0bf1	[Clang] Added explanation why `is_constructible` evaluated to false. (#143309 ) Added explanation why a is constructible evaluated to false. Also fixed problem with ```ExtractTypeTraitFromExpression```. In case ```std::is_xxx_v<>``` with variadic pack it tries to get template argument, but fails in expression ```Arg.getAsType()``` due to ```Arg.getKind() == TemplateArgument::ArgKind::Pack```, but not ```TemplateArgument::ArgKind::Type```.	2025-06-13 09:53:15 +02:00
Simon Pilgrim	cd3d234868	[X86] X86FixupInstTuning - extend BLENDPD/S -> MOVSD/S handling to SSE variant (#143961 )	2025-06-13 08:52:48 +01:00
Longsheng Mou	02f1f6967a	[mlir][linalg] Add pure tensor check for `winogradConv2DHelper` (#142299 ) This PR adds pure tensor semantics check for `winogradConv2DHelper` to prevent a crash. Fixes #141566.	2025-06-13 15:49:54 +08:00

1 2 3 4 5 ...

540908 Commits