Commit Graph

540908 Commits

Author SHA1 Message Date
Charles Zablit
6751b3a549 Revert "[lit] cleanup unused imports" (#144054)
Reverts llvm/llvm-project#143930 as it causes build failures:
https://github.com/llvm/llvm-project/pull/143930#issuecomment-2969115461
2025-06-13 08:16:09 -07:00
Kazu Hirata
bcfbba12e6 [llvm] Compare std::optional<T> to values directly (NFC) (#143913)
This patch transforms:

  X && *X == Y

to:

  X == Y

where X is of std::optional<T>, and Y is of T or similar.
2025-06-13 08:11:20 -07:00
Darren Wihandi
9e62298652 [mlir][spirv] Fix FuncOpVectorUnroll to process placeholder values in all blocks (#142339)
`FuncOpVectorUnroll` contains logic that replaces function arguments by
placeholders values. These replacements also involve changing all
instructions in the function that use the arguments to use these
placeholders. These placeholder values will later be changed back to use
the function arguments (either new or original if already legal).

The current implementation however only replaces back (the second
replacement, i.e. replacing the placeholder values to new/legal
arguments) the first block of instructions and not all of the blocks.
This may leave some instructions to use these placeholder values (which
for already legal arguments are just zeroattr values that will get
DCE'd) instead of the arguments, which is incorrect.

Closes #132158.
2025-06-13 11:06:31 -04:00
Orlando Cazalet-Hyams
ebd7f7539b [KeyInstr][NFC] Fix incorrect atomGroup/rank uint size in computeKeyInstructions 2025-06-13 16:04:03 +01:00
nicebert
cf6ae065a0 [OpenMP] Remove declaration and usage of __AMDGCN_WAVEFRONT_SIZE (#143761)
Removes usage of __AMDGCN_WAVEFRONT_SIZE as compile time constant.

---------

Co-authored-by: Shilei Tian <i@tianshilei.me>
2025-06-13 10:46:36 -04:00
Devon Loehr
9670e09d0e Enable unique-object-duplication warning for windows (#143537)
Followup to #125526. This expands the logic of the
unique-object-duplication warning so that it also works for windows
code.

For the most part, the logic is unchanged, merely substituting "has no
import/export annotation" in place of "has hidden visibility". However,
there are some small inconsistencies between the two; namely, visibility
is propagated through nested classes, while import/export annotations
aren't.

This PR:
1. Updates the logic for the warning to account for the differences
between posix and windows
2. Changes the warning message and documentation appropriately
3. Updates the tests to cover windows, and adds new test cases for the
places where behavior differs.

This PR was tested by building chromium (cross compiling linux->windows)
with the changes in place. After accounting for the differences in
semantics, no new warnings were discovered.
2025-06-13 10:29:42 -04:00
David Spickett
82911f188b [lldb][test] Skip ReadAfterClose JSON Transport tests on Windows
These were failing on our Windows on Arm bot, or more precisely,
not even completing.

This is because Microsoft's C runtime does extra parameter validation.
So when we called _read with an invalid fd, it called an invalid
parameter handler instead of returning an error.

https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/reference/read?view=msvc-170
https://learn.microsoft.com/en-us/%20cpp/c-runtime-library/parameter-validation?view=msvc-170

(lldb) run
Process 8440 launched: 'C:\Users\tcwg\llvm-worker\lldb-aarch64-windows\build\tools\lldb\unittests\Host\HostTests.exe' (aarch64)
Process 8440 stopped
* thread #1, stop reason = Exception 0xc0000409 encountered at address 0x7ffb7453564c
    frame #0: 0x00007ffb7453564c ucrtbase.dll`_get_thread_local_invalid_parameter_handler + 652
ucrtbase.dll`_get_thread_local_invalid_parameter_handler:
->  0x7ffb7453564c <+652>: brk    #0xf003

ucrtbase.dll`_invalid_parameter_noinfo:
    0x7ffb74535650 <+0>:   b      0x7ffb745354d8 ; _get_thread_local_invalid_parameter_handler + 280
    0x7ffb74535654 <+4>:   nop
    0x7ffb74535658 <+8>:   nop

You can override this handler but I'm assuming that this reading
after close isn't a crucial feature, so disabling the tests seems
like the way to go.

If it is crucial, we can check the fd before we use it.

Tests added by https://github.com/llvm/llvm-project/pull/143946.
2025-06-13 14:26:06 +00:00
Ross Brunton
e6a3579653 [Offload] Replace device info queue with a tree (#144050)
Previously, device info was returned as a queue with each element having
a "Level" field indicating its nesting level. This replaces this queue
with a more traditional tree-like structure.

This should not result in a change to the output of
`llvm-offload-device-info`.
2025-06-13 09:22:47 -05:00
Darren Wihandi
0a0960dac6 [mlir][spirv] Add bfloat16 support (#141458)
Adds bf16 support to SPIRV by using the `SPV_KHR_bfloat16` extension.
Only a few operations are supported, including loading from and storing
to memory, conversion to/from other types, cooperative matrix operations
(including coop matrix arithmetic ops) and dot product support.

This PR adds the type definition and implements the basic cast
operations. Arithmetic/coop matrix ops will be added in a separate PR.
2025-06-13 10:14:45 -04:00
Fabian Ritter
8b11de7068 [AMDGPU][SDAG] Initial support for ISD::PTRADD (#141725)
Enable generation of PTRADD SelectionDAG nodes for pointer arithmetic for SI,
for now behind an internal CLI option. Also add basic patterns to match these
nodes. Optimizations will come in follow-up PRs. Basic tests for SDAG codegen
with PTRADD are in test/CodeGen/AMDGPU/ptradd-sdag.ll

Only affects 64-bit address spaces for now, since the immediate use case only
affects the flat address space.

For SWDEV-516125.
2025-06-13 15:59:58 +02:00
Yash Solanki
a361a3dc7a [llvm][InstCombine] Fold select to cmp for weak and inverted inequalities (#143445) 2025-06-13 21:53:34 +08:00
Simon Pilgrim
6f999a5d99 [x86] vector-pcmp.ll - regenerate VPTERNLOGD asm comment 2025-06-13 14:52:24 +01:00
Michael Buch
c3ec9e3f65 [lldb][DWARF] Don't try to compute address range information of forward declarations (#144059)
This fixes the error reported in
https://github.com/llvm/llvm-project/pull/144037.

When computing the aranges table of a CU, LLDB would currently visit all
`DW_TAG_subprogram` DIEs and check their
`DW_AT_low_pc`/`DW_AT_high_pc`/`DW_AT_ranges` attributes. If those don't
exist it would error out and spam the console. Some subprograms
(particularly forward declarations) don't have low/high pc attributes,
so it's not really an "error". See DWARFv5 spec section `3.3.3
Subroutine and Entry Point Locations`:
```
A subroutine entry may have either a DW_AT_low_pc and DW_AT_high_pc
pair of attributes or a DW_AT_ranges attribute whose values encode the
contiguous or non-contiguous address ranges, respectively, of the machine
instructions generated for the subroutine (see Section 2.17 on page 51).
...
A subroutine entry representing a subroutine declaration that is not also a
definition does not have code address or range attributes.
```

We should just ignore those DIEs.
2025-06-13 14:40:27 +01:00
zhijian lin
ea73fc5f07 [PowerPC] fixed mtvsrbmi.ll test case error caused by run the update_llc_test_checks.py (#144075)
fixed mtvsrbmi.ll test case error which caused by run the
update_llc_test_checks.py
2025-06-13 09:38:54 -04:00
Kareem Ergawy
7e0bb2b0b9 [flang][fir] Extend locality specs lowering to support init and dealloc regions (#144027)
Extending `fir.do_concurrent` to `fir.do_loop ... unordered` lowering by
adding support for lowring/inlining non-empty `init` and `dealloc`
regions.

Resolves https://github.com/llvm/llvm-project/issues/143897 (actually
handles the todo).
2025-06-13 15:21:23 +02:00
zhijian lin
9c2e0bd59c [PowerPC][NFC] Pre-commit test case for checking whether mtvsrbmi power10 instruction not used (#143956)
Verify whether the generated assembly for the following function
includes the mtvsrbmi instruction.
 vector unsigned char v00FF()
{
 vector unsigned char x = { 0xFF, 0,0,0, 0,0,0,0, 0,0,0,0, 0,0,0,0 };
 return x;
 }
2025-06-13 09:19:10 -04:00
Tom Eccles
6ca31ad720 [flang][OpenMP] improve semantic check for invalid goto (#144040)
Fixes #143229
2025-06-13 14:17:39 +01:00
Tom Eccles
4a47634a00 [flang][OpenMP] Support substrings and complex part refs for DEPEND (#143907)
Fixes #142404

The parser can't tell the difference between array indexing and a
substring: that has to be done in semantics once we have types.
Substrings can only be in the form string([lower]:[higher]) not
string(index) or string(lower:higher:step). I added semantic checks to
catch this for the DEPEND clause.

This patch also adds lowering for correct substrings and for complex
part references.
2025-06-13 14:16:58 +01:00
zhijian lin
85a9f2e148 [PowerPC] enable AtomicExpandImpl::expandAtomicCmpXchg for powerpc (#142395)
In PowerPC, the AtomicCmpXchgInst is lowered to
ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS. However, this node does not handle
the weak attribute of AtomicCmpXchgInst. As a result, when compiling C++
atomic_compare_exchange_weak_explicit, the generated assembly includes a
"reservation lost" loop — i.e., it branches back and retries if the
stwcx. (store-conditional) fails. This differs from GCC’s codegen, which
does not include that loop for weak compare-exchange.

Since PowerPC uses LL/SC-style atomic instructions, the patch enables
AtomicExpandImpl::expandAtomicCmpXchg for PowerPC. With this, the weak
attribute is properly respected, and the "reservation lost" loop is
removed for weak operations.

---------

Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-06-13 09:14:48 -04:00
Ryan Buchner
a59e4acd75 [RISCV] Lower SELECT's with one constant more efficiently using Zicond (#143581)
See #143580 for MR with the test commit.

Performs the following transformations:
(select c, c1, t) -> (add (czero_nez t - c1, c), c1)
(select c, t, c1) -> (add (czero_eqz t - c1, c), c1)


@mgudim
2025-06-13 08:57:46 -04:00
Nikita Popov
4f8187c0dc [TSan] Regenerate test checks (NFC) 2025-06-13 14:54:24 +02:00
Simon Pilgrim
d7ddd46116 [X86] Add start/end debug messages for the X86CompressEVEXPass and X86PadShortFunctionPass (#144056) 2025-06-13 13:47:45 +01:00
Shilei Tian
9a237f35ef [AMDGPU][AsmParser] Support true16 register suffix for valid register range (#143997) 2025-06-13 08:39:00 -04:00
Martin Wehking
fbea0fc5c7 Add Macro for CSSC Feature (#143148)
Add a new __ARM_FEATURE_CSSC macro that can be utilized during the
preprocessing stage.

__ARM_FEATURE_CSSC is defined to 1 if there is hardware support for
CSSC.

Implements the ACLE change:
https://github.com/ARM-software/acle/pull/394
2025-06-13 13:33:46 +01:00
Stephen Tozer
cc365331af [DLCov] Origin-Tracking: Add config options (#143590)
This patch is part of a series that adds origin-tracking to the debugify
source location coverage checks, allowing us to report symbolized stack
traces of the point where missing source locations appear.

This patch adds the configuration options needed to enable this feature,
in the form of a new CMake option that enables a flag in
`llvm-config.h`; this is not an entirely new CMake flag, but a new
option, `COVERAGE_AND_ORIGIN`, for the existing flag
`LLVM_ENABLE_DEBUGLOC_COVERAGE_TRACKING`. This patch contains
documentation, but no actual implementation for the flag itself.
2025-06-13 12:54:30 +01:00
Simon Pilgrim
f1036d844e [X86] X86InstrInfo::commuteInstructionImpl - remove (V)BLENDPD/S commutation to (V)MOVSD/S optsize handling (#144051)
Just commute with (V)BLENDPD/S like all other BLEND instructions

This is now handled more generally by the X86FixupInstTuningPass (OptSize fold occurs even without a scheduler model).

First step towards #142972
2025-06-13 12:49:22 +01:00
Michael Buch
41b37f0555 [lldb] CommandObjectMemoryFind: Improve expression evaluation error messages (#144036)
We now bubble up the expression evaluation diagnostics to the user and
also distinguish between "expression failed to parse/run" versus other
ways in which expressions didn't complete (e.g., setup errors, etc.).

Before:
```
(lldb) memory find -e "" 0x16fdfedc0 0x16fdfede0
error: expression evaluation failed. pass a string instead
(lldb) memory find -e "invalid" 0x16fdfedc0 0x16fdfede0
error: expression evaluation failed. pass a string instead
```

After:
```
(lldb) memory find -e "" 0x16fdfedc0 0x16fdfede0
error: Expression evaluation failed:
error: No result returned from expression. Exit status: 1
(lldb) memory find -e "invalid" 0x16fdfedc0 0x16fdfede0
error: Expression evaluation failed:
error: <user expression 0>:1:1: use of undeclared identifier 'invalid'
    1 | invalid
      | ^~~~~~~
```
2025-06-13 12:43:27 +01:00
Ilia Kuklin
4236423ee8 [LLDB] Add bit extraction to DIL (#141422) 2025-06-13 16:31:25 +05:00
Aaron Ballman
30725efe67 Fix build after removing delayed typo expression
This addresses issues found by:
  https://lab.llvm.org/buildbot/#/builders/64/builds/4220
  https://lab.llvm.org/buildbot/#/builders/51/builds/17890
2025-06-13 07:12:41 -04:00
Abhina Sree
be9994b092 [SystemZ][z/OS] Refactor AutoConvert more (#143955)
This patch removes the C++
disablezOSAutoConversion,enablezOSAutoConversion declarations and also
updates Path.inc to use the common function.
2025-06-13 07:00:36 -04:00
Diana Picus
a5cbd2ab0b Revert "[AMDGPU] Skip register uses in AMDGPUResourceUsageAnalysis (#… (#144039)
…133242)"

This reverts commit 130080fab1 because it
causes issues in testcases similar to coalescer_remat.ll [1], i.e. when
we use a VGPR tuple but only write to its lower parts. The high VGPRs
would then not be included in the vgpr_count, and accessing them would
be an out of bounds violation.

[1]
https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/AMDGPU/coalescer_remat.ll
2025-06-13 12:48:24 +02:00
Aaron Ballman
9eef4d1c5f Remove delayed typo expressions (#143423)
This removes the delayed typo correction functionality from Clang
(regular typo correction still remains) due to fragility of the
solution.

An RFC was posted here:
https://discourse.llvm.org/t/rfc-removing-support-for-delayed-typo-correction/86631
and while that RFC was asking for folks to consider stepping up to be
maintainers, and we did have a few new contributors show some interest,
experiments show that it's likely worth it to remove this functionality
entirely and focus efforts on improving regular typo correction.

This removal fixes ~20 open issues (quite possibly more), improves
compile time performance by roughly .3-.4%
(https://llvm-compile-time-tracker.com/?config=Overview&stat=instructions%3Au&remote=AaronBallman&sortBy=date),
and does not appear to regress diagnostic behavior in a way we wouldn't
find acceptable.

Fixes #142457
Fixes #139913
Fixes #138850
Fixes #137867
Fixes #137860
Fixes #107840
Fixes #93308
Fixes #69470
Fixes #59391
Fixes #58172
Fixes #46215
Fixes #45915
Fixes #45891
Fixes #44490
Fixes #36703
Fixes #32903
Fixes #23312
Fixes #69874
2025-06-13 06:45:40 -04:00
David Sherwood
541e5118ce [LV] Use getFixedValue instead of getKnownMinValue when appropriate (#143526)
There are many places in VPlan and LoopVectorize where we use
getKnownMinValue to discover the number of elements in a vector. Where
we expect the vector to have a fixed length, I have used the stronger
getFixedValue call. I believe this is clearer and adds extra protection
in the form of an assert in getFixedValue that the vector is not
scalable.

While looking at VPFirstOrderRecurrencePHIRecipe::computeCost I also
took the liberty of simplifying the code.

In theory I believe this patch should be NFC, but I'm reluctant to add
that to the title in case we're just missing tests for some of the VPlan
changes. I built and ran the LLVM test suite when targeting neoverse-v1
and it seemed ok.
2025-06-13 11:43:50 +01:00
Nikita Popov
2019553a0b [InstCombine] Extract EmitGEPOffsets() helper (NFC)
Extract a reusable helper for emitting a sum of multiple GEP
offsets.
2025-06-13 12:34:18 +02:00
Nikita Popov
6fc8ec720e [InstCombine] Restore splat gep support in OptimizePointerDifference() (#143906)
When looking for the common base pointer, support the case where the
type changes because the GEP goes from pointer to vector of pointers.
This was supported prior to #142958.
2025-06-13 12:29:50 +02:00
Simon Pilgrim
e2c27fd66a [X86] X86FixupInstTuning - hoist OptSize flag. NFC.
Allow reuse in a future patch.
2025-06-13 11:11:01 +01:00
Simon Pilgrim
058602372e [X86] X86FixupInstTuning - fold BLENDPS -> MOVSD (#144029)
Reduces codesize - make use of free PS<->PD domain transfers (like we do in many other places) and replace a suitable BLENDPS mask with MOVSD if OptSize or the scheduler prefers it
2025-06-13 11:05:57 +01:00
SivanShani-Arm
5762491e2a [lld] Refactor storage of PAuth ABI core info (#141920)
Previously, the AArch64 PAuth ABI core values were stored as an
ArrayRef<uint8_t>, introducing unnecessary indirection.

This patch replaces the ArrayRef with two explicit uint64_t fields:
aarch64PauthAbiPlatform and aarch64PauthAbiVersion. This simplifies the
representation and improves readability.

No functional change intended, aside from improved error messages.
2025-06-13 11:02:33 +01:00
David Spickett
06c7835670 [lldb][test] Disable TestMultipleDebuggers again
I did manage to turn a crash into a non-zero return code,
but on the very first build it managed to time out.

I thought I had the appetite to tweak timeouts but
on second thought, I don't want yet another test to look
out for.

The test is not wrong, but on heavily loaded machines
it's always going to be inherently unstable.
2025-06-13 09:12:01 +00:00
Tim Gymnich
67c590004d [mlir][AMDGPU] Add scaled floating point conversion ops (#141554)
implement `ScaledExtPackedOp` and `PackedScaledTruncOp`
2025-06-13 11:09:11 +02:00
Simone Pellegrini
4b59b7b946 [mlir][Linalg] Fix fusing of indexed linalg consumer with different axes (#140892)
When fusing two `linalg.genericOp`, where the producer has index
semantics, invalid `affine.apply` ops can be generated where the number
of indices do not match the number of loops in the fused genericOp.

This patch fixes the issue by directly using the number of loops from
the generated fused op.
2025-06-13 10:03:09 +01:00
David Sherwood
2d49bc01cf [LV][NFC] Tidy up check-prof-info.ll test (#143884) 2025-06-13 10:02:27 +01:00
Sirui Mu
8ba62fdb3d [CIR] Function calls with aggregate arguments and return values (#143377)
This patch updates cir.call operation and allows function calls with
aggregate arguments and return values.

It seems that C++ class support is still at a minimum now. I tried to
make a call to a C++ function with an argument of aggregate type but it
failed because the initialization of C++ class / struct is NYI. I also
tried to inline this part of support into this patch, but the mixed
patch quickly blows in size and becomes unsuitable for review. Thus,
tests for calling functions with aggregate arguments are added only for
C for now.
2025-06-13 16:47:56 +08:00
David Spickett
addd98f7a5 [lldb][test] Don't call SBDebugger::Terminate if TestMultipleDebuggers times out (#143732)
Fixes #101162

This test did this:
* SBDebugger::Initialize
* Spawn a bunch of threads that do:
  * SBDebugger::Create
  * some work
  * SBDebugger::Destroy
* Wait on those threads to finish then call SBDebugger::Terminate and
exit, or -
* Reach a time limit before all the threads finish, call
SBDebugger::Terminate and exit.

The problem was that in the timeout case, calling SBDebugger::Terminate
destroys data being used by threads that are still running. I expect
this test was expecting said threads to be so broken they were probably
stuck, but when the machine is just heavily loaded, one of them might
read that data before the whole program exits.

This means what should have been a timeout becomes a crash. Sometimes.
Which explains why we saw both timeouts and various signals on the
AArch64 Linux bot. It depends on the timings.

So I'm changing it not to call SBDebugger::Terminate in the timeout
case. We will have to tweak the timeout value based on what happens on
the buildbot, but we will know it's machine load not an lldb bug.

Also use _exit instead of exit, to skip more cleanup that might cause a
crash.
2025-06-13 09:31:57 +01:00
Orlando Cazalet-Hyams
0cf333878d [NFC] Pack MDNodeKeyImpl<DILocation> from 40 to 32 bytes (#143891) 2025-06-13 09:26:08 +01:00
Sander de Smalen
d4826cd324 [AArch64] Observe Z-reg inline asm clobbers without SVE (#143742)
inline asm that clobbers any of the z-registers when not in streaming
mode, should still observe that the lower 128 bits of those registers
are clobbered.
2025-06-13 09:07:09 +01:00
LU-JOHN
c4caf00bfb [AMDGPU] Convert more 64-bit lshr to 32-bit if shift amt>=32 (#138204)
Convert vector 64-bit lshr to 32-bit if shift amt is known to be >= 32.
Also convert scalar 64-bit lshr to 32-bit if shift amt is variable but
known to be >=32.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-06-13 17:03:06 +09:00
Shamshura Egor
02b6ed0bf1 [Clang] Added explanation why is_constructible evaluated to false. (#143309)
Added explanation why a is constructible evaluated to false. Also fixed
problem with ```ExtractTypeTraitFromExpression```. In case
```std::is_xxx_v<>``` with variadic pack it tries to get template
argument, but fails in expression ```Arg.getAsType()``` due to
```Arg.getKind() == TemplateArgument::ArgKind::Pack```, but not
```TemplateArgument::ArgKind::Type```.
2025-06-13 09:53:15 +02:00
Simon Pilgrim
cd3d234868 [X86] X86FixupInstTuning - extend BLENDPD/S -> MOVSD/S handling to SSE variant (#143961) 2025-06-13 08:52:48 +01:00
Longsheng Mou
02f1f6967a [mlir][linalg] Add pure tensor check for winogradConv2DHelper (#142299)
This PR adds pure tensor semantics check for `winogradConv2DHelper` to
prevent a crash. Fixes #141566.
2025-06-13 15:49:54 +08:00