Commit Graph

558642 Commits

Author SHA1 Message Date
Schrodinger ZHU Yifan
2dd77050d4 [libc] add cpu feature flags for SVE/SVE2/MOPS (#166884)
Add in SVE/SVE2/MOPS features for aarch64 cpus. These features may be
interesting for future memory/math routines.

SVE/SVE2 are now being accepted in more implementations:

```
❯ echo | clang-21 -dM -E - -march=native | grep -i ARM_FEAT
#define __ARM_FEATURE_ATOMICS 1
#define __ARM_FEATURE_BF16 1
#define __ARM_FEATURE_BF16_SCALAR_ARITHMETIC 1
#define __ARM_FEATURE_BF16_VECTOR_ARITHMETIC 1
#define __ARM_FEATURE_BTI 1
#define __ARM_FEATURE_CLZ 1
#define __ARM_FEATURE_COMPLEX 1
#define __ARM_FEATURE_CRC32 1
#define __ARM_FEATURE_DIRECTED_ROUNDING 1
#define __ARM_FEATURE_DIV 1
#define __ARM_FEATURE_DOTPROD 1
#define __ARM_FEATURE_FMA 1
#define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1
#define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1
#define __ARM_FEATURE_FRINT 1
#define __ARM_FEATURE_IDIV 1
#define __ARM_FEATURE_JCVT 1
#define __ARM_FEATURE_LDREX 0xF
#define __ARM_FEATURE_MATMUL_INT8 1
#define __ARM_FEATURE_NUMERIC_MAXMIN 1
#define __ARM_FEATURE_PAUTH 1
#define __ARM_FEATURE_QRDMX 1
#define __ARM_FEATURE_RCPC 1
#define __ARM_FEATURE_SVE 1
#define __ARM_FEATURE_SVE2 1
#define __ARM_FEATURE_SVE_BF16 1
#define __ARM_FEATURE_SVE_MATMUL_INT8 1
#define __ARM_FEATURE_SVE_VECTOR_OPERATORS 2
#define __ARM_FEATURE_UNALIGNED 1
```
MOPS is another set of extension for string operations, but may not be
generally available for now:
```
❯ echo | clang-21 -dM -E - -march=armv9.2a+mops | grep -i MOPS
#define __ARM_FEATURE_MOPS 1
```
2025-11-07 13:58:54 -05:00
Aiden Grossman
c6969e578a [Github][Bazel] Add Workflow to Run Bazel Build (#165071)
This patch adds a job to the bazel checks workflow to run the bazel
build/test. This patch only tests a couple projects just to get things
going. The plan is to expand to more projects eventually and setup a GCS
bucket for caching so jobs complete quickly by using cached artifacts.

This should add minimal load to the CI given the low frequency of bazel
PRs, and especially when we enable GCS based caching due to bazel's
effective use of caching. Google is also sponsoring the Linux Premerge
CI and is interested in having premerge bazel builds which is why it
makes sense to do premerge testing of this alternative build system
using those resources.
2025-11-07 10:41:26 -08:00
Valentin Clement (バレンタイン クレメン)
b4d7d3f745 [mlir][NVVM] Add nvvm.membar operation (#166698)
Add nvvm.membar operation with level as defined in
https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-membar

This will be used to replace direct intrinsic call in CUDA Fortran for
`threadfence()`, `threadfence_block` and `thread fence_system()`
currently lowered here:
e700f15702/flang/lib/Optimizer/Builder/CUDAIntrinsicCall.cpp (L1310)

The nvvm membar intrsinsic are also used in CUDA C/C++
(49f55f4991/clang/lib/Headers/__clang_cuda_device_functions.h (L528))
2025-11-07 10:39:01 -08:00
Dominik Adamski
67198d1997 [libc] Fix wrapper headers for at_quick_exit on GLIBC for C++11 (#166960)
Eliminate compilation error related to missing exception specification
'noexcept(true)' for at_quick_exit function in C++11.
2025-11-07 19:36:42 +01:00
hanbeom
50ba89a22e [VectorCombine] support mismatching extract/insert indices for foldInsExtFNeg (#126408)
insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index 
-> shuffle DestVec, (shuffle (fneg SrcVec), poison, SrcMask), Mask

In previous, the above transform was only possible if the Extract/Insert
Index was the same; this patch makes the above transform possible
even if the two indexes are different.

Proof: https://alive2.llvm.org/ce/z/aDfdyG
Fixes: https://github.com/llvm/llvm-project/issues/125675
2025-11-07 18:35:40 +00:00
LU-JOHN
b78f6fca38 [AMDGPU][NFC] Pre-commit shlN_add test results with sdag (#166636)
Pre-commit shlN_add test results with sdag.

---------

Signed-off-by: John Lu <John.Lu@amd.com>
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
2025-11-07 12:35:26 -06:00
Steven Wu
ebb61a5bea [CAS] Add llvm-cas tools to inspect on-disk LLVMCAS (#166481)
Add a command-line tool `llvm-cas` to inspect the OnDisk CAS for
debugging purpose. It can be used to lookup/update ObjectStore or
put/get cache entries from ActionCache, together with other debugging
capabilities.
2025-11-07 10:32:55 -08:00
Nicolai Hähnle
917d815d4e AMDGPU: Preliminary documentation for named barriers (#165502) 2025-11-07 18:10:59 +00:00
Chinmay Deshpande
4637bf0c76 [NFC][AMDGPU][GISel] Precommit GlobalISel specific tests for call instruction (#165898) 2025-11-07 10:10:17 -08:00
Ryotaro Kasuga
9e341b36ed [DA] Properly pass outermost loop to monotonicity checker (#166928)
This patch fixes the unexpected result in monotonicity check for
`@step_is_variant` in `monotonicity-no-wrap-flags.ll`. Currently, the
SCEV is considered non-monotonic if it contains an expression which is
neither loop-invariant nor an affine addrec. In `@step_is_variant`, the
`offset_i` satisfies this condition, but `offset_i + j` was classified
as monotonic.

The root cause is that a non-outermost loop was passed to monotonicity
checker instead of the outermost one. This patch ensures that the
correct outermost loop is passed.
2025-11-07 18:08:04 +00:00
Jonas Devlieghere
cce1055e48 [lldb] Correctly detach (rather than kill) when connecting with gdb-remote (#166869)
We weren't setting `m_should_detach` when going through the
`DoConnectRemote` code path. This meant that when you would attaches to
a remote process with `gdb-remote <port>` and use Ctrl+D, it would kill
the process instead of detach from it.

rdar://156111423
2025-11-07 18:07:38 +00:00
Matthias Springer
3740368529 [mlir][arith] Fix arith.select lowering after #166513 (#166692)
#166513 broke the lowering of `arith.select` with unsupported FP4 types.
For this op, it is fine to convert to `i4`.
2025-11-07 09:59:55 -08:00
Alex Langford
9cca883dd0 Revert "[NFCI][lldb][test] Avoid unnecessary GNU extension for assembly call" (#166970)
Reverts llvm/llvm-project#166769

Darwin platforms prefix symbols with `_`, other platforms don't
necessarily.
2025-11-07 17:38:45 +00:00
Tarun Prabhu
03d8184d65 [flang][NFC] Strip trailing whitespace from tests (1 of N)
Only the fortran source files in flang/test have been modified. The
other files in the directory will be cleaned up in subsequent commits
2025-11-07 10:29:33 -07:00
Simon Pilgrim
626cbf70f1 [X86] isGuaranteedNotToBeUndefOrPoison - add simple target shuffles with known test coverage (#161553)
Add a number of simple target shuffles (fixed shuffle mask or simple
immediate control) to
isGuaranteedNotToBeUndefOrPoison/canCreateUndefOrPoisonForTargetNode
that have known test coverage and obviously don't introduce
undef/poison.

These were found by adding an assert for unhandled target shuffles and
running over CodeGen/X86 - providing explicit test coverage is
incredibly difficult as ISD::VECTOR_SHUFFLE nodes will typically handle
freeze nodes before we lower to these target shuffle nodes.
2025-11-07 17:22:50 +00:00
Kazu Hirata
a3b5b4bd79 [clang] Proofread *.rst (#166897)
This patch is limited to single-word replacements to fix spelling
and/or grammar to ease the review process.  Punctuation and markdown
fixes are specifically excluded.
2025-11-07 08:57:18 -08:00
Michael Liao
f55b393ea0 [clang][CIR] Fix build. NFC
- 'getStmtExprResult' is removed after d9c7c76. Use the original one to
  get the compound stmt's expr result.
2025-11-07 11:52:26 -05:00
Tom Murray
9857791c44 [bazel] Add mlir/utils/generate-test-checks.py to bazel overlay (#160693) 2025-11-07 10:50:07 -06:00
A. Jiang
f090dd15a1 [libc++][test] Fix-up tests for is_clock(_v) (#166888)
This fixes incompleteness and inconsistency for test files added in
adc7932461, by
- renaming
`libcxx/test/std/time/time.traits.is.clock/trait.is.clock.compile.pass.cpp`
to `libcxx/test/std/time/time.traits/is.clock.compile.pass.cpp`,
- renaming
`libcxx/test/libcxx/time/time.traits.is.clock/trait.is.clock.compile.verify.cpp`
to `libcxx/test/libcxx/time/time.traits/is.clock.verify.cpp` , and
- adding comments clarifying what are being tested.
2025-11-08 00:45:22 +08:00
Peter Klausler
1baf7dbed2 [flang][runtime] Allow some list-directed child output to advance (#166847)
List-directed child input is allowed to advance to new records in some
circumstances, and list-directed output should be as well so that e.g.
NAMELIST output via a defined WRITE(FORMATTED) generic doesn't get
truncated by FORT_FMT_RECL.

Fixes https://github.com/llvm/llvm-project/issues/166804.
2025-11-07 08:42:04 -08:00
Peter Klausler
3d0ae1e78a [flang] Improve warning text (#166407)
When an overflow or other floating-point exception occurs at compilation
time while folding a conversion of a math library call to a smaller
type, don't confuse the user by mentioning the conversion; just note
that the exception was noted while folding the intrinsic function.
2025-11-07 08:41:34 -08:00
Peter Klausler
b3b4ea18ac [flang] Explicit interface externals are constant expressions (#166181)
... but the constant expression test didn't allow for them, so they
weren't working in initializers.
2025-11-07 08:41:05 -08:00
Steven Wu
093f947202 [CAS] Fix wrong usage of llvm::sort() in UnifiedOnDiskCache (#166963)
Fix compare function in getAllDBDirs(). The compare function in sort
should be strictly less than operator.
2025-11-07 16:36:41 +00:00
agozillon
a7c0e78fa1 [Flang][OpenMP] Unify MapInfoFinalization's BoxChar handling with other Box types (#165954)
Currently we handle BoxChars separately and a little differently to the
other BoxType's, however realistically they can be handled the same and
should be to simplify the pass as much as we can.
2025-11-07 17:18:56 +01:00
Kazu Hirata
80a5332839 [mlir] Remove redundant declarations (NFC) (#166896)
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

The comments for these variables are also present in:

  mlir/include/mlir/Dialect/Bufferization/IR/BufferizationBase.td

Identified with readability-redundant-declaration.
2025-11-07 07:58:48 -08:00
Kazu Hirata
de4d953246 [Demangle] Remove redundant declarations (NFC) (#166895)
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

Identified with readability-redundant-declaration.
2025-11-07 07:58:40 -08:00
Kazu Hirata
563ea29932 [clang-tools-extra] Remove redundant declarations (NFC) (#166894)
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

Identified with readability-redundant-declaration.
2025-11-07 07:58:32 -08:00
Kazu Hirata
bddab8359e [BOLT] Remove redundant declarations (NFC) (#166893)
In C++17, static constexpr members are implicitly inline, so they no
longer require an out-of-line definition.

Identified with readability-redundant-declaration.
2025-11-07 07:58:24 -08:00
Damian Heaton
70f4b596cf Add llvm.vector.partial.reduce.fadd intrinsic (#159776)
With this intrinsic, and supporting SelectionDAG nodes, we can better
make use of instructions such as AArch64's `FDOT`.
2025-11-07 15:36:54 +00:00
RolandF77
411ea8e9dd [PowerPC] Lowering support for EVL type VP_LOAD/VP_STORE (#165910)
Map EVL type VP_LOAD/VP_STORE for fixed length vectors to PPC load/store
with length.
2025-11-07 10:27:46 -05:00
LU-JOHN
67d0f181f4 [AMDGPU] Delete redundant s_or_b32 (#165261)
Transform sequences like:

```
s_cselect_b64 s[12:13], -1, 0
s_or_b32 s6, s12, s13
```

where s6 is dead to: 

`s_cselect_b64 s[12:13], -1, 0`

---------

Signed-off-by: John Lu <John.Lu@amd.com>
2025-11-07 09:27:20 -06:00
Jonathan Thackray
7377ac037d [AArch64][llvm] Add support for Neon vmmlaq_{f16,f32}_mf8_fpm intrinsics (#165431)
Add support for the following new AArch64 Neon intrinsics:
```
float16x8_t vmmlaq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t, fpm_t);
float32x4_t vmmlaq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t);
```
2025-11-07 15:24:13 +00:00
Björn Schäpers
bcb1b773f6 [clang-format] Add option to separate comment alignment between ... (#165033)
normal lines and PP directives.

Handling PP directives differently can be desired, like in #161848.
Changing the default is not an option, there are tests for exactly the
current behaviour.
2025-11-07 15:12:30 +00:00
Benjamin Maxwell
21aa788ae0 [AArch64][CostModel] Replace undef with poison in sve-arith-fp.ll (NFC) (#166930)
`undef` values are now deprecated (see
https://llvm.org/docs/UndefinedBehavior.html#undef-values). Updating
this file to avoid triggering the `undef` deprecation warning on future
changes.
2025-11-07 15:04:21 +00:00
Jonathan Thackray
9a8781b86f [AArch64][llvm] Add support for new vcvt* intrinsics (#163572)
Add support for these new vcvt* intrinsics:

```
  int64_t  vcvts_s64_f32(float32_t);
  uint64_t vcvts_u64_f32(float32_t);
  int32_t  vcvtd_s32_f64(float64_t);
  uint32_t vcvtd_u32_f64(float64_t);

  int64_t  vcvtns_s64_f32(float32_t);
  uint64_t vcvtns_u64_f32(float32_t);
  int32_t  vcvtnd_s32_f64(float64_t);
  uint32_t vcvtnd_u32_f64(float64_t);

  int64_t  vcvtms_s64_f32(float32_t);
  uint64_t vcvtms_u64_f32(float32_t);
  int32_t  vcvtmd_s32_f64(float64_t);
  uint32_t vcvtmd_u32_f64(float64_t);

  int64_t  vcvtps_s64_f32(float32_t);
  uint64_t vcvtps_u64_f32(float32_t);
  int32_t  vcvtpd_s32_f64(float64_t);
  uint32_t vcvtpd_u32_f64(float64_t);

  int64_t  vcvtas_s64_f32(float32_t);
  uint64_t vcvtas_u64_f32(float32_t);
  int32_t  vcvtad_s32_f64(float64_t);
  uint32_t vcvtad_u32_f64(float64_t);
```
2025-11-07 14:56:29 +00:00
Florian Hahn
ac047f2bd2 [InstCombnine] Add test for sinking with dereferneceable assumes.
Add tests showing sinking and dropping dereferenceable assumes prevents
vectorization.
2025-11-07 14:41:17 +00:00
Paul Walker
050339b94a [Clang] Fix comment typo in BuiltinTargetFeatures.h 2025-11-07 14:40:23 +00:00
Mehdi Amini
037fd30562 Revert "[NVGPU] Fix nvdsl examples" (#166943)
Reverts llvm/llvm-project#156830

This broke the bots.
2025-11-07 15:36:44 +01:00
KaiWeng
d9c7c76269 Revert "Ignore trailing NullStmts in StmtExprs for GCC compatibility." (#166036)
This reverts commit b1e511bf5a.

https://github.com/llvm/llvm-project/issues/160243
Reverting because the GCC C front end is incorrect.

---------

Co-authored-by: Jim Lin <jim@andestech.com>
2025-11-07 09:30:53 -05:00
Rolf Morel
d78e0ded52 [MLIR][Transform][Python] Sync derived classes and their wrappers (#166871)
Updates the derived Op-classes for the main transform ops to have all
the arguments, etc, from the auto-generated classes. Additionally
updates and adds missing snake_case wrappers for the derived classes
which shadow the snake_case wrappers of the auto-generated classes,
which were hitherto exposed alongside the derived classes.
2025-11-07 14:04:53 +00:00
Florian Hahn
3ee2f07e17 [VPlan] Support multiple F(Max|Min)Num reductions. (#161735)
Generalize handleMaxMinNumReductions to handle any number of
F(Max|Min)Num reductions by collecting a vector of reductions to
convert.

We then add NaN checks for all of them, followed by adjusting the branch
controlling the vector loop region, and updating the resume phis.

Addresses a TODO from https://github.com/llvm/llvm-project/pull/148239

PR: https://github.com/llvm/llvm-project/pull/161735
2025-11-07 13:59:06 +00:00
lonely eagle
281e3844f6 [mlir] Use LDBG to replace LLVM_DEBUG in IntegerRelation.cpp (NFC) (#166772) 2025-11-07 21:51:05 +08:00
nerix
311d115ed8 [LLDB] Run MSVC STL string(-view) tests with PDB (#166833)
PDB doesn't include the typedefs for types, so all types use their full
name. For `std::string` and friends, this means they show up as
`std::basic_string<char, std::char_traits<char>, std::allocator<char>>`.

This PR updates the `std::{,w,u8,u16,u32}string(_view)` tests to account
for this and runs them with PDB.
2025-11-07 14:16:44 +01:00
Twice
7ac6a95a11 [MLIR][Pygments] Refine the pygments MLIR lexer (#166406)
Recently, the MLIR website added API documentation for the Python
bindings generated via Sphinx
([https://mlir.llvm.org/python-bindings/](https://mlir.llvm.org/python-bindings/)).
In
[https://github.com/llvm/mlir-www/pull/245](https://github.com/llvm/mlir-www/pull/245),
I introduced the Pygments lexer from the MLIR repository to enable
syntax highlighting for MLIR code blocks in these API docs.

However, since the existing Pygments lexer was fairly minimal, it didn’t
fully handle all aspects of the MLIR syntax, leading to imperfect
highlighting in some cases. In this PR, I used ChatGPT to rewrite the
lexer by combining it with the TextMate grammar for MLIR
([https://github.com/llvm/llvm-project/blob/main/mlir/utils/textmate/mlir.json](https://github.com/llvm/llvm-project/blob/main/mlir/utils/textmate/mlir.json)).
After some manual adjustments, the results look good—so I’m submitting
this to improve the syntax highlighting experience in the Python
bindings API documentation.
2025-11-07 21:12:28 +08:00
hev
cdc3cb2054 [LoongArch] Add isSafeToMove hook to prevent unsafe instruction motion (#163725)
This patch introduces a new virtual method
`TargetInstrInfo::isSafeToMove()` to allow backends to control whether a
machine instruction can be safely moved by optimization passes.

The `BranchFolder` pass now respects this hook when hoisting common
code. By default, all instructions are considered safe to to move.

For LoongArch, `isSafeToMove()` is overridden to prevent
relocation-related instruction sequences (e.g. PC-relative addressing
and calls) from being broken by instruction motion. Correspondingly,
`isSchedulingBoundary()` is updated to reuse this logic for consistency.

Fixes #163681
2025-11-07 21:01:53 +08:00
Simon Pilgrim
3719c438dc [X86] Add some initial add i64 test coverage for #142308 (#166929)
Pulled from the abandoned #144066 patch
2025-11-07 12:28:58 +00:00
Krzysztof Parzyszek
3c81587f6a [OpenMP] Add definitions for DECLARE_INDUCTION and related clauses (#166235)
Add definitions for DECLARE_INDUCTION, COLLECTOR, and INDUCTOR to
OMP.td.
2025-11-07 06:13:55 -06:00
Roberto Turrado Camblor
c2fe1d94ee [X86][Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - add AVX512 KTEST/KORTEST intrinsics to be used in constexpr (#166103)
Add AVX512 KTEST/KORTEST intrinsics to be used in constexpr.

Fixes #162051
2025-11-07 11:26:47 +00:00
Karlo Basioli
d07a4fe12a [bazel][mlir] Fix transform_xegpu_ext.py test for bazel (#166924) 2025-11-07 12:18:43 +01:00
Giacomo Castiglioni
299df7ed25 [NVGPU] Fix nvdsl examples (#156830)
This PR aims at fixing the nvdsl examples which got a bit out of sync
not being tested in the CI.

The fixed bugs were related to the following PRs:
- move to nanobind #118583
- split gpu module initialization #135478
2025-11-07 16:23:08 +05:30