Commit Graph

562683 Commits

Author SHA1 Message Date
Nikolas Klauser
57aab63417 [libc++] Fix std::for_each(associative-container) not using std:invoke and projections (#171984)
#164405 added specializations of `for_each` that didn't do the ranges
call shenanigans, but instead just did what the classic algorithms have
to do. This updates the calls to work for the ranges overloads as well.
2025-12-15 12:05:59 +01:00
Srinivasa Ravi
7d0865122e [clang][NVPTX] Add support for mixed-precision FP arithmetic (#168359)
This change adds support for mixed precision floating point 
arithmetic for `f16` and `bf16` where the following patterns:
```
%fh = fpext half %h to float
%resfh = fp-operation(%fh, ...)
...
%fb = fpext bfloat %b to float
%resfb = fp-operation(%fb, ...)

where the fp-operation can be any of:
- fadd
- fsub
- llvm.fma.f32
- llvm.nvvm.add(/fma).*
```
are lowered to the corresponding mixed precision instructions which 
combine the conversion and operation into one instruction from 
`sm_100` onwards.

This also adds the following intrinsics to complete support for 
all variants of the floating point `add/fma` operations in order 
to support the corresponding mixed-precision instructions:
- `llvm.nvvm.add.(rn/rz/rm/rp){.ftz}.sat.f`
- `llvm.nvvm.fma.(rn/rz/rm/rp){.ftz}.sat.f`

We lower `fneg` followed by one of the above addition
intrinsics to the corresponding `sub` instruction.

Tests are added in `fp-arith-sat.ll` , `fp-fold-sub.ll`, and
`bultins-nvptx.c`
for the newly added intrinsics and builtins, and in
`mixed-precision-fp.ll`
for the mixed precision instructions.

PTX spec reference for mixed precision instructions:
https://docs.nvidia.com/cuda/parallel-thread-execution/#mixed-precision-floating-point-instructions
2025-12-15 16:28:23 +05:30
Ramkumar Ramachandra
0636225b93 [VPlan] Directly unroll VectorPointerRecipe (#168886)
In an effort to get rid of VPUnrollPartAccessor and directly unroll
recipes, start by directly unrolling VectorPointerRecipe, allowing for
VPlan-based simplifications and simplification of the corresponding
execute.
2025-12-15 10:54:06 +00:00
Ivan Butygin
b3ec8be22b [mlir][gpu] Expose some utility functions from gpu-to-binary infra (#172205)
For people who do not want to use a single monolithic pass.
2025-12-15 13:39:19 +03:00
Andrei Topala
4e95718a2a [libc++] Remove unused __parent_pointer alias from __tree and map (#172185)
The `__parent_pointer` type alias was marked to be removed in
d163ab3323.
At that time, <map> still had uses of `__parent_pointer` as a local
variable type in operator[] and at()

Those uses were removed in 4a2dd31f16,
which refactored `__find_equal` to return a pair instead of using an out
parameter

However, the typedef in <map> and the alias in __tree were left behind

This patch removes the unused typedef from <map> and the
`__parent_pointer` alias from __tree

Signed-off-by: Krechals <topala.andrei@gmail.com>
2025-12-15 11:32:03 +01:00
Ahmed Nour
ed79fd714f [Clang][x86]: allow PCLMULQDQ intrinsics to be used in constexpr (#169214)
Resolves #168741
2025-12-15 10:27:17 +00:00
Petar Avramovic
f024026a21 AMDGPU/GlobalISel: Regbanklegalize for G_CONCAT_VECTORS (#171471)
RegBankLegalize using trivial mapping helper, assigns same reg bank
to all operands, vgpr or sgpr.
Uncovers multiple codegen and regbank combiner regressions related to
looking through sgpr to vgpr copies.
Skip regbankselect-concat-vector.mir since agprs are not yet supported.
2025-12-15 10:37:40 +01:00
Ingo Müller
f3e508ceec [mlir:bazel] Fix missing dependency introduced in #171727. (#172267)
That PR added an include to `LLVMOps.td` without adding a target
providing that file. Curiously, this does not break the official builds
but it *does* break my bazel build.

Signed-off-by: Ingo Müller <ingomueller@google.com>
2025-12-15 09:33:47 +00:00
Michael Buch
90783f5c4a [lldb][AppleObjCDeclVendor] Fix format specifiers when printing log (#172263)
This was causing a crash when enabling the expression log:
```
4   LLDB                          	       0x1376d68d0 llvm::formatv_object_base::parseFormatString(llvm::StringRef, unsigned long, bool) + 532
5   LLDB                          	       0x13776d838 llvm::formatv_object_base::format(llvm::raw_ostream&) const + 84
6   LLDB                          	       0x13776d7d4 llvm::raw_ostream::operator<<(llvm::formatv_object_base const&) + 36
7   LLDB                          	       0x1375f4980 lldb_private::Log::Format(llvm::StringRef, llvm::StringRef, llvm::formatv_object_base const&) + 164
8   LLDB                          	       0x12f7b39f0 lldb_private::AppleObjCExternalASTSource::CompleteType(clang::TagDecl*) + 416
9   LLDB                          	       0x12fa038dc lldb_private::ClangASTSource::FindExternalLexicalDecls(clang::DeclContext const*, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl*>&) + 1132
10  LLDB                          	       0x135d94838 clang::ExternalASTSource::FindExternalLexicalDecls(clang::DeclContext const*, llvm::SmallVectorImpl<clang::Decl*>&) + 92
11  LLDB                          	       0x135d94690 clang::DeclContext::LoadLexicalDeclsFromExternalStorage() const + 204
12  LLDB                          	       0x135d95ca0 clang::DeclContext::buildLookup() + 308
13  LLDB                          	       0x135d964b8 clang::DeclContext::lookupImpl(clang::DeclarationName, clang::DeclContext const*) const + 824
14  LLDB                          	       0x135d96168 clang::DeclContext::lookup(clang::DeclarationName) const + 124
15  LLDB                          	       0x134f093d4 clang::Sema::CheckImplicitSpecialMemberDeclaration(clang::Scope*, clang::FunctionDecl*) + 128
16  LLDB                          	       0x134efb488 clang::Sema::DeclareImplicitDestructor(clang::CXXRecordDecl*) + 932
17  LLDB                          	       0x1352ddf24 clang::Sema::LookupSpecialMember(clang::CXXRecordDecl*, clang::CXXSpecialMemberKind, bool, bool, bool, bool, bool)::$_0::operator()() const + 36
```
2025-12-15 09:28:19 +00:00
Fabrice de Gans
96881c1226 llvm: Export IndexedCodeGenDataLazyLoading (#169563)
This is needed so the llvm-cgdata tool properly builds with
`LLVM_BUILD_LLVM_DYLIB` so LLVM can be built as a DLL on Windows.

This effort is tracked in #109483.
2025-12-15 04:25:30 -05:00
Duncan Ogilvie
5785b4a4fb Add .gitignore file in .cache/clangd/index (#170003)
This solves a common issue where users have to manually add the
`.cache/clangd/index/` folder to their `.gitignore`. I got this idea
from [ruff](https://github.com/astral-sh/ruff), which creates
`.ruff_cache/.gitignore` and it would greatly improve the user
experience for everyone without requiring per-computer configurations
and without any significant cost.
2025-12-15 10:18:46 +01:00
Younan Zhang
a5bfe8e5c3 [Clang] Recompute the value category when rebuilding SubstNonTypeTemplateParmExpr (#172251)
In concept checking, we need to transform SubstNTTPExpr when evaluating
constraints.

The value category is initially computed during parameter mapping,
possibly with a dependent expression. However during instantiation, it
wasn't recomputed, and the stale category is propagated into parent
expressions. So we may end up with an 'out-of-thin-air' reference type,
which breaks the evaluation.

We now call BuildSubstNonTypeTemplateParmExpr in TreeTransform, in which
the value category is recomputed.

The issue was brought by both 078e99e and the concept normalization
patch, which are not released yet, so no release note.

Fixes https://github.com/llvm/llvm-project/issues/170856
2025-12-15 17:16:58 +08:00
Benjamin Maxwell
17f29c22ab [AArch64] Support lowering smaller than legal LOOP_DEP_MASKs to whilewr/rw (#171982)
This adds support for lowering smaller-than-legal masks such as:

```
<vscale x 8 x i1> @llvm.loop.dependence.war.mask.nxv8i1(ptr %a, ptr %b, i64 1)
```

To a whilewr + unpack. It also slightly simplifies the lowering.
2025-12-15 09:12:58 +00:00
Nikita Popov
80b900e91c [InstSimplify] Support ptrtoaddr in simplifyICmpInst() (#171985)
This is basically the same change as #162653, but for InstSimplify
instead of ConstantFolding.

It folds `icmp (ptrtoaddr x, ptrtoaddr y)` to `icmp (x, y)` and `icmp
(ptrtoaddr x, C)` to `icmp (x, inttoptr C)`.

The fold is restricted to the case where the result type is the address
type, as icmp only compares the icmp bits. As in the other PR, I think
in practice all the folds are also going to work if the ptrtoint result
type is larger than the address size, but it's unclear how to justify
this in general.
2025-12-15 09:06:28 +00:00
A. Jiang
37c7f695dc [libc++][char_traits] Applied [[nodiscard]] (#172244)
`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/char.traits
2025-12-15 16:38:01 +08:00
Timm Baeder
db557bee1e [clang][bytecode][NFC] Add Block::getBlockDesc<T>() (#172218)
Which returns the block-level descriptor. This way we don't have to do
the reinterpret_cast dance everywhere.
2025-12-15 09:21:41 +01:00
Nikita Popov
ce1b04720a [SelectOptimize] Respect optnone (#170858)
Add the missing skipFunction() call so that optnone attributes and
opt-bisect-limit is respected.
2025-12-15 09:21:02 +01:00
Juan Manuel Martinez Caamaño
c13bf9eb26 Reapply "[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST (#170323) (#171838)
A buildbot failed for the original patch.

https://github.com/llvm/llvm-project/pull/171835 addresses the issue
raised by the buildbot.
After the fix is merged, the original patch is reapplied without any
change.
2025-12-15 09:05:00 +01:00
David Green
e309272467 [AArch64][ARM] Regenerate llvm-mca tests. NFC 2025-12-15 07:28:21 +00:00
David Green
1e9e38983c [AArch64] Add a performBICiCombine function.
This moves the code out of PerformDAGCombine directly, changing the return
to return SDValue(N, 0) to match other uses of SimplifyDemandedBits.
2025-12-15 07:23:31 +00:00
Hristo Hristov
6ff3df87d1 [libc++][unordered_set] Applied [[nodiscard]] (#170435)
[[nodiscard]] should be applied to functions where discarding the return
value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/unord.set
2025-12-15 09:17:35 +02:00
Hristo Hristov
e22ff9b3d9 [libc++][unordered_multiset] Applied [[nodiscard]] (#171664)
`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.htm
- https://wg21.link/unord.multiset
2025-12-15 09:08:33 +02:00
Hristo Hristov
a5b7c42ab2 [libc++][unordered_multimap] Applied [[nodiscard]] (#171659)
`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.htm
- https://wg21.link/unord.multimap
2025-12-15 15:05:57 +08:00
Craig Topper
ffaa6f23fd [RISCV] Custom legalize i32 saddo/ssubo on RV64 to return a sign extended value for the data result. (#172112)
This is consistent with how we handle regular ADD/SUB and helps with
computeNumSignBits optimizations.

Fixes #172089
2025-12-14 22:33:01 -08:00
Craig Topper
7fa062ad58 [RISCV] Add BFloat16 to mangleRISCVFixedRVVVectorType. (#172095) 2025-12-14 22:32:03 -08:00
Craig Topper
c878cf4580 [SelectionDAG] Consistently use doxygen comments in the NodeType enum. NFC (#172178) 2025-12-14 22:31:47 -08:00
Lang Hames
61908c5957 [orc-rt] Prevent RTTIExtends from being used for errors. (#172250)
Custom error types (ErrorInfoBase subclasses) should use ErrorExtends as
of 8f51da369e. Adding a static_assert allows us to enforce that at
compile-time.
2025-12-15 17:21:25 +11:00
Henrich Lauko
5a581acb29 [CIR] Rename allEnumCasesCovered to all_enum_cases_covered (#172153)
Use the convetional snake_case for MLIR assembly and align with
operation documentation that already mentions snake_cased attribute.
2025-12-15 07:13:41 +01:00
Kevin Sala Penades
35315a84b4 [offload] Fix CUDA args size by subtracting tail padding (#172249)
This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.
2025-12-14 21:57:25 -08:00
David Green
35b23172c5 [AArch64] Support USDOT in performAddDotCombine (#171864)
This function does
// ADD(UDOT(zero, x, y), A) -->  UDOT(A, x, y)

Which can equally apply to USDOT too now that we have a node for it.
2025-12-15 05:46:43 +00:00
Philip Ginsbach-Chen
1d821b0c6b [AArch64] use isTRNMask to calculate shuffle costs (#171524)
This builds on #169858 to fix the divergence in codegen
(https://godbolt.org/z/a9az3h6oq) between two very similar
functions initially observed in #137447 (represented in the diff by test
cases `@transpose_splat_constants` and `@transpose_constants_splat`:
```
int8x16_t f(int8_t x)
{
  return (int8x16_t) { x, 0, x, 1, x, 2, x, 3,
                       x, 4, x, 5, x, 6, x, 7 };
}

int8x16_t g(int8_t x)
{
  return (int8x16_t) { 0, x, 1, x, 2, x, 3, x,
                       4, x, 5, x, 6, x, 7, x };
}
```

The PR uses an additional `isTRNMask` call in
`AArch64TTIImpl::getShuffleCost` to ensure that we treat shuffle masks
as transpose masks even if `isTransposeMask` fails to recognise them
(meaning that `Kind == TTI::SK_Transpose` cannot be relied upon).

Follow-up work could consider modifying `isTransposeMask`, but that
would also impact other backends than AArch64.
2025-12-15 05:34:30 +00:00
Lang Hames
8f51da369e [orc-rt] Add Error / Exception interop. (#172247)
The ORC runtime needs to work in diverse codebases, both with and
without C++ exceptions enabled (e.g. most LLVM projects compile with
exceptions turned off, but regular C++ codebases will typically have
them turned on). This introduces a tension in the ORC runtime: If a C++
exception is thrown (e.g. by a client-supplied callback) it can't be
ignored, but orc_rt::Error values will assert if not handled prior to
destruction. That makes the following pattern fundamentally unsafe in
the ORC runtime:

```
if (auto Err = orc_rt_operation(...)) {
  log("failure, bailing out"); // <- may throw if exceptions enabled
  // Exception unwinds stack before Error is handled, triggers Error-not-checked
  // assertion here.
  return Err;
}
```

We can resolve this tension by preventing any exceptions from unwinding
through ORC runtime stack frames. We can do this while preserving
exception *values* by catching all exceptions (using `catch (...)`) and
capturing their values as a std::exception_ptr into an Error.

This patch adds APIs to simplify conversion between C++ exceptions and
Errors. These APIs are available only when enabled when the ORC runtime
is configured with ORC_RT_ENABLE_EXCEPTIONS=On (the default).

- `ExceptionError` wraps a std::exception_ptr.

- `runCapturingExceptions` takes a T() callback and converts any
exceptions thrown by the body into Errors. If T is Expected or Error
already then runCapturingExceptions returns the same type. If T is void
then runCapturingExceptions returns an Error (returning Error::success()
if no exception is thrown). If T is any other type then
runCapturingExceptions returns an Expected<T>.

- A new Error::throwOnFailure method is added that converts failing
values into thrown exceptions according to the following rules:
1. If the Error is of type ExceptionError then std::rethrow_exception is
called on the contained std::exception_ptr to rethrow the original
exception value.
2. If the Error is of any other type then std::unique_ptr<T> is thrown
where T is the dynamic type of the Error.

These rules allow exceptions to be propagated through the ORC runtime as
Errors, and for ORC runtime errors to be converted to exceptions by
clients.
2025-12-15 16:10:46 +11:00
Brandon Wu
c24f66e33b [llvm][RISCV] Add bf16 vfabs and vfneg intrinsics for zvfbfa. (#172130)
These are pseudoinstruction aliases for vfsgnjx and vfsgnjn.

Co-authored-by: Craig Topper <craig.topper@sifive.com>
2025-12-15 13:04:21 +08:00
Hristo Hristov
9a03a30706 [libc++][unordered_map] Applied [[nodiscard]] (#170423)
[[nodiscard]] should be applied to functions where discarding the return
value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/unord.map
2025-12-15 05:56:19 +02:00
Shenghang Tsai
7ac01771c3 [mlir][ExecutionEngine] Remove stderr printing when propagating errors (#171997) 2025-12-14 19:46:13 -08:00
Lang Hames
00b92e3d81 [orc-rt] Add config.h.in (missing from 7ccf968d0b).
This file was accidentally left out of commit 7ccf968d0b.
2025-12-15 14:07:47 +11:00
Hristo Hristov
59fb3bc3e7 [libc++][pair] Applied [[nodiscard]] (#171999)
`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/pairs
2025-12-15 10:42:48 +08:00
Hristo Hristov
b6d940d9bc [libc++][multimap] Applied [[nodiscard]] (#171644)
`[[nodiscard]]` should be applied to functions where discarding the
return value is most likely a correctness issue.

- https://libcxx.llvm.org/CodingGuidelines.html
- https://wg21.link/multimap
2025-12-15 10:37:32 +08:00
Valentin Clement (バレンタイン クレメン)
b9d1432213 [flang-rt][device] Use snprintf result for length (#172239)
The buffer might not be null terminated on the device and result in 1
byte invalid read when trying to get the length.
2025-12-15 01:08:30 +00:00
Maksim Panchenko
adaca1348e [BOLT] Introduce getOutputBinaryFunctions(). NFCI (#172174)
To gain better control over the functions that go into the output file
and their order, introduce `BinaryContext::getOutputBinaryFunctions()`.

The new API returns a modifiable list of functions in output order.

This list is filled by a new `PopulateOutputFunctions` pass and includes
emittable functions from the input file, plus functions added by BOLT
(injected functions).

The new functionality allows to freely intermix input functions with
injected ones in the output, which will be used in new PRs.

The new function replaces `BinaryContext::getSortedFunctions()`, but
unlike its predecessor, it includes injected functions in the returned
list.
2025-12-14 16:29:01 -08:00
Lang Hames
ca81d7c2db [orc-rt] Ensure EH/RTTI=On overrides LLVM opts, applies to unit tests. (#172155)
When -DORC_RT_ENABLE_EXCEPTIONS=On and -DORC_RT_ENABLE_RTTI=On are
passed we need to ensure that the resulting compiler flags (e.g.
-fexceptions, -frtti for clang/GCC) are appended so that we override any
inherited options (e.g. -fno-exceptions, -fno-rtti) from LLVM.

Updates unit tests to ensure that these compiler options are applied to
them too.
2025-12-15 11:06:27 +11:00
Victor Chernyakin
21a25f44af [clang-tidy] Suggest std::views::reverse instead of std::ranges::reverse_view in modernize-use-ranges (#172199)
`std::views::FOO` should in almost all cases be preferred over
`std::ranges::FOO_view`. For a detailed explanation of why that is, see
https://brevzin.github.io/c++/2023/03/14/prefer-views-meow/. The TLDR is
that it's shorter to spell (which is obvious) and can in certain cases
be more efficient (which is less obvious; see the article if curious).
2025-12-14 16:02:36 -08:00
Haojian Wu
ecfdf8cb05 [bazel] One more fix for f785ca0d72 2025-12-14 23:44:23 +01:00
Raul Tambre
14c69497b3 Partially revert "[NFCI][lldb][test][asm] Enable AT&T syntax explicitly (#166770)" (#172233)
Flag changes reverted as those require the X86 target to be enabled.  
Don't have time to test fixes as I need to go to sleep so will revert for now.

Reverts: 423919d31f
2025-12-15 00:26:54 +02:00
Rolf Morel
f12fcf030c [MLIR][Transform][Python] transform.foreach wrapper and .owner OpViews (#172228)
Friendlier wrapper for transform.foreach.

To facilitate that friendliness, makes it so that OpResult.owner returns
the relevant OpView instead of Operation. For good measure, also changes
Value.owner to return OpView instead of Operation, thereby ensuring
consistency. That is, makes it is so that all op-returning .owner
accessors return OpView (and thereby give access to all goodies
available on registered OpViews.)

Reland of #171544 due to fixup for integration test.
2025-12-14 22:10:31 +00:00
Raul Tambre
423919d31f [NFCI][lldb][test][asm] Enable AT&T syntax explicitly (#166770)
Implementation files using the Intel syntax typically explicitly specify it.
Do the same for the few files using AT&T syntax.

This enables building LLVM with `-mllvm -x86-asm-syntax=intel` in one's Clang config files
(i.e. a global preference for Intel syntax).
2025-12-14 23:54:25 +02:00
Haojian Wu
bebc28a0ac [bazel] Port for f785ca0d72 2025-12-14 22:36:46 +01:00
Mehdi Amini
b9fe6532a7 Revert "[MLIR][Transform][Python] transform.foreach wrapper and .owner OpViews" (#172225)
Reverts llvm/llvm-project#171544 ; bots are broken.
2025-12-14 21:27:02 +00:00
Florian Hahn
bcbbe2c2bc [VPlan] Pass backedge value directly to FOR and reduction phis (NFC).
Pass backedge values directly to VPFirstOrderRecurrencePHIRecipe and
VPReductionPHIRecipe directly, as they must be provided and availbale.

Split off from https://github.com/llvm/llvm-project/pull/168291.
2025-12-14 20:59:22 +00:00
Rolf Morel
4cdec92827 [MLIR][Transform][Python] transform.foreach wrapper and .owner OpViews (#171544)
Friendlier wrapper for `transform.foreach`.

To facilitate that friendliness, makes it so that `OpResult.owner`
returns the relevant `OpView` instead of `Operation`. For good measure,
also changes `Value.owner` to return `OpView` instead of `Operation`,
thereby ensuring consistency. That is, makes it is so that all
op-returning `.owner` accessors return `OpView` (and thereby give access
to all goodies available on registered `OpView`s.)
2025-12-14 20:44:15 +00:00