intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-13 02:38:07 +08:00

Author	SHA1	Message	Date
Nathan Corbyn	2f9bf3f292	[GlobalISel](NFC) Refactor construction of LLTs in `LegalizerHelper` (#170664 ) I spotted a number of places where we're duplicating logic provided by the `LLT` class inline in `LegalizerHelper`. This PR tidies up these spots.	2025-12-15 12:26:27 +00:00
int-zjt	72f3995363	[CodeExtractor] Optimize PHI incoming value removal using removeIncomingValueIf() (NFC) (#171956 )	2025-12-15 20:00:54 +08:00
int-zjt	c9c46a0820	[CloneFunction] Optimize PHI incoming value removal using reverse iteration (NFC) (#171955 )	2025-12-15 20:00:25 +08:00
David Spickett	9f176e30e6	[libcxx][docs] Fix boostrapping build configure command (#172015 ) If I take the command from the page and add my triple like so: $ cmake -G Ninja -S llvm -B build \ -DCMAKE_BUILD_TYPE=RelWithDebInfo \ -DLLVM_ENABLE_PROJECTS="clang" \ # Configure -DLLVM_ENABLE_RUNTIMES="libcxx;libcxxabi;libunwind;compiler-rt" \ -DLLVM_RUNTIME_TARGETS="aarch64-unknown-linux-gnu" CMake Warning: Ignoring extra path from command line: " " <...> -- Build files have been written to: /home/david.spickett/llvm-project/build -bash: -DLLVM_ENABLE_RUNTIMES=libcxx;libcxxabi;libunwind;compiler-rt: command not found As the comment is after the backslash, it's considered part of the next line. This comments out the ENABLE_RUNTIMES line and makes the RUNTIME_TARGETS line look like another command. To fix this, put the comment before the configure command. I also moved the other inline comments (which are fine) closer to the text since they don't have to line up with the configure one anymore.	2025-12-15 11:55:03 +00:00
Nashe Mncube	b225907804	[AArch64]Enable aggressive interleaving for A320 (#169825 ) This patch makes use of aggressive interleaving options for the A320 subtarget. This is done by adding a new local parameter to the AArch64Subtarget class. With this enabled we see an aggregate uplift of 0.7% on internal benchmark suites with up to 51% uplift on individual benchmark workloads.	2025-12-15 11:45:06 +00:00
David Spickett	10767aad89	[llvm][examples] Disable some JIT examples when threading is disabled (#172282 ) This fixes an error on our Armv8 bot: ``` <...>/RemoteJITUtils.cpp:132:24: error: use of undeclared identifier 'DynamicThreadPoolTaskDispatcher' 132 \| std::make_unique<DynamicThreadPoolTaskDispatcher>(std::nullopt), \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ``` These examples require LLVM_ENABLE_THREADS to be ON, and cannot run otherwise. As a comment says elsewhere: ``` // Out of process mode using SimpleRemoteEPC depends on threads. ```	2025-12-15 11:40:27 +00:00
Jay Foad	515c3bdda0	[AMDGPU] Stop handling soft waitcnts in pseudoToMCOpcode. NFC. (#172278 ) Since #87539 all soft waitcnts should have been promoted by SIInsertWaitcnts.	2025-12-15 11:33:55 +00:00
Nikolas Klauser	57aab63417	[libc++] Fix std::for_each(associative-container) not using std:invoke and projections (#171984 ) #164405 added specializations of `for_each` that didn't do the ranges call shenanigans, but instead just did what the classic algorithms have to do. This updates the calls to work for the ranges overloads as well.	2025-12-15 12:05:59 +01:00
Srinivasa Ravi	7d0865122e	[clang][NVPTX] Add support for mixed-precision FP arithmetic (#168359 ) This change adds support for mixed precision floating point arithmetic for `f16` and `bf16` where the following patterns: ``` %fh = fpext half %h to float %resfh = fp-operation(%fh, ...) ... %fb = fpext bfloat %b to float %resfb = fp-operation(%fb, ...) where the fp-operation can be any of: - fadd - fsub - llvm.fma.f32 - llvm.nvvm.add(/fma).* ``` are lowered to the corresponding mixed precision instructions which combine the conversion and operation into one instruction from `sm_100` onwards. This also adds the following intrinsics to complete support for all variants of the floating point `add/fma` operations in order to support the corresponding mixed-precision instructions: - `llvm.nvvm.add.(rn/rz/rm/rp){.ftz}.sat.f` - `llvm.nvvm.fma.(rn/rz/rm/rp){.ftz}.sat.f` We lower `fneg` followed by one of the above addition intrinsics to the corresponding `sub` instruction. Tests are added in `fp-arith-sat.ll` , `fp-fold-sub.ll`, and `bultins-nvptx.c` for the newly added intrinsics and builtins, and in `mixed-precision-fp.ll` for the mixed precision instructions. PTX spec reference for mixed precision instructions: https://docs.nvidia.com/cuda/parallel-thread-execution/#mixed-precision-floating-point-instructions	2025-12-15 16:28:23 +05:30
Ramkumar Ramachandra	0636225b93	[VPlan] Directly unroll VectorPointerRecipe (#168886 ) In an effort to get rid of VPUnrollPartAccessor and directly unroll recipes, start by directly unrolling VectorPointerRecipe, allowing for VPlan-based simplifications and simplification of the corresponding execute.	2025-12-15 10:54:06 +00:00
Ivan Butygin	b3ec8be22b	[mlir][gpu] Expose some utility functions from `gpu-to-binary` infra (#172205 ) For people who do not want to use a single monolithic pass.	2025-12-15 13:39:19 +03:00
Andrei Topala	4e95718a2a	[libc++] Remove unused __parent_pointer alias from __tree and map (#172185 ) The `__parent_pointer` type alias was marked to be removed in `d163ab3323`. At that time, <map> still had uses of `__parent_pointer` as a local variable type in operator[] and at() Those uses were removed in `4a2dd31f16`, which refactored `__find_equal` to return a pair instead of using an out parameter However, the typedef in <map> and the alias in __tree were left behind This patch removes the unused typedef from <map> and the `__parent_pointer` alias from __tree Signed-off-by: Krechals <topala.andrei@gmail.com>	2025-12-15 11:32:03 +01:00
Ahmed Nour	ed79fd714f	[Clang][x86]: allow PCLMULQDQ intrinsics to be used in constexpr (#169214 ) Resolves #168741	2025-12-15 10:27:17 +00:00
Petar Avramovic	f024026a21	AMDGPU/GlobalISel: Regbanklegalize for G_CONCAT_VECTORS (#171471 ) RegBankLegalize using trivial mapping helper, assigns same reg bank to all operands, vgpr or sgpr. Uncovers multiple codegen and regbank combiner regressions related to looking through sgpr to vgpr copies. Skip regbankselect-concat-vector.mir since agprs are not yet supported.	2025-12-15 10:37:40 +01:00
Ingo Müller	f3e508ceec	[mlir:bazel] Fix missing dependency introduced in #171727 . (#172267 ) That PR added an include to `LLVMOps.td` without adding a target providing that file. Curiously, this does not break the official builds but it does break my bazel build. Signed-off-by: Ingo Müller <ingomueller@google.com>	2025-12-15 09:33:47 +00:00
Michael Buch	90783f5c4a	[lldb][AppleObjCDeclVendor] Fix format specifiers when printing log (#172263 ) This was causing a crash when enabling the expression log: ``` 4 LLDB 0x1376d68d0 llvm::formatv_object_base::parseFormatString(llvm::StringRef, unsigned long, bool) + 532 5 LLDB 0x13776d838 llvm::formatv_object_base::format(llvm::raw_ostream&) const + 84 6 LLDB 0x13776d7d4 llvm::raw_ostream::operator<<(llvm::formatv_object_base const&) + 36 7 LLDB 0x1375f4980 lldb_private::Log::Format(llvm::StringRef, llvm::StringRef, llvm::formatv_object_base const&) + 164 8 LLDB 0x12f7b39f0 lldb_private::AppleObjCExternalASTSource::CompleteType(clang::TagDecl) + 416 9 LLDB 0x12fa038dc lldb_private::ClangASTSource::FindExternalLexicalDecls(clang::DeclContext const, llvm::function_ref<bool (clang::Decl::Kind)>, llvm::SmallVectorImpl<clang::Decl>&) + 1132 10 LLDB 0x135d94838 clang::ExternalASTSource::FindExternalLexicalDecls(clang::DeclContext const, llvm::SmallVectorImpl<clang::Decl>&) + 92 11 LLDB 0x135d94690 clang::DeclContext::LoadLexicalDeclsFromExternalStorage() const + 204 12 LLDB 0x135d95ca0 clang::DeclContext::buildLookup() + 308 13 LLDB 0x135d964b8 clang::DeclContext::lookupImpl(clang::DeclarationName, clang::DeclContext const) const + 824 14 LLDB 0x135d96168 clang::DeclContext::lookup(clang::DeclarationName) const + 124 15 LLDB 0x134f093d4 clang::Sema::CheckImplicitSpecialMemberDeclaration(clang::Scope, clang::FunctionDecl) + 128 16 LLDB 0x134efb488 clang::Sema::DeclareImplicitDestructor(clang::CXXRecordDecl) + 932 17 LLDB 0x1352ddf24 clang::Sema::LookupSpecialMember(clang::CXXRecordDecl, clang::CXXSpecialMemberKind, bool, bool, bool, bool, bool)::$_0::operator()() const + 36 ```	2025-12-15 09:28:19 +00:00
Fabrice de Gans	96881c1226	llvm: Export IndexedCodeGenDataLazyLoading (#169563 ) This is needed so the llvm-cgdata tool properly builds with `LLVM_BUILD_LLVM_DYLIB` so LLVM can be built as a DLL on Windows. This effort is tracked in #109483.	2025-12-15 04:25:30 -05:00
Duncan Ogilvie	5785b4a4fb	Add .gitignore file in .cache/clangd/index (#170003 ) This solves a common issue where users have to manually add the `.cache/clangd/index/` folder to their `.gitignore`. I got this idea from [ruff](https://github.com/astral-sh/ruff), which creates `.ruff_cache/.gitignore` and it would greatly improve the user experience for everyone without requiring per-computer configurations and without any significant cost.	2025-12-15 10:18:46 +01:00
Younan Zhang	a5bfe8e5c3	[Clang] Recompute the value category when rebuilding SubstNonTypeTemplateParmExpr (#172251 ) In concept checking, we need to transform SubstNTTPExpr when evaluating constraints. The value category is initially computed during parameter mapping, possibly with a dependent expression. However during instantiation, it wasn't recomputed, and the stale category is propagated into parent expressions. So we may end up with an 'out-of-thin-air' reference type, which breaks the evaluation. We now call BuildSubstNonTypeTemplateParmExpr in TreeTransform, in which the value category is recomputed. The issue was brought by both `078e99e` and the concept normalization patch, which are not released yet, so no release note. Fixes https://github.com/llvm/llvm-project/issues/170856	2025-12-15 17:16:58 +08:00
Benjamin Maxwell	17f29c22ab	[AArch64] Support lowering smaller than legal LOOP_DEP_MASKs to whilewr/rw (#171982 ) This adds support for lowering smaller-than-legal masks such as: ``` <vscale x 8 x i1> @llvm.loop.dependence.war.mask.nxv8i1(ptr %a, ptr %b, i64 1) ``` To a whilewr + unpack. It also slightly simplifies the lowering.	2025-12-15 09:12:58 +00:00
Nikita Popov	80b900e91c	[InstSimplify] Support ptrtoaddr in simplifyICmpInst() (#171985 ) This is basically the same change as #162653, but for InstSimplify instead of ConstantFolding. It folds `icmp (ptrtoaddr x, ptrtoaddr y)` to `icmp (x, y)` and `icmp (ptrtoaddr x, C)` to `icmp (x, inttoptr C)`. The fold is restricted to the case where the result type is the address type, as icmp only compares the icmp bits. As in the other PR, I think in practice all the folds are also going to work if the ptrtoint result type is larger than the address size, but it's unclear how to justify this in general.	2025-12-15 09:06:28 +00:00
A. Jiang	37c7f695dc	[libc++][char_traits] Applied `[[nodiscard]]` (#172244 ) `[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/char.traits	2025-12-15 16:38:01 +08:00
Timm Baeder	db557bee1e	[clang][bytecode][NFC] Add Block::getBlockDesc<T>() (#172218 ) Which returns the block-level descriptor. This way we don't have to do the reinterpret_cast dance everywhere.	2025-12-15 09:21:41 +01:00
Nikita Popov	ce1b04720a	[SelectOptimize] Respect optnone (#170858 ) Add the missing skipFunction() call so that optnone attributes and opt-bisect-limit is respected.	2025-12-15 09:21:02 +01:00
Juan Manuel Martinez Caamaño	c13bf9eb26	Reapply "[AMDGPU][SDAG] Add missing cases for SI_INDIRECT_SRC/DST (#170323 ) (#171838 ) A buildbot failed for the original patch. https://github.com/llvm/llvm-project/pull/171835 addresses the issue raised by the buildbot. After the fix is merged, the original patch is reapplied without any change.	2025-12-15 09:05:00 +01:00
David Green	e309272467	[AArch64][ARM] Regenerate llvm-mca tests. NFC	2025-12-15 07:28:21 +00:00
David Green	1e9e38983c	[AArch64] Add a performBICiCombine function. This moves the code out of PerformDAGCombine directly, changing the return to return SDValue(N, 0) to match other uses of SimplifyDemandedBits.	2025-12-15 07:23:31 +00:00
Hristo Hristov	6ff3df87d1	[libc++][unordered_set] Applied `[[nodiscard]]` (#170435 ) [[nodiscard]] should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/unord.set	2025-12-15 09:17:35 +02:00
Hristo Hristov	e22ff9b3d9	[libc++][unordered_multiset] Applied `[[nodiscard]]` (#171664 ) `[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.htm - https://wg21.link/unord.multiset	2025-12-15 09:08:33 +02:00
Hristo Hristov	a5b7c42ab2	[libc++][unordered_multimap] Applied `[[nodiscard]]` (#171659 ) `[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.htm - https://wg21.link/unord.multimap	2025-12-15 15:05:57 +08:00
Craig Topper	ffaa6f23fd	[RISCV] Custom legalize i32 saddo/ssubo on RV64 to return a sign extended value for the data result. (#172112 ) This is consistent with how we handle regular ADD/SUB and helps with computeNumSignBits optimizations. Fixes #172089	2025-12-14 22:33:01 -08:00
Craig Topper	7fa062ad58	[RISCV] Add BFloat16 to mangleRISCVFixedRVVVectorType. (#172095 )	2025-12-14 22:32:03 -08:00
Craig Topper	c878cf4580	[SelectionDAG] Consistently use doxygen comments in the NodeType enum. NFC (#172178 )	2025-12-14 22:31:47 -08:00
Lang Hames	61908c5957	[orc-rt] Prevent RTTIExtends from being used for errors. (#172250 ) Custom error types (ErrorInfoBase subclasses) should use ErrorExtends as of `8f51da369e`. Adding a static_assert allows us to enforce that at compile-time.	2025-12-15 17:21:25 +11:00
Henrich Lauko	5a581acb29	[CIR] Rename allEnumCasesCovered to all_enum_cases_covered (#172153 ) Use the convetional snake_case for MLIR assembly and align with operation documentation that already mentions snake_cased attribute.	2025-12-15 07:13:41 +01:00
Kevin Sala Penades	35315a84b4	[offload] Fix CUDA args size by subtracting tail padding (#172249 ) This commit makes the cuLaunchKernel call to pass the total arguments size without tail padding.	2025-12-14 21:57:25 -08:00
David Green	35b23172c5	[AArch64] Support USDOT in performAddDotCombine (#171864 ) This function does // ADD(UDOT(zero, x, y), A) --> UDOT(A, x, y) Which can equally apply to USDOT too now that we have a node for it.	2025-12-15 05:46:43 +00:00
Philip Ginsbach-Chen	1d821b0c6b	[AArch64] use `isTRNMask` to calculate shuffle costs (#171524 ) This builds on #169858 to fix the divergence in codegen (https://godbolt.org/z/a9az3h6oq) between two very similar functions initially observed in #137447 (represented in the diff by test cases `@transpose_splat_constants` and `@transpose_constants_splat`: ``` int8x16_t f(int8_t x) { return (int8x16_t) { x, 0, x, 1, x, 2, x, 3, x, 4, x, 5, x, 6, x, 7 }; } int8x16_t g(int8_t x) { return (int8x16_t) { 0, x, 1, x, 2, x, 3, x, 4, x, 5, x, 6, x, 7, x }; } ``` The PR uses an additional `isTRNMask` call in `AArch64TTIImpl::getShuffleCost` to ensure that we treat shuffle masks as transpose masks even if `isTransposeMask` fails to recognise them (meaning that `Kind == TTI::SK_Transpose` cannot be relied upon). Follow-up work could consider modifying `isTransposeMask`, but that would also impact other backends than AArch64.	2025-12-15 05:34:30 +00:00
Lang Hames	8f51da369e	[orc-rt] Add Error / Exception interop. (#172247 ) The ORC runtime needs to work in diverse codebases, both with and without C++ exceptions enabled (e.g. most LLVM projects compile with exceptions turned off, but regular C++ codebases will typically have them turned on). This introduces a tension in the ORC runtime: If a C++ exception is thrown (e.g. by a client-supplied callback) it can't be ignored, but orc_rt::Error values will assert if not handled prior to destruction. That makes the following pattern fundamentally unsafe in the ORC runtime: ``` if (auto Err = orc_rt_operation(...)) { log("failure, bailing out"); // <- may throw if exceptions enabled // Exception unwinds stack before Error is handled, triggers Error-not-checked // assertion here. return Err; } ``` We can resolve this tension by preventing any exceptions from unwinding through ORC runtime stack frames. We can do this while preserving exception values by catching all exceptions (using `catch (...)`) and capturing their values as a std::exception_ptr into an Error. This patch adds APIs to simplify conversion between C++ exceptions and Errors. These APIs are available only when enabled when the ORC runtime is configured with ORC_RT_ENABLE_EXCEPTIONS=On (the default). - `ExceptionError` wraps a std::exception_ptr. - `runCapturingExceptions` takes a T() callback and converts any exceptions thrown by the body into Errors. If T is Expected or Error already then runCapturingExceptions returns the same type. If T is void then runCapturingExceptions returns an Error (returning Error::success() if no exception is thrown). If T is any other type then runCapturingExceptions returns an Expected<T>. - A new Error::throwOnFailure method is added that converts failing values into thrown exceptions according to the following rules: 1. If the Error is of type ExceptionError then std::rethrow_exception is called on the contained std::exception_ptr to rethrow the original exception value. 2. If the Error is of any other type then std::unique_ptr<T> is thrown where T is the dynamic type of the Error. These rules allow exceptions to be propagated through the ORC runtime as Errors, and for ORC runtime errors to be converted to exceptions by clients.	2025-12-15 16:10:46 +11:00
Brandon Wu	c24f66e33b	[llvm][RISCV] Add bf16 vfabs and vfneg intrinsics for zvfbfa. (#172130 ) These are pseudoinstruction aliases for vfsgnjx and vfsgnjn. Co-authored-by: Craig Topper <craig.topper@sifive.com>	2025-12-15 13:04:21 +08:00
Hristo Hristov	9a03a30706	[libc++][unordered_map] Applied `[[nodiscard]]` (#170423 ) [[nodiscard]] should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/unord.map	2025-12-15 05:56:19 +02:00
Shenghang Tsai	7ac01771c3	[mlir][ExecutionEngine] Remove stderr printing when propagating errors (#171997 )	2025-12-14 19:46:13 -08:00
Lang Hames	00b92e3d81	[orc-rt] Add config.h.in (missing from `7ccf968d0b`). This file was accidentally left out of commit `7ccf968d0b`.	2025-12-15 14:07:47 +11:00
Hristo Hristov	59fb3bc3e7	[libc++][pair] Applied `[[nodiscard]]` (#171999 ) `[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/pairs	2025-12-15 10:42:48 +08:00
Hristo Hristov	b6d940d9bc	[libc++][multimap] Applied `[[nodiscard]]` (#171644 ) `[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html - https://wg21.link/multimap	2025-12-15 10:37:32 +08:00
Valentin Clement (バレンタインクレメン)	b9d1432213	[flang-rt][device] Use snprintf result for length (#172239 ) The buffer might not be null terminated on the device and result in 1 byte invalid read when trying to get the length.	2025-12-15 01:08:30 +00:00
Maksim Panchenko	adaca1348e	[BOLT] Introduce getOutputBinaryFunctions(). NFCI (#172174 ) To gain better control over the functions that go into the output file and their order, introduce `BinaryContext::getOutputBinaryFunctions()`. The new API returns a modifiable list of functions in output order. This list is filled by a new `PopulateOutputFunctions` pass and includes emittable functions from the input file, plus functions added by BOLT (injected functions). The new functionality allows to freely intermix input functions with injected ones in the output, which will be used in new PRs. The new function replaces `BinaryContext::getSortedFunctions()`, but unlike its predecessor, it includes injected functions in the returned list.	2025-12-14 16:29:01 -08:00
Lang Hames	ca81d7c2db	[orc-rt] Ensure EH/RTTI=On overrides LLVM opts, applies to unit tests. (#172155 ) When -DORC_RT_ENABLE_EXCEPTIONS=On and -DORC_RT_ENABLE_RTTI=On are passed we need to ensure that the resulting compiler flags (e.g. -fexceptions, -frtti for clang/GCC) are appended so that we override any inherited options (e.g. -fno-exceptions, -fno-rtti) from LLVM. Updates unit tests to ensure that these compiler options are applied to them too.	2025-12-15 11:06:27 +11:00
Victor Chernyakin	21a25f44af	[clang-tidy] Suggest `std::views::reverse` instead of `std::ranges::reverse_view` in `modernize-use-ranges` (#172199 ) `std::views::FOO` should in almost all cases be preferred over `std::ranges::FOO_view`. For a detailed explanation of why that is, see https://brevzin.github.io/c++/2023/03/14/prefer-views-meow/. The TLDR is that it's shorter to spell (which is obvious) and can in certain cases be more efficient (which is less obvious; see the article if curious).	2025-12-14 16:02:36 -08:00
Haojian Wu	ecfdf8cb05	[bazel] One more fix for `f785ca0d72`	2025-12-14 23:44:23 +01:00

1 2 3 4 5 ...

562690 Commits