intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-26 03:56:16 +08:00

Author	SHA1	Message	Date
Schrodinger ZHU Yifan	2dd77050d4	[libc] add cpu feature flags for SVE/SVE2/MOPS (#166884 ) Add in SVE/SVE2/MOPS features for aarch64 cpus. These features may be interesting for future memory/math routines. SVE/SVE2 are now being accepted in more implementations: ``` ❯ echo \| clang-21 -dM -E - -march=native \| grep -i ARM_FEAT #define __ARM_FEATURE_ATOMICS 1 #define __ARM_FEATURE_BF16 1 #define __ARM_FEATURE_BF16_SCALAR_ARITHMETIC 1 #define __ARM_FEATURE_BF16_VECTOR_ARITHMETIC 1 #define __ARM_FEATURE_BTI 1 #define __ARM_FEATURE_CLZ 1 #define __ARM_FEATURE_COMPLEX 1 #define __ARM_FEATURE_CRC32 1 #define __ARM_FEATURE_DIRECTED_ROUNDING 1 #define __ARM_FEATURE_DIV 1 #define __ARM_FEATURE_DOTPROD 1 #define __ARM_FEATURE_FMA 1 #define __ARM_FEATURE_FP16_SCALAR_ARITHMETIC 1 #define __ARM_FEATURE_FP16_VECTOR_ARITHMETIC 1 #define __ARM_FEATURE_FRINT 1 #define __ARM_FEATURE_IDIV 1 #define __ARM_FEATURE_JCVT 1 #define __ARM_FEATURE_LDREX 0xF #define __ARM_FEATURE_MATMUL_INT8 1 #define __ARM_FEATURE_NUMERIC_MAXMIN 1 #define __ARM_FEATURE_PAUTH 1 #define __ARM_FEATURE_QRDMX 1 #define __ARM_FEATURE_RCPC 1 #define __ARM_FEATURE_SVE 1 #define __ARM_FEATURE_SVE2 1 #define __ARM_FEATURE_SVE_BF16 1 #define __ARM_FEATURE_SVE_MATMUL_INT8 1 #define __ARM_FEATURE_SVE_VECTOR_OPERATORS 2 #define __ARM_FEATURE_UNALIGNED 1 ``` MOPS is another set of extension for string operations, but may not be generally available for now: ``` ❯ echo \| clang-21 -dM -E - -march=armv9.2a+mops \| grep -i MOPS #define __ARM_FEATURE_MOPS 1 ```	2025-11-07 13:58:54 -05:00
Aiden Grossman	c6969e578a	[Github][Bazel] Add Workflow to Run Bazel Build (#165071 ) This patch adds a job to the bazel checks workflow to run the bazel build/test. This patch only tests a couple projects just to get things going. The plan is to expand to more projects eventually and setup a GCS bucket for caching so jobs complete quickly by using cached artifacts. This should add minimal load to the CI given the low frequency of bazel PRs, and especially when we enable GCS based caching due to bazel's effective use of caching. Google is also sponsoring the Linux Premerge CI and is interested in having premerge bazel builds which is why it makes sense to do premerge testing of this alternative build system using those resources.	2025-11-07 10:41:26 -08:00
Valentin Clement (バレンタインクレメン)	b4d7d3f745	[mlir][NVVM] Add nvvm.membar operation (#166698 ) Add nvvm.membar operation with level as defined in https://docs.nvidia.com/cuda/parallel-thread-execution/#parallel-synchronization-and-communication-instructions-membar This will be used to replace direct intrinsic call in CUDA Fortran for `threadfence()`, `threadfence_block` and `thread fence_system()` currently lowered here: `e700f15702/flang/lib/Optimizer/Builder/CUDAIntrinsicCall.cpp (L1310)` The nvvm membar intrsinsic are also used in CUDA C/C++ (`49f55f4991/clang/lib/Headers/__clang_cuda_device_functions.h (L528)`)	2025-11-07 10:39:01 -08:00
Dominik Adamski	67198d1997	[libc] Fix wrapper headers for `at_quick_exit` on GLIBC for C++11 (#166960 ) Eliminate compilation error related to missing exception specification 'noexcept(true)' for at_quick_exit function in C++11.	2025-11-07 19:36:42 +01:00
hanbeom	50ba89a22e	[VectorCombine] support mismatching extract/insert indices for foldInsExtFNeg (#126408 ) insertelt DestVec, (fneg (extractelt SrcVec, Index)), Index -> shuffle DestVec, (shuffle (fneg SrcVec), poison, SrcMask), Mask In previous, the above transform was only possible if the Extract/Insert Index was the same; this patch makes the above transform possible even if the two indexes are different. Proof: https://alive2.llvm.org/ce/z/aDfdyG Fixes: https://github.com/llvm/llvm-project/issues/125675	2025-11-07 18:35:40 +00:00
LU-JOHN	b78f6fca38	[AMDGPU][NFC] Pre-commit shlN_add test results with sdag (#166636 ) Pre-commit shlN_add test results with sdag. --------- Signed-off-by: John Lu <John.Lu@amd.com> Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-11-07 12:35:26 -06:00
Steven Wu	ebb61a5bea	[CAS] Add llvm-cas tools to inspect on-disk LLVMCAS (#166481 ) Add a command-line tool `llvm-cas` to inspect the OnDisk CAS for debugging purpose. It can be used to lookup/update ObjectStore or put/get cache entries from ActionCache, together with other debugging capabilities.	2025-11-07 10:32:55 -08:00
Nicolai Hähnle	917d815d4e	AMDGPU: Preliminary documentation for named barriers (#165502 )	2025-11-07 18:10:59 +00:00
Chinmay Deshpande	4637bf0c76	[NFC][AMDGPU][GISel] Precommit GlobalISel specific tests for call instruction (#165898 )	2025-11-07 10:10:17 -08:00
Ryotaro Kasuga	9e341b36ed	[DA] Properly pass outermost loop to monotonicity checker (#166928 ) This patch fixes the unexpected result in monotonicity check for `@step_is_variant` in `monotonicity-no-wrap-flags.ll`. Currently, the SCEV is considered non-monotonic if it contains an expression which is neither loop-invariant nor an affine addrec. In `@step_is_variant`, the `offset_i` satisfies this condition, but `offset_i + j` was classified as monotonic. The root cause is that a non-outermost loop was passed to monotonicity checker instead of the outermost one. This patch ensures that the correct outermost loop is passed.	2025-11-07 18:08:04 +00:00
Jonas Devlieghere	cce1055e48	[lldb] Correctly detach (rather than kill) when connecting with gdb-remote (#166869 ) We weren't setting `m_should_detach` when going through the `DoConnectRemote` code path. This meant that when you would attaches to a remote process with `gdb-remote <port>` and use Ctrl+D, it would kill the process instead of detach from it. rdar://156111423	2025-11-07 18:07:38 +00:00
Matthias Springer	3740368529	[mlir][arith] Fix `arith.select` lowering after #166513 (#166692 ) #166513 broke the lowering of `arith.select` with unsupported FP4 types. For this op, it is fine to convert to `i4`.	2025-11-07 09:59:55 -08:00
Alex Langford	9cca883dd0	Revert "[NFCI][lldb][test] Avoid unnecessary GNU extension for assembly call" (#166970 ) Reverts llvm/llvm-project#166769 Darwin platforms prefix symbols with `_`, other platforms don't necessarily.	2025-11-07 17:38:45 +00:00
Tarun Prabhu	03d8184d65	[flang][NFC] Strip trailing whitespace from tests (1 of N) Only the fortran source files in flang/test have been modified. The other files in the directory will be cleaned up in subsequent commits	2025-11-07 10:29:33 -07:00
Simon Pilgrim	626cbf70f1	[X86] isGuaranteedNotToBeUndefOrPoison - add simple target shuffles with known test coverage (#161553 ) Add a number of simple target shuffles (fixed shuffle mask or simple immediate control) to isGuaranteedNotToBeUndefOrPoison/canCreateUndefOrPoisonForTargetNode that have known test coverage and obviously don't introduce undef/poison. These were found by adding an assert for unhandled target shuffles and running over CodeGen/X86 - providing explicit test coverage is incredibly difficult as ISD::VECTOR_SHUFFLE nodes will typically handle freeze nodes before we lower to these target shuffle nodes.	2025-11-07 17:22:50 +00:00
Kazu Hirata	a3b5b4bd79	[clang] Proofread *.rst (#166897 ) This patch is limited to single-word replacements to fix spelling and/or grammar to ease the review process. Punctuation and markdown fixes are specifically excluded.	2025-11-07 08:57:18 -08:00
Michael Liao	f55b393ea0	[clang][CIR] Fix build. NFC - 'getStmtExprResult' is removed after `d9c7c76`. Use the original one to get the compound stmt's expr result.	2025-11-07 11:52:26 -05:00
Tom Murray	9857791c44	[bazel] Add mlir/utils/generate-test-checks.py to bazel overlay (#160693 )	2025-11-07 10:50:07 -06:00
A. Jiang	f090dd15a1	[libc++][test] Fix-up tests for `is_clock(_v)` (#166888 ) This fixes incompleteness and inconsistency for test files added in `adc7932461`, by - renaming `libcxx/test/std/time/time.traits.is.clock/trait.is.clock.compile.pass.cpp` to `libcxx/test/std/time/time.traits/is.clock.compile.pass.cpp`, - renaming `libcxx/test/libcxx/time/time.traits.is.clock/trait.is.clock.compile.verify.cpp` to `libcxx/test/libcxx/time/time.traits/is.clock.verify.cpp` , and - adding comments clarifying what are being tested.	2025-11-08 00:45:22 +08:00
Peter Klausler	1baf7dbed2	[flang][runtime] Allow some list-directed child output to advance (#166847 ) List-directed child input is allowed to advance to new records in some circumstances, and list-directed output should be as well so that e.g. NAMELIST output via a defined WRITE(FORMATTED) generic doesn't get truncated by FORT_FMT_RECL. Fixes https://github.com/llvm/llvm-project/issues/166804.	2025-11-07 08:42:04 -08:00
Peter Klausler	3d0ae1e78a	[flang] Improve warning text (#166407 ) When an overflow or other floating-point exception occurs at compilation time while folding a conversion of a math library call to a smaller type, don't confuse the user by mentioning the conversion; just note that the exception was noted while folding the intrinsic function.	2025-11-07 08:41:34 -08:00
Peter Klausler	b3b4ea18ac	[flang] Explicit interface externals are constant expressions (#166181 ) ... but the constant expression test didn't allow for them, so they weren't working in initializers.	2025-11-07 08:41:05 -08:00
Steven Wu	093f947202	[CAS] Fix wrong usage of `llvm::sort()` in UnifiedOnDiskCache (#166963 ) Fix compare function in getAllDBDirs(). The compare function in sort should be strictly less than operator.	2025-11-07 16:36:41 +00:00
agozillon	a7c0e78fa1	[Flang][OpenMP] Unify MapInfoFinalization's BoxChar handling with other Box types (#165954 ) Currently we handle BoxChars separately and a little differently to the other BoxType's, however realistically they can be handled the same and should be to simplify the pass as much as we can.	2025-11-07 17:18:56 +01:00
Kazu Hirata	80a5332839	[mlir] Remove redundant declarations (NFC) (#166896 ) In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. The comments for these variables are also present in: mlir/include/mlir/Dialect/Bufferization/IR/BufferizationBase.td Identified with readability-redundant-declaration.	2025-11-07 07:58:48 -08:00
Kazu Hirata	de4d953246	[Demangle] Remove redundant declarations (NFC) (#166895 ) In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.	2025-11-07 07:58:40 -08:00
Kazu Hirata	563ea29932	[clang-tools-extra] Remove redundant declarations (NFC) (#166894 ) In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.	2025-11-07 07:58:32 -08:00
Kazu Hirata	bddab8359e	[BOLT] Remove redundant declarations (NFC) (#166893 ) In C++17, static constexpr members are implicitly inline, so they no longer require an out-of-line definition. Identified with readability-redundant-declaration.	2025-11-07 07:58:24 -08:00
Damian Heaton	70f4b596cf	Add `llvm.vector.partial.reduce.fadd` intrinsic (#159776 ) With this intrinsic, and supporting SelectionDAG nodes, we can better make use of instructions such as AArch64's `FDOT`.	2025-11-07 15:36:54 +00:00
RolandF77	411ea8e9dd	[PowerPC] Lowering support for EVL type VP_LOAD/VP_STORE (#165910 ) Map EVL type VP_LOAD/VP_STORE for fixed length vectors to PPC load/store with length.	2025-11-07 10:27:46 -05:00
LU-JOHN	67d0f181f4	[AMDGPU] Delete redundant s_or_b32 (#165261 ) Transform sequences like: ``` s_cselect_b64 s[12:13], -1, 0 s_or_b32 s6, s12, s13 ``` where s6 is dead to: `s_cselect_b64 s[12:13], -1, 0` --------- Signed-off-by: John Lu <John.Lu@amd.com>	2025-11-07 09:27:20 -06:00
Jonathan Thackray	7377ac037d	[AArch64][llvm] Add support for Neon vmmlaq_{f16,f32}_mf8_fpm intrinsics (#165431 ) Add support for the following new AArch64 Neon intrinsics: ``` float16x8_t vmmlaq_f16_mf8_fpm(float16x8_t, mfloat8x16_t, mfloat8x16_t, fpm_t); float32x4_t vmmlaq_f32_mf8_fpm(float32x4_t, mfloat8x16_t, mfloat8x16_t, fpm_t); ```	2025-11-07 15:24:13 +00:00
Björn Schäpers	bcb1b773f6	[clang-format] Add option to separate comment alignment between ... (#165033 ) normal lines and PP directives. Handling PP directives differently can be desired, like in #161848. Changing the default is not an option, there are tests for exactly the current behaviour.	2025-11-07 15:12:30 +00:00
Benjamin Maxwell	21aa788ae0	[AArch64][CostModel] Replace undef with poison in sve-arith-fp.ll (NFC) (#166930 ) `undef` values are now deprecated (see https://llvm.org/docs/UndefinedBehavior.html#undef-values). Updating this file to avoid triggering the `undef` deprecation warning on future changes.	2025-11-07 15:04:21 +00:00
Jonathan Thackray	9a8781b86f	[AArch64][llvm] Add support for new vcvt* intrinsics (#163572 ) Add support for these new vcvt* intrinsics: ``` int64_t vcvts_s64_f32(float32_t); uint64_t vcvts_u64_f32(float32_t); int32_t vcvtd_s32_f64(float64_t); uint32_t vcvtd_u32_f64(float64_t); int64_t vcvtns_s64_f32(float32_t); uint64_t vcvtns_u64_f32(float32_t); int32_t vcvtnd_s32_f64(float64_t); uint32_t vcvtnd_u32_f64(float64_t); int64_t vcvtms_s64_f32(float32_t); uint64_t vcvtms_u64_f32(float32_t); int32_t vcvtmd_s32_f64(float64_t); uint32_t vcvtmd_u32_f64(float64_t); int64_t vcvtps_s64_f32(float32_t); uint64_t vcvtps_u64_f32(float32_t); int32_t vcvtpd_s32_f64(float64_t); uint32_t vcvtpd_u32_f64(float64_t); int64_t vcvtas_s64_f32(float32_t); uint64_t vcvtas_u64_f32(float32_t); int32_t vcvtad_s32_f64(float64_t); uint32_t vcvtad_u32_f64(float64_t); ```	2025-11-07 14:56:29 +00:00
Florian Hahn	ac047f2bd2	[InstCombnine] Add test for sinking with dereferneceable assumes. Add tests showing sinking and dropping dereferenceable assumes prevents vectorization.	2025-11-07 14:41:17 +00:00
Paul Walker	050339b94a	[Clang] Fix comment typo in BuiltinTargetFeatures.h	2025-11-07 14:40:23 +00:00
Mehdi Amini	037fd30562	Revert "[NVGPU] Fix nvdsl examples" (#166943 ) Reverts llvm/llvm-project#156830 This broke the bots.	2025-11-07 15:36:44 +01:00
KaiWeng	d9c7c76269	Revert "Ignore trailing NullStmts in StmtExprs for GCC compatibility." (#166036 ) This reverts commit `b1e511bf5a`. https://github.com/llvm/llvm-project/issues/160243 Reverting because the GCC C front end is incorrect. --------- Co-authored-by: Jim Lin <jim@andestech.com>	2025-11-07 09:30:53 -05:00
Rolf Morel	d78e0ded52	[MLIR][Transform][Python] Sync derived classes and their wrappers (#166871 ) Updates the derived Op-classes for the main transform ops to have all the arguments, etc, from the auto-generated classes. Additionally updates and adds missing snake_case wrappers for the derived classes which shadow the snake_case wrappers of the auto-generated classes, which were hitherto exposed alongside the derived classes.	2025-11-07 14:04:53 +00:00
Florian Hahn	3ee2f07e17	[VPlan] Support multiple F(Max\|Min)Num reductions. (#161735 ) Generalize handleMaxMinNumReductions to handle any number of F(Max\|Min)Num reductions by collecting a vector of reductions to convert. We then add NaN checks for all of them, followed by adjusting the branch controlling the vector loop region, and updating the resume phis. Addresses a TODO from https://github.com/llvm/llvm-project/pull/148239 PR: https://github.com/llvm/llvm-project/pull/161735	2025-11-07 13:59:06 +00:00
lonely eagle	281e3844f6	[mlir] Use LDBG to replace LLVM_DEBUG in IntegerRelation.cpp (NFC) (#166772 )	2025-11-07 21:51:05 +08:00
nerix	311d115ed8	[LLDB] Run MSVC STL string(-view) tests with PDB (#166833 ) PDB doesn't include the typedefs for types, so all types use their full name. For `std::string` and friends, this means they show up as `std::basic_string<char, std::char_traits<char>, std::allocator<char>>`. This PR updates the `std::{,w,u8,u16,u32}string(_view)` tests to account for this and runs them with PDB.	2025-11-07 14:16:44 +01:00
Twice	7ac6a95a11	[MLIR][Pygments] Refine the pygments MLIR lexer (#166406 ) Recently, the MLIR website added API documentation for the Python bindings generated via Sphinx ([https://mlir.llvm.org/python-bindings/](https://mlir.llvm.org/python-bindings/)). In [https://github.com/llvm/mlir-www/pull/245](https://github.com/llvm/mlir-www/pull/245), I introduced the Pygments lexer from the MLIR repository to enable syntax highlighting for MLIR code blocks in these API docs. However, since the existing Pygments lexer was fairly minimal, it didn’t fully handle all aspects of the MLIR syntax, leading to imperfect highlighting in some cases. In this PR, I used ChatGPT to rewrite the lexer by combining it with the TextMate grammar for MLIR ([https://github.com/llvm/llvm-project/blob/main/mlir/utils/textmate/mlir.json](https://github.com/llvm/llvm-project/blob/main/mlir/utils/textmate/mlir.json)). After some manual adjustments, the results look good—so I’m submitting this to improve the syntax highlighting experience in the Python bindings API documentation.	2025-11-07 21:12:28 +08:00
hev	cdc3cb2054	[LoongArch] Add `isSafeToMove` hook to prevent unsafe instruction motion (#163725 ) This patch introduces a new virtual method `TargetInstrInfo::isSafeToMove()` to allow backends to control whether a machine instruction can be safely moved by optimization passes. The `BranchFolder` pass now respects this hook when hoisting common code. By default, all instructions are considered safe to to move. For LoongArch, `isSafeToMove()` is overridden to prevent relocation-related instruction sequences (e.g. PC-relative addressing and calls) from being broken by instruction motion. Correspondingly, `isSchedulingBoundary()` is updated to reuse this logic for consistency. Fixes #163681	2025-11-07 21:01:53 +08:00
Simon Pilgrim	3719c438dc	[X86] Add some initial add i64 test coverage for #142308 (#166929 ) Pulled from the abandoned #144066 patch	2025-11-07 12:28:58 +00:00
Krzysztof Parzyszek	3c81587f6a	[OpenMP] Add definitions for DECLARE_INDUCTION and related clauses (#166235 ) Add definitions for DECLARE_INDUCTION, COLLECTOR, and INDUCTOR to OMP.td.	2025-11-07 06:13:55 -06:00
Roberto Turrado Camblor	c2fe1d94ee	[X86][Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - add AVX512 KTEST/KORTEST intrinsics to be used in constexpr (#166103 ) Add AVX512 KTEST/KORTEST intrinsics to be used in constexpr. Fixes #162051	2025-11-07 11:26:47 +00:00
Karlo Basioli	d07a4fe12a	[bazel][mlir] Fix transform_xegpu_ext.py test for bazel (#166924 )	2025-11-07 12:18:43 +01:00
Giacomo Castiglioni	299df7ed25	[NVGPU] Fix nvdsl examples (#156830 ) This PR aims at fixing the nvdsl examples which got a bit out of sync not being tested in the CI. The fixed bugs were related to the following PRs: - move to nanobind #118583 - split gpu module initialization #135478	2025-11-07 16:23:08 +05:30

1 2 3 4 5 ...

558642 Commits