intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-20 01:58:44 +08:00

Author	SHA1	Message	Date
Nico Weber	e05fffbbc5	Revert "[Clang] Add __builtin_common_reference (#121199 )" This reverts commit `3b9e203364`. Causes not-yet-understood semantic differences, see commits on #121199.	2025-12-02 19:37:16 -05:00
Aiden Grossman	c5e9289ba5	[llvm-exegesis] Make rvv/filter.test deterministic This should prevent the flaky failures that have been plaguing the buildbots since the test was introduced and allow for offline investigation without disrupting CI. Reviewers: topperc, mshockwave Reviewed By: mshockwave Pull Request: https://github.com/llvm/llvm-project/pull/170014	2025-12-02 16:36:29 -08:00
Zachary Fogg	325a08267d	[lldb] Fix Doxygen warning in SBTrace.h (#170394 ) Remove errant `\a` command before `<directory>` in `SaveToDisk` documentation. The `\a` Doxygen command expects a word argument, but `<directory>` starts with `<` which Doxygen interprets as HTML. This fixes: ``` llvm-project/lldb/include/lldb/API/SBTrace.h:60: Warning 564: Error parsing Doxygen command a: No word followed the command. Command ignored. ```	2025-12-02 16:36:01 -08:00
Max Desiatov	94c8940f44	lldbgdbremote.md: Update `qWasmLocal` result description (#170393 ) The current description mistakenly specified that an address of a local value in some address space is returned. When testing this with Wasm runtimes that already implement this command, it can be observed that the value itself is returned. The value itself may be an address for languages that use shadow stack in Wasm linear memory, but the value of an arbitrary local does not always contain that address.	2025-12-02 16:27:14 -08:00
Matt Arsenault	9fd288e886	clang/AMDGPU: Enable opencl 2.0 features for unknown target (#170308 ) Assume amdhsa triples support flat addressing, which matches the backend logic for the default target. This fixes the rocm device-libs build.	2025-12-02 19:11:30 -05:00
Stanislav Mekhanoshin	9dd3346589	[AMDGPU] Prevent folding of flat_scr_base_hi into a 64-bit SALU (#170373 ) Fixes: SWDEV-563886	2025-12-02 16:08:00 -08:00
Farzon Lotfi	dd1b4abfb7	[HLSL][Matrix] Add support for Matrix element and trunc Casts (#168915 ) fixes #168737 fixes #168755 This change fixes adds support for Matrix truncations via the ICK_HLSL_Matrix_Truncation enum. That ends up being most of the files changed. It also allows Matrix as an HLSL Elementwise cast as long as the cast does not perform a shape transformation ie 3x2 to 2x3. Tests for the new elementwise and truncation behavior were added. As well as sema tests to make sure we error n the shape transformation cast. I am punting right now on the ConstExpr Matrix support. That will need to be addressed later. Will file a seperate issue for that if reviewers agree it can wait.	2025-12-02 19:02:25 -05:00
David Stone	45918f50aa	[llvm][NFC] In `SetVector`, `contains` and `count` now automatically accept `const T ` arguments when the key is `T ` (#170377 ) Also use `is_contained` to implement `contains`, since this tries the `contains` member function of the set type first.	2025-12-02 17:02:14 -07:00
David Stone	6c32535b20	[clang][NFC] Remove unused CFGStmtMap.h includes (#170383 )	2025-12-02 17:02:00 -07:00
Mircea Trofin	e9c127428c	[LTT] mark the CFI jumptable naked on Windows (#170371 ) We were not marking the `.cfi.jumptable` functions as `naked` on windows. The referenced bug (https://llvm.org/bugs/show_bug.cgi?id=28641#c3) appears to be fixed: ```bash build/bin/opt -S -passes=lowertypetests -mtriple=i686-pc-win32 llvm/test/Transforms/LowerTypeTests/function.ll \| build/bin/llc -O0 ``` ``` L_.cfi.jumptable: # @.cfi.jumptable # %bb.0: # %entry #APP jmp _f.cfi@PLT int3 int3 int3 #NO_APP #APP jmp _g.cfi@PLT int3 int3 int3 #NO_APP # -- End function .section .rdata,"dr" .p2align 4, 0x0 # @0 ``` Not seeing the spilled registers described in the bug anymore.	2025-12-02 15:47:35 -08:00
Thibault Monnier	6bdb838a05	[CIR] Upstream vec shuffle builtins in CIR codegen (#169178 ) This PR is part of #167752. It upstreams the codegen and tests for the shuffle builtins implemented in the incubator, including: - `vinsert` + `insert` - `pblend` + `blend` - `vpermilp` - `pshuf` + `shufp` - `palignr` It does NOT upstream the `perm`, `vperm2`, `vpshuf`, `shuf_i` / `shuf_f` and `align` builtins, which are not yet implemented in the incubator. This _is_ a large commit, but most of it is tests. The `pshufd` / `vpermilp` builtins seem to have no test coverage in the incubator, what should I do?	2025-12-02 15:29:12 -08:00
Drew Kersnar	9c78bc5de4	Revert "[LSV] Merge contiguous chains across scalar types" (#170381 ) Reverts llvm/llvm-project#154069. I pointed out a number of issues post-merge, most importantly examples of miscompiles: https://github.com/llvm/llvm-project/pull/154069#issuecomment-3603854626. While the motivation of the change is clear, I think the implementation approach is flawed. It seems like the goal is to allow elements like `load <2xi16>` and `load i32` to be vectorized together despite the current algorithm not grouping them into the same equivalence classes. I personally think that if we want to attempt this it should be a more wholistic approach, maybe even redefining the concept of an equivalence class. This current solution seems like it would be really hard to do bug-free, and even if the bugs were not present, it is only able to merge chains that happen to be adjacent to each other after `splitChainByContiguity`, which seems like it is leaving things up to chance whether this optimization kicks in. But we can discuss more in the re-land. Maybe the broader approach I'm proposing is too difficult, and a narrow optimization is worthwhile. Regardless, this should be reverted, it needs more iteration before it is correct.	2025-12-02 18:27:58 -05:00
Hendrik Hübner	e5f1d025aa	[CIR] Lower calls to trivial copy constructor to cir::CopyOp (#168281 ) This PR is a follow up to #167975 and replaces calls to trivial copy constructors with `cir::CopyOp`. --------- Co-authored-by: Andy Kaylor <akaylor@nvidia.com> Co-authored-by: Henrich Lauko <henrich.lau@gmail.com>	2025-12-02 15:22:46 -08:00
Shilei Tian	dbb702fbcb	[NFC][AMDGPU] Remove trailing white spaces in `AMDGPU.td`	2025-12-02 18:17:09 -05:00
Björn Pettersson	0f235c346c	[LowerConstantIntrinsics] Improve tests related to llvm.objectsize. NFC (#132364 ) Adding some new test cases (including FIXME:s) to highlight some bugs related to lowering of llvm.objectsize. One special case is when there are getelementptr instruction with index types that are larger than the index type size for the pointer being analysed. This will add a couple of tests to show what happens both when using a smaller and larger index type, and when having out-of-bounds indices (both too large and negative).	2025-12-02 23:12:42 +00:00
Petar Avramovic	aeea056f60	AMDGPU/GlobalISel: Report RegBankLegalize errors using reportGISelFailure (#169918 ) Use standard GlobalISel error reporting with reportGISelFailure and pass returning false instead of llvm_unreachable. Also enables -global-isel-abort=0 or 2 for -global-isel -new-reg-bank-select. Note: new-reg-bank-select with abort 0 or 2 runs LCSSA, while "intended use" without abort or with abort 1 does not run LCSSA.	2025-12-02 23:49:21 +01:00
Alex Duran	ec6091f4de	[OFFLOAD][LIBOMPTARGET] Start to update debug messages in libomptarget (#170265 ) * Add compatibility support for DP and REPORT macros * Define a set of predefined Debug Type for libomptarget * Start to update libomptarget files (OffloadRTL.cpp, device.cpp)	2025-12-02 23:45:23 +01:00
Valentin Clement (バレンタインクレメン)	9885aed474	[flang][cuda] Add address cast for src and dst in TMA operations (#170375 ) src and dst pointer needs to have an address cast	2025-12-02 22:31:55 +00:00
Helena Kotas	434127b0c1	[HLSL] Static resources (#166880 ) This change fixes couple of issues with static resources: - Enables assignment to static resource or resource array variables (fixes #166458) - Initializes static resources and resource arrays with default constructor that sets the handle to poison	2025-12-02 22:25:17 +00:00
John Harrison	fff45ddcc0	[lldb-dap] Follow the spec more closely on 'initialize' arguments. (#170350 ) Updates `InitializeRequestArguments` to correctly follow the spec, see https://microsoft.github.io/debug-adapter-protocol/specification#Requests_Initialize. This should correct which fields are tracked as optional and simplifies some of the types to make sure they're meaningful (e.g. an `optional<bool>` isn't anymore helpful than a `bool` since undefined and false are basically equivalent and it requires us to handle interpreting undefined as the default value in all the places we use the `optional<bool>`).	2025-12-02 14:19:05 -08:00
Florian Hahn	41519b390f	[SCEV] Add UDiv canonicalization tests with nested AddRecs. Add more tests for follow-up to https://github.com/llvm/llvm-project/pull/169576.	2025-12-02 22:18:16 +00:00
Valentin Clement (バレンタインクレメン)	d3256d935d	[flang][cuda] Add alignment to shared memory operation (#170372 ) Shared memory for TMA operation needs to be align to 16. Add ability to set an alignment on the cuf.shared_memory operation.	2025-12-02 22:13:19 +00:00
Florian Hahn	bd5fa63335	[VPlan] Remove duplicated computeCost call (NFC). Remove a redundant duplicated computeCost call. NFC, just skipping an unneeded call.	2025-12-02 21:59:40 +00:00
Erich Keane	4006df9b32	[OpenACC][CIR] Implement 'nohost' lowering. (#170369 ) This clause is pretty small/trivial and is a simple 'set a bool' value on the IR node, so its implementation is quite simple. We create the Operation with this as 'false', so the 'nohost' marks it as true always.	2025-12-02 21:56:42 +00:00
Florian Hahn	f0e1254bce	[LV] Use forced cost once for whole interleave group in legacy costmodel (#168270 ) The VPlan-based cost model assigns the forced cost once for a whole VPInterleaveRecipe. Update the legacy cost model to match this behavior. This fixes a cost-model divergence, and assigns the cost in a way that matches the generated code more accurately. PR: https://github.com/llvm/llvm-project/pull/168270	2025-12-02 21:39:54 +00:00
Jason Macnak	139ebfa63d	[Bazel] Fix `--warn-backrefs` errors in `Analysis` target (#170357 ) Commit `b262785` introduced a separate `AnalysisFpExc` target to try to workaround the lack of a bazel equivalent of single source file properties. However, this introduces backref errors when `--warn-backrefs` is enabled. This change alternatively just adds the `-ftrapping-math` copt to the entire `Analysis` target. Fix suggested by @rocallahan.	2025-12-02 15:32:52 -06:00
asmok-g	d97746c56b	[libc++] Fix the rest of __gnu_cxx::hash_XXX copy construction (#160525 ) Co-authored-by: Alexander Kornienko <alexfh@google.com> Co-authored-by: Louis Dionne <ldionne.2@gmail.com>	2025-12-02 22:18:50 +01:00
Andy Kaylor	12ae72744c	[CIR] Upstream support for builtin_constant_p (#170354 ) This upstreams the handler for the BI__builtin_constant_p function.	2025-12-02 21:15:17 +00:00
Kyungtak Woo	c77fe5845e	[bazel] update bazel build for PluginScriptedProcess (#170364 ) Adding the following dependencies to PluginScriptedProcess: - "//lldb:CoreHeaders", - "//lldb:SymbolHeaders", - "//llvm:Support", For `c50802cbee`	2025-12-02 15:04:07 -06:00
Erich Keane	c910d821dc	[OpenACC][CIR] Add worker/vector clause lowering for Routine (#170358 ) These two are both incredibly similar and simple, basically identical to 'seq'. This patch adds them both together.	2025-12-02 12:58:11 -08:00
Yaxun (Sam) Liu	0bb987f409	Revert "[CUDA][HIP] Fix CTAD for host/device constructors (#168711 )" This reverts commit `e719e93d41`. revert this since it caused regression in our internal CI. Deduction guide with host/device attrs have already been used in https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocrand/library/src/rng/utils/cpp_utils.hpp#L249 ``` template<class V> __host__ __device__ vec_wrapper(V) -> vec_wrapper<V>; ```	2025-12-02 15:42:22 -05:00
Andy Kaylor	ca3de05eca	[CIR][NFC] Fix a release build warning (#170359 ) This moves a call inside an assert to avoid a warning about the result variable being unused in release builds.	2025-12-02 20:29:04 +00:00
Philip Reames	49a9787128	[SCEV] Regenerate a subset of auto updated tests Reducing spurious diff in an upcoming change.	2025-12-02 12:16:53 -08:00
Razvan Lupusoru	b50a590984	[acc][flang] Add genLoad and genStore to PointerLikeType (#170348 ) This patch extends the OpenACC PointerLikeType interface with two new methods for generating load and store operations, enabling dialect-agnostic memory access patterns. New Interface Methods: - genLoad(builder, loc, srcPtr, valueType): Generates a load operation from a pointer-like value. Returns the loaded value. - genStore(builder, loc, valueToStore, destPtr): Generates a store operation to a pointer-like value. Implementations provided for FIR pointer-like types, memref type (rank-0 only), and LLVM pointer types. Extended TestPointerLikeTypeInterface.cpp with 'load' and 'store' test modes.	2025-12-02 12:09:32 -08:00
Erich Keane	6dd639ec9e	[CIR][OpenACC] Implement 'routine' lowering + seq clause (#170207 ) The 'routine' construct just adds a acc.routine element to the global module, which contains all of the information about the directive. it contains a reference to the function, which also contains a reference to the acc.routine, which this generates. This handles both the implicit-func version (where the routine is spelled without parens, and just applies to the next function) and the explicit-func version (where the routine is spelled with the func name in parens). The AST stores the directive in an OpenACCRoutineDeclAttr in the implicit case, so we can emit that when we hit the function declaration. The explicit case is held in an OpenACCRoutineAnnotAttr on the function, however, when we emit the function we haven't necessarily seen the construct yet, so we can't depend on that attribute. Instead, we save up the list in Sema so that we can emit them all at the end. This results in the tests getting really hard to read (because ordering is a little awkward based on spelling, with no way to fix it), so we instead split the tests up based on topic. One last thing: Flang spends some time determining if the clause lists of two routines on the same function are identical, and omits the duplicates. However, it seems to do a poor job on this when the ordering isn't the same, or references are slightly different. This patch doesn't bother trying that, and instead emits all, trusting the ACC dialect to remove duplicates/handle duplicates gracefully. Note; This doesn't cause emission of functions that would otherwise not be emitted, but DOES emit routine references based on which function they are attached to.	2025-12-02 11:55:14 -08:00
David Peixotto	fae64adaa6	[lldb] Handle deref of register and implicit locations (#169419 ) This commit modifies the dwarf expression evaluator in how we handle the deref operation for register and implicit locations on the stack. For a typical memory location a deref operation will read the value from memory. For register and implicit locations the deref operation will read the value from the register or its implicit location. In lldb we eagerly read register and implicit values and push them on the stack so the deref operation for these becomes a "no-op" that leaves the value on the stack and updates the tracked location kind. The motivation for this change is to handle `DW_OP_deref` operations on location descriptions as described by the heterogenious debugging [extensions](https://rocm.docs.amd.com/projects/llvm-project/en/latest/LLVM/llvm/html/AMDGPUDwarfExtensionsForHeterogeneousDebugging.html#a-2-5-4-4-4-register-location-description-operations). Specifically, for register locations it states > These operations obtain a register location. To fetch the contents of > a register, it is necessary to use DW_OP_regval_type, use one of the > DW_OP_breg register-based addressing operations, or use DW_OP_deref* on > a register location description. My understanding is that this is the intended behavior from dwarf5 as well and is not a change in behavior.	2025-12-02 11:13:48 -08:00
Krzysztof Drewniak	3f2e3e67c1	[mlir][AMDGPU][NFC] Fix overlapping masked load refinements (#159805 ) The two paterns for handlig vector.maskedload on AMD GPUs had an overlap - both the "scalar mask becomes an if statement" pattern and the "masked loads become a normal load + a select on buffers" patterns could handle a load with a broadcast mask on a fat buffer resource. This commet add checks to resolve the overlap.	2025-12-02 11:02:45 -08:00
Med Ismail Bennani	c50802cbee	Reland "[lldb] Introduce ScriptedFrameProvider for real threads (#161870 )" (#170236 ) This patch re-lands #161870 with fixes to the previous test failures. rdar://161834688 Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>	2025-12-02 18:59:40 +00:00
David Green	879dddf2b4	[AArch64] Add tests for umulh. NFC	2025-12-02 18:58:32 +00:00
LLVM GN Syncbot	6e262aa8ba	[gn build] Port `41a53c0a23`	2025-12-02 18:51:35 +00:00
Erick Ochoa Lopez	73979c1df9	[mlir][amdgpu] Lower amdgpu.make_dma_base (#169817 ) * Adds lowering for `amdgpu.make_dma_base`	2025-12-02 13:48:31 -05:00
Changpeng Fang	697b1be09c	[AMDGPU][NFC] Put gfx125x common features into 12_50_Common (#170338 )	2025-12-02 10:47:00 -08:00
Robert Imschweiler	5c3c0020af	[NFC] Refactor TargetLowering::getTgtMemIntrinsic to take CallBase parameter (#170334 ) cf. https://github.com/llvm/llvm-project/pull/133907#discussion_r2578576548	2025-12-02 19:42:31 +01:00
hjagasiaAMD	2183846a15	[AMDGPU] Fix AGPR_32 reg assign for mfma scale ops (#168964 ) In MFMA rewrite pass, prevent AGPR_32 reg class assignment for scale operands, not permitted by instruction format. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-12-02 13:41:16 -05:00
Med Ismail Bennani	41a53c0a23	[lldb/Target] Add BorrowedStackFrame and make StackFrame methods virtual (#170191 ) This change makes StackFrame methods virtual to enable subclass overrides and introduces BorrowedStackFrame, a wrapper that presents an existing StackFrame with a different frame index. This enables creating synthetic frame views or renumbering frames without copying the underlying frame data, which is useful for frame manipulation scenarios. This also adds a new borrowed-info format entity to show what was the original frame index of the borrowed frame. Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>	2025-12-02 10:41:03 -08:00
Nick Sarnie	1a3709cc7e	[SPIRV] Error for zero-length arrays if not a shader (#169732 ) I had a case where the frontend was generating a zero elem array in non-shader code so it was just crashing in a release build. Add a real error and make it not crash. --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>	2025-12-02 18:24:59 +00:00
Jasmine Tang	e0db7f347c	[WebAssembly] Optimize away mask of 63 for sra and srl( zext (and i32 63))) (#170128 ) Follow up to #71844 after shl implementation	2025-12-02 18:23:17 +00:00
Shubham Sandeep Rastogi	23a22d0497	[SROA] Unify the names of new instructions created in SROA. (#167917 ) In Debug builds, the names of adjusted pointers have a pointer-specific name prefix which doesn't exist in non-debug builds. This causes differences in output when looking at the output of SROA with a Debug or Release compiler. For most of our ongoing testing, we use essentially Release+Asserts build (basically release but without NDEBUG defined), however we ship a Release compiler. Therefore we want to say with reasonable confidence that building a large project with Release vs a Release+Asserts build gives us the same output when the same compiler version is used. This difference however, makes it difficult to prove that the output is the same if the only difference is the name when using LTO builds and looking at bitcode. Hence this change is being proposed.	2025-12-02 10:12:20 -08:00
serge-sans-paille	4587fe6be8	[lld] Fix typo in lld manpage, nfc (#170299 )	2025-12-02 18:11:27 +00:00
Matt Arsenault	2c38632639	LTO: Remove unused TargetLibraryInfo include (#170340 )	2025-12-02 18:10:48 +00:00

1 2 3 4 5 ...

561300 Commits