intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-13 11:02:04 +08:00

Author	SHA1	Message	Date
Michael Buch	8889377f5c	[lldb][docs] DW_AT_APPLE_major_runtime_version -> DW_AT_APPLE_major_runtime_vers (#162062 ) `DW_AT_APPLE_major_runtime_version` doesn't exist. This should be `DW_AT_APPLE_major_runtime_vers`.	2025-10-06 18:23:07 +01:00
Maksim Levental	84a214856a	[MLIR][Python] use `FetchContent_Declare` for nanobind and remove pybind (#161230 ) Inspired by this comment https://github.com/llvm/llvm-project/pull/157930#issuecomment-3346634290 (and long-standing issues related to finding nanobind/pybind in the right place), this PR moves to using `FetchContent_Declare` to get the nanobind dependency. This is pretty standard (see e.g., [IREE](`cf60359b74/CMakeLists.txt (L842-L848)`)). This PR also removes pybind which has been deprecated for almost a year (https://github.com/llvm/llvm-project/pull/117922) and which isn't compatible (for whatever reason) with `FetchContent_Declare`. --------- Co-authored-by: Jacques Pienaar <jpienaar@google.com>	2025-10-06 17:17:04 +00:00
Naveen Seth Hanig	2f3bb76781	[clang][DependencyScanning] Reset options generated for named module compilations. (#161486 ) The driver-generated -cc1 command-lines for C++ named module inputs introduce some command-line options which affect the canonical module build command (and therefore the context hash). This resets those options.	2025-10-06 19:11:52 +02:00
Steven Wu	2aff3c6a6d	Re-land #161264 : [CAS] Add OnDiskDataAllocator (#162112 ) Fix the build configuration that has OnDiskCAS disabled.	2025-10-06 17:07:13 +00:00
Craig Topper	141964b392	[RISCV][GISel] Force atomic G_LOAD/STORE to the GPR register bank. (#162042 ) We don't have FPR isel patterns for G_LOAD/STORE so force to the GPR register bank.	2025-10-06 10:02:33 -07:00
Simon Pilgrim	ec0db6619f	[X86] avx512fp16intrin.h - allow _mm512_cvtsh_h to be used in constexpr (#162114 ) This was missed in the earlier f16c/fp16 constexpr patches	2025-10-06 16:53:31 +00:00
Fei Peng	208231d197	[compiler-rt][TSan] Add support for Android (#147580 ) 1. Fixed Android setjmp issue. The root cause is that TSan initializes before longjmp_xor_key is set up. During __libc_init_vdso, a call to strcmp triggers TSan initialization, which occurs before __libc_init_setjmp_cookie. The solution is to call InitializeLongjmpXorKey on the first use of longjmp_xor_key. Additionally, correct LONG_JMP_SP_ENV_SLOT by following the bionic source code. 2. Skip thr object range check on Android. On Android, thr is allocated on the heap, causing the check to fail. 3. Disable intercepting clone on Android. pthread_create internally calls clone. Disabling the interception of clone resolves the issue in most scenarios. 4. Use a workaround to recover the thr pointer stored in TLS_SLOT_SANITIZER slot, whose value was modified by Skia. This PR solved the issue from NDK https://github.com/android/ndk/issues/1041. Test project: https://github.com/bytedance/android_tsan_sample/	2025-10-06 12:51:26 -04:00
Shawn K	d7feeda437	[Clang] VectorExprEvaluator::VisitCallExpr / InterpretBuiltin - add AVX512 VPTERNLOGD/VPTERNLOGQ intrinsics to be used in constexpr (#158703 ) Fix #157698 Add handling for `__builtin_ia32_pternlog[d/q][128/256/512]_mask[z]` intrinsics to `VectorExprEvaluator::VisitCallExpr` and `InterpBuiltin.cpp` with the corresponding test coverage: ``` _mm_mask_ternarylogic_epi32 _mm_maskz_ternarylogic_epi32 _mm_ternarylogic_epi32 _mm256_mask_ternarylogic_epi32 _mm256_maskz_ternarylogic_epi32 _mm256_ternarylogic_epi32 _mm512_mask_ternarylogic_epi32 _mm512_maskz_ternarylogic_epi32 _mm512_ternarylogic_epi32 _mm_mask_ternarylogic_epi64 _mm_maskz_ternarylogic_epi64 _mm_ternarylogic_epi64 _mm256_mask_ternarylogic_epi64 _mm256_maskz_ternarylogic_epi64 _mm256_ternarylogic_epi64 _mm512_mask_ternarylogic_epi64 _mm512_maskz_ternarylogic_epi64 _mm512_ternarylogic_epi64 ``` --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-10-06 16:44:37 +00:00
Steven Wu	f23c0e6f55	[CAS] Fix #161548 for broken build (#162116 ) Followup to fix the broken configuraiton.	2025-10-06 09:42:12 -07:00
Folkert de Vries	3f3d522ba7	[PowerPC] recognize `vmnsub` in older ppc versions (#155465 ) fixes https://github.com/llvm/llvm-project/issues/129432 Recognize expansion sequence of negate where it isn't legal in order to select multiply-subtract.	2025-10-06 12:40:57 -04:00
Steven Wu	c52de9ab48	[CAS] Rename OnDiskTrieRawHashMap::pointer -> OnDiskPtr. NFC (#161548 ) Rename the ondisk pointer type in OnDiskTrieRawHashMap to match OnDiskDataAllocator. NFC.	2025-10-06 16:35:20 +00:00
Matt Arsenault	134407d5f9	AMDGPU: Add gfx1250 to sram-ecc elf header flags test (#162107 )	2025-10-06 16:22:46 +00:00
Nico Weber	ebfb16a285	Revert "[CAS] Add OnDiskDataAllocator (#161264 )" This reverts commit `08e9540575`. Doesn't build on some bots, see comments on https://github.com/llvm/llvm-project/pull/161264	2025-10-06 12:21:43 -04:00
Juan Manuel Martinez Caamaño	27c207ef4c	[NFC][SPIRV] GetElementPtrInst does not need a call to isInstructionTriviallyDead after replaceUsesofWith (#162045 ) A getelementptr is always removable after replacing all its uses, since it doesn't have side effects and always returns.	2025-10-06 18:13:31 +02:00
Steven Wu	08e9540575	[CAS] Add OnDiskDataAllocator (#161264 ) Add OnDiskDataAllocator, which is the data pool implementation inside a OnDiskCAS that stores data in a single file. It is a based on MappedFileRegionArena and wrapped inside a CAS database file.	2025-10-06 09:02:55 -07:00
Andy Kaylor	44b2673544	[CIR] Implement initial LoweringPrepare support for global ctors (#161452 ) This adds the initial support for lowering the 'ctor' region of cir.global operations to an init function which is called from a TU-specific static initialization function. This does not yet add an attribute to hold a list of global initializers. That will be added in a future change.	2025-10-06 08:57:29 -07:00
Shilei Tian	9e8dda1034	[NFC] Change spelling of cluster feature to "clusters" (#162103 )	2025-10-06 15:55:39 +00:00
Lei Huang	e9f3be63d3	[NFC][PowerPC] Cleanup isImm and getImmEncoding functions (#161567 ) Refactor and replace explicit Imm `getImmEncodng() \| isUImm() \| isS*Imm()` functions to a generic one that takes a template. This is in prep for followup batch to implement `paddis` which takes a pcrel Imm == 32bits. Doing this refactor so we don't have to copy and paste the same set of functions again with only the bit length changes.	2025-10-06 11:46:15 -04:00
Matt Arsenault	1cc9a8c127	AMDGPU: Stop using the wavemask register class for SCC cross class copies (#161801 ) SCC should be copied to a 32-bit SGPR. Using a wave mask doesn't make sense.	2025-10-07 00:44:45 +09:00
Amr Hesham	6620e53511	[CIR][NFC] Update Complex CXX new test to use regex (#162024 ) Update Complex CXX new test to use regex for variable names	2025-10-06 17:41:22 +02:00
Ramkumar Ramachandra	a663119455	[LV] Fix verifier failures due to `93073af` (#162097 ) Follow up on `93073af` ([LV] Move 3 functions into VPlanTransforms (NFC)) to not call runPass on the moved functions, as that results in verifier failures. Ref: https://lab.llvm.org/buildbot/#/builders/187/builds/12178	2025-10-06 16:36:50 +01:00
Sarah Spall	35c57a778b	[HLSL] Add support for elementwise and aggregate splat casting struct types with bitfields (#161263 ) Adds support for elementwise and aggregate splat casting struct types with bitfields. Replacing existing Flattening function which used to produce a list of GEPs representing a flattened object with one that produces a list of LValues representing a flattened object. The LValues can be used by EmitStoreThroughLValue and EmitLoadOfLValue, ensuring bitfields are properly loaded and stored. This also simplifies the code in the elementwise and aggregate splat casting functions. Closes #125986	2025-10-06 08:26:23 -07:00
Hristo Hristov	45c41247f8	[libc++][ranges] P3060R3: Add `std::views::indices(n)` (#146823 ) Implements [P3060R3](https://wg21.link/P3060R3) Closes #148175 # References - https://github.com/cplusplus/draft/issues/7966 - https://github.com/cplusplus/draft/pull/8006 - https://wg21.link/customization.point.object - https://wg21.link/range.iota.overview - https://wg21.link/ranges.syn --------- Co-authored-by: Hristo Hristov <zingam@outlook.com> Co-authored-by: A. Jiang <de34@live.cn>	2025-10-06 18:13:25 +03:00
Cameron McInally	23e35bd43c	[Flang][Tests] Add GPL notice to GFortran test suite documentation. (#161912 ) Add a GPL notice to the GFortran test suite documentation and redirect to the LICENSE file distributed with the test suite. Co-authored-by: Cameron McInally <cmcinally@nvidia.com>	2025-10-06 10:59:17 -04:00
Rahul Joshi	95215a3f0d	[NFC][MLIR][TableGen] Change `emitSummaryAndDescComments` to write to os directly (#162014 ) Change `emitSummaryAndDescComments` to directly write to the output stream, avoiding creating large intermediate strings.	2025-10-06 07:59:04 -07:00
Rahul Joshi	2b153a4c93	[NFC][MLIR][TableGen] Use ArrayRef instead of const vector reference (#162016 )	2025-10-06 07:58:11 -07:00
Oleksandr "Alex" Zinenko	6b5fecf93b	[mlir] transform dialect: don't crash in verifiers (#161098 ) Fix crashes in the verifier of `transform.with_named_sequence` attribute attached to a symbol table operation caused by it constructing a call graph inside the symbol table. The call graph construction assumes calls and callables, such as functions or named sequences, have been verified, but it is not yet the case when the attribute verifier on the (parent) symbol table operation runs. Trigger such verification manually before constructing the call graph. This adds redundancy in verification, but there is currently no mechanism to change the order of verificaiton. In performance-critical scenarios, verification can be disabled altogether. Remove unnecessary verfificaton from `transform::IncludeOp::getEffects`. It was introduced along with the op definition as the op used to inspect the body of callee, which assumed the body existed, to identify handle consumption behavior. This was later evolved to having explicit argument attributes on the callee, which handles the absence of such attributes gracefully without the need for verification, but the verification was never removed. It would have been causing infinite recursion if kept in place. Fixes #159646. Fixes #159734. Fixes #159736.	2025-10-06 16:58:00 +02:00
Usha Gupta	47d74ca157	[FuncAttrs][LTO] Relax norecurse attribute inference during postlink LTO (#158608 ) This PR, which supersedes https://github.com/llvm/llvm-project/pull/139943, extends the scenarios where the 'norecurse' attribute can be inferred. Currently, the 'norecurse' attribute is only inferred if all called functions also have this attribute. This change introduces a new pass in the LTO pipeline, run after Whole Program Devirtualization, to broaden the inference criteria. The new pass inspects all functions in the module and sets a flag if any functions are external or have their addresses taken (while ignoring those already marked norecurse). This flag is then used with the existing conditions to enable inference in more cases. This enhancement allows 'norecurse' to be applied in situations where a function calls a recursive function, but is not part of the same recursion chain. For example, foo can now be marked 'norecurse' in the following scenarios: `foo -> callee1 -> callee2 -> callee2` In this case, foo and callee1 can both be marked 'norecurse' because they're not part of the callee2 recursion. Similarly, foo can be marked 'norecurse' here: `foo -> callee1 -> callee2 -> callee1` Here, foo is not part of the callee1 -> callee2 -> callee1 recursion chain, so it can be marked 'norecurse'.	2025-10-06 15:57:27 +01:00
Matt Arsenault	919470311f	clang/AMDGPU: Report some missing OpenCL 2.0 feature macros (#160826 ) Report __opencl_c_program_scope_global_variables and __opencl_c_device_enqueue as supported. These 2.0 features are supported but were missing from the extension map. __opencl_c_atomic_scope_all_devices should also be reported, but that seems to not just work by adding it to the map for some reason. The existing test for these macros was also broken, since it was missing CL3.0 run lines, so add those.	2025-10-06 23:57:19 +09:00
David Spickett	4efe170d85	[llvm-exegesis] Disable load store aliasing test Test added by #159366 This is causing objdump to crash more often than not on our 2 stage SVE bots, disabling it and I will investigate tomorrow. Could be the changes in the PR, or a pre-existing codegen or llvm-objdump problem.	2025-10-06 14:52:32 +00:00
Matt Arsenault	48db3fd702	AMDGPU: Stop handling AGPR case in getCrossCopyRegClass (#161800 ) This isn't what this is for. In the sense this hook is concerned with, you can copy between AGPRs. This only changes some DAG scheduling decisions; later passes are responsible for dealing with the bad agpr-agpr handling.	2025-10-06 23:34:39 +09:00
Sander de Smalen	f3a952311c	[AArch64] Return Invalid partial reduction cost for i128 accumulator. (#162066 ) PR #158641 introduced an issue where i128 accumulator types resulted in a valid cost, because for a <2 x i128> type the code that checks for unsupported type legalization would see a type action of 'TypeSplitVector' which is supported, even though the legalised type of <1 x i128> would require further scalarization. This fixes https://github.com/llvm/llvm-project/issues/162009	2025-10-06 15:32:13 +01:00
Nikita Popov	f31bc666f4	[IR] Handle addrspacecast in findBaseObject() (#162076 ) Make findBaseObject() look through addrspacecast, so that getAliaseeObject() works with an aliasee that uses and addrspacecast. This fixes a crash during module summary index emission. Fixes https://github.com/llvm/llvm-project/issues/161646.	2025-10-06 16:18:12 +02:00
Matt Arsenault	c6a4e84a10	AMDGPU: Remove unnecessary reference (#162085 )	2025-10-06 13:54:41 +00:00
Erich Keane	542cba8930	[OpenACC][CIR] Handle firstprivate bounds recipe lowering (#161873 ) These work the same as the other two (private and reduction) except that the expression for the 'init' is a copy instead of a default/value init, and in a separate region. This patch gets all of that correct, and ensures we generate these as expected. There is a little extra work to make sure that the bounds-loop generation does 2 separate array index operations, otherwise this is very much like the reduction implementation.	2025-10-06 06:46:21 -07:00
Jakub Kuderski	8bab6c4e8c	[mlir] Simplify unreachable type switch cases. NFC. (#162032 ) Use `DefaultUnreachable` from https://github.com/llvm/llvm-project/pull/161970.	2025-10-06 09:23:25 -04:00
marius doerner	5296d01738	[clang][bytecode] Assert on virtual func call from array elem (#158502 ) Fixes #152893. An assert was raised when a constexpr virtual function was called from an constexpr array element with -fexperimental-new-constant-interpreter set.	2025-10-06 15:08:38 +02:00
Luke Hutton	fee71a3474	[mlir][tosa] Apply 'Symbol' trait to `tosa.variable` (#153223 ) Implement SymbolOpInterface on tosa.variable so that it's declaration is automatically inserted into its parents SymbolTable. Verifiers for tosa.variable_read/write can now look up the symbol and guarantee it exists, and duplicate names are caught at creation time. Previously this was completed by walking the graph which could be inefficient. Unfortunately, the Symbol trait expects to find a symbol name via a hard-coded attribute name "sym_name". Therefore, "name" is renamed to"sym_name" and a getName() wrapper is provided for backwards compatibility. This change also restricts tosa.variable declarations to ops that carry a SymbolTable (e.g. modules), rather than allowing them to be placed inside a func.func. Note: EXT-VARIABLE is an experimental extension in the TOSA specification, so is not subject to backwards compatibility guarantees.	2025-10-06 13:14:34 +01:00
Jakub Kuderski	5e07093917	[mlir][spirv] Simplify unreachable default cases in type switch. NFC. (#162010 ) Use `DefaultUnreachable` from https://github.com/llvm/llvm-project/pull/161970.	2025-10-06 07:37:42 -04:00
Lang Hames	f8baf07c7c	[orc-rt] Clean up SPSWrapperFunction unittest names. Drop the redundant 'Test' prefix and rename transparent serialization tests to clarify their purpose.	2025-10-06 22:31:26 +11:00
Lang Hames	9a111ff91c	[orc-rt] Enable transparent SPS conversion for ptrs via ExecutorAddr. (#162069 ) Allows SPS wrapper function calls and handles to use pointer arguments. These will be converted to ExecutorAddr for serialization / deserialization.	2025-10-06 22:30:57 +11:00
Martin Storsjö	7f43b80d85	[libcxx] [ci] Stop manually installing ninja in the Windows build jobs (#161907 ) Ninja is officially included among the preinstalled tools on the Windows runners now. This should reduce the risk for stray failures here; sometimes, attempting to install Ninja through Chocolatey have caused spurious failures.	2025-10-06 14:28:10 +03:00
Jan Patrick Lehr	1c5186c315	[OpenMP][omptest] Enable missing callback (#161650 ) The registration of this callback handler was disabled for some reason. Local testing did not bring up any issues when I enabled it. Side effect is: Silences current warning about unused function.	2025-10-06 13:21:10 +02:00
Alexey Bataev	5d7f324614	[SLP]Enable Shl as a base opcode in copyables (#156766 ) Enables Shl matching for the nodes, where copyable can be modelled as shl %v, 0	2025-10-06 07:07:37 -04:00
Steven Perron	5547c0cff3	[SPIRV] Implement LLVM IR and backend for typed buffer counters (#161425 ) This commit implements the backend portion of the typed buffer counter proposal described in https://github.com/llvm/wg-hlsl/blob/main/proposals/0023-typed-buffer-counters.md. This is the second part of the implementation, focusing on the LLVM IR and SPIR-V backend. Specifically, this commit implements the "LLVM IR Generation and Backend Handling" section of the proposal. This includes: - Adding the `llvm.spv.resource.counterhandlefromimplicitbinding` and `llvm.spv.resource.counterhandlefrombinding` intrinsics. - Implementing the selection of these intrinsics in the SPIRV backend to generate the correct `OpVariable` and `OpDecorate` instructions for the counter buffer. - Handling `IncrementCounter` and `DecrementCounter` via a new `llvm.spv.resource.updatecounter` intrinsic, which is lowered to `OpAtomicIAdd`. - Adding a new test file to verify the implementation. Contributes to https://github.com/llvm/llvm-project/issues/137032 --------- Co-authored-by: Marcos Maronas <marcos.maronas@intel.com>	2025-10-06 06:49:42 -04:00
Nikolas Klauser	4b05a12e9c	[libc++] Fix simd_unary.pass.cpp with AppleClang When using AppleClang the `clang` feature flag is not set, but the compiler supports `-flax-vector-conversions=integer`. This adds another `ADDITIONAL_COMPILE_FLAGS` for AppleClang to fix the CI.	2025-10-06 12:47:51 +02:00
Simon Pilgrim	10da6f05cc	[X86] x86-shrink-wrap-unwind.ll - regenerate test checks (#162061 )	2025-10-06 10:44:09 +00:00
Cullen Rhodes	913ae2d372	[llvm][docs] Minor fixes and improvements for release process (#151956 ) - The list numbering in [1] currently starts again after item 3 due to the code-block. - Remove mentions of Phabricator and Subversion. - In final step of [2] remove mention of llvm/utils/git/sync-release-repo.sh, which was removed in #73682. - Add direct links to: - www-releases repo. - backporting doc [3]. - Getting Started page. - RELEASE_TESTERS.txt. - Release Sources GitHub workflow. [1] https://llvm.org/docs/HowToReleaseLLVM.html#create-release-branch [2] https://llvm.org/docs/HowToReleaseLLVM.html#triaging-bug-reports-for-releases [3] https://llvm.org/docs/GitHub.html#backporting-fixes-to-the-release-branches	2025-10-06 11:30:55 +01:00
Simon Pilgrim	93408f5312	[AArch64] determineSVEStackSizes - fix MSVC signed/unsigned comparison failure. NFC. (#162059 )	2025-10-06 10:30:46 +00:00
Simon Pilgrim	23f010f1ab	[clang] SemaConcept.cpp - fix MSVC "not all control paths return a value" warnings. NFC. (#162060 )	2025-10-06 10:26:51 +00:00

1 2 3 4 5 ...

554995 Commits