intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-25 10:55:58 +08:00

Author	SHA1	Message	Date
Nick Sarnie	031d99836d	[SPIRV] Error in backend for vararg functions (#169111 ) SPIR-V doesn't support variadic functions, though we make an exception for `printf`. If we don't error, we generate invalid SPIR-V because the backend has no idea how to codegen vararg functions as it is not described in the spec. We get asm like this: ``` %27 = OpFunction %6 None %7 %28 = OpFunctionParameter %4 ; -- End function ``` The above asm is totally invalid, there's no `OpFunctionEnd` and it causes crashes in downstream tools like `spirv-as` and `spirv-link`. We already have many `printf` tests locking down that this doesn't break `printf`, it was already handled elsewhere at the time the error check runs. Note the SPIR-V Translator does the same thing, see [here](https://github.com/KhronosGroup/SPIRV-LLVM-Translator/pull/2703). --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>	2025-11-25 15:17:04 +00:00
Nick Sarnie	b3b83ac1e8	[offload][lit] Fix compilation of two offload tests (#169399 ) These are C tests, not C++, so no function parameters means unspecified number of parameters, not `void`. These compile fine on the current tested offload targets because an error is only [thrown](https://github.com/llvm/llvm-project/blob/main/clang/lib/Sema/SemaDecl.cpp#L10695) if the calling convention doesn't support variadic arguments, which they happen to. When compiling this test for other targets that do not support variadic arguments, we get an error, which does not seem intentional. Just add `void` to the parameter list. --------- Signed-off-by: Nick Sarnie <nick.sarnie@intel.com>	2025-11-25 15:16:15 +00:00
Craig Topper	3564870a9f	[RISCV] Initialize AltFmt and TWiden in the VSETVLIInfo default constructor. (#169457 )	2025-11-25 07:05:51 -08:00
Craig Topper	4f5fb36ddb	[RISCV] Use an enum class for AVL state ins RISCVInsertVSETVLI. NFC (#169455 )	2025-11-25 07:05:03 -08:00
Craig Topper	9007b36b42	[RISCV] Add a InstRW to COPY in RISCVSchedSpacemitX60.td. (#169423 ) This prevents the scheduler from thinking copy instructions are free. In #167008, we saw cases where the scheduler moved ABI copies past other instructions creating high register pressure that caused the register allocator to run out of registers. They can't be spilled because the physical register lifetime was increased, not the virtual register. Ideally, we would detect what register class the COPY is for, but for now I've just treated it as a scalar integer copy.	2025-11-25 07:04:11 -08:00
Erich Keane	be2dfce647	[OpenACC][CIR] Global declare 'copyin' clause lowering (#169498 ) JUST like the 'create' clause, except the entry op is copyin instead of create. Most of this is the test.	2025-11-25 15:01:14 +00:00
Matt Arsenault	eb5297e0ad	RuntimeLibcalls: Add mustprogress to common function attributes (#167080 )	2025-11-25 09:48:36 -05:00
Florian Hahn	a51e2ef0fe	[VPlan] Treat VPVector(End)PointerRecipe as single-scalar, if ops are. (#169249 ) VPVector(End)PointerRecipes are single-scalar if all their operands are. This should be effectively NFC currently, but it should re-enable cost checking for some more VPWidenMemoryRecipe after https://github.com/llvm/llvm-project/pull/157387 as discovered by John Brawn.	2025-11-25 14:46:30 +00:00
Ming Yan	25c95ebfa8	[flang][fir] Convert `fir.do_loop` with the unordered attribute to `scf.parallel`. (#168510 ) Refines the existing conversion to allow `fir.do_loop` annotated with `unordered` to be lowered to `scf.parallel`, while other loops retain their original lowering.	2025-11-25 14:43:41 +00:00
Matt Arsenault	d8ae4d503a	RuntimeLibcalls: Add __memcpy_chk, __memmove_chk, __memset_chk (#167053 ) These were in TargetLibraryInfo, but missing from RuntimeLibcalls. This only adds the cases that already have the non-chk variants already. Copies the enabled-by-default logic from TargetLibraryInfo, which is probably overly permissive. Only isPS opts-out.	2025-11-25 09:39:40 -05:00
GrumpyPigSkin	7f8c43a249	[X86][GISel] Fix crash on bitcasting i16 <-> half with gisel enabled. (#168456 ) Added missing checks for casting half to/from i16 with global-isel enabled. Fixes #166557	2025-11-25 14:31:27 +00:00
Matt Arsenault	5818435c43	RuntimeLibcalls: Add a few libm entries from TargetLibraryInfo (#167049 ) These are floating-point functions recorded in TargetLibraryInfo, but missing from RuntimeLibcalls.	2025-11-25 14:30:59 +00:00
jeanPerier	077a280cf5	[flang][acc] remap symbol appearing in reduction clause (#168876 ) This patch is a follow-up of #162306 for the reduction clause. Inside the compute region that carries the reduction clause, a new hlfir.declare is generated for symbol appearing in the reduction clause. The input of this hlfir.declare is the acc.reduction result. The related semantics::Symbol is remapped to the hlfir.declare result so that any reference to the symbol inside the compute region will use this SSA value as the starting point instead of the SSA value for the host address.	2025-11-25 15:25:15 +01:00
Hristo Hristov	b37b307715	[libc++] Applied `[[nodiscard]]` to some general utilities (#169322 ) `[[nodiscard]]` should be applied to functions where discarding the return value is most likely a correctness issue. - https://libcxx.llvm.org/CodingGuidelines.html#apply-nodiscard-where-relevant The following functions/classes have been annotated in this patch: - [x] `bind_back`, `bind_front`, `bind` - [x] `function`, `mem_fn` - [x] `reference_wrapper`	2025-11-25 16:24:38 +02:00
Paul Osmialowski	a7e715a141	[llvm][docs] Correct the list of the available -fveclib= options to match with the reality (#168205 ) The command line reality is this: $ clang -c prog.c -fveclib=accelerate error: invalid value 'accelerate' in '-fveclib=accelerate' $ clang -c prog.c -fveclib=Accelerate prog.c:1:2: warning: This is only a test [-W#warnings] 1 \| #warning This is only a test \| ^ 1 warning generated. $ clang -c prog.c -fveclib=libmvec prog.c:1:2: warning: This is only a test [-W#warnings] 1 \| #warning This is only a test \| ^ 1 warning generated. $ clang -c prog.c -fveclib=LIBMVEC error: invalid value 'LIBMVEC' in '-fveclib=LIBMVEC' $ clang -c prog.c -fveclib=massv error: invalid value 'massv' in '-fveclib=massv' $ clang -c prog.c -fveclib=MASSV prog.c:1:2: warning: This is only a test [-W#warnings] 1 \| #warning This is only a test \| ^ 1 warning generated. $ clang -c prog.c -fveclib=sleef error: invalid value 'sleef' in '-fveclib=sleef' $ clang -c prog.c -fveclib=sleefgnuabi error: invalid value 'sleefgnuabi' in '-fveclib=sleefgnuabi' $ clang -c prog.c -fveclib=SLEEF prog.c:1:2: warning: This is only a test [-W#warnings] 1 \| #warning This is only a test \| ^ 1 warning generated. $ clang -c prog.c -fveclib=darwin_libsystem_m error: invalid value 'darwin' in '-fveclib=darwin_libsystem_m' $ clang -c prog.c -fveclib=Darwin_libsystem_m prog.c:1:2: warning: This is only a test [-W#warnings] 1 \| #warning This is only a test \| ^ 1 warning generated. $ clang -c prog.c -fveclib=armpl error: invalid value 'armpl' in '-fveclib=armpl' $ clang -c prog.c -fveclib=ARMPL error: invalid value 'ARMPL' in '-fveclib=ARMPL' $ clang -c prog.c -fveclib=ArmPL prog.c:1:2: warning: This is only a test [-W#warnings] 1 \| #warning This is only a test \| ^ 1 warning generated. $ clang -c prog.c -fveclib=amdlibm error: invalid value 'amdlibm' in '-fveclib=amdlibm' $ clang -c prog.c -fveclib=AMDLIBM clang: error: unsupported option 'AMDLIBM' for target 'aarch64'	2025-11-25 14:24:24 +00:00
Mikhail R. Gadelha	d615c14c22	[RISCV] Update SpacemiT-X60 vector floating-point instructions latencies (#150618 ) This PR adds hardware-measured latencies for all instructions defined in Section 13 of the RVV specification: "Vector Floating-Point Instructions" to the SpacemiT-X60 scheduling model.	2025-11-25 11:21:34 -03:00
Jay Foad	d54168013a	[LLVM] Use "syncscope" instead of "synchscope" in comments. NFC. (#134615 ) This matches the spelling of the keyword in LLVM IR.	2025-11-25 14:11:49 +00:00
Erich Keane	4e9b76e23b	[OpenACC][CIR] 'declare' lowering for globals/ns/struct-scopes (+create) (#169409 ) This patch does the lowering for a 'declare' construct that is not a function-local-scope. It also does the lowering for 'create', which has an entry-op of create and exit-op of delete. Global/NS/Struct scope 'declare's emit a single 'acc_ctor' and 'acc_dtor' (except in the case of 'link') per variable referenced. The ctor is the entry op followed by a declare_enter. The dtor is a get_device_ptr, followed by a declare_exit, followed by a delete(exit op). This DOES include any necessary bounds. This patch implements all of the above. We use a separate 'visitor' for the clauses here since it is particularly different from the other uses, AND there are only 4 valid clauses. Additionally, we had to split the modifier conversion into its own 'helpers' file, which will hopefully get some additional use in the future.	2025-11-25 05:56:34 -08:00
Colin Kinloch	1919cd6322	[analyzer] Fix non decimal macro values in tryExpandAsInteger (#168632 ) Values were parsed into an unsigned APInt with just enough of a bit width to hold the number then interpreted as signed values. This resulted in hex, octal and binary literals from being interpreted as negative when the most significant bit is 1. For example the `-0b11` would have a bit width of 2, would be interpreted as -1, then negated to become 1.	2025-11-25 13:41:33 +00:00
Walter Lee	f0e0a22158	[bazel] Delete redundant visibility (#169493 ) default_visibility is already public.	2025-11-25 13:21:56 +00:00
Paul Walker	6bf3249fe9	[Clang][Sema] Emit diagnostic for __builtin_vectorelements(<SVEType>) when SVE is not available. (#168097 ) As is done for other targets, I've moved the target type checking code into SemaARM and migrated existing uses. Fixes https://github.com/llvm/llvm-project/issues/155736	2025-11-25 13:14:10 +00:00
Durgadoss R	9e53ef3d8c	[MLIR][NVVM] Update mbarrier.arrive.* Op (#168758 ) This patch updates the mbarrier.arrive.* family of Ops to include all features added up-to Blackwell. * Update the `mbarrier.arrive` Op to include shared_cluster memory space, cta/cluster scope and an option to lower using relaxed semantics. * An `arrive_drop` variant is added for both the `arrive` and `arrive.nocomplete` operations. * Updates for expect_tx and complete_tx operations. * Verifier checks are added wherever appropriate. * lit tests are added to verify the lowering to the intrinsics. TODO: * Updates for the remaining mbarrier family will be done in subsequent PRs. (mainly, arrive.expect-tx, test_wait and try_waits) Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-11-25 18:32:13 +05:30
Jan Patrick Lehr	4bc654d649	Revert "[Flang] Move builtin .mod generation into runtimes" (#169489 ) Reverts llvm/llvm-project#137828 Buildbot error in https://lab.llvm.org/staging/#/builders/105/builds/37275	2025-11-25 13:54:27 +01:00
LLVM GN Syncbot	262716b35b	[gn build] Port `07ad928d92`	2025-11-25 12:47:00 +00:00
Nikolas Klauser	07ad928d92	[libc++] Introduce __specialized_algorithms (#167295 )	2025-11-25 13:46:19 +01:00
Ramkumar Ramachandra	e06c148af7	[IVDesc] Use SCEVPatternMatch to improve code (NFC) (#168397 )	2025-11-25 12:29:56 +00:00
Simon Pilgrim	af3af8ea5a	[X86] setcc-wide-types.ll - cleanup check prefixes NFC (#169488 ) Match typical prefixes used in x86 SSE/AVX tests	2025-11-25 12:19:59 +00:00
Jay Foad	4e37526fdb	[AMDGPU] Fix test after #169378	2025-11-25 12:17:17 +00:00
Balázs Benics	17b19c5034	[analyzer] Unroll loops of compile-time upper-bounded loops (#169400 ) Previously, only literal upper-bounded loops were recognized. This patch relaxes this matching to accept any compile-time deducible constant expression. It would be better to rely on the SVals (values from the symbolic domain), as those could potentially have more accurate answers, but this one is much simpler. Note that at the time we calculate this value, we have not evaluated the sub-exprs of the condition, consequently, we can't just query the Environment for the folded SVal. Because of this, the next best tool in our toolbox is comp-time evaluating the Expr. rdar://165363923	2025-11-25 12:16:56 +00:00
Jay Foad	d748c81218	[AMDGPU] Change the immediate operand of s_waitcnt_depctr / s_wait_alu (#169378 ) The 16-bit immediate operand of s_waitcnt_depctr / s_wait_alu has some unused bits. Previously codegen would set these bits to 1, but setting them to 0 matches the SP3 assembler behaviour better, which in turn means that we can print them using the human readable SP3 syntax: s_wait_alu 0xfffd ; unused bits set to 1 s_wait_alu 0xff9d ; unused bits set to 0 s_wait_alu depctr_va_vcc(0) ; unused bits set to 0, human readable Note that the set of unused bits changed between GFX10.1 and GFX10.3.	2025-11-25 11:55:26 +00:00
Nikolas Klauser	105900ced1	[libc++] Always define _LIBCPP_GLIBC_PREREQ (#169405 ) Always defining the macro allows us to simplify the few places where it's used.	2025-11-25 12:51:23 +01:00
Nikolas Klauser	68c2a8140f	[libc++][C++03] Fix ODR tests (#169349 ) We don't really need to include `<__config>`. We just need to include a public C++ header.	2025-11-25 12:49:59 +01:00
Aiden Grossman	51dd3ec13c	[MLIR][OpenMP] Bail early in sortMapIndices if indices are the same (#169474 ) If we are given the same index in the comparator callback, simply return false. Otherwise we will end up adding invalid items to occludedChildren, causing extra items to get removed that should not be, resulting in failures that manifest in different forms (assertions, asan failures, ubsan failures, etc.).	2025-11-25 06:23:12 -05:00
Sander de Smalen	e1b08731e5	Revert "Reland "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG"" This reverts commit `bb78728826`.	2025-11-25 11:01:27 +00:00
Ravil Dorozhinskii	bc4143b27a	[DAG] SDPatternMatch - add m_SpecificFP matcher (#167438 ) This patch introduces SpecificFP matcher for SelectionDAG nodes. This includes: Adding SpecificFP_match() in SDPatternMatch.h. Adding test coverage in SelectionDAGPatternMatchTest.cpp. Closes #165566	2025-11-25 11:49:36 +01:00
Felipe de Azevedo Piovezan	4b137e7446	[lldb][NFC] Remove code dupl in favour of a named variable in UnwindAssemblyInstEmulation (#169369 )	2025-11-25 10:34:39 +00:00
Ramkumar Ramachandra	cb63e99e58	[VPlan] Include flags in VectorPointerRecipe::printRecipe (#169466 ) The change is non-functional with respect to emitted IR.	2025-11-25 10:26:51 +00:00
Zhaoxin Yang	5e7631e14a	[LoongArch][DAGCombiner] Combine vand (vnot ..) to vandn (#161037 ) After this commit, DAGCombiner will have more opportunities to perform vector folding. This patch includes several foldings, as follows: - VANDN(x,NOT(y)) -> AND(NOT(x),NOT(y)) -> NOT(OR(X,Y)) - VANDN(x, SplatVector(Imm)) -> AND(NOT(x), NOT(SplatVector(~Imm)))	2025-11-25 17:46:28 +08:00
Simon Pilgrim	f287abd53e	[DAG][X86] Improve custom i256/i512 AVX512 CTLZ/CTTZ Handling with MVT::i256/i512 (#168860 ) This patch proposes to move the AVX512 CTLZ/CTTZ i256/i512 codegen to ReplaceNodeResults to allow them to be declared as custom lowering - this allows expansion of larger int types (e.g. i1024) to fallback to them during their expansion. However to declare these i256/i512 ops as custom, we need to add MVT::i256/i512 simple types - I'm intending to add further large integer handling in the future, some of which will use vector register instructions, and its going to be much easier if this can be handled with i128/i256/i512 types that match the vector register sizes. This exposed a regression in NVPTX due to their use of EVT::isSimple() to match their upper integer size bounds.	2025-11-25 09:46:14 +00:00
Michael Kruse	86fbaef99a	[Flang] Move builtin .mod generation into runtimes (#137828 ) Move building the .mod files from openmp/flang to openmp/flang-rt using a shared mechanism. Motivations to do so are: 1. Most modules are target-dependent and need to be re-compiled for each target separately, which is something the LLVM_ENABLE_RUNTIMES system already does. Prime example is `iso_c_binding.mod` which encodes the target's ABI. Most other modules have `#ifdef`-enclosed code as well. 2. CMake has support for Fortran that we should use. Among other things, it automatically determines module dependencies so there is no need to hardcode them in the CMakeLists.txt. 3. It allows using Fortran itself to implement Flang-RT. Currently, only `iso_fortran_env_impl.f90` emits object files that are needed by Fortran applications (#89403). The workaround of #95388 could be reverted. Some new dependencies come into play: * openmp depends on flang-rt for building `lib_omp.mod` and `lib_omp_kinds.mod`. Currently, if flang-rt is not found then the modules are not built. * check-flang depends on flang-rt: If not found, the majority of tests are disabled. If not building in a bootstrpping build, the location of the module files can be pointed to using `-DFLANG_INTRINSIC_MODULES_DIR=<path>`, e.g. in a flang-standalone build. Alternatively, the test needing any of the intrinsic modules could be marked with `REQUIRES: flangrt-modules`. * check-flang depends on openmp: Not a change; tests requiring `lib_omp.mod` and `lib_omp_kinds.mod` those are already marked with `openmp_runtime`. As intrinsic are now specific to the target, their location is moved from `include/flang` to `<resource-dir>/finclude/flang/<triple>`. The mechnism to compute the location have been moved from flang-rt (previously used to compute the location of `libflang_rt.*.a`) to common locations in `cmake/GetToolchainDirs.cmake` and `runtimes/CMakeLists.txt` so they can be used by both, openmp and flang-rt. Potentially the mechnism could also be shared by other libraries such as compiler-rt. `finclude` was chosen because `gfortran` uses it as well and avoids misuse such as `#include <flang/iso_c_binding.mod>`. The search location is now determined by `ToolChain` in the driver, instead of by the frontend. Now the driver adds `-fintrinsic-module-path` for that location to the frontend call (Just like gfortran does). `-fintrinsic-module-path` had to be fixed for this because ironically it was only added to `searchDirectories`, but not `intrinsicModuleDirectories_`. Since the driver determines the location, tests invoking `flang -fc1` and `bbc` must also be passed the location by llvm-lit. This works like llvm-lit does for finding the include dirs for Clang using `-print-file-name=...`.	2025-11-25 10:33:58 +01:00
Cullen Rhodes	a11e7347fb	[llvm][nfc] Ignore OpenAI Codex artifacts (#162481 ) Follow-up to #153853 to also ignore Codex artifacts [1]. AGENTS.md may be at the root or in sub-directories, so unlike other Markdown config files I've not prefixed it with '/'. [1] https://github.com/openai/codex/blob/main/docs/getting-started.md#memory-with-agentsmd	2025-11-25 09:32:55 +00:00
Jie Fu	cf5234bac4	[AArch64] Silence a warning (NFC) /llvm-project/llvm/lib/Target/AArch64/MachineSMEABIPass.cpp:952:12: error: unused variable 'SMEFnAttrs' [-Werror,-Wunused-variable] SMEAttrs SMEFnAttrs = AFI->getSMEFnAttrs(); ^ 1 error generated.	2025-11-25 17:23:26 +08:00
Pierre van Houtryve	a086fb2fbb	[AMDGPU][gfx1250] Add wait_xcnt before any access that cannot be repeated (#168852 ) The xcnt wait is actually required before any memory access that can only be done once, so atomic stores and volatile accesses are affected. This patch also ensures buffer instructions are handled.	2025-11-25 10:11:04 +01:00
Benjamin Maxwell	eb568d6d0c	[AArch64][SME] Handle zeroing ZA and ZT0 in functions with ZT0 state (#166361 ) In the MachineSMEABIPass, if we have a function with ZT0 state, then there are some additional cases where we need to zero ZA and ZT0. If the function has a private ZA interface, i.e., new ZT0 (and new ZA if present). Then ZT0/ZA must be zeroed when committing the incoming ZA save. If the function has a shared ZA interface, e.g. new ZA and shared ZT0. Then ZA must be zeroed on function entry (without a ZA save commit). The logic in the ABI pass has been reworked to use an "ENTRY" state to handle this (rather than the more specific "CALLER_DORMANT" state).	2025-11-25 09:09:47 +00:00
LLVM GN Syncbot	2ce363d252	[gn build] Port `a39af125db`	2025-11-25 08:55:46 +00:00
Gergely Bálint	ed95c4d6ec	[BOLT][BTI] Add MCPlusBuilder::createBTI (#167305 ) - creates a BTI j\|c landing pad MCInst. - create getBTIHintNum utility in AArch64/Utils, to make sure BOLT generates BTI immediates the same way as LLVM. - add MCPlusBuilder unittests to cover new function.	2025-11-25 09:51:40 +01:00
Tomer Shafir	6193f2aeda	[AArch64] Assert `expandMOVImm` prioritizes optimal single MOVZ/N (#169341 ) The expansion of move immediate in `expandMOVImm` follows the priority of the `MOV` alias. In addition, the selection there properly prefers expansion based on perf optimality order. This change adds a simple assert that `expandMOVImmSimple` expands a single optimal MOVZ/MOVK.	2025-11-25 10:48:23 +02:00
Dharuni R Acharya	a39af125db	[NVVM] Move pretty-print functions from NVVMIntrinsicUtils.h to cpp file (#168997 ) This patch moves the print functions from `NVVMIntrinsicUtils.h` to `NVVMIntrinsicUtils.cpp`, a file created in the `llvm/lib/IR` directory. Signed-off-by: Dharuni R Acharya <dharunira@nvidia.com>	2025-11-25 14:12:50 +05:30
Longsheng Mou	f817a1b039	[NFC] Fix typo of `integer` (#169325 )	2025-11-25 16:06:13 +08:00
Maksim Panchenko	5490bcf4aa	[BOLT] Add missing new line. NFC	2025-11-25 00:05:13 -08:00

1 2 3 4 5 ...

560624 Commits