Commit Graph

561911 Commits

Author SHA1 Message Date
Chow
3e008cb333 Scalarizer : Fix vector shuffle issue when can't aligned to customized minBits. (#163912)
When set a value to minBits, and doing scalarizer pass, if last remained
boolean vector size can't be aligned to min bits, remained bits should
be processed each by each, and not allowed to do a direct shuffle during
packing.

Problem:
In 'concatenate' step, when processing a boolean vector, if last
remained bits (fragment) can't be aligned to minBits, but required to be
packed, those bits should be processed each by each.

A direct call to vector shuffle is to assume those remained boolean bits
can be packed to target pack size. For example, when processing a
boolean vector with `size = 7`, but set `min bits = 4`, first fragment
with `4` bits can be packed correctly, but there are still `3` bits
remained which can't be used in a vector shuffle call.

Solution:
If remained bits can't be aligned to required target (min bits) pack
size, process them each by each.
(This will mostly only influence boolean vector as they have bit width
not aligned to pow(2).)

---------

Co-authored-by: Zhou, Shaochi(AMD) <shaozhou@amd.com>
2025-12-08 18:06:39 +00:00
anonymous
57607e2b51 [Clang][Diagnose] Minimal support on emit-include-location in sarif mode (#170415)
Currently, invoking `clang++` with `-fdiagnostics-format=sarif` causes a
crash, with stack traces indicating that
`SARIFDiagnostic::emitIncludeLocation` is unimplemented.

This PR adds minimal support for converting `In file included from ...`
and `In module ...` into `SARIF.result.relatedLocations`. With this
change, `clang++ -fdiagnostics-format=sarif` no longer crashes and now
provides a minimal amount of useful information.

Thank you.
2025-12-08 13:05:25 -05:00
Aiden Grossman
c7f62d28ea [bazel] Port 2a5420ea51 (#171161)
Move some headers into the common __support_time library now that they
are no longer platform specific.
2025-12-08 09:59:45 -08:00
Tarun Prabhu
da7174958b [flang][Driver] Accept (and ignore) some gfortran-specific options
Enable the clang_ignored_gcc_optimization_f_group in flang. These
options are accepted by clang, but ignored after emitting a warning
message. flang's behavior now mirrors both clang and gfortran.

Fixes #158436
2025-12-08 10:58:34 -07:00
Alexey Bataev
f8d0c355f5 [SLP]Prefer instructions, ued outside the block, as the initial main copyable instructions
Instructions, used outside the block, must be considered the first
choice for the main instructionsin the copyable nodes, to avoid
use-before-def.

Fixes #171055
2025-12-08 09:46:15 -08:00
Kevin Sala Penades
0e92beb0c0 [Clang][OpenMP] Switch to __kmpc_parallel_60 with strict parameter (#171082)
This commit switches the `__kmpc_parallel_51` to `__kmpc_parallel_60`,
and adds the strict boolean for the number of threads.
2025-12-08 09:37:11 -08:00
Michael Liao
039c834b03 [scavenger] Remove unused iterator. NFC 2025-12-08 12:36:02 -05:00
Artem Kroviakov
66134e7136 [MLIR][XeVM] Improve matrix ops lowering (#170268) 2025-12-08 18:29:15 +01:00
Michael Klemm
8c8074f0fa [Flang][OpenMP] Fix OpenMP static scheduling when trip count is zero (#170863)
Code-gen produced incorrect code for cases when the trip count an
associated DO loop was zero. The generated code evaluated the trip count
of the loop and substracted 1 from it. When this was passed to
__kmpc_for_static_init_4u, the value was interpreted as unsigned, which
made the upper bound of the worksharing loop 2^32-1 and caused a
division by zero in the calculation of the loop bounds for the threads.
2025-12-08 18:29:02 +01:00
Folkert de Vries
fdd0d53430 cmse: emit __acle_se_ symbol for aliases to entry functions (#162109)
Emitting the symbol in `emitGlobalAlias` seemed most efficient,
otherwise I think you'd have to traverse all aliases. I have verified
that the additional symbol is picked up by `arm-none-eabi-ld` and
correctly generates an entry in `veneers.o`.

Fixes #162084
2025-12-08 17:26:21 +00:00
Sang Ik Lee
c3579f0199 [MLIR][XeGPU][Conversion] Add 2D block op support for sub byte types (#169099)
Some usage case or shapes for 2D block op with sub byte types can be
emulated with 2D block operations for non-sub byte types. Add sub byte
type i4 as a valid XeGPU type. And add lowering of certain 2D
block operations by emulating with larger element types.
2025-12-08 09:23:07 -08:00
Ivan Butygin
f88d060c41 [mlir][amdgpu] memory_counter_wait tensor counter support (#171153) 2025-12-08 20:02:40 +03:00
Felipe de Azevedo Piovezan
f27fbca37c [lldb][NFC] Replace const std::vector& with ArrayRef in APIs (#170834)
Inside the LLVM codebase, const vector& should just be ArrayRef, as this
more general API works both with vectors, SmallVectors and
SmallVectorImpl, as well as with single elements.

This commit replaces two uses introduced in
https://github.com/llvm/llvm-project/pull/168797 .
2025-12-08 16:59:32 +00:00
Anutosh Bhat
cbce30e0f8 [clang-repl] Skip CodeGen for top-level decls when diagnostics report errors (#169989)
We can see the following while running clang-repl in C mode 
```
anutosh491@vv-nuc:/build/anutosh491/llvm-project/build/bin$ ./clang-repl --Xcc=-x --Xcc=c --Xcc=-std=c23
clang-repl> printf("hi\n");
In file included from <<< inputs >>>:1:
input_line_1:1:1: error: call to undeclared library function 'printf' with type 'int (const char *, ...)'; ISO C99 and
      later do not support implicit function declarations [-Wimplicit-function-declaration]
    1 | printf("hi\n");
      | ^
input_line_1:1:1: note: include the header <stdio.h> or explicitly provide a declaration for 'printf'
error: Parsing failed.
clang-repl> #include <stdio.h>
hi
```

In debug mode while dumping the generated Module, i see this 
```
clang-repl> printf("hi\n");
In file included from <<< inputs >>>:1:
input_line_1:1:1: error: call to undeclared library function 'printf' with type 'int (const char *, ...)'; ISO C99 and
      later do not support implicit function declarations [-Wimplicit-function-declaration]
    1 | printf("hi\n");
      | ^
input_line_1:1:1: note: include the header <stdio.h> or explicitly provide a declaration for 'printf'
error: Parsing failed.
clang-repl> #include <stdio.h>

=== compile-ptu 1 ===
[TU=0x55556cfbf830, M=0x55556cfc13a0 (incr_module_1)]
[LLVM IR]
; ModuleID = 'incr_module_1'
source_filename = "incr_module_1"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

@.str = private unnamed_addr constant [4 x i8] c"hi\0A\00", align 1
@llvm.global_ctors = appending global [1 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 65535, ptr @_GLOBAL__sub_I_incr_module_1, ptr null }]

define internal void @__stmts__0() #0 {
entry:
  %call = call i32 (ptr, ...) @printf(ptr noundef @.str)
  ret void
}

declare i32 @printf(ptr noundef, ...) #1

; Function Attrs: noinline nounwind uwtable
define internal void @_GLOBAL__sub_I_incr_module_1() #2 section ".text.startup" {
entry:
  call void @__stmts__0()
  ret void
}

attributes #0 = { "min-legal-vector-width"="0" }
attributes #1 = { "frame-pointer"="all" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }
attributes #2 = { noinline nounwind uwtable "frame-pointer"="all" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cmov,+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" }

!llvm.module.flags = !{!0, !1, !2, !3, !4}
!llvm.ident = !{!5}

!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{i32 8, !"PIC Level", i32 2}
!2 = !{i32 7, !"PIE Level", i32 2}
!3 = !{i32 7, !"uwtable", i32 2}
!4 = !{i32 7, !"frame-pointer", i32 2}
!5 = !{!"clang version 22.0.0git (https://github.com/anutosh491/llvm-project.git 81ad8fbc2b)"}
=== end compile-ptu ===

execute-ptu 1: [TU=0x55556cfbf830, M=0x55556cfc13a0 (incr_module_1)]
hi
```

Basically I see that CodeGen emits IR for a cell before we know whether
DiagnosticsEngine has an error. For C code like `printf("hi\n");`
without <stdio.h>, Sema emits a diagnostic but still produces a
"codegen-able" `TopLevelStmt`, so the `printf` call is IR-generated into
the current module.

Previously, when `Diags.hasErrorOccurred()` was true, we only cleaned up
the PTU AST and left the CodeGen module untouched. The next successful
cell then called `GenModule()`, which returned that same module (now
also containing the next cell’s IR), causing side effects from the
failed cell (e.g. printf)
2025-12-08 22:23:54 +05:30
Krzysztof Parzyszek
58cb4873f9 Revert "[mlir] Fix '-Wtemplate-id-cdtor'. NFC"
This reverts commit fccb65ef8f.

It breaks pre-merge CI:
```
2025-12-08T16:35:11.7239054Z /home/gha/actions-runner/_work/llvm-project/llvm-project/mlir/lib/Pass/PassRegistry.cpp:439:37: error: ISO C++ requires the name after '::~' to be found in the same scope as the name before '::~' [-Werror,-Wdtor-name]
2025-12-08T16:35:11.7240458Z   439 | llvm::cl::OptionValue<OpPassManager>::~OptionValue() = default;
2025-12-08T16:35:11.7241014Z       | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
2025-12-08T16:35:11.7241494Z       |                                     ::OptionValue
2025-12-08T16:35:11.7241903Z 1 error generated.
```
2025-12-08 10:48:53 -06:00
Shreeyash Pandey
2a5420ea51 [libc] move abs_timesout and monotonicity out of linux dir (#167719)
This patch moves abs_timeout and monotonicity out of the linux dir into
common. Both of these functions depend on clock_gettime which is the
actual os-dependent component. As other features in `__support/threads`
may want to use these, it's better to share it in common.
2025-12-08 22:14:12 +05:30
Men-cotton
614fe6da14 [mlir][OpenMP] Fix crash in MapInfoOp conversion when type conversion fails (#171045)
Check the result of `convertType` before calling `TypeAttr::get`. This
prevents a crash on unsupported types (e.g. `tensor`) by ensuring the
pattern fails gracefully.

Added regression test: map-info-type-conversion-fail.mlir

Fixes: #108159
2025-12-08 17:30:22 +01:00
Rana Pratap Reddy
b32a2f418a [Clang][OpenCL][AMDGPU] Allow _Float16 and half vector type compatibility (#170605)
## Summary
Allowing implicit compatibility between `_Float16` vector types and
`half` vector types in OpenCL mode. This enables AMDGPU builtins to work
correctly across OpenCL, HIP, and C++ without requiring separate builtin
definitions.
## Problem Statement
When using AMDGPU image builtins that return half-precision vectors in
OpenCL, users encounter type incompatibility errors:
**Builtin Definition:**
`TARGET_BUILTIN(__builtin_amdgcn_image_load_1d_v4f16_i32, "V4xiiQtii",
"nc", "image-insts")`

**Test Case:**
```
typedef half half4 __attribute__((ext_vector_type(4)));
half4 test_builtin_image_load_1d_2(half4 v4f16, int i32, __amdgpu_texture_t tex) {
  return __builtin_amdgcn_image_load_1d_v4f16_i32(100, i32, tex, 120, i32);
}
```
**Error:**
```
error: returning '__attribute__((__vector_size__(4 * sizeof(_Float16)))) _Float16' 
(vector of 4 '_Float16' values) from a function with incompatible result type 
'half4' (vector of 4 'half' values)
```
## Solution
In OpenCL, allow implicit compatibility between `_Float16` vector types
and `half` vector types. This is needed for AMDGPU builtins that may
return _Float16 vectors to work correctly with OpenCL half vector types.
2025-12-08 21:56:35 +05:30
Michael Liao
fccb65ef8f [mlir] Fix '-Wtemplate-id-cdtor'. NFC 2025-12-08 11:22:48 -05:00
Simon Pilgrim
a05fc9edb9 HexagonGenWideningVecInstr.cpp - fix MSVC "result of 32-bit shift implicitly converted to 64 bits" warning. NFC. (#171095) 2025-12-08 16:17:25 +00:00
zhijian lin
d1ad0856f8 Fix [PowerPC] llc crashed at -O1/O2/O3: Assertion `isImm() && "Wrong MachineOperand mutator"' failed. (#170548)
Fixed issue 
[[PowerPC] llc crashed at -O1/O2/O3: Assertion `isImm() && "Wrong
MachineOperand mutator"'
failed.](https://github.com/llvm/llvm-project/issues/167672)

the root cause of the crash, the IMM operand is in different operand num
of the instruction PPC::XXSPLTW and PPC::XXSPLTB/PPC::XXSPLTH.

and the patch also fix a potential bug that the new element index of
PPC::XXSPLTB/PPC::XXSPLTH/XXSPLTW use the same logic. It should be
different .We need to convert the element index into the proper unit
(byte for VSPLTB, halfword for VSPLTH, word for VSPLTW) because
PPC::XXSLDWI interprets its ShiftImm in 32-bit word units.
2025-12-08 11:16:55 -05:00
Sang Ik Lee
447af32fbb [MLIR][XeGPU][XeVM] create_nd_tdesc: use correct pitch from strides. (#170384)
Base memory pitch should be derived from base stride, not base width.
Remove offset fields from tensor descriptor payload and add pitch field.
2025-12-08 08:15:44 -08:00
Sang Ik Lee
b8ddbc4f03 [MLIR][XeVM] gpu.printf test: use correct runtime. (#170754)
gpu printf test was not using the runtime required by lit.local.cfg
All other tests in the directory are correctly using level zero runtime.
But gpu printf test is using sycl runtime.
2025-12-08 08:14:56 -08:00
Ivan Butygin
ca8419d6cc [mlir][amdgpu] Fuse adjacent MemoryCounterWaitOp (#171148)
Taking the minimum value.
2025-12-08 18:52:26 +03:00
Simon Pilgrim
ebdb903c10 [X86] Handle X86ISD::EXPAND/COMPRESS nodes as target shuffles (#171119)
Allows for shuffle simplification

Required a minor fix to the overly reduced compress-undef-float-passthrough.ll regression test
2025-12-08 15:48:43 +00:00
Anchu Rajendran S
b08c72b26c [Flang][OpenMP] Enables parsing of threadset clause (#169856) 2025-12-08 07:47:05 -08:00
Ramkumar Ramachandra
c5b90103da [VPlan] Use nuw when computing {VF,VScale}xUF (#170710)
These quantities should never unsigned-wrap. This matches the behavior
if only VFxUF is used (and not VF): when computing both VF and VFxUF,
nuw should hold for each step separately.
2025-12-08 15:46:02 +00:00
Benjamin Maxwell
9a5fa3075a [ADT] Add llvm::reverse_conditionally() iterator (#171040)
This patch adds a simple iterator range that allows conditionally
iterating a collection in reverse. It works with any collection
supported by `llvm::reverse(Collection)`.

```
void foo(bool Reverse, std::vector<int>& C) {
  for (int I : reverse_conditionally(C, Reverse)) {
    // ...
  }
}
```
2025-12-08 15:28:09 +00:00
Matt Arsenault
886f54a04c DAG: Set MachinePointerInfo for stack when expanding divrem libcall (#170537) 2025-12-08 16:25:19 +01:00
Sameer Sahasrabuddhe
1ae957515c [AMDGPU][NFC] Update a comment about FLAT v/s LDSDMA
The change in #170263 does not do justice to common knowledge in the backend.
Fix the comment to reflect the relation between FLAT encoding, flat pointer
access, and LDSDMA operations.
2025-12-08 20:49:19 +05:30
Victor Chernyakin
a6fc5a1d77 [clang-tidy][NFC] Refactor fuchsia-multiple-inheritance (#171059) 2025-12-08 07:19:04 -08:00
Matt Arsenault
ce73cbb6ab clang: Use generic builtins in cuda complex builtins header (#171106)
There's no reason to use the ocml or nv prefixed functions and
maintain this list of alias macros. I left these macros in for
NVPTX in the scalbn and logb case, since those have a special
case hack in the AMDGPU codegen and probably do not work on ptx.
2025-12-08 16:16:24 +01:00
Dark Steve
cc19f420b9 [AMDGPU][NPM] Port AMDGPUArgumentUsageInfo to NPM (#170886)
Port AMDGPUArgumentUsageInfo analysis to the NPM to fix suboptimal code
generation when NPM is enabled by default.

Previously, DAG.getPass() returns nullptr when using NPM, causing the
argument usage info to be unavailable during ISel. This resulted in
fallback to FixedABIFunctionInfo which assumes all implicit arguments
are needed, generating unnecessary register setup code for entry
functions.

Fixes LLVM::CodeGen/AMDGPU/cc-entry.ll

Changes:
- Split AMDGPUArgumentUsageInfo into a data class and NPM analysis
wrapper
- Update SIISelLowering to use DAG.getMFAM() for NPM path
- Add RequireAnalysisPass in addPreISel() to ensure analysis
availability

This follows the same pattern used for PhysicalRegisterUsageInfo.
2025-12-08 20:38:00 +05:30
Tim Gymnich
0487154588 [mlir][amdgpu] Add workgroup_mask to MakeDmaDescriptorOp (#171103)
- add `workgroup_mask` and `early_timeout`
2025-12-08 16:02:18 +01:00
Luke Lau
e8219e5ce8 [VPlan] Use BlockFrequencyInfo in getPredBlockCostDivisor (#158690)
In 531.deepsjeng_r from SPEC CPU 2017 there's a loop that we
unprofitably loop vectorize on RISC-V.

The loop looks something like:

```c
  for (int i = 0; i < n; i++) {
    if (x0[i] == a)
      if (x1[i] == b)
        if (x2[i] == c)
          // do stuff...
  }
```

Because it's so deeply nested the actual inner level of the loop rarely
gets executed. However we still deem it profitable to vectorize, which
due to the if-conversion means we now always execute the body.

This stems from the fact that `getPredBlockCostDivisor` currently
assumes that blocks have 50% chance of being executed as a heuristic.

We can fix this by using BlockFrequencyInfo, which gives a more accurate
estimate of the innermost block being executed 12.5% of the time. We can
then calculate the probability as `HeaderFrequency / BlockFrequency`.

Fixing the cost here gives a 7% speedup for 531.deepsjeng_r on RISC-V.

Whilst there's a lot of changes in the in-tree tests, this doesn't
affect llvm-test-suite or SPEC CPU 2017 that much:

- On armv9-a -flto -O3 there's 0.0%/0.2% more geomean loops vectorized
on llvm-test-suite/SPEC CPU 2017.
- On x86-64 -flto -O3 **with PGO** there's 0.9%/0% less geomean loops
vectorized on llvm-test-suite/SPEC CPU 2017.

Overall geomean compile time impact is 0.03% on stage1-ReleaseLTO:
https://llvm-compile-time-tracker.com/compare.php?from=9eee396c58d2e24beb93c460141170def328776d&to=32fbff48f965d03b51549fdf9bbc4ca06473b623&stat=instructions%3Au
2025-12-08 14:28:26 +00:00
Erich Keane
dd06214394 [OpenACC][CIR] Implement routine 'bind'-with-a-string lowering (#170916)
The 'bind' clause emits an attribute on the RoutineOp that states which
function it should call on the device side. When provided in
double-quotes, the function on the device side should be the exact name
given. This patch emits the IR to do that.

As a part of that, we add a helper function to the OpenACC dialect to do
so, as well as a version that adds the ID version (though we don't
    exercise th at yet).

The 'bind' with an ID should do the MANGLED name, but it isn't quite
clear what that name SHOULD be yet. Since the signature of a function is
included in its mangling, and we're not providing said signature, we
have to come up with something. This is left as an exercise for a future
patch.
2025-12-08 06:23:13 -08:00
Simon Pilgrim
bab4d1e8b2 [X86] shift-i512.ll - extend test coverage (#171125)
Remove v8i64 dependency from original shift-by-1 tests - this was added for #132601 but is unlikely to be necessary

Add tests for general shifts as well as shift-by-constant and shift-of-constant examples
2025-12-08 14:17:00 +00:00
Hongyu Chen
11866c499b [DAGCombiner] Don't peek through bitcast when checking isMulAddWithConstProfitable (#171056)
Fixes https://github.com/llvm/llvm-project/issues/171035
Peeking through bitcast may cause type mismatch between `AddNode` and
`ConstNode` in `isMulAddWithConstProfitable`.
2025-12-08 22:09:12 +08:00
Mend Renovate
f1af9b027e Update [Github] Update GHA Dependencies (#171064)
This PR contains the following updates:

| Package | Type | Update | Change | Pending |
|---|---|---|---|---|
| [actions/checkout](https://redirect.github.com/actions/checkout) |
action | patch | `v6.0.0` -> `v6.0.1` | |
| [actions/setup-node](https://redirect.github.com/actions/setup-node) |
action | minor | `v6.0.0` -> `v6.1.0` | |
|
[github/codeql-action](https://redirect.github.com/github/codeql-action)
| action | patch | `v4.31.5` -> `v4.31.6` | `v4.31.7` |
2025-12-08 06:06:43 -08:00
Aiden Grossman
f29f01db8f [Sanitizer] Bump soft_rss_limit_mb in test (#170911)
This test is failing on some buildbots now that the internal shell has
been turned on and was failing previously on some ppc bots when turning
it on a while back (before it got reverted).

At least one X86 bot is barely hitting the limit
(https://lab.llvm.org/buildbot/#/builders/174/builds/28487 224MB-235MB).

This likely needs to be bumped due to changes in the process tree (now
that we invoke things through python rather than a bash shell) with the
enablement of the internal shell.
2025-12-08 06:04:41 -08:00
David Spickett
7fbd443491 [lldb] Remove printf in breakpoint add command
Added in 2110db0f49 / #156067.
2025-12-08 13:53:55 +00:00
Mehdi Amini
c1d030e9a4 [MLIR][ExecutionEngine] Don't create a _mlir_ wrapper function for internal linkage (#171115)
This is somehow NFC, we were creating wrapper for interal functions,
which are de-facto not callable.
2025-12-08 14:42:00 +01:00
Jay Foad
07bafab83d [AMDGPU] Do not generate V_FMAC_DX9_ZERO_F32 on GFX12 (#171116)
GFX12 does not have the FMAC form of this instruction, only the FMA
form.

Fixes: #170437
2025-12-08 13:20:02 +00:00
Robert Imschweiler
33d779dfbf [OpenMP] Fix undefined symbol for Darwin builds (#170999)
cf.
https://github.com/llvm/llvm-project/pull/168554#issuecomment-3617253169
2025-12-08 14:15:39 +01:00
Adrian Vogelsgesang
7c832fca53 [lldb] Fix command line of target frame-provider register (#167803)
So far, the syntax was `target frame-provider register <cmd-options>
[<run-args>]`. Note the optional `run-args` at the end. They are
completely ignored by the actual command, but the command line parser
still accepts them.

This commit removes them.

This was probably a copy-paste error from `CommandObjectProcessLaunch`
which was probably used as a blue-print for `target frame-provider
register`.
2025-12-08 13:14:41 +00:00
Gergely Bálint
a5e8e77f7c [BOLT][PAC] Warn about synchronous unwind tables (#165227)
BOLT currently ignores functions with synchronous PAuth DWARF info.
If more than 10% of functions get ignored for inconsistencies, we
should emit a warning to only use asynchronous unwind tables.

See related issue: #165215
2025-12-08 13:34:48 +01:00
Mehdi Amini
60492898f8 [MLIR] Apply clang-tidy fixes for readability-identifier-naming in ShardOps.cpp (NFC) 2025-12-08 04:12:47 -08:00
Mehdi Amini
1bbff7290f [MLIR] Apply clang-tidy fixes for llvm-qualified-auto in VulkanRuntimeWrappers.cpp (NFC) 2025-12-08 04:12:47 -08:00
Tirthankar Mazumder
d94958b2f2 [InstCombine] Fold icmp samesign u{gt/lt} (X +nsw C2), C -> icmp s{gt/lt} X, (C - C2) (#169960)
Fixes #166973

Partially addresses #134028

Alive2 proof: https://alive2.llvm.org/ce/z/BqHQNN
2025-12-08 13:05:37 +01:00
Simon Pilgrim
3a6781ea4d [X86] vector-shuffle-combining-avx512f.ll - add tests showing failure to simplify expand/compress nodes (#171113) 2025-12-08 12:02:43 +00:00