18653 Commits

Author SHA1 Message Date
3e2f5492aa Changes to MismatchDetected
MismatchDetect wasn't detecting type size mismatch in this case

```llvm
%0 = alloca [2 x double]
%1 = getelementptr inbounds [2 x double], ptr %0, i64 0, i64 0
%2 = load <2 x i32>, ptr %1
```

As it was comparing number of bits allocated by load instruction type --
<2 x i32> to allocated bits of alloca scalar type -- double, resulting
in not detecting size mismatch as 64 == 64. I've changed approach to
using LLVM API getScalarSizeInBits() type method to compare scalar
sizes, similarily to what was done in typed pointers path (see
SOALayoutChecker::visitBitCastInst). Refactored control flow.
2025-10-06 13:00:08 +02:00
e4b64c1e55 [Autobackout][FunctionalRegression]Revert of change: d062d53b0d: CodeGenContext changes wrt LLVMContext
Do not leak LLVMContext in InitLLVMContextWrapper
    Recreate LLVMContext in resetOnRetry
2025-10-05 08:35:15 +02:00
d062d53b0d CodeGenContext changes wrt LLVMContext
Do not leak LLVMContext in InitLLVMContextWrapper
Recreate LLVMContext in resetOnRetry
2025-10-04 21:31:43 +02:00
dc026d107b Internal performance refinements
Improvements to internal analysis and data handling to reduce overhead.
2025-10-04 02:19:20 +02:00
51cd032253 Bump MINOR to 21 2025-10-03 14:49:28 +02:00
8b7ee37523 Adding support for opaque pointers in HandleSpirvDecorationMetadata
For `opaque pointers` we cannot deduce a type for `Prefetch` call with `ptr`.
We need this information to create appropriate builtins. To achieve this,
we can use `llvm::demangle()` method and find appropriate type.

This change is compatible with both typed and opaque pointers.
v2.21.0
2025-10-03 12:25:49 +02:00
cf6d8b2620 Add new builtin pattern in ProcessFuncAttributes
Add new builtin pattern in ProcessFuncAttributes
2025-10-03 11:59:32 +02:00
b26f997728 [Autobackout][FunctionalRegression]Revert of change: e6a8ee602b: Changes in code
Changes in code
2025-10-03 03:08:09 +02:00
72755ca3a6 Replace EATOMIC_IADD with EATOMIC_INC and EATOMIC_DEC when
immediate is 1 or -1 as increment or decrement

In cases a shader is doing typed atomics with typed, or untyped atomics
with ugm, or untyped atomics with slm and just increment or decrement
atomic operation using an immediate as -1 and 1, we can use EATOMIC_INC(2)
or EATOMIC_DEC(3) to replace EATOMIC_IADD.
2025-10-03 02:45:09 +02:00
c308953542 Small Cmake change for build
Small Cmake change for build
2025-10-03 02:35:16 +02:00
52bed8b3b1 Refactor optimization handling in new inline raytracing
Refactor optimization handling in new inline raytracing to defer modifying the function until we are done with all liveness objects.
This way, we don't invalidate liveness objects and avoid costly recalculations.
2025-10-03 00:47:30 +02:00
f858b9191c Fix vISA assertion "operand with mme must be GRF-aligned"
Fix vISA assertion "operand with mme must be GRF-aligned"
2025-10-02 23:39:04 +02:00
cbac959bf1 Purge unused instructions after value remapping in new inline raytracing
Purge unused instructions after value remapping in new inline raytracing
2025-10-02 20:45:24 +02:00
21bae4bef7 Use cross block load vectorization for new inline raytracing by default
Use cross block load vectorization for new inline raytracing by default
2025-10-02 19:42:19 +02:00
7bcf614eea Minor fixes and refactors. 2025-10-02 19:33:55 +02:00
5a62655cc4 Improve (and fix) cross block load vectorization path
Cross block load vectorization works on an assumption: within a single block, we can preload the rstack data for multiple rayinfo calls without drastically increasing overall register pressure.
This let's us cull a lot of sends (applications will usually cluster rayinfo calls within a single block).
The first implementation was flawed though. It didn't take into account the following things:
1. Some instructions will write to the stack (like TraceRayInline). This will make the shadow copy stale.
2. RayInfo instructions will create their own blocks when lowered. This will affect basic block -> stack pointer mapping, creating more shadow copies and unnecessary loads.

1. is fixed by splitting the block after instructions that write to the stack.
2. is fixed by collecting ray info instructions first, assigning stack pointers to them, and only then lowering them.
2025-10-02 19:26:20 +02:00
e6a8ee602b Bump MINOR to 21 2025-10-02 11:12:47 +02:00
fdc6004c85 [Autobackout][FunctionalRegression]Revert of change: 4262034272: Purge unused instructions after value remapping in new inline raytracing
Purge unused instructions after value remapping in new inline raytracing
2025-10-02 04:45:00 +02:00
2c373057d5 [Autobackout][FunctionalRegression]Revert of change: 36cd3f0809: Enable Loop unrolling protmotion for Alloc for some
platforms

Enable Loop unrolling protmotion for Alloc after platforms
2025-10-02 02:29:41 +02:00
e1467856fe Pre-assign GRF to spillHeader in fail safe RA iteration
spillHeader may be used to store offset for spill/fill instruction. It
must be infinite spill cost variable. If spillHeader gets assigned to a
register that causes fragmentation, then it could cause previously
spilled variables to not get an allocation in fail safe RA iteration.

With this change, we find first GRF candidate that can be assigned to
spillHeader. This way, we avoid fragmenting free GRF space.
2025-10-02 00:43:53 +02:00
6105eaf90a Fix RTStack2 size used for memcpy in cross block load vectorization path
Fix RTStack2 size used for memcpy in cross block load vectorization path
2025-10-02 00:29:12 +02:00
8653a04dbe User function call detection changes
User function call detection changes
2025-10-01 20:40:20 +02:00
e07d2a44ef Remove RoundChunkSize assertion checks
These assertion checks are not applicable to non-LSC enabled platforms.
The rounding is already in place for LSC case.
2025-10-01 19:06:53 +02:00
4262034272 Purge unused instructions after value remapping in new inline raytracing
Purge unused instructions after value remapping in new inline raytracing
2025-10-01 18:31:49 +02:00
6fbca28d93 Minor fixes and refactors.
Add std::move if it's at the end of variable's scope.

Small refactor + typo fix in tryPrintLabel labmda.

Replace `construct -> push_back` pattern with emplace_back.

Change arg to const ref where makes sense.
2025-10-01 14:22:10 +02:00
f728b0f9f6 [Autobackout][FunctionalRegression]Revert of change: 973b78365f: Use cross block load vectorization for new inline raytracing by default
Use cross block load vectorization for new inline raytracing by default
2025-10-01 00:46:32 +02:00
df9c8fde8c Fix scalarized GEP index for reinterpreted vectors in LowerGEPForPrivMem
The old scalarization advanced the GEP scalarized index by the number of
smaller vector elements when a GEP indexed through a reinterpreted
vector whose lane size differed from the promoted lane. This
over-advanced the index (e.g. using 8 for <8 x i32> over double lanes
instead of 4), producing incorrect accesses.

The fix:
- Track the promoted lane byte size (m_promotedLaneBytes) in
TransposeHelper and set it in TransposeHelperPromote's constructor.
- In TransposeHelper::getArrSizeAndEltType, when a vector is a
reinterpret of the promoted storage, compute the increment as
vector_byte_size / m_promotedLaneBytes instead of
vector_byte_size / small_element_size.
2025-10-01 00:43:14 +02:00
36cd3f0809 Enable Loop unrolling protmotion for Alloc for some
platforms

Enable Loop unrolling protmotion for Alloc after platforms
2025-09-30 23:27:52 +02:00
ea58eb22d7 Restore comment layout
Restore comment layout
2025-09-30 21:45:45 +02:00
ca8161f7e0 Modify Integer MAD Pattern Matching
Modify Integer MAD pattern matching to catch more cases.
2025-09-30 18:46:10 +02:00
6b02723b76 IGC_StackOverflowDetection documentation
Add StackOverflowDetection.md documentation.
2025-09-30 16:46:37 +02:00
8929bc7dbe Minor fixes and refactors. 2025-09-30 16:02:57 +02:00
98feff653c [Autobackout][FunctionalRegression]Revert of change: 025e40ef8f: IGC_StackOverflowDetection documentation
Add StackOverflowDetection.md documentation.
2025-09-30 05:28:24 +02:00
973b78365f Use cross block load vectorization for new inline raytracing by default
Use cross block load vectorization for new inline raytracing by default
2025-09-30 05:25:35 +02:00
d6a14e51b8 Put SplitIndirectEEtoSel back to the working order after MemOpt
Put SplitIndirectEEtoSel back to the working order after MemOpt
2025-09-29 23:19:02 +02:00
56b3435453 Changes in code. 2025-09-29 19:45:15 +02:00
0513242371 waveall-stub and wider dependency window inside IGCVectorizer
waveall-stub and wider dependency window inside IGCVectorizer
2025-09-29 18:58:25 +02:00
025e40ef8f IGC_StackOverflowDetection documentation
Add StackOverflowDetection.md documentation.
2025-09-29 12:24:38 +02:00
d90ef22b8b [LLVM16] Fixing memory issue during TargetExtensionType resolution
When calling `NewF->setName(OriginalName);`
setName under the hood performs `destroyValueName();` which invalidates OriginalName.

It resulted in such IR:

`define spir_kernel void @"\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD"(ptr addrspace(1)...)`

or

`%14 = call <2 x i32> @"\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD\DD"(ptr addrspace(1) %input...)`
2025-09-29 11:35:33 +02:00
06bdb48bf1 Skip spill coalescing when payload is address taken
When payload of spill intrinsic is address taken, we cannot simply
replace the virtual register with a temporary coalesced range. That's
because address takens can have indirect def.

Treat such cases as non-spill coalesceable.

This is a functional fix.
2025-09-29 07:18:03 +02:00
5572ee373a [Autobackout][FunctionalRegression]Revert of change: c3870c8a5b: Replace EATOMIC_IADD with EATOMIC_INC and EATOMIC_DEC when
immediate is 1 or -1 as increment or decrement

In cases a shader is doing typed atomics with typed, or untyped atomics
    with ugm, or untyped atomics with slm and just increment or decrement
    atomic operation using an immediate as -1 and 1, we can use EATOMIC_INC(2)
    or EATOMIC_DEC(3) to replace EATOMIC_IADD.
2025-09-28 09:53:08 +02:00
69e5a51b53 Fix assert when not compiling compute input
Fix assert when not compiling compute input
2025-09-28 06:26:42 +02:00
c3870c8a5b Replace EATOMIC_IADD with EATOMIC_INC and EATOMIC_DEC when
immediate is 1 or -1 as increment or decrement

In cases a shader is doing typed atomics with typed, or untyped atomics
with ugm, or untyped atomics with slm and just increment or decrement
atomic operation using an immediate as -1 and 1, we can use EATOMIC_INC(2)
or EATOMIC_DEC(3) to replace EATOMIC_IADD.
2025-09-27 04:30:20 +02:00
1803372348 Changes in code. 2025-09-26 21:16:39 +02:00
b6e17077e1 Fix Typo
Fix Typo
2025-09-26 19:01:51 +02:00
32e337f945 [Autobackout][FunctionalRegression]Revert of change: 840a25e7ab: Replace EATOMIC_IADD with EATOMIC_INC and EATOMIC_DEC when
immediate is 1 or -1 as increment or decrement

In cases a shader is doing typed atomics with typed, or untyped atomics
    with ugm, or untyped atomics with slm and just increment or decrement
    atomic operation using an immediate as -1 and 1, we can use EATOMIC_INC(2)
    or EATOMIC_DEC(3) to replace EATOMIC_IADD.
2025-09-26 04:55:54 +02:00
840a25e7ab Replace EATOMIC_IADD with EATOMIC_INC and EATOMIC_DEC when
immediate is 1 or -1 as increment or decrement

In cases a shader is doing typed atomics with typed, or untyped atomics
with ugm, or untyped atomics with slm and just increment or decrement
atomic operation using an immediate as -1 and 1, we can use EATOMIC_INC(2)
or EATOMIC_DEC(3) to replace EATOMIC_IADD.
2025-09-26 02:10:03 +02:00
148f0bd94e Enable Loads via LSC
enable loads via LSC
2025-09-25 20:45:30 +02:00
0613f29a34 Changes in code. 2025-09-25 20:30:52 +02:00
65c78717b4 Revert "Move SplitIndirectEEtoSel after MemOpt"
Revert "Move SplitIndirectEEtoSel after MemOpt"
2025-09-25 18:19:48 +02:00