18653 Commits

Author SHA1 Message Date
48d72f91a2 Another pattern match for packing <4 x i8> values.
This PR add detection for one more pattern that packs 4 8-bit integer values
into a single 32-bit value.
2025-10-15 11:27:50 +02:00
6141525b6f Refactor: cache pointer size in DwarfDebug
Capture the target pointer size once during DwarfDebug construction.
2025-10-15 10:15:25 +02:00
0fe2acfbb4 src2 acc support
For single precision float instruction only
2025-10-15 08:28:51 +02:00
b3e1d9a27b IGA SWSB: Refactor dpas macro builder
Removed DpasMacroBuilder::getSuppressionBlockCandidate. Now the dpas
macro is formed until a dpas is seen that cannot be in a macro, even
if there is no suppression opportunity, i.e. no sources are the same
within the macro. There is no performance drawback doing so. This also
aligns with vISA's dpas macro logic.
2025-10-15 01:18:30 +02:00
025d95a026 Changes in code. 2025-10-14 18:52:58 +02:00
ca6ae8ab9c Fix incorrect fc register offset while saving in SIP
Fix incorrect fc register offset while saving in SIP
2025-10-14 16:19:23 +02:00
a20b78cbf7 Rematerialize runtime_value intrinsics
This change is to rematerialize `runtime_value` instructions.
2025-10-14 15:57:28 +02:00
ecb7315c86 Fix DIStringType length emitting in DWARF
For some cases, there wasn't DW_AT_string_length added to
variable, which resulted in treating vla array as character.
2025-10-14 15:34:19 +02:00
9d5bec3237 Additional LLVM patching
Add necessary LLVM patches.
2025-10-14 14:05:15 +02:00
7d69a9f62e [Autobackout][FunctionalRegression]Revert of change: 4235871241: Lower loads using PHI instructions
Lowers loads using PHI instructions to incoming blocks to avoid
    uncessary address space casts.
2025-10-13 21:34:56 +02:00
3cf5b6bc7b Remove redundant guard for a pattern of global imm offsets
This simplifies the control flow for a pattern of global imm offsets.
2025-10-13 18:49:36 +02:00
c44094d341 Clean up code 2025-10-13 18:48:05 +02:00
4235871241 Lower loads using PHI instructions
Lowers loads using PHI instructions to incoming blocks to avoid
uncessary address space casts.
2025-10-13 16:50:07 +02:00
12448203f5 [Autobackout][FunctionalRegression]Revert of change: 393114b7a2: Replace AtomicSyncAtomic Fence with GenISA_source_value try 2
Replace AtomicSyncAtomic Fence with GenISA_source_value try 2
2025-10-11 05:36:42 +02:00
ab9be4bab1 [Autobackout][FunctionalRegression]Revert of change: 72b78e92ff: IGA SWSB: Refactor dpas macro builder
Removed DpasMacroBuilder::getSuppressionBlockCandidate. Now the dpas
    macro is formed until a dpas is seen that cannot be in a macro, even
    if there is no suppression opportunity, i.e. no sources are the same
    within the macro. There is no performance drawback doing so. This also
    aligns with vISA's dpas macro logic.
2025-10-10 21:48:50 +02:00
816436eff5 Disable LdShrink for smaller than 32-bit types
Do not shrink loads smaller than 32-bit, as some code patterns can
caused non-aligned loads that degrade performance.
2025-10-10 20:30:55 +02:00
aca51efd6b Prevent opencl-clang from automatically enabling extensions
Prevent opencl-clang from automatically enabling extensions by undefining __SPIR__/__SPIRV__
macros. This way we only enable extensions that are passed to IGC from NEO, which are extensions
supported by the device we compile code for.

This change also enables the cl_khr_integer_dot_product extension in OpenCL C < 3.0.
2025-10-10 19:16:37 +02:00
9a2d64427b Minor fixes and refactors.
- Pass string by const ref where makes sense.

- Construct + push_back -> emplace_back.

- Move NewOutputArgs instead of cpy.

- const auto to const auto ref where possible.

- dyn_cast to cast where certain cast won't fail.
2025-10-10 18:39:17 +02:00
442b357362 Fixed the dead code issue reported by coverity
Fixed the dead code issue reported by coverity
2025-10-10 18:30:02 +02:00
7d3682fc47 Disable InterpreterPatternMatching
By default disable InterpreterPatternMatching due to performance
reasons. Recompilation will be still triggered in some cases, but now
based on register pressure.
2025-10-10 14:27:22 +02:00
24c1b8353d Bump MINOR to 22 2025-10-10 13:25:20 +02:00
260cf1b4cd [Autobackout][FunctionalRegression]Revert of change: 22d9a8ee99: Prevent opencl-clang from automatically enabling extensions
Prevent opencl-clang from automatically enabling extensions by undefining __SPIR__/__SPIRV__
    macros. This way we only enable extensions that are passed to IGC from NEO, which are extensions
    supported by the device we compile code for.

    This change also enables the cl_khr_integer_dot_product extension in OpenCL C < 3.0.
2025-10-10 03:31:12 +02:00
04e034c396 Implement workaround of no atomic write combined instruction as the first instruction of kernel
Implement workaround of no atomic write combined instruction as the first instruction of kernel
2025-10-10 02:00:07 +02:00
72b78e92ff IGA SWSB: Refactor dpas macro builder
Removed DpasMacroBuilder::getSuppressionBlockCandidate. Now the dpas
macro is formed until a dpas is seen that cannot be in a macro, even
if there is no suppression opportunity, i.e. no sources are the same
within the macro. There is no performance drawback doing so. This also
aligns with vISA's dpas macro logic.
2025-10-10 01:45:45 +02:00
393114b7a2 Replace Atomic Fence with GenISA_source_value try 2
Replace Atomic Fence with GenISA_source_value try 2
2025-10-09 23:02:16 +02:00
9a528053e5 Refactor OverlapsWith to use ContainsInstruction
Refactor OverlapsWith to use ContainsInstruction
2025-10-09 21:48:01 +02:00
2a1debbe4b Refactor AllocationLivenessAnalyzer to use SetVector instead of DenseSet
DenseSet would be preferred since it is a more mature container type than SetVector.
Unfortunately, iteration over lifetime ends and lifetime edges should be deterministic and only SetVector guarantees that.
2025-10-09 20:21:58 +02:00
ae461354fd Use data structures in STB_TranslateOutputArgs.
This commit changes STB_TranslateOutputArgs to use data structures
instead of pointers to arrays of char. This is done for the purpose of
preventing memory leaks.

For `pOutput` field, llvm::SmallVector is used, as it works with
ZEBinaryBuilder::getBinaryObject(llvm::raw_pwrite_stream).
2025-10-09 14:57:26 +02:00
d53ffddcbc Adjust ConvertUserSemanticDecoratorOnFunctions pass for opaque-pointers
When we handle annotations with opaque pointers, we can call only single getOperand()
on annotation struct, because we don't need to use e.g. bitcast instruction
like for typed pointers.
2025-10-09 14:08:38 +02:00
5f3b2b4c5a Prevent fast math flag propagation to __spirv_ocl_exp builtin implementation
When `-cl-fast-relaxed-math` is enabled, IGC computes e^x using 2^x by
calculating 2^(x * log2(e)), where log2(e) is `M_LOG2E_F` (≈ 1.44269504).

For `exp(a * b)`, IGC transforms this to `exp2((a * b) * M_LOG2E_F)`.
The compiler must preserve the original multiplication order to avoid
overflow in critical cases.

Critical case: When `a` is large (e.g., `FLOAT_MAX`) and `b` is `0`:
- Correct order: `(a * b) * M_LOG2E_F` = `(FLOAT_MAX * 0) * M_LOG2E_F` = `0`
- Wrong order: `(a * M_LOG2E_F) * b` = `(FLOAT_MAX * M_LOG2E_F) * 0` = `INF * 0` = `NaN`

This change ensures that fast math flags are not applied to the
(x * M_LOG2E_F) multiplication in the exp builtin implementation,
preventing reordering optimization in `CustomUnsafeOptPass.cpp` that
could lead to incorrect results.

The multiplication by `M_LOG2E_F` now happens right before passing
the value to the math.exp instruction, preserving the original
multiplication order.
2025-10-09 13:46:54 +02:00
9b31b6b98f Fixing GetOrInsert function
Fixing GetOrInsert function invoked in Emu64Ops pass.
2025-10-09 12:54:11 +02:00
22d9a8ee99 Prevent opencl-clang from automatically enabling extensions
Prevent opencl-clang from automatically enabling extensions by undefining __SPIR__/__SPIRV__
macros. This way we only enable extensions that are passed to IGC from NEO, which are extensions
supported by the device we compile code for.

This change also enables the cl_khr_integer_dot_product extension in OpenCL C < 3.0.
2025-10-09 12:30:07 +02:00
92e5116459 Turn invalid asm constraints assert into error
When processing inline vISA, IGC checks that inputs match constraints.
Before this change, if a check failed, the compiler used to silently drop the
instruction or produce an assert, if the compiler was built in debug mode.

After this change, the compiler will throw an error regardless of the build type.
2025-10-09 11:21:53 +02:00
b7478f2c14 Some code refactor to atomic_inc conversion and AIL flag use
The replacement EATOMIC_IADD with EATOMIC_INC and EATOMIC_DEC seems to
have unexpected performance impact on some workload. Use AIL flag to
bail out the incompatible use case.
2025-10-09 01:55:00 +02:00
73b524f9b2 [Autobackout][FunctionalRegression]Revert of change: 7cf809e12d: Update RetryManager
Add SetSpillCost and GetLastSpillCost
2025-10-08 12:46:30 +02:00
e5f2c15e95 Revert "[LLVM16][StatelessToStateful] Case where BUFFER_OFFSET doesn't seem to be 0"
Revert "[LLVM16][StatelessToStateful] Case where BUFFER_OFFSET doesn't seem to be 0"
2025-10-08 11:10:01 +02:00
a6a3160415 Don't compare pointer types on opaque pointers in
`SOALayoutChecker::visitBitCastInst()`

`SOALayoutChecker::visitBitCastInst()` assumed we're on typed pointers
and tried to get pointer types, triggering LLVM assert on opaque
pointers.
2025-10-08 09:53:13 +02:00
e3be8fe2a5 Handle metadata users of old function after legalization
Just replace these users with poison instead of dropping them
2025-10-08 00:15:52 +02:00
353f89eeaa Internal metadata changes
Internal metadata changes
2025-10-08 00:10:05 +02:00
053ebf897c [Autobackout][FunctionalRegression]Revert of change: 9379c8d1de: Enable the use of load_status.tgm
This PR enables the use of load_status.tgm.
2025-10-07 23:55:45 +02:00
7cf809e12d Update RetryManager
Add SetSpillCost and GetLastSpillCost
2025-10-07 20:22:58 +02:00
76b4d1bfe2 Refresh workaround files
Refreshes workaround-related files.
2025-10-07 15:01:17 +02:00
900059c7f6 Adjust ocloc/LIT tests for opaque and typed pointers
Adjust ocloc/LIT tests for opaque and typed pointers
2025-10-07 12:34:56 +02:00
9379c8d1de Enable the use of load_status.tgm
This PR enables the use of load_status.tgm.
2025-10-07 12:09:02 +02:00
cc9353f122 Set opaque-pointers to OFF by default
Set opaque-pointers to OFF by default
2025-10-07 11:25:11 +02:00
fc97dc4826 Remove coherence hint function
Remove coherence hint function
2025-10-07 01:23:57 +02:00
74bdcb383f CodeGenContext changes wrt LLVMContext
Do not leak LLVMContext in InitLLVMContextWrapper
Recreate LLVMContext in resetOnRetry
2025-10-06 21:37:42 +02:00
ef6c2002b2 Enable Loop unrolling protmotion for Alloc for some
platforms

Enable Loop unrolling protmotion for Alloc for some platforms
2025-10-06 20:42:56 +02:00
6adbbc856b Minor fixes and refactors. 2025-10-06 16:02:38 +02:00
d895809a52 Add no-inline and targeted code removal for testing
Adds new pass to disable inlining, and passes to remove/drop specific
functions and basic blocks.
This is for testing purposes only.
2025-10-06 14:18:52 +02:00