Commit Graph

18372 Commits

Author SHA1 Message Date
cee4ac4e9e Retype TargetExtTy return types of function declarations
Clang 16 still lowers OpenCL/SPIR-V built-ins as ptr to opaque structs,
while SPIR-V Reader uses TargetExtTy values. This patch extends the
retyping function to also retype the return types of function (builtins)
declarations. Please note that the builtin function resolution is already
done earlier by SPIR-V Reader.
2025-07-17 00:53:34 +02:00
0d42d6f769 Apply clang-format
As github enables clang-format, convert files to be compliant with
clang-format.

No functional change
2025-07-16 21:12:24 +02:00
2278af1cfe Fixing ShaderOverride case path on Linux
"ShaderOVerride" => "ShaderOverride"
2025-07-16 20:18:10 +02:00
b727832427 Refactor SIMDInfo
Refactor SIMDInfo
2025-07-16 18:43:54 +02:00
890b8bf021 Bitcast in StatelessToStateful pass
Add support for bitcast instruction into StatelessToStateful pass.
2025-07-16 17:02:35 +02:00
0e204e2473 Enable SIMD16 drop for more platforms
Enable abort on spills to SIMD16 for more platforms.
2025-07-16 13:56:05 +02:00
317bdf1b8d Enable LVN matching for And
Also change emit pattern for wave shuffles with same index.
With this LVN can deduplicate same index calculations (and+Shl) for
wave shuffles.
2025-07-16 02:04:13 +02:00
cb9e6e5daf Refactoring
Clarifying comments for vector alias and remove the dead
code.

No Functional change
2025-07-16 01:52:20 +02:00
f69d41cd42 Fix spill threshold bug 2025-07-15 22:21:54 +02:00
6b8e1e040c Move PromoteToPredicatedMemoryAccess pass to OPT stage
Move the PromoteToPredicatedMemoryAccess pass to
the optimization stage of the compiler.
This allows to keep standard LLVM passes to optimize the IR before
the predication pass is applied.
Change the pass to support scalarized loads and stores.
Add a new pass to hoist conversion operations
to the common dominator to unblock the predication pass.
Fix generation of predicated stores, in case address is uniform
and stored value is not.
2025-07-15 21:15:57 +02:00
d5bdfba670 Disable dst LIFETIME_START in resource loop header
Disable LIFETIME_START for destination in resource loop header until
some issues just found be resolved.
2025-07-15 20:48:59 +02:00
07078a445c Fix docker file due to docker image EOL
The IGC scripts are using "debian buster" for the base docker image,
which seems to be EOL'd and is not available.
2025-07-15 19:06:20 +02:00
19a9fc83de Rematerialize pointer and coordinates in InsertBranchOpt (#23159)
This change is to fix issues with pattern matching of the pointer extraction in `InsertBranchOpt`. This rematerializes pointer and coordinates of typed access insts.

Co-authored-by: Andrzejewski, Krystian <krystian.andrzejewski@intel.com>
2025-07-15 12:30:18 +02:00
398538e95f [Autobackout][FunctionalRegression]Revert of change: cebdde95e5: Enable SIMD16 drop for more platforms
Enable abort on spills to SIMD16 for more platforms.
2025-07-12 02:13:47 +02:00
495e061d87 Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
2025-07-11 23:07:02 +02:00
9d7bfe9724 Reduce the SWSB compilation time when there is subroutine
For subroutine, there is no need add live out dependence of call BB
2025-07-11 22:54:21 +02:00
957a0090cb Upgrade IGC C++ standard from 17 to 20
Upgrade IGC C++ standard from 17 to 20
2025-07-11 20:32:30 +02:00
fb2c3b0df4 Create a new CSE to remove redundant WaveBallot
Create a new CSE to remove redundant WaveBallot for performance.
2025-07-11 19:00:41 +02:00
4c0406ad9f Clean up dead code related to patch token binary format deprecation
Cleaned up dead code that's related to patch token binary format deprecation. Removed unused code, adjusted some comments.
Most of these changes are related to previous commits that deprecated the format in VC and OCL.

Some parts are still to be refactored, this doesn't cover all patch token code.
2025-07-11 14:45:04 +02:00
8530f90229 Add internal option to turn off PVCSendWARWA
Adding internal options: `-cl-intel-disable-sendwarwa, -ze-opt-disable-sendwarwa`
to turn off PVCSendWARWA
2025-07-11 12:28:48 +02:00
599485a610 [Autobackout][FunctionalRegression]Revert of change: 16e7042597: Reduce the SWSB compilation time when there is subroutine
For subroutine, there is no need add live out dependence of call BB
2025-07-11 11:36:58 +02:00
b298db3849 PrivateMemoryResolution - fixing cases where elementSize assert was failing
When adding Opaque Pointers support to JointMatrix I've found that 4 test were failing due to this assert:

	info: error, assertion failed: bits == elementSize
	file: Source\IGC\Compiler\Optimizer\OpenCLPasses\PrivateMemory\PrivateMemoryResolution.cpp
	function: TransposeHelperPrivateMem::handleLoadInst
	line: 665

	Failed Tests (4):
	  SYCL :: Matrix/SG32/joint_matrix_bf16_fill_k_cache_unroll.cpp
	  SYCL :: Matrix/SG32/joint_matrix_bf16_fill_k_cache_unroll_init.cpp
	  SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_unroll.cpp
	  SYCL :: Matrix/joint_matrix_bf16_fill_k_cache_unroll_init.cpp

My investigation showed that such resolution path:

alloca -> gep -> load

used invalid vector elements count value, which caused this assert to fail.
To my understanding the reason for this was that we used elementSize saved in "TransposeHelperPrivateMem" instance,
But when we were going thru instructions (alloca->gep->load) then they weren't updated, so there was mismatch.
2025-07-11 09:18:13 +02:00
16e7042597 Reduce the SWSB compilation time when there is subroutine
For subroutine, there is no need add live out dependence of call BB
2025-07-11 08:15:13 +02:00
5e53ab2577 [Autobackout][FunctionalRegression]Revert of change: 6bfa6f3fed: Create a new CSE to remove redundant WaveBallot
Create a new CSE to remove redundant WaveBallot for performance.
2025-07-11 04:30:51 +02:00
2dd58f4311 Fix non-determinism in metadata
Fix non-determinism in metadata
2025-07-11 03:44:37 +02:00
9bd31a245e [Autobackout][FunctionalRegression]Revert of change: 5eee6f4686: Reduce the SWSB compilation time when there is subroutine
For subroutine, there is no need add live out dependence of call BB
2025-07-11 01:10:11 +02:00
c62187734e Revert "Add EarlyCSE and fix perf regressions" 2025-07-11 01:08:23 +02:00
18ee6a4765 Change UseNewInlineRaytracing to a mask to allow selective enabling
Changes:
* UseNewInlineRaytracing is now a mask that lets user selectively enable new inline raytracing for particular shader type
* New regkey AddDummySlotsForNewInlineRaytracing forces increased number of slots required for rayqueries to test if UMD allocated the HW stacks necessary
2025-07-11 00:59:35 +02:00
1f2e916f54 Fix issue that align=1 can not be parsed correctly
Fix issue that align=1 can not be parsed correctly
2025-07-10 22:41:57 +02:00
a0f4cb0f63 [Autobackout][FunctionalRegression]Revert of change: 089d12eb4b: Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
2025-07-10 22:32:30 +02:00
58e13aeb38 Add heuristic for DPAS macro building in post-RA scheduling
Group the dpas instructions which have no dependence between each others
and can be in same macro block in instruction scheduling
2025-07-10 22:24:01 +02:00
089d12eb4b Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
Fix operands alignment issues for SIMD2 instructions with 64b or float datatype
2025-07-10 20:40:11 +02:00
5eee6f4686 Reduce the SWSB compilation time when there is subroutine
For subroutine, there is no need add live out dependence of call BB
2025-07-10 19:52:15 +02:00
7f1a0107a4 Add lit for GenSpecificPattern w/ Clang Formatting
Add missing lit for GenSpecificPattern, also align clang fmt.
2025-07-10 19:41:20 +02:00
cebdde95e5 Enable SIMD16 drop for more platforms
Enable abort on spills to SIMD16 for more platforms.
2025-07-10 17:36:31 +02:00
4107480d96 Relax byte destination restrictions
When the destination type is byte (UB or B), destination sunbregnum can
be aligned to 2 or 3 of the (DWORD) execution channel.
2025-07-10 16:44:12 +02:00
a4e1f99a95 [Autobackout][FunctionalRegression]Revert of change: 4ca99a33b4: Fix problem in split barrier
Fixed problem in split barrier when we are using with regular barrier.
    Case:
    splitbarrier.signal()
    regularbarrier()
    splitbarrier.wait()

    was causing the hang due assingning the same ID of the barrier in the regular barrier and split barrier.
    Now, the split barrier will take other ID than the regular one.
2025-07-10 16:28:09 +02:00
43a46ca97f Enable MAXNUM by default in IGCVectorizer
Enable MAXNUM by default in IGCVectorizer
2025-07-10 15:15:56 +02:00
c96c687caf Fix RegPressureVerbocity flag
Currently flag value was being overriden in code so it was unusable.
2025-07-10 14:41:17 +02:00
c02e235070 [Autobackout][FunctionalRegression]Revert of change: 10520a29cb: Only modify cr0 on debug SIP exit
Only modify cr0 on debug SIP exit
2025-07-10 05:25:00 +02:00
59271223ce Add EarlyCSE without degrading performance
Add EarlyCSE to pass pipeline without generating weird IR patterns that
degrade performance
2025-07-10 00:00:10 +02:00
6bfa6f3fed Create a new CSE to remove redundant WaveBallot
Create a new CSE to remove redundant WaveBallot for performance.
2025-07-09 19:36:52 +02:00
fc1cb18212 [Autobackout][FunctionalRegression]Revert of change: d8afb7673a: Relax byte destination restrictions
When the destination type is byte (UB or B), destination sunbregnum can
    be aligned to 2 or 3 of the (DWORD) execution channel.
2025-07-09 18:56:38 +02:00
4ca99a33b4 Fix problem in split barrier
Fixed problem in split barrier when we are using with regular barrier.
Case:
splitbarrier.signal()
regularbarrier()
splitbarrier.wait()

was causing the hang due assigning the same ID of the barrier in the regular barrier and split barrier.
Now, the split barrier will take other ID than the regular one.
2025-07-09 18:21:45 +02:00
4895516e1b Bump MINOR to 16 2025-07-09 17:58:04 +02:00
61fda4a0cf Remove left-over patch token logic from ProgramScopeConstantAnalysis
Removed deprecated logic after fully deprecating the patch token format in OCL.
The removed code patches offsets based on the m_PatchLaterDataVector, which is always empty, as it was used on the patch token code path.
2025-07-09 14:24:25 +02:00
8f730d8ea6 Remove non-existent value from binary-format VC option help text
Removed the "ocl" values in "binary-format" and "runtime" VC internal options help text. OCL format was removed in an earlier commit.
2025-07-09 13:12:42 +02:00
d8afb7673a Relax byte destination restrictions
When the destination type is byte (UB or B), destination sunbregnum can
be aligned to 2 or 3 of the (DWORD) execution channel.
2025-07-09 11:48:05 +02:00
788da1555d GEP LSR pass - fix trunc/ext handling
GEP LSR pass can strip trunc/ext instructions from SCEV expressions.
Strip only when comparing two expressions, but not when building a new
one.
2025-07-09 10:14:07 +02:00
88a8f29701 [Autobackout][FunctionalRegression]Revert of change: 38ba8f2d84: Unused bindless image args treated as bindless fix.
When emitting zeinfo IGC tags addr mode of images with no users as
    stateful even if the module is compiled to use bindless images. This
    caused NEO to throw an error as it disallows the use of both bindless
    and bindful mode in the same module.

    This commit sets the default addr mode to bindless for modules that have
    UseBindlessImage set to true.
2025-07-09 07:22:09 +02:00