Commit Graph

18372 Commits

Author SHA1 Message Date
aefd0097e6 Handle single-index GEPs into flat aggregates in SimplifyConstant
In opaque pointer mode, GEPs that index into globals often have a
different shape. SimplifyConstant pass assumed two-index GEPs (0, index)
and directly used the second operand as an element index. However, it is
possible to address flat aggregates using single-index GEPs.

See the two examples below from SYCL_CTS-math_builtin_float_double_1_ocl
run in typed and opaque pointer mode.

Two-index GEP example:
%130 = getelementptr inbounds [2 x i32], [2 x i32] addrspace(2)* @__stgamma_ep_nofp64__ones, i64 0, i64 %129
%131 = bitcast i32 addrspace(2)* %130 to float addrspace(2)*
%132 = load float, float addrspace(2)* %131, align 4, !tbaa !5163, !noalias !5409

Single-index GEP example:
%103 = getelementptr inbounds float, ptr addrspace(2) @__stgamma_ep_nofp64__ones, i64 %102
%104 = load float, ptr addrspace(2) %103, align 4, !tbaa !5163, !noalias !5409

This patch changes the pass to always use the last GEP index as the
element selector. This works because the pass only transforms top-level
arrays of scalars/vectors. In these cases, the element being loaded is
always designated by the final GEP index (whether there are earlier
indices selecting the actual aggregate or single index in opaque pointer
mode).
2025-08-19 21:52:02 +02:00
215e971107 [Autobackout][FunctionalRegression]Revert of change: 6876fb54b2: Enable loop unrolling in retry
Enable loop unrolling in retry
2025-08-19 19:22:15 +02:00
38f1569e69 Revert of Remove redundant guard for a pattern of global imm offsets
Revert
2025-08-19 19:15:31 +02:00
e2c4ba8d76 Bitcast in StatelessToStateful pass
The fix prevents crash in StatelessToStateful pass
if all ptr usees are bitcast instructions.
2025-08-19 13:43:37 +02:00
a057740b8a [Autobackout][FunctionalRegression]Revert of change: 6072b2cdf4: _OS_SUMMARY
_OS_DESCRIPTION
2025-08-19 03:05:22 +02:00
6876fb54b2 Enable loop unrolling in retry
Enable loop unrolling in retry
2025-08-18 21:11:44 +02:00
df0baa89e6 Include chrono explicitly
Include chrono explicitly
2025-08-18 20:41:35 +02:00
95ac72d3b8 Fix remat inst handling in CodeScheduling test
Temporary fix for the remat inst handling in CodeScheduling LIT test
that allows indeterminism of the first two loads order
2025-08-18 18:42:00 +02:00
46f497d623 GenXPromoteArray opaque pointers fix
Do not rely on bitcasts when deciding whether an index adjustment is
necessary. In opaque pointers mode types can change between instructions
without bitcasts.
2025-08-18 14:13:46 +02:00
6072b2cdf4 _OS_SUMMARY
_OS_DESCRIPTION
2025-08-18 10:48:02 +02:00
9d9d6b3e5e enable ShortImplicitPayloadHeader on PVC
Compute workloads add following implicit arguments:
 * payloadHeader - 8 x i32 packing global_id_offset (3 x i32),
   local_size (3 x i32) and 2 x i32 reserved.
 * enqueued_local_size - 3 x i32
Most of the time only enqueued_local_size is used, leaving local_size
unnecessary. In the end, payloadHeader has unused 20 bytes.

This commit enables short payload header on PVC platform.
2025-08-18 09:07:53 +02:00
3c9eb3b099 [Autobackout][FunctionalRegression]Revert of change: bdd9b15ad7: Fix GEP lowering overflow issues
This change prevents usage of potentially
    negative values which are then zero-extended to
    64 bits as indexes.
v2.18.0
2025-08-16 02:57:21 +02:00
ceb9c26626 [Autobackout][FunctionalRegression]Revert of change: 76b5b50eb2: Only modify cr0 on debug SIP exit
Only modify cr0 on debug SIP exit
2025-08-16 00:29:20 +02:00
43da807c49 Changes in code. 2025-08-15 02:14:34 +02:00
587c7e9603 [Autobackout][FunctionalRegression]Revert of change: 882201b325: Use Ray Query Return value in Compute Ray Tracing Extension
Modified intel_get_hit_candidate and intel_is_traversal_done functions.
2025-08-15 01:01:14 +02:00
dcc6f77411 Fix issue in emit pattern with LVN matching for And
Fix issue with LVN matching for And when in SIMD32 with mad operation.
2025-08-15 00:20:33 +02:00
8eb1fe42bd Enable loop unroll but only for reducing code size during compilation retry
Enable loop unroll but only for reducing code size during compilation retry
2025-08-15 00:17:07 +02:00
d81684bd3f Fix the access bound check issue of src operand for madw instruction
For madw instruction, only the dst operand needs special handling in verifier and
src operand should be treated as other instructions.
2025-08-14 22:58:17 +02:00
cedf0f970b Parameterize UnrollMaxCountForAllocai in GenTTI
Parameterize UnrollMaxCountForAllocai in GenTTI
2025-08-14 20:17:15 +02:00
4c2e31a450 Fix the bug of verifying if an operand access exceeds the declared variable size for madw instruction
When verifying if an operand access exceeds the declared variable size, we should do special
handling for madw instruction as this instruction write both the low and high results to
GRFs.
2025-08-14 19:01:27 +02:00
941ba382ec Fix predicated store sub-DW value handling
This change addresses the handling of predicated
stores for sub-DW values with non-uniform stored values.
Predicate alone is not enough to calculate the correct
offset. So, we use `EMASK & Predicate` to determine the
correct offset.
2025-08-14 18:14:13 +02:00
f68235fad2 Bump MINOR to 18 2025-08-14 13:29:44 +02:00
6cad180e82 Add lit test for conversion from i64 to double
Add lit test for conversion from i64 to double
2025-08-14 12:46:40 +02:00
c442009f88 Bump MINOR to 17 2025-08-14 12:23:27 +02:00
Y
882201b325 Use Ray Query Return value in Compute Ray Tracing Extension
Modified intel_get_hit_candidate and intel_is_traversal_done functions.
2025-08-14 10:59:06 +02:00
e4d71856fa Change schedule priority according to dep type
For barrier dep, shouldn't use latency cycle to calcuate priority, because barrier is order issue, not the latecy issue. Use occupancy steady.
2025-08-14 08:09:25 +02:00
e9afb1822b Changes in code. 2025-08-14 00:04:07 +02:00
e8906d0679 Fix i8/opaque pointer byte offset GEP scalarization in PrivateMemoryResolution
When LLVM IR uses opaque pointers or inserts a bitcast to i8*, a
subsequent GEP is expressed in bytes. The legacy handleGEPInst always
scalarized indices by starting from pGEP->getSourceElementType(). After
the i8* cast, the type is i8, so the algorithm mistakenly treated the
byte index as a count of elements, producing misscaled (too large)
scalarized index.

Example:
%a = alloca [16 x [16 x float]], align 4
%b = bitcast [16 x [16 x float]]* %a to i8*
%c = getelementptr inbounds i8, i8* %b, i64 64

Here, 64 is a byte offset into the original aggregate. The old
implementation, seeing i8, scaled as if 64 elements, not 64 bytes.

Yet, the meaningful base of the GEP is alloca's aggregate type
[16 x [16 x float]] and the element-calculations should be based on this
type.

This change:
1. Introduces getFirstNonScalarSourceElementType(GEP), which
walks back from the GEP base through pointer casts to find a root
aggregate element type.
2. Adds additional handling in handleGEPInst, so that i8 GEP byte offset
is converted to an element index of the underlying base type.

This way the algorithm avoids basing element index scalarization on
incidental i8* and keeps index calculation aligned with the underlying
allocation layout.

For reference, in typed pointer mode (or without the bitcast), the GEP
would look like this:
%a = alloca [16 x [16 x float]], align 4
%c = getelementptr inbounds [16 x [16 x float]], [16 x [16 x float]]* %a, i64 0, i64 1

Here, %c is the pointer to the 2nd inner array [16 x float]*.
2025-08-13 22:53:48 +02:00
bdd9b15ad7 Fix GEP lowering overflow issues
This change prevents usage of potentially
negative values which are then zero-extended to
64 bits as indexes.
2025-08-13 20:41:06 +02:00
dcfe3f25db Change new inline raytracing setting
Change new inline raytracing setting
2025-08-13 17:49:52 +02:00
4458a3bfcc Stub vectorization for IGCVectorizer
Allow certain instructions to be "stub-vectorized"
New tests are added to cover for additional flexibitlity of
vectorization.
2025-08-13 14:54:45 +02:00
d19cdc5a52 Refactor ZEBinary flags and documentation
Refactored all conditions based on enableZEBinary() and supportsZEBin(), as if they were always true. Removed said conditions.
2025-08-13 09:05:48 +02:00
aafca7ed1b Improve spill threshold handling
Improve spill threshold handling in units of GRFs calculated from
byte input.
2025-08-12 23:08:27 +02:00
d3ca4a545c Add -vc-codegen option handling for VLD
.
2025-08-12 17:00:51 +02:00
b799e7c1f2 Add GenericCastToPtrOpt pass
In cases where we have no local casts to generics and we allocate
private memory in global space, we can replace GenericCastToPtrExplicit
with simple address space cast.
2025-08-12 15:45:04 +02:00
76b5b50eb2 Only modify cr0 on debug SIP exit
Only modify cr0 on debug SIP exit
2025-08-12 15:20:45 +02:00
aac325449a Enable SIMD16 drop for more platforms
Enable abort on spills to SIMD16 for more platforms.
2025-08-12 14:52:30 +02:00
363ae09a9c Add API option documentation.
Add API option documentation.
2025-08-12 12:33:32 +02:00
bd5e98c59c Restore part runtime loop unroll preference
Restore runtime loop unroll preference but still diable it at high
register pressure
2025-08-12 03:28:16 +02:00
dfa0289e3f Add extra check
.
2025-08-11 21:52:47 +02:00
91c760c8e3 Corrected negative values handling for fast asinh
Use abs value for calculation
2025-08-11 17:21:13 +02:00
0bed77b26a Add SPIR-V test suite for OpReadClockKHR instruction
This PR introduces test suite for the OpReadClockKHR SPIR-V instruction, ensuring proper compilation and intrinsic generation across different scenarios.
2025-08-11 14:35:52 +02:00
97790b75e2 Fix the assertion of "operand with mme must be GRF-aligned"
Fix the assertion of "operand with mme must be GRF-aligned"
2025-08-11 14:31:36 +02:00
d4f76aa175 Support O0 for new pass manager on LLVM16
Prevent crashes when -no-optimize option is used with new PM
2025-08-11 14:27:23 +02:00
539f561fe4 Improve handling the remated instructions in CodeScheduling
CloneAddressArithmetic marks rematted instructions with metadata
Use the metadata in RematChainsAnalysis pass to mark the patterns that
are safe to consider in the scheduling.
Use the estimation of the target instructions (because it's usually a
load) in the RegisterPressureTracker of the scheduling and schedule the
remat chain as a whole.
2025-08-11 14:21:02 +02:00
64d04206c2 Remove code for reading UBO through sampler
This change removes IGC code for features no longer supported in the driver.
2025-08-08 15:26:08 +02:00
1a98d9f986 Provide IGC API option descriptions
Provide IGC API option descriptions
2025-08-08 11:38:39 +02:00
0e2022d0bf Legacy inliner llvm patch
Introduce llvm patch that builds upon commit:
88da019977

Original commit diagnosed an issue in the legacy inliner and claimed
to fix it but the change was non-functional and only added
a debug mode assert.
This patch modifies it to mitigate the problem in the cases where
the assert would happen.
2025-08-08 11:01:43 +02:00
e053866213 Remove deadcode, init members
Default initialize small std::arrays to zero-values in BuildIR.h.
Remove dead code from FlowGraph.cpp.
2025-08-08 10:49:38 +02:00
09f2a4b853 [Autobackout][FunctionalRegression]Revert of change: 610badcf18: Enable SIMD16 drop for more platforms
Enable abort on spills to SIMD16 for more platforms.
2025-08-08 03:11:42 +02:00