intel-graphics-compiler

mirror of https://github.com/intel/intel-graphics-compiler.git synced 2025-10-30 08:18:26 +08:00

Author	SHA1	Message	Date
Michal Paszkowski	aefd0097e6	Handle single-index GEPs into flat aggregates in SimplifyConstant In opaque pointer mode, GEPs that index into globals often have a different shape. SimplifyConstant pass assumed two-index GEPs (0, index) and directly used the second operand as an element index. However, it is possible to address flat aggregates using single-index GEPs. See the two examples below from SYCL_CTS-math_builtin_float_double_1_ocl run in typed and opaque pointer mode. Two-index GEP example: %130 = getelementptr inbounds [2 x i32], [2 x i32] addrspace(2)* @__stgamma_ep_nofp64__ones, i64 0, i64 %129 %131 = bitcast i32 addrspace(2)* %130 to float addrspace(2)* %132 = load float, float addrspace(2)* %131, align 4, !tbaa !5163, !noalias !5409 Single-index GEP example: %103 = getelementptr inbounds float, ptr addrspace(2) @__stgamma_ep_nofp64__ones, i64 %102 %104 = load float, ptr addrspace(2) %103, align 4, !tbaa !5163, !noalias !5409 This patch changes the pass to always use the last GEP index as the element selector. This works because the pass only transforms top-level arrays of scalars/vectors. In these cases, the element being loaded is always designated by the final GEP index (whether there are earlier indices selecting the actual aggregate or single index in opaque pointer mode).	2025-08-19 21:52:02 +02:00
sys_igc	215e971107	[Autobackout][FunctionalRegression]Revert of change: `6876fb54b2`: Enable loop unrolling in retry Enable loop unrolling in retry	2025-08-19 19:22:15 +02:00
Andrzejewski, Krystian	38f1569e69	Revert of `Remove redundant guard for a pattern of global imm offsets` Revert	2025-08-19 19:15:31 +02:00
Rybalov, Viacheslav G	e2c4ba8d76	Bitcast in StatelessToStateful pass The fix prevents crash in StatelessToStateful pass if all ptr usees are bitcast instructions.	2025-08-19 13:43:37 +02:00
sys_igc	a057740b8a	[Autobackout][FunctionalRegression]Revert of change: `6072b2cdf4`: _OS_SUMMARY _OS_DESCRIPTION	2025-08-19 03:05:22 +02:00
Liou, Jhe-Yu	6876fb54b2	Enable loop unrolling in retry Enable loop unrolling in retry	2025-08-18 21:11:44 +02:00
mkarigan	df0baa89e6	Include chrono explicitly Include chrono explicitly	2025-08-18 20:41:35 +02:00
Dmitrichenko, Aleksei	95ac72d3b8	Fix remat inst handling in CodeScheduling test Temporary fix for the remat inst handling in CodeScheduling LIT test that allows indeterminism of the first two loads order	2025-08-18 18:42:00 +02:00
Shelegov, Maksim	46f497d623	GenXPromoteArray opaque pointers fix Do not rely on bitcasts when deciding whether an index adjustment is necessary. In opaque pointers mode types can change between instructions without bitcasts.	2025-08-18 14:13:46 +02:00
Krzysztof Śliwiński	6072b2cdf4	_OS_SUMMARY _OS_DESCRIPTION	2025-08-18 10:48:02 +02:00
Kwasniewski, Patryk	9d9d6b3e5e	enable ShortImplicitPayloadHeader on PVC Compute workloads add following implicit arguments: * payloadHeader - 8 x i32 packing global_id_offset (3 x i32), local_size (3 x i32) and 2 x i32 reserved. * enqueued_local_size - 3 x i32 Most of the time only enqueued_local_size is used, leaving local_size unnecessary. In the end, payloadHeader has unused 20 bytes. This commit enables short payload header on PVC platform.	2025-08-18 09:07:53 +02:00
sys_igc	3c9eb3b099	[Autobackout][FunctionalRegression]Revert of change: `bdd9b15ad7`: Fix GEP lowering overflow issues This change prevents usage of potentially negative values which are then zero-extended to 64 bits as indexes. v2.18.0	2025-08-16 02:57:21 +02:00
sys_igc	ceb9c26626	[Autobackout][FunctionalRegression]Revert of change: `76b5b50eb2`: Only modify cr0 on debug SIP exit Only modify cr0 on debug SIP exit	2025-08-16 00:29:20 +02:00
Gu, Junjie	43da807c49	Changes in code.	2025-08-15 02:14:34 +02:00
sys_igc	587c7e9603	[Autobackout][FunctionalRegression]Revert of change: `882201b325`: Use Ray Query Return value in Compute Ray Tracing Extension Modified intel_get_hit_candidate and intel_is_traversal_done functions.	2025-08-15 01:01:14 +02:00
Chen, Kai	dcc6f77411	Fix issue in emit pattern with LVN matching for And Fix issue with LVN matching for And when in SIMD32 with mad operation.	2025-08-15 00:20:33 +02:00
Liou, Jhe-Yu	8eb1fe42bd	Enable loop unroll but only for reducing code size during compilation retry Enable loop unroll but only for reducing code size during compilation retry	2025-08-15 00:17:07 +02:00
Liu, Fang L	d81684bd3f	Fix the access bound check issue of src operand for madw instruction For madw instruction, only the dst operand needs special handling in verifier and src operand should be treated as other instructions.	2025-08-14 22:58:17 +02:00
Liou, Jhe-Yu	cedf0f970b	Parameterize UnrollMaxCountForAllocai in GenTTI Parameterize UnrollMaxCountForAllocai in GenTTI	2025-08-14 20:17:15 +02:00
Liu, Fang L	4c2e31a450	Fix the bug of verifying if an operand access exceeds the declared variable size for madw instruction When verifying if an operand access exceeds the declared variable size, we should do special handling for madw instruction as this instruction write both the low and high results to GRFs.	2025-08-14 19:01:27 +02:00
Plyakhin, Yury	941ba382ec	Fix predicated store sub-DW value handling This change addresses the handling of predicated stores for sub-DW values with non-uniform stored values. Predicate alone is not enough to calculate the correct offset. So, we use `EMASK & Predicate` to determine the correct offset.	2025-08-14 18:14:13 +02:00
Pawel Szymichowski	f68235fad2	Bump MINOR to 18	2025-08-14 13:29:44 +02:00
Borzyszkowski, Mateusz	6cad180e82	Add lit test for conversion from i64 to double Add lit test for conversion from i64 to double	2025-08-14 12:46:40 +02:00
Pawel Szymichowski	c442009f88	Bump MINOR to 17	2025-08-14 12:23:27 +02:00
Y	882201b325	Use Ray Query Return value in Compute Ray Tracing Extension Modified intel_get_hit_candidate and intel_is_traversal_done functions.	2025-08-14 10:59:06 +02:00
Cheng, Bu Qi	e4d71856fa	Change schedule priority according to dep type For barrier dep, shouldn't use latency cycle to calcuate priority, because barrier is order issue, not the latecy issue. Use occupancy steady.	2025-08-14 08:09:25 +02:00
bcheng0127	e9afb1822b	Changes in code.	2025-08-14 00:04:07 +02:00
Michal Paszkowski	e8906d0679	Fix i8/opaque pointer byte offset GEP scalarization in PrivateMemoryResolution When LLVM IR uses opaque pointers or inserts a bitcast to i8, a subsequent GEP is expressed in bytes. The legacy handleGEPInst always scalarized indices by starting from pGEP->getSourceElementType(). After the i8 cast, the type is i8, so the algorithm mistakenly treated the byte index as a count of elements, producing misscaled (too large) scalarized index. Example: %a = alloca [16 x [16 x float]], align 4 %b = bitcast [16 x [16 x float]]* %a to i8* %c = getelementptr inbounds i8, i8* %b, i64 64 Here, 64 is a byte offset into the original aggregate. The old implementation, seeing i8, scaled as if 64 elements, not 64 bytes. Yet, the meaningful base of the GEP is alloca's aggregate type [16 x [16 x float]] and the element-calculations should be based on this type. This change: 1. Introduces getFirstNonScalarSourceElementType(GEP), which walks back from the GEP base through pointer casts to find a root aggregate element type. 2. Adds additional handling in handleGEPInst, so that i8 GEP byte offset is converted to an element index of the underlying base type. This way the algorithm avoids basing element index scalarization on incidental i8* and keeps index calculation aligned with the underlying allocation layout. For reference, in typed pointer mode (or without the bitcast), the GEP would look like this: %a = alloca [16 x [16 x float]], align 4 %c = getelementptr inbounds [16 x [16 x float]], [16 x [16 x float]]* %a, i64 0, i64 1 Here, %c is the pointer to the 2nd inner array [16 x float]*.	2025-08-13 22:53:48 +02:00
Yury Plyakhin	bdd9b15ad7	Fix GEP lowering overflow issues This change prevents usage of potentially negative values which are then zero-extended to 64 bits as indexes.	2025-08-13 20:41:06 +02:00
Jakacki, Jakub	dcfe3f25db	Change new inline raytracing setting Change new inline raytracing setting	2025-08-13 17:49:52 +02:00
Sukhov, Egor	4458a3bfcc	Stub vectorization for IGCVectorizer Allow certain instructions to be "stub-vectorized" New tests are added to cover for additional flexibitlity of vectorization.	2025-08-13 14:54:45 +02:00
Krause, Michal	d19cdc5a52	Refactor ZEBinary flags and documentation Refactored all conditions based on enableZEBinary() and supportsZEBin(), as if they were always true. Removed said conditions.	2025-08-13 09:05:48 +02:00
Joel Fuentes	aafca7ed1b	Improve spill threshold handling Improve spill threshold handling in units of GRFs calculated from byte input.	2025-08-12 23:08:27 +02:00
Gorban, Igor	d3ca4a545c	Add -vc-codegen option handling for VLD .	2025-08-12 17:00:51 +02:00
Stefan Ilic	b799e7c1f2	Add GenericCastToPtrOpt pass In cases where we have no local casts to generics and we allocate private memory in global space, we can replace GenericCastToPtrExplicit with simple address space cast.	2025-08-12 15:45:04 +02:00
Lella, Nagalakshmi	76b5b50eb2	Only modify cr0 on debug SIP exit Only modify cr0 on debug SIP exit	2025-08-12 15:20:45 +02:00
Stefan Ilic	aac325449a	Enable SIMD16 drop for more platforms Enable abort on spills to SIMD16 for more platforms.	2025-08-12 14:52:30 +02:00
Mielczarek, Aleksander	363ae09a9c	Add API option documentation. Add API option documentation.	2025-08-12 12:33:32 +02:00
Liou, Jhe-Yu	bd5e98c59c	Restore part runtime loop unroll preference Restore runtime loop unroll preference but still diable it at high register pressure	2025-08-12 03:28:16 +02:00
Semenov, Vadim	dfa0289e3f	Add extra check .	2025-08-11 21:52:47 +02:00
Guzhaev, Dmitry	91c760c8e3	Corrected negative values handling for fast asinh Use abs value for calculation	2025-08-11 17:21:13 +02:00
CochWojciech	0bed77b26a	Add SPIR-V test suite for OpReadClockKHR instruction This PR introduces test suite for the OpReadClockKHR SPIR-V instruction, ensuring proper compilation and intrinsic generation across different scenarios.	2025-08-11 14:35:52 +02:00
Liu, Fang L	97790b75e2	Fix the assertion of "operand with mme must be GRF-aligned" Fix the assertion of "operand with mme must be GRF-aligned"	2025-08-11 14:31:36 +02:00
Shelegov, Maksim	d4f76aa175	Support O0 for new pass manager on LLVM16 Prevent crashes when -no-optimize option is used with new PM	2025-08-11 14:27:23 +02:00
Aleksei Dmitrichenko	539f561fe4	Improve handling the remated instructions in CodeScheduling CloneAddressArithmetic marks rematted instructions with metadata Use the metadata in RematChainsAnalysis pass to mark the patterns that are safe to consider in the scheduling. Use the estimation of the target instructions (because it's usually a load) in the RegisterPressureTracker of the scheduling and schedule the remat chain as a whole.	2025-08-11 14:21:02 +02:00
Sliwinski, Adrian	64d04206c2	Remove code for reading UBO through sampler This change removes IGC code for features no longer supported in the driver.	2025-08-08 15:26:08 +02:00
Mielczarek, Aleksander	1a98d9f986	Provide IGC API option descriptions Provide IGC API option descriptions	2025-08-08 11:38:39 +02:00
Garbowski, Mateusz	0e2022d0bf	Legacy inliner llvm patch Introduce llvm patch that builds upon commit: `88da019977` Original commit diagnosed an issue in the legacy inliner and claimed to fix it but the change was non-functional and only added a debug mode assert. This patch modifies it to mitigate the problem in the cases where the assert would happen.	2025-08-08 11:01:43 +02:00
Garbowski, Mateusz	e053866213	Remove deadcode, init members Default initialize small std::arrays to zero-values in BuildIR.h. Remove dead code from FlowGraph.cpp.	2025-08-08 10:49:38 +02:00
sys_igc	09f2a4b853	[Autobackout][FunctionalRegression]Revert of change: `610badcf18`: Enable SIMD16 drop for more platforms Enable abort on spills to SIMD16 for more platforms.	2025-08-08 03:11:42 +02:00

1 2 3 4 5 ...

18372 Commits