intel-graphics-compiler

mirror of https://github.com/intel/intel-graphics-compiler.git synced 2025-10-30 08:18:26 +08:00

Author	SHA1	Message	Date
builder	1380ea29db	Refresh workaround files Refreshes workaround-related files.	2025-09-01 15:48:29 +02:00
sys_igc	d69a75f632	[Autobackout][FunctionalRegression]Revert of change: `a2dbe990ca`: Replace Atomic Fence with GenISA_source_value Replace Atomic Fence with GenISA_source_value	2025-09-01 13:44:04 +02:00
Anastasia Bodrova	cf2dc92ae5	Changes in code.	2025-09-01 13:03:23 +02:00
sys_igc	46629d9b5f	[Autobackout][FunctionalRegression]Revert of change: `5bffd05743`: Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-08-30 08:28:14 +02:00
Thamminedi, Prashanth	a2dbe990ca	Replace Atomic Fence with GenISA_source_value Replace Atomic Fence with GenISA_source_value	2025-08-30 07:21:57 +02:00
Xue, Bowen	5bffd05743	Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-08-30 04:02:24 +02:00
Jakacki, Jakub	5c1f4b67f2	Change new inline raytracing regkey default value to 12 Change new inline raytracing regkey default value to 12	2025-08-29 22:12:52 +02:00
Andrzejewski, Krystian	a060a9a901	Prevent from a double erasion of instructions in `HoistCongruentPHI` This change is to avoid exception related to double erasion of instructions in `HoistCongruentPHI`.	2025-08-29 18:40:22 +02:00
Pogotovskii, Pavel	82a7975dea	Support zero incoming values in MergeScalarPhisPass. Add support to MergeScalarPhisPass for vectorizing phi instructions with constant zero inсoming values.	2025-08-29 17:56:52 +02:00
bokrzesi	6cdf4a7232	Workaround for prefetch_cache_control failing due to unknown type size on opaque pointers Currently `handleCacheControlINTELForPrefetch` requires type size to perform call conversion correctly, but on opaque pointers such size is not available and we cannot just extract it anyhow. We're waiting for "OpUntypedPrefetch" extension which will support opaque pointers in such cases. Meanwhile as of today I'm adding skip of the whole prefetch conversion when opaque pointer type is involved and we're waiting for update about the status of "OpUntypedPrefetch"	2025-08-29 13:21:43 +02:00
Dmitrichenko, Aleksei	a22afc6f41	Disable CodeScheduling for older platforms Disable CodeScheduling for older platforms	2025-08-29 11:23:10 +02:00
Liu, Fang L	75b46bbc7d	Don't swap src operands if the swapping causes invalid datetype combination for mad instruction Don't swap src0 and src1 of pseudo_mad instruction in HWConformity if the swapping causes invalid datatype combination. For example: pseudo_mad (32) result1(0,0)<1>:d x1(0,0)<2;0>:uw r0.1<0;0>:d z(0,0)<1;0>:d In this case, we swap src0(actually src2) and src1 if src1 is scalar but src0 is not, as src0(actually src2) has no regioning support: pseudo_mad (32) result1(0,0)<1>:d r0.1<0;0>:d x1(0,0)<2;0>:uw z(0,0)<1;0>:d After swapping, the datatype combination is invalid as it changes the datatype combination from (W * D + D) to (D * W + D). If src2(actually src0) is D, HW only supports (W * D + D). Then we wouldn't generate mad, and we generate mul+add instead. But without this swapping, actually we can generate mad as src0(actually src2) is aligned to dst.	2025-08-29 03:22:04 +02:00
Jakub Jakacki	74c3bcce15	Implement cross block load vectorization for inline raytracing We are not having performance parity with the old implementation. One of the reasons is suboptimal loading from rtstack. This change should coalesce loads for trivial rayquery usages	2025-08-29 00:45:05 +02:00
Jakacki, Jakub	3275c8a2a4	Handle selects and memsets in alloca tracking Handle selects and memsets in alloca tracking	2025-08-28 21:29:37 +02:00
Sukhov, Egor	cca2a9fe60	IGCVectorizer now supports I32 PHI IGCVectorizer now supports I32 Phi instructions.	2025-08-28 16:00:49 +02:00
Mielczarek, Aleksander	0998b31acf	Include additional headers in dev package Include additional headers in dev package	2025-08-28 15:27:28 +02:00
Kreczko, Konrad	137cd1df57	Fix constant folding in PredefinedConstantResolving PredefinedConstantResolving pass caused type mismatch assertion in tests while moving to opaque pointers. It happened when there was a type difference between a global variable and it's load instruction user. With typed pointers the pass was skipped in this scenario because user of same global was bitcast and then it's user was load. What this pass tried to do was doing RAUW operation on load to replace it with global constant. This fix changes pass's behaviour by enabling constant folding even when there is a type difference between load instruction and global constant. Example crashing ir: ```llvm @global = constant [3 x i64] [i64 16, i64 32, i64 64] define void @func(i64 %0) { %2 = load i64, ptr @global ; <-- crash ret void } ```	2025-08-28 14:02:41 +02:00
Kwasniewski, Patryk	f7a18fd28f	add Code Scheduling LIT for SIMD32 Add new LIT for Code Scheduling to test SIMD32 kernels.	2025-08-28 11:49:50 +02:00
Mielczarek, Aleksander	7682d93a20	Remove null dereference Remove null return value dereference	2025-08-28 09:30:11 +02:00
sys_igc	8b32f60b27	[Autobackout][FunctionalRegression]Revert of change: `7b8b6da4df`: Eliminate samples with the same texture coordinates Adding ReassociatePass before GVN pass ensures the second sample gets eliminated.	2025-08-28 07:33:23 +02:00
Wu, Irene W	7b8b6da4df	Eliminate samples with the same texture coordinates Adding ReassociatePass before GVN pass ensures the second sample gets eliminated.	2025-08-28 01:03:46 +02:00
Joel Fuentes	c24729eddc	Add Pre-RA register pressure stat	2025-08-27 18:09:24 +02:00
Garbowski, Mateusz	2fb53dfcba	Use const ref. Initialize members. In RematChainsAnalysis.hpp pass arguments using const reference to avoid copy construction. Initialize pointer members to nullptr to avoid uninitialized memory usage.	2025-08-27 17:51:05 +02:00
Zawrotny, Emilian	a29088c5d7	ConstCoalescing SEXT/ZEXT Fix Fix bug where pass treated zero-extended values as sign-extended values in add instruction	2025-08-27 16:35:10 +02:00
Dmitrichenko, Aleksei	d9eda715d4	Enable CodeScheduling on the 1st try Enable CodeScheduling on the 1st try (not only on the recompilation)	2025-08-26 20:22:34 +02:00
Kwasniewski, Patryk	b717c7c181	fix test igc_opt returns non-zero on failed assert	2025-08-26 15:56:46 +02:00
Sukhov, Egor	46c11fc759	Rematerialization pass now supports CMP instructions Rematerialization pass now supports CMP instructions	2025-08-26 14:42:06 +02:00
bokrzesi	391a1da977	[LLVM 16] Fixing ResolveOCLRaytracingBuiltins by creating "struct.intel_ray_query_opaque_t" that's not present in generated BiF .bc file When importing built-in types, the type called "struct.intel_ray_query_opaque_t" was properly imported on typed-pointers mode as: ``` %struct.intel_ray_query_opaque_t = type opaque ``` but on opaque types mode it was not present in the generated BiF .bc file, thus it was not imported. It caused ResolveOCLRaytracingBuiltins pass to fail because it relied on having when creating Alloca. ``` auto allocaType = IGCLLVM::getTypeByName(callInst.getModule(), "struct.intel_ray_query_opaque_t"); auto alloca = m_builder->CreateAlloca(allocaType); ``` This patch adds workaround for it by creating such type when it is not present.	2025-08-26 13:14:12 +02:00
Dmitrichenko, Aleksei	ea5c192f8c	Enable Code Scheduling on recompilation Enable Code Scheduling on recompilation	2025-08-26 12:41:53 +02:00
Dmitrichenko, Aleksei	68d6ad7d08	CodeScheduling improvements CodeScheduling improvements to ensure better register pressure handling - Support handling of the remated instructions that are used by select (not memop) - Various heuristics added to handle situations with small (splitted) loads - Heuristic to populate the same vector added	2025-08-26 12:36:13 +02:00
Sukhov, Egor	7fb2b0b28f	Revert DepWindow setting inside igc-vectorizer back Revert DepWindow setting inside igc-vectorizer back	2025-08-25 19:45:11 +02:00
Liou, Jhe-Yu	09e26cf081	Revert LoopUnrollMaxPercentThresholdBoostForHighRegPressure to 400 Revert LoopUnrollMaxPercentThresholdBoostForHighRegPressure to 400	2025-08-25 19:18:51 +02:00
Grabezhnoy, Andrey	7591da236d	Adding WillReturn attribute to intrinsics with `memory(read)` Adding `WillReturn` attributes to all intrinsics with `Ref` memory_effects to solve performance regressions from LLVM 16 transition.	2025-08-25 12:14:00 +02:00
Kanclerz, Piotr	c1d34755f1	Support aggregate with bools promotion in functions This PR enables support of structs and arrays with bools in function arguments and return types.	2025-08-22 21:16:25 +02:00
Liou, Jhe-Yu	cd5c825d4d	Use alloca size instead number of element as the cutoff for promoting loop unrolling Use alloca size instead number of element as the cutoff for promoting loop unrolling	2025-08-22 20:57:12 +02:00
Wu, Irene W	8ae6734b82	Remove initializeMergeUniformStoresPass	2025-08-22 17:59:32 +02:00
Krzysztof Śliwiński	98bedfba35	Changes in code.	2025-08-22 16:58:07 +02:00
Gojska, Grzegorz	ed88b63592	Add get_coord formula for get_coord for 16 and 32 bit datatypes. Platforms: All Keywords: Feature Related-to: GSD-11139 Resolves:	2025-08-22 12:31:50 +02:00
Kwasniewski, Patryk	d85d0be961	2D block I/O for SIMD32 Expose SPIR-V API for 2D block load/store/prefetch for SIMD32 kernels. Works for platforms with minimum subgroup-size=16.	2025-08-22 10:15:56 +02:00
Y	e679b1de8a	Use Ray Query Return value in Compute Ray Tracing Extension Modified intel_get_hit_candidate and intel_is_traversal_done functions.	2025-08-22 09:40:08 +02:00
Jakacki, Jakub	6a6a8a0f22	Expose option to force linear scan RA via metadata Expose option to force linear scan RA via metadata	2025-08-22 02:21:53 +02:00
Michal Paszkowski	7249d00150	Support width and pointer type agnostic loads/stores for private memory allocas in LowerGEPForPrivMem The old handleStoreInst/loadEltsFromVecAlloca assume 1:1 lane mapping and equal sizes between user value and the promoted vector element type. This is insufficient for mixed widths (e.g. <4 x i8> and <... x i32>), cross-lane accesses created by the new byte-offset GEP lowering, or pointers under opaque pointers (bitcasts between pointers and non-pointers are illegal). With the changes: 1) Stores (handleStoreInst and storeEltsToVecAlloca) normalize the source (scalar or vector) to a single integer of NeedBits = N * DstBits using ptrtoint/bitcast, split the big integer into K = ceil( NeedBits / SrcBits) chunks, bitcast/inttoptr each chunk back to the promoted lane type and insert into K consecutive lanes starting at the scalarized index. 2) Loads (handleLoadInst and loadEltsFromVecAlloca) read K promoted lanes starting at the scalarized index, convert each lane to iSrcBits, pack into i(K*SrcBits), truncate to i(NeedBits), then expand to the requested scalar or <N x DstScalarTy>. Use inttoptr for pointer results. There is also still a simple (old) path. If SrcBits == DstBits, just emit extractelement with casts (if needed). All paths do a single load of the promoted vector, extractelement/insertelement, and in case of stores only a single store back. With these changes, the LLVM IR emitted from LowerGEPForPrivMem will look different. Instead of using plain bitcasts, there are now ptrtoint/inttoptr instructions and there is additional packing/splitting logic. For the simple (old) load path, the new implementation should essentially emit the same pattern (potnetially skipping bitcasts). The additional integer/bitcast instruction sequences should be easily foldable. Memory traffic is unchanged (still one vector load/store). Overall register pressure should be similar, the pass still eliminates GEPs and avoids private/scratch accesses.	2025-08-21 20:25:19 +02:00
Shelegov, Maksim	9e80e069bb	GenXStructSplitter opaque pointers fix Handle the special opaque pointers case when a GEP user indexing into a non-structure type because leading zero indices are omitted.	2025-08-21 19:23:19 +02:00
igcbot	5e4a7b9e87	Drop preprocessor checks for unsupported LLVM versions (#23491 ) Drop preprocessor checks for unsupported LLVM versions Delete all #if LLVM_VERSION_MAJOR checks which were about LLVM versions older than 14.	2025-08-21 15:02:07 +02:00
Sukhov, Egor	47c1f6e944	Stub Vectorization for WAVEALL, CMP, SELECT enabled Stub Vectorization for WAVEALL, CMP, SELECT enabled by default. Dependency check window enlarged to 6 x number elements inside slice.	2025-08-20 13:38:22 +02:00
Zawrotny, Emilian	4f0123a7d6	Bump CMake version Change minimum CMake version to 3.13.4	2025-08-20 12:40:33 +02:00
Milczek, Szymon	9ae78bef34	Apply rule-of-three Applies rule-of-three by removing missing copy ctors.	2025-08-20 11:36:17 +02:00
Milczek, Szymon	6a2055eeeb	Apply rule-of-three Apply rule-of-three by explicitly deleting copy ctors.	2025-08-20 11:34:09 +02:00
Kanclerz, Piotr	0b5f9e11eb	Add has_printf_calls to zeinfo zeinfo now contains information if kernel/function has printf calls and function pointer calls. This allows neo to create printf_buffer when it is really used.	2025-08-20 10:30:06 +02:00
Liou, Jhe-Yu	8ca10e3d2e	Enable loop unrolling in retry for 3D Enable loop unrolling in retry for 3D	2025-08-20 02:41:26 +02:00

1 2 3 4 5 ...

18372 Commits