intel-graphics-compiler

mirror of https://github.com/intel/intel-graphics-compiler.git synced 2025-10-30 08:18:26 +08:00

Author	SHA1	Message	Date
Zawrotny, Emilian	e3fc880273	Add GenericNullPtrPropagation Pass Add new pass which propagates null pointers accross address space casts and remove no longer needed Generic Pointers Comparision Pattern Match. This change is needed to fix bug, where sometimes comparison between generic pointers returns incorrect results.	2025-10-28 14:43:01 +01:00
Xue, Bowen	ca8161f7e0	Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-09-30 18:46:10 +02:00
Xue, Bowen	0613f29a34	Changes in code.	2025-09-25 20:30:52 +02:00
Xue, Bowen	7a2a2c2e94	Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-09-25 01:10:57 +02:00
sys_igc	befdece36d	[Autobackout][FunctionalRegression]Revert of change: `788001ec31`: Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-09-04 10:31:46 +02:00
Xue, Bowen	788001ec31	Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-09-03 20:22:43 +02:00
sys_igc	46629d9b5f	[Autobackout][FunctionalRegression]Revert of change: `5bffd05743`: Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-08-30 08:28:14 +02:00
Xue, Bowen	5bffd05743	Modify Integer MAD Pattern Matching Modify Integer MAD pattern matching to catch more cases.	2025-08-30 04:02:24 +02:00
Paige, Alexander	420b632df9	Update IGC code format Update IGC code format	2025-07-20 06:20:11 +02:00
Merecki, Mariusz	3bcf1d1b3e	Another pattern match for packing 4 `i8` values The PR adds another pattern that detects packing 4 i8 values into 32-bit scalar. Detected pattern is packing values clamped to [0, 127] and packed using `shl` and `or` instructions:	2025-07-07 11:43:28 +02:00
Mariusz Merecki	657fcc9abf	Update `PatternMatch` to genreate saturation more often Detect more cases when `.sat` qualifier can be used	2025-06-26 10:40:40 +02:00
Merecki, Mariusz	66153b94e5	Optimize operations on packed 8-bit integers Detect operations on `<4 x i8>` and avoid data unpacking and packing.	2025-06-25 07:46:15 +02:00
sys_igc	de80a52bb3	[Autobackout][FunctionalRegression]Revert of change: `b048fd1f97`: Changes in code Changes in code	2024-04-23 16:50:00 +02:00
Bartlomiej Gajda	b048fd1f97	Changes in code.	2024-04-22 22:28:01 +02:00
wkuczyns	6d814c9de7	Changes in code.	2024-03-08 18:48:31 +01:00
Bartlomiej Gajda	9ad0bec2a7	Changes in code.	2024-03-08 01:15:41 +01:00
igcbot	2e455a98b8	Re-apply multiple fixes (#16602 ) Re-apply multiple fixes `c8f734d` Fix a warning issue related to the overloaded struct `77fb673` GEP canon is on only for OCL `8d12a0f` avoid y*(1/x) for double precision type `fbf1aa9` [IGC VC] GenXVerify pass, initial `0ae6dfb` SetMaxRegForThreadDispatch was hardcoded for up to 128 GRFs. `932eafa` Minor update to DisableRecompilation regkey description `bc3034f` Fix bugs and update the LIT test for linear scan RA `45f1295` Enable GRF read delay of send stall instructions `7c95f49` [Autobackout][FuncReg]Revert of change: `3f0c186` `620c74c` Fix a few register regioning issues for 64b instructions on MTL platform	2023-12-13 23:28:54 +01:00
Lella, Nagalakshmi	bfcfad722f	Add Xe2 bindless SIP support Add Xe2 bindless SIP support	2023-12-13 03:55:46 -05:00
Kwasniewski, Patryk	8d12a0f6e4	avoid y(1/x) for double precision type Avoid y(1/x) for double precision type.	2023-12-11 11:53:36 -05:00
Krystian Andrzejewski	3fc4f58023	Control over the support for the LSC immediate global base offset on the driver info level This change is to add a new property to the driver info which informs if support for the LSC immediate global base offset is enabled.	2023-09-15 22:29:00 +02:00
Guzhaev, Dmitry	72f99595b9	PatternMatchPass cleanup Removed unused code	2023-07-11 16:03:24 +02:00
Kwasniewski, Patryk	a32779f332	avoid redundant moves for SIMD shuffle down intrinsic SIMD shuffle down intrinsic takes "current" and "next" values and combines them into 2N variable (where N is SIMD size) to deal with OOB lanes when shuffling. The moves to initialize this temporary variable are materialized in emit vISA pass, thus when multiple shuffle intrinsic calls have the identical source operands, we end up with multiple temp variables and redundant moves. This change adds per basic block caching of temporary variable, so multiple shuffle in the same BB can share common source.	2023-06-26 09:32:14 +02:00
Artem Gindinson	efee112de6	Revert "avoid redundant moves for SIMD shuffle down intrinsic" This reverts commit `097633ceb1`. Co-authored-by: Patryk Kwasniewski <patryk.kwasniewski@intel.com>	2023-06-21 15:46:30 +02:00
Kwasniewski, Patryk	097633ceb1	avoid redundant moves for SIMD shuffle down intrinsic SIMD shuffle down intrinsic takes "current" and "next" values and combines them into 2N variable (where N is SIMD size) to deal with OOB lanes when shuffling. The moves to initialize this temporary variable are materialized in emit vISA pass, thus when multiple shuffle intrinsic calls have the identical source operands, we end up with multiple temp variables and redundant moves. This change adds per basic block caching of temporary variable, so multiple shuffle in the same BB can share common source.	2023-06-20 13:54:57 +02:00
Szymon Karczewski	fda9211138	WA for uninitialized sample coorinates When shader has early 'return' under non-uniform control flow, all further calculations are skipped for selected SIMD lanes. On SIMD lanes there might be sample instruction which depends on uninitialized coordinates. This is undefined behaviour according to spec. This WA initializes such lanes to zero to avoid corrupions.	2023-06-13 13:20:07 +02:00
Szymon Karczewski	34b9f38d81	Change sample sources execution mask Before change: sample sources calculations were marked as 'subspanUse' and noMask was applied. Exception - noMask was not applied if sample result was a source to another sample, Vmask needed was set instead. After change: sample sources are still marked as 'subspanUse', but they list is also stored separately - subspanUse is a larger set than only sample sources calculations. Additionaly, list of sample sources under control flow is created. Vmask is still requested if sample result is used as a souce for another sample. Execution mask policy - if shader is executed with dispatch mask (Vmask not requested), old behaviour is preserved - noMasks are applied. If Vmask is requested, noMask is only applied for sample sources under control flow - this is for covering common application missuses of samples.	2023-04-21 17:16:21 +02:00
Szymon Karczewski	c17a53c177	[Autobackout][FuncReg]Revert of change: `6da2d48944` Change sample sources execution mask Before change: sample sources calculations were marked as 'subspanUse' and noMask was applied. Exception - noMask was not applied if sample result was a source to another sample, Vmask needed was set instead. After change: sample sources are still marked as 'subspanUse', but they list is also stored separately - subspanUse is a larger set than only sample sources calculations. Additionaly, list of sample sources under control flow is created. Vmask is still requested if sample result is used as a souce for another sample. Execution mask policy - if shader is executed with dispatch mask (Vmask not requested), old behaviour is preserved - noMasks are applied. If Vmask is requested, noMask is only applied for sample sources under control flow - this is for covering common application missuses of samples.	2023-04-19 04:20:17 +02:00
Szymon Karczewski	6da2d48944	Change sample sources execution mask Before change: sample sources calculations were marked as 'subspanUse' and noMask was applied. Exception - noMask was not applied if sample result was a source to another sample, Vmask needed was set instead. After change: sample sources are still marked as 'subspanUse', but they list is also stored separately - subspanUse is a larger set than only sample sources calculations. Additionaly, list of sample sources under control flow is created. Vmask is still requested if sample result is used as a souce for another sample. Execution mask policy - if shader is executed with dispatch mask (Vmask not requested), old behaviour is preserved - noMasks are applied. If Vmask is requested, noMask is only applied for sample sources under control flow - this is for covering common application missuses of samples.	2023-04-14 17:30:56 +02:00
Szymon Karczewski	3561f6d0bb	Revert "Change sample sources calculations execution mask" Revert	2023-03-22 10:36:31 +01:00
Szymon Karczewski	9d2f83d6d6	Change sample sources execution mask Before change: sample sources calculations were marked as 'subspanUse' and noMask was applied. Exception - noMask was not applied if sample result was a source to another sample, Vmask needed was set instead. After change: sample sources are still marked as 'subspanUse', but they list is also stored separately - subspanUse is a larger set than only sample sources calculations. Additionaly, list of sample sources under control flow is created. Vmask is still requested if sample result is used as a souce for another sample. Execution mask policy - if shader is executed with dispatch mask (Vmask not requested), old behaviour is preserved - noMasks are applied. If Vmask is requested, noMask is only applied for sample sources under control flow - this is for covering common application missuses of samples.	2023-03-16 15:30:21 +01:00
Gu, Junjie	5cc6599200	Fixed insertValue emit As insertvalue chain is coalesced in dessa, emit part needs to be changed accordingly. 1. Fixed emit errors related to partially-shared insertvalue. If src0 and dst are different visa variable, need to copy src0 to dst first. 2. no need to have patterns for insertvalue/extractvalue. With this, insertvalue/extractvalue on struct of primitive types should work as expected.	2023-02-13 22:45:36 +01:00
Tim Bauer	e1395acaf3	[Autobackout][FuncReg]Revert of change: `e2d4d33744` remove dead instructions with debug info enabled. This change removes IGC backend pattern match dependency setting on arguments to debug instructions. call void @llvm.dbg.value(metadata float %8, metadata !903, metadata !DIExpression()), !dbg !901 ; visa id: 20 This forces generation of instruction producing %8. However, if that instruction is not used, it will be generated in dead code. For example, %8 = load float, float addrspace(1)* %7, align 4, !dbg !902 ; visa id: 18 ... %simdShuffle = call float @llvm.genx.GenISA.WaveShuffleIndex.f32(float %8, i32 0, i32 0), !dbg !981 ; visa id: 23 ... call void @llvm.dbg.value(metadata float %simdShuffle, metadata !904, metadata !DIExpression()), !dbg !901 ; visa id: 26 ... %9 = fadd fast float %simdShuffle, %simdShuffle.1, !dbg !982 ; visa id: 30 The pattern matcher will link out %simdShuffle and directly use %8 (regioning), However, llvm.dbg.value creates a false dependency on %simdShuffle and causes the shuffle to emit a dead broadcast mov in vISA. This change wouldn't impact in -O0 debug mode, and just impact -O2 debug mode.	2023-01-18 22:42:41 +01:00
Tim Bauer	e2d4d33744	remove dead instructions with debug info enabled. This change removes IGC backend pattern match dependency setting on arguments to debug instructions. call void @llvm.dbg.value(metadata float %8, metadata !903, metadata !DIExpression()), !dbg !901 ; visa id: 20 This forces generation of instruction producing %8. However, if that instruction is not used, it will be generated in dead code. For example, %8 = load float, float addrspace(1)* %7, align 4, !dbg !902 ; visa id: 18 ... %simdShuffle = call float @llvm.genx.GenISA.WaveShuffleIndex.f32(float %8, i32 0, i32 0), !dbg !981 ; visa id: 23 ... call void @llvm.dbg.value(metadata float %simdShuffle, metadata !904, metadata !DIExpression()), !dbg !901 ; visa id: 26 ... %9 = fadd fast float %simdShuffle, %simdShuffle.1, !dbg !982 ; visa id: 30 The pattern matcher will link out %simdShuffle and directly use %8 (regioning), However, llvm.dbg.value creates a false dependency on %simdShuffle and causes the shuffle to emit a dead broadcast mov in vISA. This change wouldn't impact in -O0 debug mode, and just impact -O2 debug mode.	2023-01-18 03:08:55 +01:00
Wawrzyn, Patrycja	1c2446af09	Fix the warning C4505 Remove definitions of unused functions	2022-09-21 16:10:32 +02:00
Gu, Junjie	fab5576d25	Refactoring by caching emask Refactoring the code to caching num of active lanes of the entire dispatch for reuse within a BB. With this, GetNumActiveLanes() will be read-only and cached for reuse within a BB.	2022-09-12 23:36:53 +02:00
Gu, Junjie	0fa570cbe8	[Autobackout][FuncReg]Revert of change: `b71f7d8eb0` Refactoring by caching emask Refactoring the code to caching execMask and num of active lanes of the entire dispatch for reuse within a BB. With this, GetExecutionMask() and GetNumActiveLanes() will be read-only and cached for reuse within a BB.	2022-09-12 13:00:25 +02:00
Gu, Junjie	b71f7d8eb0	Refactoring by caching emask Refactoring the code to caching execMask and num of active lanes of the entire dispatch for reuse within a BB. With this, GetExecutionMask() and GetNumActiveLanes() will be read-only and cached for reuse within a BB.	2022-09-10 23:05:49 +02:00
Gu, Junjie	a92bb4ec49	Improve scalar atomic add/sub For scalar atomic (add/sub/inc/dec) without return and with uniform addend, a more efficient code sequence will be used. For example, "atomic_add (16\|M0) p, 1" will be: emask = current emask numBits = numOfOne(emask); (W) atomic_add (1\|M0) p, numBits We basically save numBits for reuses within the same BB.	2022-09-07 21:09:20 +02:00
Gu, Junjie	c74ad9055f	[Autobackout][FuncReg]Revert of change: `065dba60ab` Improve scalar atomic add/sub For scalar atomic (add/sub/inc/dec) without return and with uniform addend, a more efficient code sequence will be used. For example, "atomic_add (16\|M0) p, 1" will be: emask = current emask numBits = numOfOne(emask); (W) atomic_add (1\|M0) p, numBits We basically save numBits for reuses within the same BB.	2022-09-03 12:32:04 +02:00
Gu, Junjie	065dba60ab	Improve scalar atomic add/sub For scalar atomic (add/sub/inc/dec) without return and with uniform addend, a more efficient code sequence will be used. For example, "atomic_add (16\|M0) p, 1" will be: emask = current emask numBits = numOfOne(emask); (W) atomic_add (1\|M0) p, numBits We basically save numBits for reuses within the same BB.	2022-09-02 23:27:32 +02:00
Andrzej Ratajewski	704d5b637e	Fix NULL generic pointers comparison This change introduces clearing tag bits before generic pointers comparison. It is required since some NULL generic pointers may have tag set and some may not.	2022-07-14 14:54:59 +02:00
Gang Y Chen	23ae7233eb	tune down threshold of latency-sched tune down threshold for latency scheduling so it should behave the same on non-iterative setting	2022-06-23 19:22:42 +02:00
Mariusz Merecki	4ba11e00ec	Enable `lrp` and predicated `mad` matching based on fast-math flags Enable matching `lrp` and predicated `mad` instructions based on instructions` fast-math flags.	2022-06-14 13:09:49 +02:00
liushuyu	03da8989d0	Port to LLVM 14 Port to LLVM 14	2022-06-02 10:33:53 +02:00
Mariusz Merecki	c223e56b01	Remove unneeded canonicalize instructions Update PaternMatch to remove canonicalize instructions before and after instructions that flush denorms.	2022-04-20 16:57:04 +02:00
davidjwoo	d753951e2f	Add helper lane mode to wave intrinsics Added helper lane mode argument to wave intrinsics. Much like with GenISA_WaveShuffleIndex, this argument denotes that helper lanes should be active for this instruction when its value is 1.	2022-04-07 17:38:00 +02:00
David Woo	1f715498bf	[Autobackout][FuncReg]Revert of change: `9e87804e5b` Add helper lane mode to wave intrinsics Added helper lane mode argument to wave intrinsics. Much like with GenISA_WaveShuffleIndex, this argument denotes that helper lanes should be active for this instruction when its value is 1.	2022-04-02 00:24:26 +02:00
David Woo	9e87804e5b	Add helper lane mode to wave intrinsics Added helper lane mode argument to wave intrinsics. Much like with GenISA_WaveShuffleIndex, this argument denotes that helper lanes should be active for this instruction when its value is 1.	2022-03-31 22:26:33 +02:00
Szymichowski, Pawel	0b68ab4123	Adding more DG2 & PVC code In URBWrites and MeshShader regions.	2022-01-10 19:14:31 +01:00
Scott Pillow	5a753bbc33	Don't pattern match to frc.sat Don't pattern match to frc.sat	2021-11-19 03:37:30 +01:00

1 2 3

106 Commits