Commit Graph

106 Commits

Author SHA1 Message Date
e3fc880273 Add GenericNullPtrPropagation Pass
Add new pass which propagates null pointers accross address space casts
and remove no longer needed Generic Pointers Comparision Pattern Match.

This change is needed to fix bug, where sometimes comparison between
generic pointers returns incorrect results.
2025-10-28 14:43:01 +01:00
ca8161f7e0 Modify Integer MAD Pattern Matching
Modify Integer MAD pattern matching to catch more cases.
2025-09-30 18:46:10 +02:00
0613f29a34 Changes in code. 2025-09-25 20:30:52 +02:00
7a2a2c2e94 Modify Integer MAD Pattern Matching
Modify Integer MAD pattern matching to catch more cases.
2025-09-25 01:10:57 +02:00
befdece36d [Autobackout][FunctionalRegression]Revert of change: 788001ec31: Modify Integer MAD Pattern Matching
Modify Integer MAD pattern matching to catch more cases.
2025-09-04 10:31:46 +02:00
788001ec31 Modify Integer MAD Pattern Matching
Modify Integer MAD pattern matching to catch more cases.
2025-09-03 20:22:43 +02:00
46629d9b5f [Autobackout][FunctionalRegression]Revert of change: 5bffd05743: Modify Integer MAD Pattern Matching
Modify Integer MAD pattern matching to catch more cases.
2025-08-30 08:28:14 +02:00
5bffd05743 Modify Integer MAD Pattern Matching
Modify Integer MAD pattern matching to catch more cases.
2025-08-30 04:02:24 +02:00
420b632df9 Update IGC code format
Update IGC code format
2025-07-20 06:20:11 +02:00
3bcf1d1b3e Another pattern match for packing 4 i8 values
The PR adds another pattern that detects packing 4 i8 values into
32-bit scalar. Detected pattern is packing values clamped to [0, 127]
and packed using `shl` and `or` instructions:
2025-07-07 11:43:28 +02:00
657fcc9abf Update PatternMatch to genreate saturation more often
Detect more cases when `.sat` qualifier can be used
2025-06-26 10:40:40 +02:00
66153b94e5 Optimize operations on packed 8-bit integers
Detect operations on `<4 x i8>` and avoid data unpacking and packing.
2025-06-25 07:46:15 +02:00
de80a52bb3 [Autobackout][FunctionalRegression]Revert of change: b048fd1f97: Changes in code
Changes in code
2024-04-23 16:50:00 +02:00
b048fd1f97 Changes in code. 2024-04-22 22:28:01 +02:00
6d814c9de7 Changes in code. 2024-03-08 18:48:31 +01:00
9ad0bec2a7 Changes in code. 2024-03-08 01:15:41 +01:00
2e455a98b8 Re-apply multiple fixes (#16602)
Re-apply multiple fixes

c8f734d Fix a warning issue related to the overloaded struct
77fb673 GEP canon is on only for OCL
8d12a0f avoid y*(1/x) for double precision type
fbf1aa9 [IGC VC] GenXVerify pass, initial
0ae6dfb SetMaxRegForThreadDispatch was hardcoded for up to 128 GRFs.
932eafa Minor update to DisableRecompilation regkey description
bc3034f Fix bugs and update the LIT test for linear scan RA
45f1295 Enable GRF read delay of send stall instructions
7c95f49 [Autobackout][FuncReg]Revert of change: 3f0c186
620c74c Fix a few register regioning issues for 64b instructions on MTL platform
2023-12-13 23:28:54 +01:00
bfcfad722f Add Xe2 bindless SIP support
Add Xe2 bindless SIP support
2023-12-13 03:55:46 -05:00
8d12a0f6e4 avoid y*(1/x) for double precision type
Avoid y*(1/x) for double precision type.
2023-12-11 11:53:36 -05:00
3fc4f58023 Control over the support for the LSC immediate global base offset on the driver info level
This change is to add a new property to the driver info which informs if support for the LSC immediate global base offset is enabled.
2023-09-15 22:29:00 +02:00
72f99595b9 PatternMatchPass cleanup
Removed unused code
2023-07-11 16:03:24 +02:00
a32779f332 avoid redundant moves for SIMD shuffle down intrinsic
SIMD shuffle down intrinsic takes "current" and "next" values and
combines them into 2N variable (where N is SIMD size) to deal with
OOB lanes when shuffling. The moves to initialize this temporary
variable are materialized in emit vISA pass, thus when multiple
shuffle intrinsic calls have the identical source operands, we end
up with multiple temp variables and redundant moves.

This change adds per basic block caching of temporary variable, so
multiple shuffle in the same BB can share common source.
2023-06-26 09:32:14 +02:00
efee112de6 Revert "avoid redundant moves for SIMD shuffle down intrinsic"
This reverts commit 097633ceb1.

Co-authored-by: Patryk Kwasniewski <patryk.kwasniewski@intel.com>
2023-06-21 15:46:30 +02:00
097633ceb1 avoid redundant moves for SIMD shuffle down intrinsic
SIMD shuffle down intrinsic takes "current" and "next" values and
combines them into 2N variable (where N is SIMD size) to deal with
OOB lanes when shuffling. The moves to initialize this temporary
variable are materialized in emit vISA pass, thus when multiple
shuffle intrinsic calls have the identical source operands, we end
up with multiple temp variables and redundant moves.

This change adds per basic block caching of temporary variable, so
multiple shuffle in the same BB can share common source.
2023-06-20 13:54:57 +02:00
fda9211138 WA for uninitialized sample coorinates
When shader has early 'return' under non-uniform control flow, all
further calculations are skipped for selected SIMD lanes. On SIMD lanes
there might be sample instruction which depends on uninitialized
coordinates. This is undefined behaviour according to spec. This WA
initializes such lanes to zero to avoid corrupions.
2023-06-13 13:20:07 +02:00
34b9f38d81 Change sample sources execution mask
Before change: sample sources calculations were marked as 'subspanUse'
and noMask was applied. Exception - noMask was not applied if sample
result was a source to another sample, Vmask needed was set instead.
After change: sample sources are still marked as 'subspanUse', but they
list is also stored separately - subspanUse is a larger set than only
sample sources calculations. Additionaly, list of sample sources under
control flow is created. Vmask is still requested if sample result is
used as a souce for another sample.
Execution mask policy - if shader is executed with dispatch mask (Vmask
not requested), old behaviour is preserved - noMasks are applied. If
Vmask is requested, noMask is only applied for sample sources under
control flow - this is for covering common application missuses of
samples.
2023-04-21 17:16:21 +02:00
c17a53c177 [Autobackout][FuncReg]Revert of change: 6da2d48944
Change sample sources execution mask

Before change: sample sources calculations were marked as 'subspanUse'
and noMask was applied. Exception - noMask was not applied if sample
result was a source to another sample, Vmask needed was set instead.
After change: sample sources are still marked as 'subspanUse', but they
list is also stored separately - subspanUse is a larger set than only
sample sources calculations. Additionaly, list of sample sources under
control flow is created. Vmask is still requested if sample result is
used as a souce for another sample.
Execution mask policy - if shader is executed with dispatch mask (Vmask
not requested), old behaviour is preserved - noMasks are applied. If
Vmask is requested, noMask is only applied for sample sources under
control flow - this is for covering common application missuses of
samples.
2023-04-19 04:20:17 +02:00
6da2d48944 Change sample sources execution mask
Before change: sample sources calculations were marked as 'subspanUse'
and noMask was applied. Exception - noMask was not applied if sample
result was a source to another sample, Vmask needed was set instead.
After change: sample sources are still marked as 'subspanUse', but they
list is also stored separately - subspanUse is a larger set than only
sample sources calculations. Additionaly, list of sample sources under
control flow is created. Vmask is still requested if sample result is
used as a souce for another sample.
Execution mask policy - if shader is executed with dispatch mask (Vmask
not requested), old behaviour is preserved - noMasks are applied. If
Vmask is requested, noMask is only applied for sample sources under
control flow - this is for covering common application missuses of
samples.
2023-04-14 17:30:56 +02:00
3561f6d0bb Revert "Change sample sources calculations execution mask"
Revert
2023-03-22 10:36:31 +01:00
9d2f83d6d6 Change sample sources execution mask
Before change: sample sources calculations were marked as 'subspanUse'
and noMask was applied. Exception - noMask was not applied if sample
result was a source to another sample, Vmask needed was set instead.
After change: sample sources are still marked as 'subspanUse', but they
list is also stored separately - subspanUse is a larger set than only
sample sources calculations. Additionaly, list of sample sources under
control flow is created. Vmask is still requested if sample result is
used as a souce for another sample.
Execution mask policy - if shader is executed with dispatch mask (Vmask
not requested), old behaviour is preserved - noMasks are applied. If
Vmask is requested, noMask is only applied for sample sources under
control flow - this is for covering common application missuses of
samples.
2023-03-16 15:30:21 +01:00
5cc6599200 Fixed insertValue emit
As insertvalue chain is coalesced in dessa, emit part needs
to be changed accordingly.

1. Fixed emit errors related to partially-shared insertvalue.
   If src0 and dst are different visa variable, need to copy
   src0 to dst first.

2. no need to have patterns for insertvalue/extractvalue.

With this, insertvalue/extractvalue on struct of primitive
types should work as expected.
2023-02-13 22:45:36 +01:00
e1395acaf3 [Autobackout][FuncReg]Revert of change: e2d4d33744
remove dead instructions with debug info enabled.

This change removes IGC backend pattern match dependency setting on arguments to debug instructions.
  call void @llvm.dbg.value(metadata float %8, metadata !903, metadata !DIExpression()), !dbg !901		; visa id: 20
This forces generation of instruction producing %8.  However, if that instruction is not used, it will be generated in dead code.
For example,
  %8 = load float, float addrspace(1)* %7, align 4, !dbg !902		; visa id: 18
  ...
  %simdShuffle = call float @llvm.genx.GenISA.WaveShuffleIndex.f32(float %8, i32 0, i32 0), !dbg !981		; visa id: 23
  ...
  call void @llvm.dbg.value(metadata float %simdShuffle, metadata !904, metadata !DIExpression()), !dbg !901		; visa id: 26
  ...
  %9 = fadd fast float %simdShuffle, %simdShuffle.1, !dbg !982		; visa id: 30
The pattern matcher will link out %simdShuffle and directly use %8 (regioning),
However, llvm.dbg.value creates a false dependency on %simdShuffle and causes the shuffle
to emit a dead broadcast mov in vISA.

This change wouldn't impact in -O0 debug mode, and just impact -O2 debug mode.
2023-01-18 22:42:41 +01:00
e2d4d33744 remove dead instructions with debug info enabled.
This change removes IGC backend pattern match dependency setting on arguments to debug instructions.
  call void @llvm.dbg.value(metadata float %8, metadata !903, metadata !DIExpression()), !dbg !901		; visa id: 20
This forces generation of instruction producing %8.  However, if that instruction is not used, it will be generated in dead code.
For example,
  %8 = load float, float addrspace(1)* %7, align 4, !dbg !902		; visa id: 18
  ...
  %simdShuffle = call float @llvm.genx.GenISA.WaveShuffleIndex.f32(float %8, i32 0, i32 0), !dbg !981		; visa id: 23
  ...
  call void @llvm.dbg.value(metadata float %simdShuffle, metadata !904, metadata !DIExpression()), !dbg !901		; visa id: 26
  ...
  %9 = fadd fast float %simdShuffle, %simdShuffle.1, !dbg !982		; visa id: 30
The pattern matcher will link out %simdShuffle and directly use %8 (regioning),
However, llvm.dbg.value creates a false dependency on %simdShuffle and causes the shuffle
to emit a dead broadcast mov in vISA.

This change wouldn't impact in -O0 debug mode, and just impact -O2 debug mode.
2023-01-18 03:08:55 +01:00
1c2446af09 Fix the warning C4505
Remove definitions of unused functions
2022-09-21 16:10:32 +02:00
fab5576d25 Refactoring by caching emask
Refactoring the code to caching num of active lanes of
the entire dispatch for reuse within a BB.

With this, GetNumActiveLanes() will be read-only
and cached for reuse within a BB.
2022-09-12 23:36:53 +02:00
0fa570cbe8 [Autobackout][FuncReg]Revert of change: b71f7d8eb0
Refactoring by caching emask

Refactoring the code to caching execMask and num of active lanes of
the entire dispatch for reuse within a BB.

With this, GetExecutionMask() and GetNumActiveLanes() will be read-only
and cached for reuse within a BB.
2022-09-12 13:00:25 +02:00
b71f7d8eb0 Refactoring by caching emask
Refactoring the code to caching execMask and num of active lanes of
the entire dispatch for reuse within a BB.

With this, GetExecutionMask() and GetNumActiveLanes() will be read-only
and cached for reuse within a BB.
2022-09-10 23:05:49 +02:00
a92bb4ec49 Improve scalar atomic add/sub
For scalar atomic (add/sub/inc/dec) without return and with uniform addend,
a more efficient code sequence will be used. For example,
"atomic_add (16|M0)  p,  1" will be:

  emask = current emask
  numBits = numOfOne(emask);
  (W) atomic_add (1|M0)  p, numBits

We basically save numBits for reuses within the same BB.
2022-09-07 21:09:20 +02:00
c74ad9055f [Autobackout][FuncReg]Revert of change: 065dba60ab
Improve scalar atomic add/sub

For scalar atomic (add/sub/inc/dec) without return and with uniform addend,
a more efficient code sequence will be used. For example,
"atomic_add (16|M0)  p,  1" will be:

  emask = current emask
  numBits = numOfOne(emask);
  (W) atomic_add (1|M0)  p, numBits

We basically save numBits for reuses within the same BB.
2022-09-03 12:32:04 +02:00
065dba60ab Improve scalar atomic add/sub
For scalar atomic (add/sub/inc/dec) without return and with uniform addend,
a more efficient code sequence will be used. For example,
"atomic_add (16|M0)  p,  1" will be:

  emask = current emask
  numBits = numOfOne(emask);
  (W) atomic_add (1|M0)  p, numBits

We basically save numBits for reuses within the same BB.
2022-09-02 23:27:32 +02:00
704d5b637e Fix NULL generic pointers comparison
This change introduces clearing tag bits before generic pointers
comparison. It is required since some NULL generic pointers may
have tag set and some may not.
2022-07-14 14:54:59 +02:00
23ae7233eb tune down threshold of latency-sched
tune down threshold for latency scheduling so it should behave the same
on non-iterative setting
2022-06-23 19:22:42 +02:00
4ba11e00ec Enable lrp and predicated mad matching based on fast-math flags
Enable matching `lrp` and predicated `mad` instructions based on instructions`
fast-math flags.
2022-06-14 13:09:49 +02:00
03da8989d0 Port to LLVM 14
Port to LLVM 14
2022-06-02 10:33:53 +02:00
c223e56b01 Remove unneeded canonicalize instructions
Update PaternMatch to remove canonicalize instructions before and after instructions that flush denorms.
2022-04-20 16:57:04 +02:00
d753951e2f Add helper lane mode to wave intrinsics
Added helper lane mode argument to wave intrinsics. Much like with
GenISA_WaveShuffleIndex, this argument denotes that helper lanes
should be active for this instruction when its value is 1.
2022-04-07 17:38:00 +02:00
1f715498bf [Autobackout][FuncReg]Revert of change: 9e87804e5b
Add helper lane mode to wave intrinsics

Added helper lane mode argument to wave intrinsics. Much like with
GenISA_WaveShuffleIndex, this argument denotes that helper lanes
should be active for this instruction when its value is 1.
2022-04-02 00:24:26 +02:00
9e87804e5b Add helper lane mode to wave intrinsics
Added helper lane mode argument to wave intrinsics. Much like with
GenISA_WaveShuffleIndex, this argument denotes that helper lanes
should be active for this instruction when its value is 1.
2022-03-31 22:26:33 +02:00
0b68ab4123 Adding more DG2 & PVC code
In URBWrites and MeshShader regions.
2022-01-10 19:14:31 +01:00
5a753bbc33 Don't pattern match to frc.sat
Don't pattern match to frc.sat
2021-11-19 03:37:30 +01:00