Commit Graph

18372 Commits

Author SHA1 Message Date
09573028d0 [Autobackout][FunctionalRegression]Revert of change: e640d20fc8: Enable Code Scheduling on recompilation
Enable Code Scheduling on recompilation
2025-07-26 06:22:53 +02:00
74151dcc32 Remove unneeded assert
Remove unneeded assert
2025-07-26 05:07:18 +02:00
e640d20fc8 Enable Code Scheduling on recompilation
Enable Code Scheduling on recompilation
2025-07-25 15:11:38 +02:00
684ab05a6c Improve CodeScheduling
- Add caching for register pressure estimation, real uses computation
  and values size
- Implement fragmentation-aware register pressure adjustment heuristic for large loads
- Add new heuristic for prioritizing loads that unlock DPAS instructions
- Fix initial register pressure estimation for hoisted loads and corresponding IEs in BBIn
- Fix ftobf regpressure estimation
- Some changes of the whole scheduling workflow to take advantage of the
  backtracking
- Add new heuristic to put instructions between the load and the subsequent shuffling to hide latency
2025-07-25 13:25:04 +02:00
f8ce0b6d52 use whole GRF for load 2D
Block2d load's return size per block is multiple of GRFs. If the actual
returned data per block is not multiple of GRFs, its size is rounded up
to the next whole GRF with unused GRF storage filled with zero.
2025-07-25 13:11:23 +02:00
b224b29b4d Remove workaround -Wno-error=implicit-int
Previously, some workloads used implicit int types in function
definitions. With LLVM 15, implicit ints are treated as errors.
The workaround to disable this error has now been removed, so IGC
enforces the same behavior as LLVM 15 and treats implicit ints
as errors.
2025-07-25 11:13:24 +02:00
38f7295c4a Mark gradient intrinsics as convergent
Mark gradient intrinsics as convergent
2025-07-25 02:20:46 +02:00
Y
9977bdf206 Add handling for opaque pointer of Function/FunctionType case in ProgramScopeConstantAnalysis
The call to IGCLLVM::getNonOpaquePtrEltTy was failing when opaque pointers were enabled.
I've introduced check and adequate handling of such scenario
2025-07-25 02:01:45 +02:00
db24ad6d71 Use clang-format in auto-generated files
This change is to format auto-generated files with `clang-format` using llvm code style.
2025-07-24 23:33:04 +02:00
77cd23b537 Handle inline asm in vector alias
1. Improve vector alias optim to handle inline asm
2. Allow constant insert elements
2025-07-24 18:14:27 +02:00
16fa7371f0 Synchronization between branched
---------------------------
2025-07-24 15:04:24 +02:00
342c4fb729 Allow customizing function patch names
This change is to allow the compiler to set a customized function patch names.
2025-07-23 13:47:37 +02:00
c4c4eb33c6 use whole GRF for load 2D
Block2d load's return size per block is multiple of GRFs. If the actual
returned data per block is not multiple of GRFs, its size is rounded up
to the next whole GRF with unused GRF storage filled with zero.
2025-07-23 13:08:32 +02:00
a30d372e95 [Autobackout][FunctionalRegression]Revert of change: 1a88d6ea34: Adjust pre-RA scheduling heuristic
Adjust pre-RA scheduling heuristic
2025-07-23 00:29:49 +02:00
f82224aa18 [Autobackout][FunctionalRegression]Revert of change: 61e417b80b: Allow customizing function patch names
This change is to allow the compiler to set a customized function patch names.
2025-07-22 22:16:28 +02:00
61e417b80b Allow customizing function patch names
This change is to allow the compiler to set a customized fuction patch names.
2025-07-22 17:07:52 +02:00
bbdc8e9184 [IGC OCL] SYCL Joint Matrix enable 16-bit datatypes for C and D matrices.
Enable 16-bit datatypes for accumulator and output matrices in joint matrix.

Platforms:
PVC, DG2

Keywords:
Feature

Related-to: GSD-11139
Resolves:
2025-07-22 16:41:26 +02:00
33a954cf51 Remove redundant guard for a pattern of global imm offsets
This simplifies the control flow for a pattern of global imm offsets.
2025-07-22 15:53:34 +02:00
349f3e9d21 Disable SIMD16 drop heuristics for XE3
Disable heuristics due to regression.
2025-07-22 14:40:59 +02:00
c8ee2afd36 Update docs
Removed old information from docs and updated script snippets to be
easily copyable for user experience.
2025-07-22 14:31:04 +02:00
dac4abb69a Expose a static function to identify ContinuationHLIntrinsic
This function allows to identify `ContinuationHLIntrinsic` by the intrinsic id.
2025-07-22 13:04:41 +02:00
e7ecc4545a Fix tryFindPointerOrigin assert
Fix assert condition to better match the message
and intention behind it.
2025-07-22 12:46:52 +02:00
b63540c8a8 Update discard mask pattern match fix
Update discard mask pattern match fix. Corrected predicate mode is set
for discard branch.
2025-07-22 12:43:45 +02:00
14aa10abcc [Autobackout][FunctionalRegression]Revert of change: 80068a026f: Update discard mask pattern match fix
Update discard mask pattern match fix. Corrected predicate mode is set
    for discard branch.
2025-07-22 04:00:09 +02:00
1a88d6ea34 Adjust pre-RA scheduling heuristic
Adjust pre-RA scheduling heuristic
2025-07-21 23:13:03 +02:00
b4641ae2ef Fix and improve inline asm
If inline asm's operands are aliases, the current code generate a copy
if the operand is input; and does not handle aliased output operand.

When using copy, it is a little tricky whether to use NoMask or not,
especially for output operands. In addition, using inline asm is most
likely for performance and additional copies should be avoided as much
as possible.

This change fixes output alias operands and also removes copies by
generating visa alias decl with non-zero offset.
2025-07-21 21:42:38 +02:00
a1f7a26a59 Fix subroutine handling for intel_reqd_sub_group_size(32)
Previously, using `intel_reqd_sub_group_size(32)` on DG2 resulted in two
redundant SIMD32 call instructions being generated in vISA, which could
lead to unexpected issues. This change ensures that only a single SIMD32
call instruction is generated. All function arguments and return values
are now correctly passed using two SIMD16 instructions, eliminating
redundancy and improving
2025-07-21 13:58:39 +02:00
ce775e8c5e Create fix_missing_cstdint_dev_gcc15.patch
---------------------------
2025-07-21 12:28:48 +02:00
80068a026f Update discard mask pattern match fix
Update discard mask pattern match fix. Corrected predicate mode is set
for discard branch.
2025-07-21 09:58:25 +02:00
d56e799c0a Reduce number of available colors in RA by # of reserved
GRFs in fail safe

In fail safe RA, we reserve some number of GRFs to guarantee RA
termination. When GRFs are reserved, we must also reduce number of
available colors when determining color ordering.
2025-07-21 07:47:09 +02:00
420b632df9 Update IGC code format
Update IGC code format
v2.16.0
2025-07-20 06:20:11 +02:00
3976c0b30f [Autobackout][FunctionalRegression]Revert of change: c3e6c9d734: Don't cache volatile load store instructions
On platforms with default cache policy set to L1 and L3 cached
    such as DG2 or BMG volatile instructions are also cached. Since
    CUDA doesn't cache volatile pointers, there is a code that is
    not supported by Intel GPU, as caching volatile can lead to hangs.
2025-07-20 03:56:01 +02:00
b23eb1ef49 gather send update
Gather send update
2025-07-19 10:40:25 +02:00
d7e78d5e45 Update clang-format column limit
Update clang-format column limit
2025-07-18 23:45:33 +02:00
a794173711 IGA: minor indent fix
For internal feature
2025-07-18 22:05:08 +02:00
323eca95f9 Support NaN in Bfloat MinMax resolution
Fixing BfloatFuncsResolution pass to support NaN in MinMax resolution.
2025-07-18 14:51:18 +02:00
c3e6c9d734 Don't cache volatile load store instructions
On platforms with default cache policy set to L1 and L3 cached
such as DG2 or BMG volatile instructions are also cached. Since
CUDA doesn't cache volatile pointers, there is a code that is
not supported by Intel GPU, as caching volatile can lead to hangs.
2025-07-18 14:43:40 +02:00
b98a2ba086 Only Rematerialize ptr bitcasts for function calls inside
CloneAddressArithmetic

Only Rematerialize ptr bitcasts for function calls inside
CloneAddressArithmetic
2025-07-18 14:05:49 +02:00
92d6114445 Opaque pointer fixes in LegalizeFunctionSignatures, PromoteBools
In LegalizeFunctionSignatures don't call `getFunction()` which
returns parent function. Add support for llvm15+ which works
with opaque pointers and a legacy llvm 14 path.

In PromoteBools:
- Call `getType()` on load instruction - calling `getType()` on src
returns an opaque pointer.
- Use getValueType() in promoteGlobalVariable to work with
opaque pointers.
2025-07-18 12:51:53 +02:00
acae2f8d37 [Autobackout][FunctionalRegression]Revert of change: df4a2a246d: Fix subroutine handling for intel_reqd_sub_group_size(32)
Previously, using `intel_reqd_sub_group_size(32)` on DG2 resulted in two
    redundant SIMD32 call instructions being generated in vISA, which could
    lead to unexpected issues. This change ensures that only a single SIMD32
    call instruction is generated. All function arguments and return values
    are now correctly passed using two SIMD16 instructions, eliminating
    redundancy and improving
2025-07-18 11:47:49 +02:00
7fd0952833 Fix kernel_arg_base_type for OpenCL type arguments
The metadata node !kernel_arg_base_type must mirror !kernel_arg_type for
OpenCL builtin types (e.g. image1d_t). Unfortunately, this is
inconsistent with LLVM 16-based Common Clang.

This patch ensures that every OpenCL builtin type (*_t) listed in
!kernel_arg_type is also present in !kernel_arg_base_type at the same
position.
2025-07-18 05:45:28 +02:00
3c5f2266ea Re-enable dst for lifetime_start in resourec loop header
Re-enable dst for lifetime_start in resourec loop header. Set the
condition to benefit compatible workloads.
2025-07-18 05:21:44 +02:00
82a1986c0a RA change
_OS_DESCRIPTION
2025-07-18 02:38:09 +02:00
d2d30ed8fc IGA: Update GEDLibrary version
For internal feature
2025-07-18 02:05:25 +02:00
e2e261bd7e [Autobackout][FunctionalRegression]Revert of change: 890b8bf021: Bitcast in StatelessToStateful pass
Add support for bitcast instruction into StatelessToStateful pass.
2025-07-18 00:23:49 +02:00
bb1ad498e6 Retype TargetExtTy return types of function declarations
Clang 16 still lowers OpenCL/SPIR-V built-ins as ptr to opaque structs,
while SPIR-V Reader uses TargetExtTy values. This patch extends the
retyping function to also retype the return types of function (builtins)
declarations. Please note that the builtin function resolution is already
done earlier by SPIR-V Reader.

This patch also changes how ImageFuncsAnalysis pass recognizes
image/sampler types. Now, instead of relying on pointer element
types, the pass uses IGC metadata (m_OpenCLArgBaseTypes) --
consistent with other passes later on in the pipeline.
2025-07-17 20:38:16 +02:00
df4a2a246d Fix subroutine handling for intel_reqd_sub_group_size(32)
Previously, using `intel_reqd_sub_group_size(32)` on DG2 resulted in two
redundant SIMD32 call instructions being generated in vISA, which could
lead to unexpected issues. This change ensures that only a single SIMD32
call instruction is generated. All function arguments and return values
are now correctly passed using two SIMD16 instructions, eliminating
redundancy and improving
2025-07-17 10:42:38 +02:00
95429e8897 [Autobackout][FunctionalRegression]Revert of change: cee4ac4e9e: Retype TargetExtTy return types of function declarations
Clang 16 still lowers OpenCL/SPIR-V built-ins as ptr to opaque structs,
    while SPIR-V Reader uses TargetExtTy values. This patch extends the
    retyping function to also retype the return types of function (builtins)
    declarations. Please note that the builtin function resolution is already
    done earlier by SPIR-V Reader.

    This patch also changes how ImageFuncsAnalysis pass recognizes
    image/sampler types. Now, instead of relying on pointer element
    types, the pass uses IGC metadata (m_OpenCLArgType) --
    consistent with other passes later on in the pipeline.
2025-07-17 07:27:44 +02:00
2284c57ba1 Apply clang-format
Apply clang-format, no functional change
2025-07-17 02:50:09 +02:00
0b39832e8e [Autobackout][FunctionalRegression]Revert of change: 0e204e2473: Enable SIMD16 drop for more platforms
Enable abort on spills to SIMD16 for more platforms.
2025-07-17 01:56:16 +02:00