Extended retry manager
Extended behaviour of retry manager to check if the previously compiled kernel was better.
If after "retry" compilation, the current shader was generated worst than previous, then pick
the previous and abort retry manager for this kernel.
We hit an nonconformance issue when disabling global opt in staged
compilation. CustomSafeOpt, CFGSimplification and JumpThreading are the culprit passes
and cannot be disabled
This change looks for the functions/kernel which has loops with high
estimated register pressure in pre-header and forces on those functions
retry compilation with disabled LICM.
The root issue is with pass LICM which moves a lot of
invariant code to the pre-header of loop - causing increasing of
the live-range of the values (and increasing of the spill-fills at the end).
Extended behaviour of retry manager to check if the previously compiled kernel was better.
If after "retry" compilation, the current shader was generated worst than previous, then pick
the previous and abort retry manager for this kernel.
Implement logic to detect if stateless scratch space pointer is needed
(per thread scratch space size > 256kB), and set according information
in compiler output and payload.
Both Scalar and Vector paths need to support the following options:
* `-cl-disable-zebin`
* `-cl[-intel]-allow-zebin`
* `-ze-allow-zebin`
* `-ze-disable-zebin`
Updated 2 rules in DpasMacroBuilder:
- src2 read suppression can allow dp dpas as long as the rep-count is 4
- WAW and WAR dependency are allowed between dpas within the same macro
add CFGSimplification and JumpThreading passes back for correctness
We hit an nonconformance issue when disabling global opt in staged
compilation. CFGSimplification and JumpThreading are the culprit passes
and cannot be disabled
If AnyHit shaders directly invokes ClosestHit shader,
CommittedHit has to be manually filled with the
data from PotentialHit.
Copying u, v and HitInfo (3rd dword) was missing.
Check for expensive loops
This change looks for the functions/kernel which has loops with high
estimated register pressure in pre-header and forces on those functions
retry compilation with disabled LICM.
The root issue is with pass LICM which moves a lot of
invariant code to the pre-header of loop - causing increasing of
the live-range of the values (and increasing of the spill-fills at the end).
Make vISA_AbortOnSpillThreshold SaveOption code consistent for
PS, CS, and OCL. SIMD16/SIMD32_SpillThreshold are applied to
SIMD16/SIMD32 respectively.
Run SROA after RemoveunsupportedIntrinsics
RemoveunsupportedIntrinsics changes memcpy into loads and stores
Running SROA after that can limit private memory usage.
We hit an nonconformance issue when disabling global opt in staged
compilation. CFGSimplification and JumpThreading are the culprit passes
and cannot be disabled
register pressure
In loops with inst count greater than a threshold (500) and register pressure
greater than 98% of max pressure, make split less aggressive to avoid
more spilling.
This change looks for the functions/kernel which has loops with high
estimated register pressure in pre-header and forces on those functions
retry compilation with disabled LICM.
The root issue is with pass LICM which moves a lot of
invariant code to the pre-header of loop - causing increasing of
the live-range of the values (and increasing of the spill-fills at the end).