MaxDualSubSlicesSupported is filled inside GT_SYSTEM_INFO
structure when querying the KMD appropriately with the
number of enabled DualSubSlices. However we need to find
the highest index of the last enabled DualSubSlice.
For proper allocation of thread scratch space, allocation
has to be done based on native die config (including unfused
or non-enabled DualSubSlices). Since HW doesn't provide us a
way to know the exact native die config, in SW we need to
allocate RT stacks with enough size based on the last used
DualSubSlice.
The IsDynamicallyPopulated field in GT_SYSTEM_INFO is used to
indicate if system details are populated either via Fuse reg.
or hard-coded. Based on this field's value, we calcuate the
numRtStacks appropriately.
Related-To: LOCI-3954
Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
This fixes several bugs in previous (reverted) implementation.
We use correct RTStack pointer offset, and a larger RTStack size.
Related-To: LOCI-2966
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
Previously we used an array-of-pointers approach, but using an
array-of-structures is in some ways simpler.
We also split out the RTStack as a separate allocation.
Related-To: LOCI-2966
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
If a kernel has ray tracing calls, we allocate and initialize
per-device RTDispatchGlobals if needed, and hand off pointer to
the same into a running kernel via an implicit parameter.
Related-To: NEO-5384
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
This allocates the buffer on a per-device basis and enables ray
tracing on devices that support it when given a kernel with ray
tracing calls.
Signed-off-by: Jim Snow <jim.m.snow@intel.com>