Optimize zeKernelSetGroupSize by early returning success if group size
values have not changed since last function call.
Related-To: NEO-7394
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Two code parts contained invalid logic related to traversing
opaque list of pNexts. This has been fixed.
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Add missing allocation of kernel private memory for the scenario when
the private memory is not allocated within `KernelImp::initialize()` but
deferred until `appendLaunchKernelWithParams()` instead.
One kernel can never allocate more private/scratch memory than
`globalMemorySize`, that ends up in `ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY`
being returned. However, several separate kernels can exceed the
`globalMemorySize` and then, the private region of each such kernel is
allocated at later stage, in `appendLaunchKernelWithParams()`.
Such mechanism was present on pre-xehp platforms and it is now added to
xehp-and-later.
See:
* ModuleImp::checkIfPrivateMemoryPerDispatchIsNeeded()
* Module::shouldAllocatePrivateMemoryPerDispatch()
Related-To: NEO-7398
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
This fixes several bugs in previous (reverted) implementation.
We use correct RTStack pointer offset, and a larger RTStack size.
Related-To: LOCI-2966
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
Add support for inline samplers in zebin.
Generate required SAMPLER_STATEs in DSH.
Resolves: NEO-7388
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Instead of just returning proper error code in case of exceeding
available Shared Local Memory size we also want to print error message
to make debugging easier.
Related-To: NEO-7280
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Previously we used an array-of-pointers approach, but using an
array-of-structures is in some ways simpler.
We also split out the RTStack as a separate allocation.
Related-To: LOCI-2966
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
With compiler LSC WAs this gives better performance.
If debugger is active, policy will not be changed ie.
will be WBP.
Related-To: NEO-7003
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
With compiler LSC WAs this gives better performance.
If debugger is active, policy will not be changed ie.
will be WBP.
Related-To: NEO-7003
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This change:
- prevents writing memory out of the range of the destination buffer
- prevents calling strlen() with non-null terminated c-string
- corrects the logic, which validates passed range to proceed
when real length fits the destination buffer
Related-To: NEO-7264
Signed-off-by: Wrobel, Patryk <patryk.wrobel@intel.com>
- Check that the Image Format is valid for the image
argument to a SPIRv module. If the Image is invalid
return ZE_RESULT_ERROR_UNSUPPORTED_IMAGE_FORMAT.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
Implementation was assuming that if HasRTCalls is true then the
RTDispatchGlobals patch token is also valid, but that isn't the case
when the application is using its own RTDispatchGlobals instead of the
one provided by the L0 UMD.
Related-To: LOCI-3323
Signed-off-by: Jim Snow <jim.m.snow@intel.com>