Sizing context (PVC):
When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU
(instead of default 8). Together with SIMD16 that means that there can
be max 64 work-items per EU. With 8 EU per subslice this gives 512
work-items on a single subslice. For correct intra-WG synchronization
all its WIs must be executed on the same subslice (to access the same
SLM, where the synchronization primitives are stored). Thus, with SIMD16
and LargeGRF the work-group size must not exceed 512 (PVC example).
So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure
both in `ModuleTranslationUnit::processUnpackedBinary()` and
`ModuleImp::initialize()`. This method does not take kernel parameters
(LargeGRF) into account. It allows to submit a kernel using LargeGRF
with SIMD16 with the work-group size set to 1024. That leads to a hang.
Fix the `.maxWorkGroupSize` computation so that it takes the kernel
parameters into consideration.
Add new (for discrete platforms >= XeHP) and adapt existing tests, fix
cosmetics by the way.
Similar check for OCL:
https://github.com/intel/compute-runtime/blob/master/opencl/source/comma
nd_queue/enqueue_kernel.h#L130
Related-To: NEO-7684
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
During event synchronize in commandlist, now the printf buffer
should get flushed out when host synchronize is called.
Related-To: LOCI-3681
Signed-off-by: Zhang, Winston <winston.zhang@intel.com>
In order to support latest spec, where sysman's initialization
could happen independent of core's initialization, add a new sysman
directory inside level_zero.
Related-To: LOCI-3887
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
Due to the mixed usage of `std::string` objects and C-strings, null terminator
characters sometimes propagate into possible `program::updateBuildLog()`
inputs. In particular, `TranslationOutput::frontendCompilerLog` and
`::backendCompilerLog` get set this way within the `CompilerInterface::build()`
API implementation.
This becomes an issue whenever each of CL -> IR and IR -> ISA compilation steps
emits their warnings and/or errors:
`clGetProgramBuildInfo(..., CL_PROGRAM_BUILD_LOG, ...)` also relies on
`std::string` to C-string conversion, so the output is trimmed at the first
null terminator, i.e. the BE part of the build log simply gets lost.
The change handles possible null terminators within `program::updateBuildLog()`
and adds relevant regression tests.
Related-To: IGC-6509
Signed-off-by: Artem Gindinson <artem.gindinson@intel.com>
- add estimation parameter for interface descriptor data count
- add to the heap estimation alignment parameter for dynamic and surface heaps
- extend encode interface and implementations to allow child heaps
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This commit switches the device ID logic from the deprecated
to the new one, so that if the user passes a hex value to the -device
parameter, ocloc will use the new implementation in the product config
helper. The change also introduces a fix for setting the values in the
correct order to configure the hwIfno correctly.
Signed-off-by: Daria Hinz daria.hinz@intel.com
Related-To: NEO-7487
when multi-storage image is initialized with memory then we need to track
location of actual memory
Related-To: NEO-5735
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Refactor structure and add field to pass USM memory type.
To maintain backwards compatibility with current applications,
pass 0 as type for device allocations, and 1 for host
allocations.
Related-To: LOCI-3771
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
when multi-storage buffer is initialized with memory then we need to track
location of actual data
remove redundant parameters from copyHostPointer function
Related-To: NEO-5735
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Use getSubDevicesCount() from hwInfo to determine whether device
is root or not, instead of isSubDevice(), since the former does not
change with the affinity mask.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>