compute-runtime

History

Maciej Bielski 2778043d67 fix(l0): check for largeGRF when computing maxWorkGroupSize Sizing context (PVC): When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU (instead of default 8). Together with SIMD16 that means that there can be max 64 work-items per EU. With 8 EU per subslice this gives 512 work-items on a single subslice. For correct intra-WG synchronization all its WIs must be executed on the same subslice (to access the same SLM, where the synchronization primitives are stored). Thus, with SIMD16 and LargeGRF the work-group size must not exceed 512 (PVC example). So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure both in `ModuleTranslationUnit::processUnpackedBinary()` and `ModuleImp::initialize()`. This method does not take kernel parameters (LargeGRF) into account. It allows to submit a kernel using LargeGRF with SIMD16 with the work-group size set to 1024. That leads to a hang. Fix the `.maxWorkGroupSize` computation so that it takes the kernel parameters into consideration. Add new (for discrete platforms >= XeHP) and adapt existing tests, fix cosmetics by the way. Similar check for OCL: https://github.com/intel/compute-runtime/blob/master/opencl/source/comma nd_queue/enqueue_kernel.h#L130 Related-To: NEO-7684 Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>		2023-02-08 11:20:52 +01:00
..
kernel.cpp	Copyright header update	2021-05-17 20:38:19 +02:00
kernel.h	feature: print printf contents right after gpu hang detection	2023-01-11 08:14:00 +01:00
kernel_ext.cpp	Add option for extending kernel	2022-05-16 12:08:41 +02:00
kernel_hw.h	Cleanup includes 42	2023-01-25 09:16:39 +01:00
kernel_imp.cpp	fix(l0): check for largeGRF when computing maxWorkGroupSize	2023-02-08 11:20:52 +01:00
kernel_imp.h	Add state base address properties tracking for command lists	2023-01-31 12:47:17 +01:00
patch_with_implicit_surface.inl	Reduce usage of global gfx core helper getter [3/n]	2022-12-13 11:13:11 +01:00
sampler_patch_values.h	Use correct enum values for sampler in clamp mode	2022-01-20 18:15:53 +01:00