Commit Graph

236 Commits

Author SHA1 Message Date
Maciej Bielski 2778043d67 fix(l0): check for largeGRF when computing maxWorkGroupSize
Sizing context (PVC):
When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU
(instead of default 8). Together with SIMD16 that means that there can
be max 64 work-items per EU. With 8 EU per subslice this gives 512
work-items on a single subslice. For correct intra-WG synchronization
all its WIs must be executed on the same subslice (to access the same
SLM, where the synchronization primitives are stored). Thus, with SIMD16
and LargeGRF the work-group size must not exceed 512 (PVC example).

So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure
both in `ModuleTranslationUnit::processUnpackedBinary()` and
`ModuleImp::initialize()`. This method does not take kernel parameters
(LargeGRF) into account. It allows to submit a kernel using LargeGRF
with SIMD16 with the work-group size set to 1024. That leads to a hang.

Fix the `.maxWorkGroupSize` computation so that it takes the kernel
parameters into consideration.

Add new (for discrete platforms >= XeHP) and adapt existing tests, fix
cosmetics by the way.

Similar check for OCL:
https://github.com/intel/compute-runtime/blob/master/opencl/source/comma
nd_queue/enqueue_kernel.h#L130

Related-To: NEO-7684
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-02-08 11:20:52 +01:00
Mateusz Jablonski 24c5352350 refactor: remove redundant including of compiler_cache.h
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-02-03 11:16:31 +01:00
Compute-Runtime-Validation 606a900080 Revert "Disable EUFusion for odd work groups with DPAS on DG2"
This reverts commit 017d66a469.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-03 02:45:21 +01:00
Maciej Plewka 017d66a469 Disable EUFusion for odd work groups with DPAS on DG2
Related-To: NEO-7495, HSD-14017007475

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-02-02 13:57:42 +01:00
Kamil Kopryk 2484c7ceb2 refactor: rename hw_helper files to gfx_core_helper files
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-02-01 19:37:51 +01:00
Zbigniew Zdanowicz 34b8f08fc6 Add state base address properties tracking for command lists
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-01-31 12:47:17 +01:00
Kamil Kopryk 68bfd49033 refactor: don't use global ProductHelper getter 15/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-01-27 17:51:57 +01:00
Kamil Kopryk 445706361d refactor: don't use global ProductHelper 14/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-01-27 14:51:12 +01:00
Warchulski, Jaroslaw c43233dabf Cleanup includes 42
Cleaned up files:
level_zero/core/source/kernel/kernel_hw.h
shared/source/helpers/common_types.h
shared/test/common/libult/linux/drm_mock.h
shared/test/common/libult/ult_command_stream_receiver.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-25 09:16:39 +01:00
Warchulski, Jaroslaw 49837b7bb5 Cleanup includes 39
Cleaned up files:
shared/source/command_container/command_encoder.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-23 11:56:42 +01:00
Warchulski, Jaroslaw 286c672ef4 Cleanup includes 37
Cleaned up files:
level_zero/core/source/event/event.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-20 12:34:39 +01:00
Warchulski, Jaroslaw 16d5a323c7 Cleanup includes 34
Cleaned up files:
opencl/source/command_queue/cl_local_work_size.h
opencl/test/unit_test/mocks/mock_buffer.h
shared/source/program/kernel_info.cpp

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-17 14:42:04 +01:00
Warchulski, Jaroslaw 3d59dce80c Cleanup includes 27
Cleaned up files:
opencl/source/command_queue/command_queue.h
shared/source/built_ins/registry/built_ins_registry.h
shared/source/kernel/kernel_descriptor.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-11 16:10:28 +01:00
Fabian Zwolinski 9dfed7cd54 Use cached group sizes in zeKernelSetGroupSize
Optimize zeKernelSetGroupSize by early returning success if group size
values have not changed since last function call.

Moved ImplicitArgs construction above setGroupSize call
in kernel initialization to prevent pImplicitArgs being nullptr
in calls in which we use cached group sizes and early return.

Related-To: NEO-7394
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-01-11 12:50:51 +01:00
Mateusz Hoppe d623ef391b feature: print printf contents right after gpu hang detection
- printf used in kernel is printed on synchronize() call, if
hang is detected - printf buffer was not printed immediately but
only when Kernel was destroyed
- this change adds copying printf buffer with internal engine
(whenever available) right after hang detection on
CommandQueue::synchronize() call

Related-To: NEO-6427

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-01-11 08:14:00 +01:00
Fabian Zwoliński 2e2abf1b6e Revert "Use cached group sizes in zeKernelSetGroupSize"
This reverts commit 7ec94c6aaa.

Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-01-03 16:36:36 +01:00
Warchulski, Jaroslaw a2fe929f0c Cleanup includes 18
Cleaned up files:
shared/source/command_stream/command_stream_receiver_hw.h
shared/source/compiler_interface/compiler_interface.h
shared/source/direct_submission/direct_submission_hw.h
shared/source/helpers/dirty_state_helpers.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-02 13:28:29 +01:00
Kamil Kopryk da80d9906e Refactor: don't use global GfxCoreHelper getter in shared files 5/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-01-02 10:54:14 +01:00
Warchulski, Jaroslaw 9f3fc6858e Cleanup includes 16
Cleaned up files:
shared/source/built_ins/built_ins.h
shared/source/command_container/command_encoder.h
shared/source/helpers/hw_helper.h
shared/source/memory_manager/allocation_properties.h
shared/source/xe_hpc_core/hw_cmds.h
shared/test/common/test_macros/test_excludes.h

Related-To: NEO-5548

Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-29 15:12:37 +01:00
Kamil Kopryk 16a238895a Refactor: don't use global ProductHelper getter in L0 2/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-22 11:43:26 +01:00
Fabian Zwolinski 7ec94c6aaa Use cached group sizes in zeKernelSetGroupSize
Optimize zeKernelSetGroupSize by early returning success if group size
values have not changed since last function call.

Related-To: NEO-7394
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2022-12-15 10:00:59 +01:00
Kamil Kopryk 232b886056 Rename HwInfoConfig to ProductHelper
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-14 14:39:52 +01:00
Mateusz Jablonski 10dbfc0d19 Reduce usage of global gfx core helper getter [3/n]
Related-To: NEO-6853
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-12-13 11:13:11 +01:00
Kamil Kopryk 03b687881f Rename HwHelper -> GfxCoreHelper
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-09 10:29:06 +01:00
Warchulski, Jaroslaw be647d42d9 Cleanup includes 12
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-07 13:14:15 +01:00
Patryk Wrobel 5793e200e4 Remove possible infinite loops related to pNext
Two code parts contained invalid logic related to traversing
opaque list of pNexts. This has been fixed.

Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
2022-12-06 11:55:14 +01:00
Warchulski, Jaroslaw 1fa5710dff Cleanup includes 10
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-05 12:39:33 +01:00
Andrzej Koska 90034d4173 Added scratch size check
Related-To: NEO-7508
Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2022-11-22 14:14:33 +01:00
Kamil Kopryk 002a90c717 Move hwHelper ownership to RootDeviceEnvironment 2/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>

UseRootDeviceEnvironment getHelper<CoreHelper> for:
- getMaxBarrierRegisterPerSlice
- getPaddingForISAAllocation
2022-11-10 16:39:39 +01:00
Maciej Bielski c06ddfc7b8 Allocate kernel private memory for xehp and later
Add missing allocation of kernel private memory for the scenario when
the private memory is not allocated within `KernelImp::initialize()` but
deferred until `appendLaunchKernelWithParams()` instead.

One kernel can never allocate more private/scratch memory than
`globalMemorySize`, that ends up in `ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY`
being returned. However, several separate kernels can exceed the
`globalMemorySize` and then, the private region of each such kernel is
allocated at later stage, in `appendLaunchKernelWithParams()`.

Such mechanism was present on pre-xehp platforms and it is now added to
xehp-and-later.

See:
* ModuleImp::checkIfPrivateMemoryPerDispatchIsNeeded()
* Module::shouldAllocatePrivateMemoryPerDispatch()

Related-To: NEO-7398
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2022-11-08 19:10:26 +01:00
Jim Snow 48ba0554db Allocate RTDispatchGlobals as array-of-structures.
This fixes several bugs in previous (reverted) implementation.
We use correct RTStack pointer offset, and a larger RTStack size.

Related-To: LOCI-2966

Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2022-11-07 21:25:32 +01:00
Maciej Plewka 7f38c5e633 Revert "Return error code for unsuported image arg in gen12lp"
This reverts commit bbc31e6aac


Signed-off-by: Maciej Plewka maciej.plewka@intel.com
2022-11-02 12:57:16 +01:00
Maciej Plewka ff01b9361e Return error code when there is no space for scratch/private
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-11-02 11:55:18 +01:00
Dominik Dabek 526ba1bde5 Fix l0 kernel set arg buffer caching
Fix for incorrect cache hit if alloc id was uninitialized
and allocations counter was the same.

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-10-27 17:25:10 +02:00
Maciej Plewka bbc31e6aac Return error code for unsuported image arg in gen12lp
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-10-24 16:54:10 +02:00
Jim Snow f976c7a313 Revert "Allocate RTDispatchGlobals as unboxed array"
This reverts commit eaa4965ae8.

Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2022-10-24 05:16:03 +02:00
Compute-Runtime-Validation 7c6783c4a1 Revert "Return error when image arg does not support media block commands"
This reverts commit e56d18b69f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-10-12 03:58:33 +02:00
Maciej Plewka e56d18b69f Return error when image arg does not support media block commands
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-10-11 15:47:27 +02:00
Krystian Chmielewski 73a58aaf9e feat(zebin): inline sampler
Add support for inline samplers in zebin.
Generate required SAMPLER_STATEs in DSH.

Resolves: NEO-7388

Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2022-10-10 12:47:19 +02:00
Fabian Zwolinski 7953d15826 Print warning when kernel uses too much SLM
Instead of just returning proper error code in case of exceeding
available Shared Local Memory size we also want to print error message
to make debugging easier.

Related-To: NEO-7280
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2022-10-07 19:06:19 +02:00
Fabian Zwolinski 1142404c0c Add error handling when kernel uses too much SLM
API Functions reporting error:
- clCreateKernel
- clEnqueueNDRangeKernel
- zeKernelCreate
- zeCommandListAppendLaunchKernel
- zeKernelSuggestGroupSize

Related-To: NEO-7280
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2022-10-06 16:45:42 +02:00
Jim Snow eaa4965ae8 Allocate RTDispatchGlobals as unboxed array
Previously we used an array-of-pointers approach, but using an
array-of-structures is in some ways simpler.

We also split out the RTStack as a separate allocation.

Related-To: LOCI-2966

Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2022-09-28 03:42:14 +02:00
Dominik Dabek 52ae228535 Add missing unit test
Missing ult for isDebuggerActive
Move inline function to .inl file

Related-To: NEO-7003

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-09-16 15:55:03 +02:00
Dominik Dabek 8cc0177f1c Change DG2 l1 cache policy to WB
With compiler LSC WAs this gives better performance.

If debugger is active, policy will not be changed ie.
will be WBP.

Related-To: NEO-7003

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-08-31 14:31:23 +02:00
Compute-Runtime-Validation 2621460e80 Revert "Change DG2 l1 cache policy to WB"
This reverts commit a820e73dd7.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-08-27 08:04:19 +02:00
Dominik Dabek a820e73dd7 Change DG2 l1 cache policy to WB
With compiler LSC WAs this gives better performance.

If debugger is active, policy will not be changed ie.
will be WBP.

Related-To: NEO-7003

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-08-26 12:58:45 +02:00
Compute-Runtime-Validation a5b4a13452 Revert "Return error when image arg does not support media block commands"
This reverts commit 8388e6cf4a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-08-24 06:23:22 +02:00
Maciej Plewka 8388e6cf4a Return error when image arg does not support media block commands
Related-To: NEO-7168

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-08-23 12:07:29 +02:00
Krystian Chmielewski 18adbed233 feat(zebin): add thread scheduling mode support
Resolves: NEO-7197

Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2022-08-18 16:10:18 +02:00
Wrobel, Patryk dda5b19859 Adjust the implementation of strncpy_s() for Linux
This change:
- prevents writing memory out of the range of the destination buffer
- prevents calling strlen() with non-null terminated c-string
- corrects the logic, which validates passed range to proceed
when real length fits the destination buffer

Related-To: NEO-7264
Signed-off-by: Wrobel, Patryk <patryk.wrobel@intel.com>
2022-08-17 18:37:46 +02:00