Commit Graph

249 Commits

Author SHA1 Message Date
Rafal Maziejuk
27ff1c911d feature l0: handle additional properties in modules
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-24 10:27:44 +01:00
Maciej Bielski
3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Krzysztof Gibala
ecd8c6b410 fix l0: Add missing calculation in kernel getProperties
After resolving NEO-7684 in turns out that `zeKernelGetProperties`
is still returning wrong value for `maxNumSubgroups` since it
did not take into account `LargeGRF & SIMD` limitation.

Related-To: NEO-7829
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2023-03-22 16:06:13 +01:00
Mateusz Jablonski
0da5e6f277 refactor l0: cleanup cmake file level_zero/core/source/CMakeLists.txt
Related-To: NEO-7507
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-16 12:38:15 +01:00
Mateusz Hoppe
0204761add feature: gpu assert implementation
- allocate assert buffer when kernel has assert
- track assert kernels in cmdlists and cmdqueues
- check and print assert at sync calls: cmdqueue synchronize(), fence
synchronize(), event hostSynchronize(), synchronous imm cmdlists
append()

Related-To: NEO-5753

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-15 19:22:09 +01:00
Dominik Dabek
69a16fd3ed feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
For both level zero and opencl.
Currently disabled by default, enable with debug flag
DetectIndirectAccessInKernel

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-03-08 16:58:26 +01:00
Zbigniew Zdanowicz
c8b90613a8 [perf] simplify command list preemption state transition
- apply revelant flags only on platforms supporting these flags
- update command list preemption level when supported
- use actual kernel preemption level to program interface descriptor data

Related-To: NEO-7771

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-02 12:19:02 +01:00
Zhang, Winston
c584d19a6c modify printPrintfOutput to be an atomic operation
Mutex was added to kernel_imp for atomic operation during
printPrintfOutput on kernel.

Related-To: LOCI-3681

Signed-off-by: Zhang, Winston <winston.zhang@intel.com>
2023-03-02 08:53:18 +01:00
Compute-Runtime-Validation
4a369ad88d Revert "feature: check indirect access for kernel"
This reverts commit 075c96267d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-24 03:48:22 +01:00
Dominik Dabek
075c96267d feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
Enable for both level zero and opencl.

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-02-23 12:38:53 +01:00
Compute-Runtime-Validation
678e47de2d Revert "Adjust maxWorkGroupSize value"
This reverts commit f7685a93e4.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-21 14:45:36 +01:00
Rafal Maziejuk
f7685a93e4 Adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-02-17 09:34:15 +01:00
Maciej Plewka
429be6b4cb Disable EUFusion for odd work groups with DPAS on DG2
Related-To: NEO-7495, HSD-14017007475

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-02-13 15:27:49 +01:00
Maciej Bielski
2778043d67 fix(l0): check for largeGRF when computing maxWorkGroupSize
Sizing context (PVC):
When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU
(instead of default 8). Together with SIMD16 that means that there can
be max 64 work-items per EU. With 8 EU per subslice this gives 512
work-items on a single subslice. For correct intra-WG synchronization
all its WIs must be executed on the same subslice (to access the same
SLM, where the synchronization primitives are stored). Thus, with SIMD16
and LargeGRF the work-group size must not exceed 512 (PVC example).

So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure
both in `ModuleTranslationUnit::processUnpackedBinary()` and
`ModuleImp::initialize()`. This method does not take kernel parameters
(LargeGRF) into account. It allows to submit a kernel using LargeGRF
with SIMD16 with the work-group size set to 1024. That leads to a hang.

Fix the `.maxWorkGroupSize` computation so that it takes the kernel
parameters into consideration.

Add new (for discrete platforms >= XeHP) and adapt existing tests, fix
cosmetics by the way.

Similar check for OCL:
https://github.com/intel/compute-runtime/blob/master/opencl/source/comma
nd_queue/enqueue_kernel.h#L130

Related-To: NEO-7684
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-02-08 11:20:52 +01:00
Mateusz Jablonski
24c5352350 refactor: remove redundant including of compiler_cache.h
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-02-03 11:16:31 +01:00
Compute-Runtime-Validation
606a900080 Revert "Disable EUFusion for odd work groups with DPAS on DG2"
This reverts commit 017d66a469.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-03 02:45:21 +01:00
Maciej Plewka
017d66a469 Disable EUFusion for odd work groups with DPAS on DG2
Related-To: NEO-7495, HSD-14017007475

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-02-02 13:57:42 +01:00
Kamil Kopryk
2484c7ceb2 refactor: rename hw_helper files to gfx_core_helper files
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-02-01 19:37:51 +01:00
Zbigniew Zdanowicz
34b8f08fc6 Add state base address properties tracking for command lists
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-01-31 12:47:17 +01:00
Kamil Kopryk
68bfd49033 refactor: don't use global ProductHelper getter 15/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-01-27 17:51:57 +01:00
Kamil Kopryk
445706361d refactor: don't use global ProductHelper 14/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-01-27 14:51:12 +01:00
Warchulski, Jaroslaw
c43233dabf Cleanup includes 42
Cleaned up files:
level_zero/core/source/kernel/kernel_hw.h
shared/source/helpers/common_types.h
shared/test/common/libult/linux/drm_mock.h
shared/test/common/libult/ult_command_stream_receiver.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-25 09:16:39 +01:00
Warchulski, Jaroslaw
49837b7bb5 Cleanup includes 39
Cleaned up files:
shared/source/command_container/command_encoder.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-23 11:56:42 +01:00
Warchulski, Jaroslaw
286c672ef4 Cleanup includes 37
Cleaned up files:
level_zero/core/source/event/event.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-20 12:34:39 +01:00
Warchulski, Jaroslaw
16d5a323c7 Cleanup includes 34
Cleaned up files:
opencl/source/command_queue/cl_local_work_size.h
opencl/test/unit_test/mocks/mock_buffer.h
shared/source/program/kernel_info.cpp

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-17 14:42:04 +01:00
Warchulski, Jaroslaw
3d59dce80c Cleanup includes 27
Cleaned up files:
opencl/source/command_queue/command_queue.h
shared/source/built_ins/registry/built_ins_registry.h
shared/source/kernel/kernel_descriptor.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-11 16:10:28 +01:00
Fabian Zwolinski
9dfed7cd54 Use cached group sizes in zeKernelSetGroupSize
Optimize zeKernelSetGroupSize by early returning success if group size
values have not changed since last function call.

Moved ImplicitArgs construction above setGroupSize call
in kernel initialization to prevent pImplicitArgs being nullptr
in calls in which we use cached group sizes and early return.

Related-To: NEO-7394
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-01-11 12:50:51 +01:00
Mateusz Hoppe
d623ef391b feature: print printf contents right after gpu hang detection
- printf used in kernel is printed on synchronize() call, if
hang is detected - printf buffer was not printed immediately but
only when Kernel was destroyed
- this change adds copying printf buffer with internal engine
(whenever available) right after hang detection on
CommandQueue::synchronize() call

Related-To: NEO-6427

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-01-11 08:14:00 +01:00
Fabian Zwoliński
2e2abf1b6e Revert "Use cached group sizes in zeKernelSetGroupSize"
This reverts commit 7ec94c6aaa.

Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-01-03 16:36:36 +01:00
Warchulski, Jaroslaw
a2fe929f0c Cleanup includes 18
Cleaned up files:
shared/source/command_stream/command_stream_receiver_hw.h
shared/source/compiler_interface/compiler_interface.h
shared/source/direct_submission/direct_submission_hw.h
shared/source/helpers/dirty_state_helpers.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-02 13:28:29 +01:00
Kamil Kopryk
da80d9906e Refactor: don't use global GfxCoreHelper getter in shared files 5/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-01-02 10:54:14 +01:00
Warchulski, Jaroslaw
9f3fc6858e Cleanup includes 16
Cleaned up files:
shared/source/built_ins/built_ins.h
shared/source/command_container/command_encoder.h
shared/source/helpers/hw_helper.h
shared/source/memory_manager/allocation_properties.h
shared/source/xe_hpc_core/hw_cmds.h
shared/test/common/test_macros/test_excludes.h

Related-To: NEO-5548

Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-29 15:12:37 +01:00
Kamil Kopryk
16a238895a Refactor: don't use global ProductHelper getter in L0 2/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-22 11:43:26 +01:00
Fabian Zwolinski
7ec94c6aaa Use cached group sizes in zeKernelSetGroupSize
Optimize zeKernelSetGroupSize by early returning success if group size
values have not changed since last function call.

Related-To: NEO-7394
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2022-12-15 10:00:59 +01:00
Kamil Kopryk
232b886056 Rename HwInfoConfig to ProductHelper
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-14 14:39:52 +01:00
Mateusz Jablonski
10dbfc0d19 Reduce usage of global gfx core helper getter [3/n]
Related-To: NEO-6853
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-12-13 11:13:11 +01:00
Kamil Kopryk
03b687881f Rename HwHelper -> GfxCoreHelper
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-09 10:29:06 +01:00
Warchulski, Jaroslaw
be647d42d9 Cleanup includes 12
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-07 13:14:15 +01:00
Patryk Wrobel
5793e200e4 Remove possible infinite loops related to pNext
Two code parts contained invalid logic related to traversing
opaque list of pNexts. This has been fixed.

Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
2022-12-06 11:55:14 +01:00
Warchulski, Jaroslaw
1fa5710dff Cleanup includes 10
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-05 12:39:33 +01:00
Andrzej Koska
90034d4173 Added scratch size check
Related-To: NEO-7508
Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2022-11-22 14:14:33 +01:00
Kamil Kopryk
002a90c717 Move hwHelper ownership to RootDeviceEnvironment 2/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>

UseRootDeviceEnvironment getHelper<CoreHelper> for:
- getMaxBarrierRegisterPerSlice
- getPaddingForISAAllocation
2022-11-10 16:39:39 +01:00
Maciej Bielski
c06ddfc7b8 Allocate kernel private memory for xehp and later
Add missing allocation of kernel private memory for the scenario when
the private memory is not allocated within `KernelImp::initialize()` but
deferred until `appendLaunchKernelWithParams()` instead.

One kernel can never allocate more private/scratch memory than
`globalMemorySize`, that ends up in `ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY`
being returned. However, several separate kernels can exceed the
`globalMemorySize` and then, the private region of each such kernel is
allocated at later stage, in `appendLaunchKernelWithParams()`.

Such mechanism was present on pre-xehp platforms and it is now added to
xehp-and-later.

See:
* ModuleImp::checkIfPrivateMemoryPerDispatchIsNeeded()
* Module::shouldAllocatePrivateMemoryPerDispatch()

Related-To: NEO-7398
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2022-11-08 19:10:26 +01:00
Jim Snow
48ba0554db Allocate RTDispatchGlobals as array-of-structures.
This fixes several bugs in previous (reverted) implementation.
We use correct RTStack pointer offset, and a larger RTStack size.

Related-To: LOCI-2966

Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2022-11-07 21:25:32 +01:00
Maciej Plewka
7f38c5e633 Revert "Return error code for unsuported image arg in gen12lp"
This reverts commit bbc31e6aac


Signed-off-by: Maciej Plewka maciej.plewka@intel.com
2022-11-02 12:57:16 +01:00
Maciej Plewka
ff01b9361e Return error code when there is no space for scratch/private
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-11-02 11:55:18 +01:00
Dominik Dabek
526ba1bde5 Fix l0 kernel set arg buffer caching
Fix for incorrect cache hit if alloc id was uninitialized
and allocations counter was the same.

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-10-27 17:25:10 +02:00
Maciej Plewka
bbc31e6aac Return error code for unsuported image arg in gen12lp
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-10-24 16:54:10 +02:00
Jim Snow
f976c7a313 Revert "Allocate RTDispatchGlobals as unboxed array"
This reverts commit eaa4965ae8.

Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2022-10-24 05:16:03 +02:00
Compute-Runtime-Validation
7c6783c4a1 Revert "Return error when image arg does not support media block commands"
This reverts commit e56d18b69f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-10-12 03:58:33 +02:00