compute-runtime

mirror of https://github.com/intel/compute-runtime.git synced 2025-12-28 16:48:45 +08:00

Author	SHA1	Message	Date
Rafal Maziejuk	27ff1c911d	feature l0: handle additional properties in modules Related-To: NEO-7357 Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>	2023-03-24 10:27:44 +01:00
Maciej Bielski	3ec0a637ba	fix(l0): return API error on ISA allocation OOM It is possible that a module has so many kernels that the 4GB limit of GPU VA is depleted when each kernel allocates a 64 KB page for its own ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to the API caller to indicate the actual problem. Currently such scenario is not detected, the execution advances a bit further and the following crashes do not let the user to easily understand what happened. Related-To: NEO-7788 Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>	2023-03-23 17:30:15 +01:00
Krzysztof Gibala	ecd8c6b410	fix l0: Add missing calculation in kernel getProperties After resolving NEO-7684 in turns out that `zeKernelGetProperties` is still returning wrong value for `maxNumSubgroups` since it did not take into account `LargeGRF & SIMD` limitation. Related-To: NEO-7829 Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>	2023-03-22 16:06:13 +01:00
Mateusz Jablonski	0da5e6f277	refactor l0: cleanup cmake file level_zero/core/source/CMakeLists.txt Related-To: NEO-7507 Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>	2023-03-16 12:38:15 +01:00
Mateusz Hoppe	0204761add	feature: gpu assert implementation - allocate assert buffer when kernel has assert - track assert kernels in cmdlists and cmdqueues - check and print assert at sync calls: cmdqueue synchronize(), fence synchronize(), event hostSynchronize(), synchronous imm cmdlists append() Related-To: NEO-5753 Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>	2023-03-15 19:22:09 +01:00
Dominik Dabek	69a16fd3ed	feature: check indirect access for kernel Do not make indirect allocations resident if kernel does not use indirect access. For both level zero and opencl. Currently disabled by default, enable with debug flag DetectIndirectAccessInKernel Related-To: NEO-7712 Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>	2023-03-08 16:58:26 +01:00
Zbigniew Zdanowicz	c8b90613a8	[perf] simplify command list preemption state transition - apply revelant flags only on platforms supporting these flags - update command list preemption level when supported - use actual kernel preemption level to program interface descriptor data Related-To: NEO-7771 Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>	2023-03-02 12:19:02 +01:00
Zhang, Winston	c584d19a6c	modify printPrintfOutput to be an atomic operation Mutex was added to kernel_imp for atomic operation during printPrintfOutput on kernel. Related-To: LOCI-3681 Signed-off-by: Zhang, Winston <winston.zhang@intel.com>	2023-03-02 08:53:18 +01:00
Compute-Runtime-Validation	4a369ad88d	Revert "feature: check indirect access for kernel" This reverts commit `075c96267d`. Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>	2023-02-24 03:48:22 +01:00
Dominik Dabek	075c96267d	feature: check indirect access for kernel Do not make indirect allocations resident if kernel does not use indirect access. Enable for both level zero and opencl. Related-To: NEO-7712 Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>	2023-02-23 12:38:53 +01:00
Compute-Runtime-Validation	678e47de2d	Revert "Adjust maxWorkGroupSize value" This reverts commit `f7685a93e4`. Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>	2023-02-21 14:45:36 +01:00
Rafal Maziejuk	f7685a93e4	Adjust maxWorkGroupSize value Related-To: NEO-7357 Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>	2023-02-17 09:34:15 +01:00
Maciej Plewka	429be6b4cb	Disable EUFusion for odd work groups with DPAS on DG2 Related-To: NEO-7495, HSD-14017007475 Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>	2023-02-13 15:27:49 +01:00
Maciej Bielski	2778043d67	fix(l0): check for largeGRF when computing maxWorkGroupSize Sizing context (PVC): When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU (instead of default 8). Together with SIMD16 that means that there can be max 64 work-items per EU. With 8 EU per subslice this gives 512 work-items on a single subslice. For correct intra-WG synchronization all its WIs must be executed on the same subslice (to access the same SLM, where the synchronization primitives are stored). Thus, with SIMD16 and LargeGRF the work-group size must not exceed 512 (PVC example). So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure both in `ModuleTranslationUnit::processUnpackedBinary()` and `ModuleImp::initialize()`. This method does not take kernel parameters (LargeGRF) into account. It allows to submit a kernel using LargeGRF with SIMD16 with the work-group size set to 1024. That leads to a hang. Fix the `.maxWorkGroupSize` computation so that it takes the kernel parameters into consideration. Add new (for discrete platforms >= XeHP) and adapt existing tests, fix cosmetics by the way. Similar check for OCL: https://github.com/intel/compute-runtime/blob/master/opencl/source/comma nd_queue/enqueue_kernel.h#L130 Related-To: NEO-7684 Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>	2023-02-08 11:20:52 +01:00
Mateusz Jablonski	24c5352350	refactor: remove redundant including of compiler_cache.h Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>	2023-02-03 11:16:31 +01:00
Compute-Runtime-Validation	606a900080	Revert "Disable EUFusion for odd work groups with DPAS on DG2" This reverts commit `017d66a469`. Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>	2023-02-03 02:45:21 +01:00
Maciej Plewka	017d66a469	Disable EUFusion for odd work groups with DPAS on DG2 Related-To: NEO-7495, HSD-14017007475 Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>	2023-02-02 13:57:42 +01:00
Kamil Kopryk	2484c7ceb2	refactor: rename hw_helper files to gfx_core_helper files Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2023-02-01 19:37:51 +01:00
Zbigniew Zdanowicz	34b8f08fc6	Add state base address properties tracking for command lists Related-To: NEO-5055 Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>	2023-01-31 12:47:17 +01:00
Kamil Kopryk	68bfd49033	refactor: don't use global ProductHelper getter 15/n Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2023-01-27 17:51:57 +01:00
Kamil Kopryk	445706361d	refactor: don't use global ProductHelper 14/n Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2023-01-27 14:51:12 +01:00
Warchulski, Jaroslaw	c43233dabf	Cleanup includes 42 Cleaned up files: level_zero/core/source/kernel/kernel_hw.h shared/source/helpers/common_types.h shared/test/common/libult/linux/drm_mock.h shared/test/common/libult/ult_command_stream_receiver.h Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2023-01-25 09:16:39 +01:00
Warchulski, Jaroslaw	49837b7bb5	Cleanup includes 39 Cleaned up files: shared/source/command_container/command_encoder.h Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2023-01-23 11:56:42 +01:00
Warchulski, Jaroslaw	286c672ef4	Cleanup includes 37 Cleaned up files: level_zero/core/source/event/event.h Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2023-01-20 12:34:39 +01:00
Warchulski, Jaroslaw	16d5a323c7	Cleanup includes 34 Cleaned up files: opencl/source/command_queue/cl_local_work_size.h opencl/test/unit_test/mocks/mock_buffer.h shared/source/program/kernel_info.cpp Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2023-01-17 14:42:04 +01:00
Warchulski, Jaroslaw	3d59dce80c	Cleanup includes 27 Cleaned up files: opencl/source/command_queue/command_queue.h shared/source/built_ins/registry/built_ins_registry.h shared/source/kernel/kernel_descriptor.h Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2023-01-11 16:10:28 +01:00
Fabian Zwolinski	9dfed7cd54	Use cached group sizes in zeKernelSetGroupSize Optimize zeKernelSetGroupSize by early returning success if group size values have not changed since last function call. Moved ImplicitArgs construction above setGroupSize call in kernel initialization to prevent pImplicitArgs being nullptr in calls in which we use cached group sizes and early return. Related-To: NEO-7394 Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>	2023-01-11 12:50:51 +01:00
Mateusz Hoppe	d623ef391b	feature: print printf contents right after gpu hang detection - printf used in kernel is printed on synchronize() call, if hang is detected - printf buffer was not printed immediately but only when Kernel was destroyed - this change adds copying printf buffer with internal engine (whenever available) right after hang detection on CommandQueue::synchronize() call Related-To: NEO-6427 Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>	2023-01-11 08:14:00 +01:00
Fabian Zwoliński	2e2abf1b6e	Revert "Use cached group sizes in zeKernelSetGroupSize" This reverts commit `7ec94c6aaa`. Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>	2023-01-03 16:36:36 +01:00
Warchulski, Jaroslaw	a2fe929f0c	Cleanup includes 18 Cleaned up files: shared/source/command_stream/command_stream_receiver_hw.h shared/source/compiler_interface/compiler_interface.h shared/source/direct_submission/direct_submission_hw.h shared/source/helpers/dirty_state_helpers.h Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2023-01-02 13:28:29 +01:00
Kamil Kopryk	da80d9906e	Refactor: don't use global GfxCoreHelper getter in shared files 5/n Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2023-01-02 10:54:14 +01:00
Warchulski, Jaroslaw	9f3fc6858e	Cleanup includes 16 Cleaned up files: shared/source/built_ins/built_ins.h shared/source/command_container/command_encoder.h shared/source/helpers/hw_helper.h shared/source/memory_manager/allocation_properties.h shared/source/xe_hpc_core/hw_cmds.h shared/test/common/test_macros/test_excludes.h Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2022-12-29 15:12:37 +01:00
Kamil Kopryk	16a238895a	Refactor: don't use global ProductHelper getter in L0 2/n Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2022-12-22 11:43:26 +01:00
Fabian Zwolinski	7ec94c6aaa	Use cached group sizes in zeKernelSetGroupSize Optimize zeKernelSetGroupSize by early returning success if group size values have not changed since last function call. Related-To: NEO-7394 Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>	2022-12-15 10:00:59 +01:00
Kamil Kopryk	232b886056	Rename HwInfoConfig to ProductHelper Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2022-12-14 14:39:52 +01:00
Mateusz Jablonski	10dbfc0d19	Reduce usage of global gfx core helper getter [3/n] Related-To: NEO-6853 Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>	2022-12-13 11:13:11 +01:00
Kamil Kopryk	03b687881f	Rename HwHelper -> GfxCoreHelper Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2022-12-09 10:29:06 +01:00
Warchulski, Jaroslaw	be647d42d9	Cleanup includes 12 Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2022-12-07 13:14:15 +01:00
Patryk Wrobel	5793e200e4	Remove possible infinite loops related to pNext Two code parts contained invalid logic related to traversing opaque list of pNexts. This has been fixed. Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>	2022-12-06 11:55:14 +01:00
Warchulski, Jaroslaw	1fa5710dff	Cleanup includes 10 Related-To: NEO-5548 Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>	2022-12-05 12:39:33 +01:00
Andrzej Koska	90034d4173	Added scratch size check Related-To: NEO-7508 Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>	2022-11-22 14:14:33 +01:00
Kamil Kopryk	002a90c717	Move hwHelper ownership to RootDeviceEnvironment 2/n Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com> UseRootDeviceEnvironment getHelper<CoreHelper> for: - getMaxBarrierRegisterPerSlice - getPaddingForISAAllocation	2022-11-10 16:39:39 +01:00
Maciej Bielski	c06ddfc7b8	Allocate kernel private memory for xehp and later Add missing allocation of kernel private memory for the scenario when the private memory is not allocated within `KernelImp::initialize()` but deferred until `appendLaunchKernelWithParams()` instead. One kernel can never allocate more private/scratch memory than `globalMemorySize`, that ends up in `ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY` being returned. However, several separate kernels can exceed the `globalMemorySize` and then, the private region of each such kernel is allocated at later stage, in `appendLaunchKernelWithParams()`. Such mechanism was present on pre-xehp platforms and it is now added to xehp-and-later. See: * ModuleImp::checkIfPrivateMemoryPerDispatchIsNeeded() * Module::shouldAllocatePrivateMemoryPerDispatch() Related-To: NEO-7398 Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>	2022-11-08 19:10:26 +01:00
Jim Snow	48ba0554db	Allocate RTDispatchGlobals as array-of-structures. This fixes several bugs in previous (reverted) implementation. We use correct RTStack pointer offset, and a larger RTStack size. Related-To: LOCI-2966 Signed-off-by: Jim Snow <jim.m.snow@intel.com>	2022-11-07 21:25:32 +01:00
Maciej Plewka	7f38c5e633	Revert "Return error code for unsuported image arg in gen12lp" This reverts commit `bbc31e6aac` Signed-off-by: Maciej Plewka maciej.plewka@intel.com	2022-11-02 12:57:16 +01:00
Maciej Plewka	ff01b9361e	Return error code when there is no space for scratch/private Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>	2022-11-02 11:55:18 +01:00
Dominik Dabek	526ba1bde5	Fix l0 kernel set arg buffer caching Fix for incorrect cache hit if alloc id was uninitialized and allocations counter was the same. Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>	2022-10-27 17:25:10 +02:00
Maciej Plewka	bbc31e6aac	Return error code for unsuported image arg in gen12lp Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>	2022-10-24 16:54:10 +02:00
Jim Snow	f976c7a313	Revert "Allocate RTDispatchGlobals as unboxed array" This reverts commit `eaa4965ae8`. Signed-off-by: Jim Snow <jim.m.snow@intel.com>	2022-10-24 05:16:03 +02:00
Compute-Runtime-Validation	7c6783c4a1	Revert "Return error when image arg does not support media block commands" This reverts commit `e56d18b69f`. Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>	2022-10-12 03:58:33 +02:00

1 2 3 4 5

249 Commits