Commit Graph

280 Commits

Author SHA1 Message Date
Lu, Wenbin
3d33366ff6 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-08-22 14:57:08 +02:00
Mateusz Hoppe
8435160db4 feature: bindless addressing for images
- program surface states for redescribed images correctly. Image copy
to/from memory are using redescribed surface states,
- refactor state base address programming - program address and size
together, set max size at the beginning due to lack of Enable flag
- set GpuBase in WddmAllocation when external heap is used
- return max ssh required size from kernelInfo or based on stateful args

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-08-18 15:59:20 +02:00
Fabian Zwolinski
6fca8ee195 refactor: Remove SourceLevelDebugger
Removed:
- SourceLevelDebugger (with tests)
- DebuggerLibrary
- DebuggerLibraryRestore
- debuggerSupported field from hwInfo.capabilityTable
- HasSourceLevelDebuggerSupport matcher
- ExperimentalEnableSourceLevelDebugger debug var
- EnableMockSourceLevelDebugger debug var
- DebuggerOptDisable debug var
- lib_names.h.in file
- third_party/source_level_debugger/igfx_debug_interchange_types.h

Related-To: NEO-7213
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-08-10 11:14:02 +02:00
Compute-Runtime-Validation
b7a56521f8 Revert "refactor: Enable CSR heap sharing on Older Gen platforms"
This reverts commit 160daeb874.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-26 05:40:59 +02:00
Jitendra Sharma
160daeb874 refactor: Enable CSR heap sharing on Older Gen platforms
Related-To: LOCI-4312
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-07-25 19:37:33 +02:00
Mateusz Hoppe
67d39f88e6 feature: bindless addressing - store bindlessInfo in allocation
- store surface state info for bindless addressing in graphics
allocation
- remove map in BindlessHeapsHelper - bindlessInfo is constant for
the lifetime of an allocation
- program bindless offsets and surface states for images when used in
bindless kernel
- handle ouf of memory on surface state heap - return error

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-24 14:48:35 +02:00
Mateusz Hoppe
9fd7f9cf05 fix: set ImplicitArgs size to size of defined fields
Resolves: NEO-8169

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-12 21:30:32 +02:00
Cencelewska, Katarzyna
aa0beb8191 fix: Unify logic calculating threads per work group part 4
- also use helper when checking that is simd1 to have same flow

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-07 15:34:59 +02:00
Cencelewska, Katarzyna
61f701aba5 fix: Unify logic calculating threads per work group part 3
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 15:27:44 +02:00
Cencelewska, Katarzyna
2e17c21728 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 10:34:02 +02:00
Compute-Runtime-Validation
39740da9d1 Revert "fix: Unify logic calculating threads per work group part 2"
This reverts commit 1e8a53bd53.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-02 07:09:14 +02:00
Young Jin Yoon
c5d675570a feature: support for zeDriverGetLastErrorDescription
Added setErrorDescription() and getErrorDescription() in DriverHandle
to record and retrieve the custom string for errors.

Related-To: LOCI-4619
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2023-06-30 17:12:32 +02:00
Cencelewska, Katarzyna
1e8a53bd53 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-30 14:16:08 +02:00
Cencelewska, Katarzyna
0d7aefe66b fix: Unify logic calculating threads per work group part 1
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-29 10:43:22 +02:00
Cencelewska, Katarzyna
68d81c82a7 fix: Use proper value about hw local id generations
- remove useless flag ForceNumberOfThreadsInGpgpuThreadGroup
- add new flag "RemoveRestrictionsOnNumberOfThreadsInGpgpuThreadGroup"
to restore old path without restrictions about number of threads in
thread group
- fix forwarding information about hw local ids generations to
calculate numOfThreadsInThreadGroup correctly

Related-To: NEO-7952, NEO-7982
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-26 16:35:42 +02:00
Lukasz Jobczyk
bc0a3a7eb5 fix: Consider slm size in suggest work group cache
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-06-26 09:12:54 +02:00
Zbigniew Zdanowicz
ddffb8a67f fix: add missing unrecoverable macro
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-06-22 10:47:18 +02:00
Mateusz Hoppe
111b112729 feature: add assertBufferPtr to ImplicitArgs
Related-To: NEO-5753, NEO-8078

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-20 20:43:57 +02:00
Mateusz Hoppe
313fb84fda feature: bindless addressing mode support
- allow bindless kernels to execute
- bindless addressing kernels are using private heaps mode
- do not differentiate bindful and bindless surface state base addresses

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-19 12:41:03 +02:00
Lukasz Jobczyk
0cf975605b performance: Cache suggest group size
Resolves: NEO-7968

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-06-16 13:26:55 +02:00
Cencelewska, Katarzyna
7cb3278eb3 fix: add function to calculate number of threads per tg
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-13 14:02:24 +02:00
Neil R Spruit
ba6d447b4d feature: Support for using Reserved address with multiple mappings
Related-To: LOCI-4381

- Enabled support for customers to use full Virtual reservation range
with multiple physical mappings with additional allocations implicitly
included in residency.
- Buffer Surface state size extended for first allocation to stretch to
the bufferSize requested.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-06-07 03:12:29 +02:00
Mateusz Hoppe
1c196b9f3d refactor: change ApiSpecificConfig functions names
- better description of the meaning of functions

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-05-30 09:20:01 +02:00
Neil R Spruit
ded9d7bff2 feature: Get Peer Allocation with specified base Pointer
Related-To: LOCI-4176

- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-05-24 20:41:20 +02:00
Compute-Runtime-Validation
375f212b2d Revert "fix: setGroupSize caching to not hide error"
This reverts commit 56b167f530.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-05-16 02:58:11 +02:00
Dominik Dabek
56b167f530 fix: setGroupSize caching to not hide error
When setting kernel group size with incorrect values, error would not be
returned if method called with same arguments a second time.

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-05-15 14:57:46 +02:00
Fabian Zwolinski
cbce863dc2 refactor: Rename member variables to camelCase 3/n
Additionally enable clang-tidy check for member variables

Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-28 16:01:14 +02:00
Fabian Zwolinski
e351a90f81 refactor: Rename member variables to camelCase 2/n
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-27 20:39:22 +02:00
Zbigniew Zdanowicz
7731264fe3 [fix] update ray tracing commands programing
- 3D btd command should be programed only once per context
- Add conditional pipe control command prior dispatching 3D btd command
- share 3D btd state between immediate and regular command lists
- add pipe control after ray tracing kernel to invalidate state cache

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-03 11:21:24 +02:00
Rafal Maziejuk
b9828b543e feature: adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-28 15:19:52 +02:00
Mateusz Jablonski
dd39b822d3 feature implicit args: patch rt dispatch global array in implicit args buffer
handle has_rtcalls in kernels and functions in zebin

Related-To: NEO-7818
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-28 12:31:38 +02:00
Rafal Maziejuk
27ff1c911d feature l0: handle additional properties in modules
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-24 10:27:44 +01:00
Maciej Bielski
3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Krzysztof Gibala
ecd8c6b410 fix l0: Add missing calculation in kernel getProperties
After resolving NEO-7684 in turns out that `zeKernelGetProperties`
is still returning wrong value for `maxNumSubgroups` since it
did not take into account `LargeGRF & SIMD` limitation.

Related-To: NEO-7829
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2023-03-22 16:06:13 +01:00
Mateusz Jablonski
0da5e6f277 refactor l0: cleanup cmake file level_zero/core/source/CMakeLists.txt
Related-To: NEO-7507
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-16 12:38:15 +01:00
Mateusz Hoppe
0204761add feature: gpu assert implementation
- allocate assert buffer when kernel has assert
- track assert kernels in cmdlists and cmdqueues
- check and print assert at sync calls: cmdqueue synchronize(), fence
synchronize(), event hostSynchronize(), synchronous imm cmdlists
append()

Related-To: NEO-5753

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-15 19:22:09 +01:00
Dominik Dabek
69a16fd3ed feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
For both level zero and opencl.
Currently disabled by default, enable with debug flag
DetectIndirectAccessInKernel

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-03-08 16:58:26 +01:00
Zbigniew Zdanowicz
c8b90613a8 [perf] simplify command list preemption state transition
- apply revelant flags only on platforms supporting these flags
- update command list preemption level when supported
- use actual kernel preemption level to program interface descriptor data

Related-To: NEO-7771

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-02 12:19:02 +01:00
Zhang, Winston
c584d19a6c modify printPrintfOutput to be an atomic operation
Mutex was added to kernel_imp for atomic operation during
printPrintfOutput on kernel.

Related-To: LOCI-3681

Signed-off-by: Zhang, Winston <winston.zhang@intel.com>
2023-03-02 08:53:18 +01:00
Compute-Runtime-Validation
4a369ad88d Revert "feature: check indirect access for kernel"
This reverts commit 075c96267d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-24 03:48:22 +01:00
Dominik Dabek
075c96267d feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
Enable for both level zero and opencl.

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-02-23 12:38:53 +01:00
Compute-Runtime-Validation
678e47de2d Revert "Adjust maxWorkGroupSize value"
This reverts commit f7685a93e4.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-21 14:45:36 +01:00
Rafal Maziejuk
f7685a93e4 Adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-02-17 09:34:15 +01:00
Maciej Plewka
429be6b4cb Disable EUFusion for odd work groups with DPAS on DG2
Related-To: NEO-7495, HSD-14017007475

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-02-13 15:27:49 +01:00
Maciej Bielski
2778043d67 fix(l0): check for largeGRF when computing maxWorkGroupSize
Sizing context (PVC):
When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU
(instead of default 8). Together with SIMD16 that means that there can
be max 64 work-items per EU. With 8 EU per subslice this gives 512
work-items on a single subslice. For correct intra-WG synchronization
all its WIs must be executed on the same subslice (to access the same
SLM, where the synchronization primitives are stored). Thus, with SIMD16
and LargeGRF the work-group size must not exceed 512 (PVC example).

So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure
both in `ModuleTranslationUnit::processUnpackedBinary()` and
`ModuleImp::initialize()`. This method does not take kernel parameters
(LargeGRF) into account. It allows to submit a kernel using LargeGRF
with SIMD16 with the work-group size set to 1024. That leads to a hang.

Fix the `.maxWorkGroupSize` computation so that it takes the kernel
parameters into consideration.

Add new (for discrete platforms >= XeHP) and adapt existing tests, fix
cosmetics by the way.

Similar check for OCL:
https://github.com/intel/compute-runtime/blob/master/opencl/source/comma
nd_queue/enqueue_kernel.h#L130

Related-To: NEO-7684
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-02-08 11:20:52 +01:00
Mateusz Jablonski
24c5352350 refactor: remove redundant including of compiler_cache.h
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-02-03 11:16:31 +01:00
Compute-Runtime-Validation
606a900080 Revert "Disable EUFusion for odd work groups with DPAS on DG2"
This reverts commit 017d66a469.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-03 02:45:21 +01:00
Maciej Plewka
017d66a469 Disable EUFusion for odd work groups with DPAS on DG2
Related-To: NEO-7495, HSD-14017007475

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-02-02 13:57:42 +01:00
Kamil Kopryk
2484c7ceb2 refactor: rename hw_helper files to gfx_core_helper files
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-02-01 19:37:51 +01:00
Zbigniew Zdanowicz
34b8f08fc6 Add state base address properties tracking for command lists
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-01-31 12:47:17 +01:00