Commit Graph

289 Commits

Author SHA1 Message Date
Compute-Runtime-Validation
913a926fd4 Revert "feature: Optimize intra-module kernel ISA allocations"
This reverts commit c348831470.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-19 14:16:05 +02:00
Maciej Bielski
c348831470 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-19 12:05:09 +02:00
Maciej Plewka
ee21f7c717 fix: Use cmdlist residency container for reused private allocs
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-09-18 13:50:17 +02:00
Mateusz Hoppe
fb211a921d feature: bindless addressing support for image views
Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-09-15 15:25:47 +02:00
Mateusz Hoppe
93469eaf5d feature: bindless addressing for buffers with offset
- allocate SurfaceStates on kernel's heap for offsetted buffers

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-09-08 12:03:23 +02:00
Joshua Santosh Ranjan
91784a87cc fix: Return success for system address in setArg
This patch avoids returning error for system addresses in setArg

Related-To: GSD-3597

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2023-09-08 05:27:55 +02:00
Maciej Plewka
5807d512b3 fix: Reuse private allocations during cmdList dispatch
Related-To: NEO-8201

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-08-31 14:40:55 +02:00
Compute-Runtime-Validation
21a506b045 Revert "fix: serialize printf kernel accesses using device-wise locks"
This reverts commit 3d33366ff6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-08-24 19:29:14 +02:00
Dominik Dabek
5c5c718af3 performance: detect indirect access in kernel, PVC
Enabling on pvc after patch in igc.

Enabling only for JIT kernels because AOT could have been compiled with
IGC older than required.

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-08-24 02:15:11 +02:00
Lu, Wenbin
3d33366ff6 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-08-22 14:57:08 +02:00
Mateusz Hoppe
8435160db4 feature: bindless addressing for images
- program surface states for redescribed images correctly. Image copy
to/from memory are using redescribed surface states,
- refactor state base address programming - program address and size
together, set max size at the beginning due to lack of Enable flag
- set GpuBase in WddmAllocation when external heap is used
- return max ssh required size from kernelInfo or based on stateful args

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-08-18 15:59:20 +02:00
Fabian Zwolinski
6fca8ee195 refactor: Remove SourceLevelDebugger
Removed:
- SourceLevelDebugger (with tests)
- DebuggerLibrary
- DebuggerLibraryRestore
- debuggerSupported field from hwInfo.capabilityTable
- HasSourceLevelDebuggerSupport matcher
- ExperimentalEnableSourceLevelDebugger debug var
- EnableMockSourceLevelDebugger debug var
- DebuggerOptDisable debug var
- lib_names.h.in file
- third_party/source_level_debugger/igfx_debug_interchange_types.h

Related-To: NEO-7213
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-08-10 11:14:02 +02:00
Compute-Runtime-Validation
b7a56521f8 Revert "refactor: Enable CSR heap sharing on Older Gen platforms"
This reverts commit 160daeb874.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-26 05:40:59 +02:00
Jitendra Sharma
160daeb874 refactor: Enable CSR heap sharing on Older Gen platforms
Related-To: LOCI-4312
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-07-25 19:37:33 +02:00
Mateusz Hoppe
67d39f88e6 feature: bindless addressing - store bindlessInfo in allocation
- store surface state info for bindless addressing in graphics
allocation
- remove map in BindlessHeapsHelper - bindlessInfo is constant for
the lifetime of an allocation
- program bindless offsets and surface states for images when used in
bindless kernel
- handle ouf of memory on surface state heap - return error

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-24 14:48:35 +02:00
Mateusz Hoppe
9fd7f9cf05 fix: set ImplicitArgs size to size of defined fields
Resolves: NEO-8169

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-12 21:30:32 +02:00
Cencelewska, Katarzyna
aa0beb8191 fix: Unify logic calculating threads per work group part 4
- also use helper when checking that is simd1 to have same flow

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-07 15:34:59 +02:00
Cencelewska, Katarzyna
61f701aba5 fix: Unify logic calculating threads per work group part 3
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 15:27:44 +02:00
Cencelewska, Katarzyna
2e17c21728 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 10:34:02 +02:00
Compute-Runtime-Validation
39740da9d1 Revert "fix: Unify logic calculating threads per work group part 2"
This reverts commit 1e8a53bd53.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-02 07:09:14 +02:00
Young Jin Yoon
c5d675570a feature: support for zeDriverGetLastErrorDescription
Added setErrorDescription() and getErrorDescription() in DriverHandle
to record and retrieve the custom string for errors.

Related-To: LOCI-4619
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2023-06-30 17:12:32 +02:00
Cencelewska, Katarzyna
1e8a53bd53 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-30 14:16:08 +02:00
Cencelewska, Katarzyna
0d7aefe66b fix: Unify logic calculating threads per work group part 1
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-29 10:43:22 +02:00
Cencelewska, Katarzyna
68d81c82a7 fix: Use proper value about hw local id generations
- remove useless flag ForceNumberOfThreadsInGpgpuThreadGroup
- add new flag "RemoveRestrictionsOnNumberOfThreadsInGpgpuThreadGroup"
to restore old path without restrictions about number of threads in
thread group
- fix forwarding information about hw local ids generations to
calculate numOfThreadsInThreadGroup correctly

Related-To: NEO-7952, NEO-7982
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-26 16:35:42 +02:00
Lukasz Jobczyk
bc0a3a7eb5 fix: Consider slm size in suggest work group cache
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-06-26 09:12:54 +02:00
Zbigniew Zdanowicz
ddffb8a67f fix: add missing unrecoverable macro
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-06-22 10:47:18 +02:00
Mateusz Hoppe
111b112729 feature: add assertBufferPtr to ImplicitArgs
Related-To: NEO-5753, NEO-8078

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-20 20:43:57 +02:00
Mateusz Hoppe
313fb84fda feature: bindless addressing mode support
- allow bindless kernels to execute
- bindless addressing kernels are using private heaps mode
- do not differentiate bindful and bindless surface state base addresses

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-19 12:41:03 +02:00
Lukasz Jobczyk
0cf975605b performance: Cache suggest group size
Resolves: NEO-7968

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-06-16 13:26:55 +02:00
Cencelewska, Katarzyna
7cb3278eb3 fix: add function to calculate number of threads per tg
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-13 14:02:24 +02:00
Neil R Spruit
ba6d447b4d feature: Support for using Reserved address with multiple mappings
Related-To: LOCI-4381

- Enabled support for customers to use full Virtual reservation range
with multiple physical mappings with additional allocations implicitly
included in residency.
- Buffer Surface state size extended for first allocation to stretch to
the bufferSize requested.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-06-07 03:12:29 +02:00
Mateusz Hoppe
1c196b9f3d refactor: change ApiSpecificConfig functions names
- better description of the meaning of functions

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-05-30 09:20:01 +02:00
Neil R Spruit
ded9d7bff2 feature: Get Peer Allocation with specified base Pointer
Related-To: LOCI-4176

- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-05-24 20:41:20 +02:00
Compute-Runtime-Validation
375f212b2d Revert "fix: setGroupSize caching to not hide error"
This reverts commit 56b167f530.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-05-16 02:58:11 +02:00
Dominik Dabek
56b167f530 fix: setGroupSize caching to not hide error
When setting kernel group size with incorrect values, error would not be
returned if method called with same arguments a second time.

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-05-15 14:57:46 +02:00
Fabian Zwolinski
cbce863dc2 refactor: Rename member variables to camelCase 3/n
Additionally enable clang-tidy check for member variables

Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-28 16:01:14 +02:00
Fabian Zwolinski
e351a90f81 refactor: Rename member variables to camelCase 2/n
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-27 20:39:22 +02:00
Zbigniew Zdanowicz
7731264fe3 [fix] update ray tracing commands programing
- 3D btd command should be programed only once per context
- Add conditional pipe control command prior dispatching 3D btd command
- share 3D btd state between immediate and regular command lists
- add pipe control after ray tracing kernel to invalidate state cache

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-03 11:21:24 +02:00
Rafal Maziejuk
b9828b543e feature: adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-28 15:19:52 +02:00
Mateusz Jablonski
dd39b822d3 feature implicit args: patch rt dispatch global array in implicit args buffer
handle has_rtcalls in kernels and functions in zebin

Related-To: NEO-7818
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-28 12:31:38 +02:00
Rafal Maziejuk
27ff1c911d feature l0: handle additional properties in modules
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-24 10:27:44 +01:00
Maciej Bielski
3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Krzysztof Gibala
ecd8c6b410 fix l0: Add missing calculation in kernel getProperties
After resolving NEO-7684 in turns out that `zeKernelGetProperties`
is still returning wrong value for `maxNumSubgroups` since it
did not take into account `LargeGRF & SIMD` limitation.

Related-To: NEO-7829
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2023-03-22 16:06:13 +01:00
Mateusz Jablonski
0da5e6f277 refactor l0: cleanup cmake file level_zero/core/source/CMakeLists.txt
Related-To: NEO-7507
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-16 12:38:15 +01:00
Mateusz Hoppe
0204761add feature: gpu assert implementation
- allocate assert buffer when kernel has assert
- track assert kernels in cmdlists and cmdqueues
- check and print assert at sync calls: cmdqueue synchronize(), fence
synchronize(), event hostSynchronize(), synchronous imm cmdlists
append()

Related-To: NEO-5753

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-15 19:22:09 +01:00
Dominik Dabek
69a16fd3ed feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
For both level zero and opencl.
Currently disabled by default, enable with debug flag
DetectIndirectAccessInKernel

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-03-08 16:58:26 +01:00
Zbigniew Zdanowicz
c8b90613a8 [perf] simplify command list preemption state transition
- apply revelant flags only on platforms supporting these flags
- update command list preemption level when supported
- use actual kernel preemption level to program interface descriptor data

Related-To: NEO-7771

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-02 12:19:02 +01:00
Zhang, Winston
c584d19a6c modify printPrintfOutput to be an atomic operation
Mutex was added to kernel_imp for atomic operation during
printPrintfOutput on kernel.

Related-To: LOCI-3681

Signed-off-by: Zhang, Winston <winston.zhang@intel.com>
2023-03-02 08:53:18 +01:00
Compute-Runtime-Validation
4a369ad88d Revert "feature: check indirect access for kernel"
This reverts commit 075c96267d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-24 03:48:22 +01:00
Dominik Dabek
075c96267d feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
Enable for both level zero and opencl.

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-02-23 12:38:53 +01:00