Commit Graph

298 Commits

Author SHA1 Message Date
Mateusz Hoppe
7ffd151ac3 fix: adjust numArgsStateful based on binding table entries
- global and const buffer may have BTI index allocated, ssh template
must be allocated with size for all stateful args

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-11-23 12:15:39 +01:00
Mateusz Hoppe
1c37da280c fix: fix bindless offset patching for images
- usingSurfaceStateHeap indicates if any of the args is using local ssh
in bindless kernels:

without global allocator - ssh is used for all args
with global bindless allocator - ssh used only for buffer with offset
set in surface state, otherwise not used

When any of the args is using ssh - getSurfaceStateHeapDataSize() returns
non-zero size.

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-11-07 11:39:49 +01:00
Spruit, Neil R
58fa968273 fix: Calculate size of buffer surface state given mapped allocations
Related-To: NEO-8350

- given a virtual address part of a mapping to multiple physical
allocations, then the buffer surface state size is increased to
include the allocations which follow the current allocation, which
allows users access to the remainder of the mapped buffer.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
2023-10-19 13:38:51 +02:00
Maciej Bielski
f553d9f76b fix: one transfer per kernel ISA allocation(s) page
If several kernel heaps are sharing the same page then use a temporary
buffer to collect all of them and transfer to memory in one shot.
Previously there were several transfers performed (one per kernel) and,
observably, they happened not to be immediately effective at times.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-10-05 18:29:26 +02:00
Jitendra Sharma
8a01619310 refactor: Enable CSR heap sharing on Older Generation platforms
Related-To: LOCI-4312
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-10-03 18:19:50 +02:00
Compute-Runtime-Validation
1ac37d4a49 Revert "refactor: Enable CSR heap sharing on Older Generation platforms"
This reverts commit 58ff9c6d94.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-10-02 15:13:23 +02:00
Jitendra Sharma
58ff9c6d94 refactor: Enable CSR heap sharing on Older Generation platforms
Related-To: LOCI-4312
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-09-29 11:54:51 +02:00
Mateusz Jablonski
b8c3dea8dd refactor: simplify KernelImmutableData dtor
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-28 08:36:01 +02:00
Maciej Bielski
97e7cda912 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-21 13:55:45 +02:00
Compute-Runtime-Validation
913a926fd4 Revert "feature: Optimize intra-module kernel ISA allocations"
This reverts commit c348831470.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-19 14:16:05 +02:00
Maciej Bielski
c348831470 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-19 12:05:09 +02:00
Maciej Plewka
ee21f7c717 fix: Use cmdlist residency container for reused private allocs
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-09-18 13:50:17 +02:00
Mateusz Hoppe
fb211a921d feature: bindless addressing support for image views
Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-09-15 15:25:47 +02:00
Mateusz Hoppe
93469eaf5d feature: bindless addressing for buffers with offset
- allocate SurfaceStates on kernel's heap for offsetted buffers

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-09-08 12:03:23 +02:00
Joshua Santosh Ranjan
91784a87cc fix: Return success for system address in setArg
This patch avoids returning error for system addresses in setArg

Related-To: GSD-3597

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2023-09-08 05:27:55 +02:00
Maciej Plewka
5807d512b3 fix: Reuse private allocations during cmdList dispatch
Related-To: NEO-8201

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-08-31 14:40:55 +02:00
Compute-Runtime-Validation
21a506b045 Revert "fix: serialize printf kernel accesses using device-wise locks"
This reverts commit 3d33366ff6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-08-24 19:29:14 +02:00
Dominik Dabek
5c5c718af3 performance: detect indirect access in kernel, PVC
Enabling on pvc after patch in igc.

Enabling only for JIT kernels because AOT could have been compiled with
IGC older than required.

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-08-24 02:15:11 +02:00
Lu, Wenbin
3d33366ff6 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-08-22 14:57:08 +02:00
Mateusz Hoppe
8435160db4 feature: bindless addressing for images
- program surface states for redescribed images correctly. Image copy
to/from memory are using redescribed surface states,
- refactor state base address programming - program address and size
together, set max size at the beginning due to lack of Enable flag
- set GpuBase in WddmAllocation when external heap is used
- return max ssh required size from kernelInfo or based on stateful args

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-08-18 15:59:20 +02:00
Fabian Zwolinski
6fca8ee195 refactor: Remove SourceLevelDebugger
Removed:
- SourceLevelDebugger (with tests)
- DebuggerLibrary
- DebuggerLibraryRestore
- debuggerSupported field from hwInfo.capabilityTable
- HasSourceLevelDebuggerSupport matcher
- ExperimentalEnableSourceLevelDebugger debug var
- EnableMockSourceLevelDebugger debug var
- DebuggerOptDisable debug var
- lib_names.h.in file
- third_party/source_level_debugger/igfx_debug_interchange_types.h

Related-To: NEO-7213
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-08-10 11:14:02 +02:00
Compute-Runtime-Validation
b7a56521f8 Revert "refactor: Enable CSR heap sharing on Older Gen platforms"
This reverts commit 160daeb874.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-26 05:40:59 +02:00
Jitendra Sharma
160daeb874 refactor: Enable CSR heap sharing on Older Gen platforms
Related-To: LOCI-4312
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-07-25 19:37:33 +02:00
Mateusz Hoppe
67d39f88e6 feature: bindless addressing - store bindlessInfo in allocation
- store surface state info for bindless addressing in graphics
allocation
- remove map in BindlessHeapsHelper - bindlessInfo is constant for
the lifetime of an allocation
- program bindless offsets and surface states for images when used in
bindless kernel
- handle ouf of memory on surface state heap - return error

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-24 14:48:35 +02:00
Mateusz Hoppe
9fd7f9cf05 fix: set ImplicitArgs size to size of defined fields
Resolves: NEO-8169

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-12 21:30:32 +02:00
Cencelewska, Katarzyna
aa0beb8191 fix: Unify logic calculating threads per work group part 4
- also use helper when checking that is simd1 to have same flow

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-07 15:34:59 +02:00
Cencelewska, Katarzyna
61f701aba5 fix: Unify logic calculating threads per work group part 3
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 15:27:44 +02:00
Cencelewska, Katarzyna
2e17c21728 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 10:34:02 +02:00
Compute-Runtime-Validation
39740da9d1 Revert "fix: Unify logic calculating threads per work group part 2"
This reverts commit 1e8a53bd53.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-02 07:09:14 +02:00
Young Jin Yoon
c5d675570a feature: support for zeDriverGetLastErrorDescription
Added setErrorDescription() and getErrorDescription() in DriverHandle
to record and retrieve the custom string for errors.

Related-To: LOCI-4619
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2023-06-30 17:12:32 +02:00
Cencelewska, Katarzyna
1e8a53bd53 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-30 14:16:08 +02:00
Cencelewska, Katarzyna
0d7aefe66b fix: Unify logic calculating threads per work group part 1
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-29 10:43:22 +02:00
Cencelewska, Katarzyna
68d81c82a7 fix: Use proper value about hw local id generations
- remove useless flag ForceNumberOfThreadsInGpgpuThreadGroup
- add new flag "RemoveRestrictionsOnNumberOfThreadsInGpgpuThreadGroup"
to restore old path without restrictions about number of threads in
thread group
- fix forwarding information about hw local ids generations to
calculate numOfThreadsInThreadGroup correctly

Related-To: NEO-7952, NEO-7982
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-26 16:35:42 +02:00
Lukasz Jobczyk
bc0a3a7eb5 fix: Consider slm size in suggest work group cache
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-06-26 09:12:54 +02:00
Zbigniew Zdanowicz
ddffb8a67f fix: add missing unrecoverable macro
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-06-22 10:47:18 +02:00
Mateusz Hoppe
111b112729 feature: add assertBufferPtr to ImplicitArgs
Related-To: NEO-5753, NEO-8078

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-20 20:43:57 +02:00
Mateusz Hoppe
313fb84fda feature: bindless addressing mode support
- allow bindless kernels to execute
- bindless addressing kernels are using private heaps mode
- do not differentiate bindful and bindless surface state base addresses

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-19 12:41:03 +02:00
Lukasz Jobczyk
0cf975605b performance: Cache suggest group size
Resolves: NEO-7968

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-06-16 13:26:55 +02:00
Cencelewska, Katarzyna
7cb3278eb3 fix: add function to calculate number of threads per tg
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-13 14:02:24 +02:00
Neil R Spruit
ba6d447b4d feature: Support for using Reserved address with multiple mappings
Related-To: LOCI-4381

- Enabled support for customers to use full Virtual reservation range
with multiple physical mappings with additional allocations implicitly
included in residency.
- Buffer Surface state size extended for first allocation to stretch to
the bufferSize requested.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-06-07 03:12:29 +02:00
Mateusz Hoppe
1c196b9f3d refactor: change ApiSpecificConfig functions names
- better description of the meaning of functions

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-05-30 09:20:01 +02:00
Neil R Spruit
ded9d7bff2 feature: Get Peer Allocation with specified base Pointer
Related-To: LOCI-4176

- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-05-24 20:41:20 +02:00
Compute-Runtime-Validation
375f212b2d Revert "fix: setGroupSize caching to not hide error"
This reverts commit 56b167f530.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-05-16 02:58:11 +02:00
Dominik Dabek
56b167f530 fix: setGroupSize caching to not hide error
When setting kernel group size with incorrect values, error would not be
returned if method called with same arguments a second time.

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-05-15 14:57:46 +02:00
Fabian Zwolinski
cbce863dc2 refactor: Rename member variables to camelCase 3/n
Additionally enable clang-tidy check for member variables

Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-28 16:01:14 +02:00
Fabian Zwolinski
e351a90f81 refactor: Rename member variables to camelCase 2/n
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-27 20:39:22 +02:00
Zbigniew Zdanowicz
7731264fe3 [fix] update ray tracing commands programing
- 3D btd command should be programed only once per context
- Add conditional pipe control command prior dispatching 3D btd command
- share 3D btd state between immediate and regular command lists
- add pipe control after ray tracing kernel to invalidate state cache

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-03 11:21:24 +02:00
Rafal Maziejuk
b9828b543e feature: adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-28 15:19:52 +02:00
Mateusz Jablonski
dd39b822d3 feature implicit args: patch rt dispatch global array in implicit args buffer
handle has_rtcalls in kernels and functions in zebin

Related-To: NEO-7818
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-28 12:31:38 +02:00
Rafal Maziejuk
27ff1c911d feature l0: handle additional properties in modules
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-24 10:27:44 +01:00