Commit Graph

57 Commits

Author SHA1 Message Date
Maciej Plewka 73e4b6ae7c fix: remove w/a which disables wmtp in kernels with ray tracing
Related-To: NEO-12872
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-10-07 14:28:08 +02:00
John Falkowski 1d51f4b91c feature: Add driver-experimental API for retrieval of kernel binary program data
Related-To: NEO-11651

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2024-09-25 20:38:17 +02:00
Compute-Runtime-Validation 5dddd4a67f Revert "feature: Add experimental API for retrieval of kernel binary program ...
This reverts commit 24682e702b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-25 10:01:36 +02:00
John Falkowski 24682e702b feature: Add experimental API for retrieval of kernel binary program data
Related-To: NEO-11651

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2024-09-24 02:48:29 +02:00
Zbigniew Zdanowicz 672d8414f5 fix: remove not needed macro
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-09-19 17:01:17 +02:00
Zbigniew Zdanowicz 7e00590994 performance: get work group count per tile value when setting new group size
- change interface to function to accept external group size

Related-To: NEO-12639

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-09-16 10:45:01 +02:00
Bartosz Dunajski 4f1262645b refactor: pass extra walker params
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-09-10 15:30:03 +02:00
Mateusz Jablonski 14fd9f3f24 fix: correct using L0 loader functions
use zelLoaderTranslateHandle for translating handle to internal handle
get pointer to zelSetDriverTeardown during global ctor
don't load loader library by name
get loader function pointers directly from current process

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-09-04 17:30:25 +02:00
Mateusz Jablonski d45c16dfc2 fix: add fallback for invalid handles in extension functions
handle context, commandlist, driver, device, event, image and kernel handles

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-28 17:22:35 +02:00
Zbigniew Zdanowicz 1c1e437d4b refactor: split kernel residency into internal and argument containers
Related-To: NEO-11719

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-07-23 17:22:16 +02:00
Bartosz Dunajski 692def2c79 feature: region group barrier allocation support
Related-To: NEO-11031

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-06-03 18:34:54 +02:00
Dunajski, Bartosz f17f45d63f feature: initial support for patching region params
Related-To: NEO-8070

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-12-20 21:38:39 +01:00
Lu, Wenbin 37deaf1ae5 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-11-27 10:42:51 +01:00
Maciej Bielski 97e7cda912 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-21 13:55:45 +02:00
Compute-Runtime-Validation 913a926fd4 Revert "feature: Optimize intra-module kernel ISA allocations"
This reverts commit c348831470.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-19 14:16:05 +02:00
Maciej Bielski c348831470 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-19 12:05:09 +02:00
Compute-Runtime-Validation 21a506b045 Revert "fix: serialize printf kernel accesses using device-wise locks"
This reverts commit 3d33366ff6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-08-24 19:29:14 +02:00
Lu, Wenbin 3d33366ff6 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-08-22 14:57:08 +02:00
Neil R Spruit ded9d7bff2 feature: Get Peer Allocation with specified base Pointer
Related-To: LOCI-4176

- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-05-24 20:41:20 +02:00
Maciej Bielski 3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Zbigniew Zdanowicz c8b90613a8 [perf] simplify command list preemption state transition
- apply revelant flags only on platforms supporting these flags
- update command list preemption level when supported
- use actual kernel preemption level to program interface descriptor data

Related-To: NEO-7771

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-02 12:19:02 +01:00
Mateusz Hoppe d623ef391b feature: print printf contents right after gpu hang detection
- printf used in kernel is printed on synchronize() call, if
hang is detected - printf buffer was not printed immediately but
only when Kernel was destroyed
- this change adds copying printf buffer with internal engine
(whenever available) right after hang detection on
CommandQueue::synchronize() call

Related-To: NEO-6427

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-01-11 08:14:00 +01:00
Filip Hazubski 35d1f2e341 Add debug flag to control programming of thread arbitration policy with SCM
Related-To: NEO-6801

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-05-27 11:35:41 +02:00
Jaime Arteaga e8a6842b7e Add method to read kernel base address
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2022-03-25 21:49:22 +01:00
Filip Hazubski dd01cff879 Unify logic determining thread arbitration policy value
Related-To: NEO-6728

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-03-08 13:14:56 +01:00
Igor Venevtsev 71746a2fff Register zebin binary in L0 debugger
Related-To: NEO-5571

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2022-01-12 23:17:59 +01:00
Mateusz Hoppe 17f82bbe12 Fix double ISA transfer for user kernels in L0
Related-To: NEO-6555

- ISA should only be copied once, after linking phase is complete

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2021-12-21 07:54:51 +01:00
Mateusz Jablonski f958b053ab Merge patchWorkDim method's logic into setGroupCount method
Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-09-14 08:57:24 +02:00
Mateusz Jablonski caddc63eec Remove not needed function
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-09-07 12:24:35 +02:00
Filip Hazubski de1e4e0074 Add adjustMaxWorkGroupCount helper
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2021-08-27 15:39:48 +02:00
Vinod Tipparaju 37670aeb91 Add support for new thread arbitration policies via zeKernelSchedulingHintExp
Related-To: LOCI-2319

Signed-off-by: Vinod Tipparaju <vinod.tipparaju@intel.com>
2021-08-09 21:07:08 +02:00
Jaroslaw Chodor 7c6c45f5b5 Add option to allocate private mem per dispatch
Signed-off-by: Jaroslaw Chodor <jaroslaw.chodor@intel.com>
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2021-07-27 13:34:12 +02:00
Dominik Dabek dc9b2351d5 Change patchGlobalOffset in l0 kernel to void
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2021-07-06 11:36:53 +02:00
Dominik Dabek 62f89b174a Add work_dim patching to l0 kernel
Related-To: NEO-5931

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2021-07-05 20:09:20 +02:00
lgotszal 3bd4bca911 Copyright header update
Dates corrected in copyright headers to reflect original publication date
(2018 for OpenCL, 2020 for Level Zero).

Signed-off-by: lgotszal <lukasz.gotszald@intel.com>
2021-05-17 20:38:19 +02:00
Mateusz Jablonski 35ff284944 Cleanup Kernel class
move deviceVector to MultiDeviceKernel class
remove Device arg from Kernel's methods

Related-To: NEO-5001
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-03-24 09:17:41 +01:00
Filip Hazubski 8d55bfe21d Implement zeCommandListAppendLaunchCooperativeKernel
Resolves: NEO-4725


Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2021-03-22 19:26:41 +01:00
davidoli 8fdd1931a9 improve stub for zetKernelGetProfileInfo with ULT
Signed-off-by: davidoli <david.olien@intel.com>
2021-03-01 00:17:58 +01:00
Mateusz Hoppe 6dd0f0c728 Relocate debug data
Related-To: NEO-4769

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2021-02-16 14:59:30 +01:00
Jaime Arteaga afffedebb2 Move ISA at kernel creation time
Instead of moving the ISAs for all kernel in a module when the module
is created, move the ISA when the kernel is created, to avoid
unnecessary memory transfers.

Related-To: LOCI-2009

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-02-01 13:28:38 +01:00
Jaime Arteaga 05b5ad37ea Initialize kernel private surface when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-13 17:22:40 +01:00
Jaime Arteaga 08655a315c Revert "Initialize kernel private surface when kernel is created"
This reverts commit be2a87fe98.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-10 22:56:32 +01:00
Jaime Arteaga be2a87fe98 Initialize kernel private surface when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-08 19:22:17 +01:00
Jaime Arteaga d7ea713c5f Revert "Initialize kernel immutable data when kernel is created"
This reverts commit a6ac10088c.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-04 11:11:39 +01:00
Jaime Arteaga a6ac10088c Initialize kernel immutable data when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-12-29 19:29:10 +01:00
Jim Snow 37cd49330c Implement ZE_CACHE_CONFIG_FLAG_LARGE_DATA for zeKernelSetCacheConfig
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2020-12-16 07:00:13 +01:00
Jaime Arteaga beb3c5ed05 Add support for global work offset extension in L0
Add experimental extension to set global work offest in L0.
Current L0 specification does not have interface to export
experimental function symbols, so for now, applications need
to find the symbol like with dlsym on Linux.

A blackbox test showing functionality is also added.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-12-09 07:33:40 +01:00
macabral 720ba46548 Register kernel Elf for debugging purpose
Signed-off-by: macabral <matias.a.cabral@intel.com>
2020-12-01 17:16:14 +01:00
Mateusz Hoppe 0f42ef1ed7 Differentiate between users ISA and internal ISA allocation
Related-To: NEO-5240

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2020-11-16 13:16:30 +01:00
Jaime Arteaga b3700370a6 Remove dead-code functions for cache intermediate/last-level config
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-11-14 04:23:36 +01:00