Commit Graph

59 Commits

Author SHA1 Message Date
Filip Hazubski
6b2b42972a fix: Add asserts to ensure NonCopyable and NonMovable 1/n
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-18 09:41:20 +01:00
Wojciech Konior
6b40f9bc5a refactor: engineInstancedType removed
Related-To: NEO-12594

Signed-off-by: Wojciech Konior <wojciech.konior@intel.com>
2024-10-09 16:30:48 +02:00
Maciej Plewka
73e4b6ae7c fix: remove w/a which disables wmtp in kernels with ray tracing
Related-To: NEO-12872
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-10-07 14:28:08 +02:00
John Falkowski
1d51f4b91c feature: Add driver-experimental API for retrieval of kernel binary program data
Related-To: NEO-11651

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2024-09-25 20:38:17 +02:00
Compute-Runtime-Validation
5dddd4a67f Revert "feature: Add experimental API for retrieval of kernel binary program ...
This reverts commit 24682e702b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-25 10:01:36 +02:00
John Falkowski
24682e702b feature: Add experimental API for retrieval of kernel binary program data
Related-To: NEO-11651

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2024-09-24 02:48:29 +02:00
Zbigniew Zdanowicz
672d8414f5 fix: remove not needed macro
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-09-19 17:01:17 +02:00
Zbigniew Zdanowicz
7e00590994 performance: get work group count per tile value when setting new group size
- change interface to function to accept external group size

Related-To: NEO-12639

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-09-16 10:45:01 +02:00
Bartosz Dunajski
4f1262645b refactor: pass extra walker params
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-09-10 15:30:03 +02:00
Mateusz Jablonski
14fd9f3f24 fix: correct using L0 loader functions
use zelLoaderTranslateHandle for translating handle to internal handle
get pointer to zelSetDriverTeardown during global ctor
don't load loader library by name
get loader function pointers directly from current process

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-09-04 17:30:25 +02:00
Mateusz Jablonski
d45c16dfc2 fix: add fallback for invalid handles in extension functions
handle context, commandlist, driver, device, event, image and kernel handles

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-28 17:22:35 +02:00
Zbigniew Zdanowicz
1c1e437d4b refactor: split kernel residency into internal and argument containers
Related-To: NEO-11719

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-07-23 17:22:16 +02:00
Bartosz Dunajski
692def2c79 feature: region group barrier allocation support
Related-To: NEO-11031

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-06-03 18:34:54 +02:00
Dunajski, Bartosz
f17f45d63f feature: initial support for patching region params
Related-To: NEO-8070

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-12-20 21:38:39 +01:00
Lu, Wenbin
37deaf1ae5 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-11-27 10:42:51 +01:00
Maciej Bielski
97e7cda912 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-21 13:55:45 +02:00
Compute-Runtime-Validation
913a926fd4 Revert "feature: Optimize intra-module kernel ISA allocations"
This reverts commit c348831470.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-19 14:16:05 +02:00
Maciej Bielski
c348831470 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-19 12:05:09 +02:00
Compute-Runtime-Validation
21a506b045 Revert "fix: serialize printf kernel accesses using device-wise locks"
This reverts commit 3d33366ff6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-08-24 19:29:14 +02:00
Lu, Wenbin
3d33366ff6 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-08-22 14:57:08 +02:00
Neil R Spruit
ded9d7bff2 feature: Get Peer Allocation with specified base Pointer
Related-To: LOCI-4176

- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-05-24 20:41:20 +02:00
Maciej Bielski
3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Zbigniew Zdanowicz
c8b90613a8 [perf] simplify command list preemption state transition
- apply revelant flags only on platforms supporting these flags
- update command list preemption level when supported
- use actual kernel preemption level to program interface descriptor data

Related-To: NEO-7771

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-02 12:19:02 +01:00
Mateusz Hoppe
d623ef391b feature: print printf contents right after gpu hang detection
- printf used in kernel is printed on synchronize() call, if
hang is detected - printf buffer was not printed immediately but
only when Kernel was destroyed
- this change adds copying printf buffer with internal engine
(whenever available) right after hang detection on
CommandQueue::synchronize() call

Related-To: NEO-6427

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-01-11 08:14:00 +01:00
Filip Hazubski
35d1f2e341 Add debug flag to control programming of thread arbitration policy with SCM
Related-To: NEO-6801

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-05-27 11:35:41 +02:00
Jaime Arteaga
e8a6842b7e Add method to read kernel base address
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2022-03-25 21:49:22 +01:00
Filip Hazubski
dd01cff879 Unify logic determining thread arbitration policy value
Related-To: NEO-6728

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-03-08 13:14:56 +01:00
Igor Venevtsev
71746a2fff Register zebin binary in L0 debugger
Related-To: NEO-5571

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2022-01-12 23:17:59 +01:00
Mateusz Hoppe
17f82bbe12 Fix double ISA transfer for user kernels in L0
Related-To: NEO-6555

- ISA should only be copied once, after linking phase is complete

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2021-12-21 07:54:51 +01:00
Mateusz Jablonski
f958b053ab Merge patchWorkDim method's logic into setGroupCount method
Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-09-14 08:57:24 +02:00
Mateusz Jablonski
caddc63eec Remove not needed function
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-09-07 12:24:35 +02:00
Filip Hazubski
de1e4e0074 Add adjustMaxWorkGroupCount helper
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2021-08-27 15:39:48 +02:00
Vinod Tipparaju
37670aeb91 Add support for new thread arbitration policies via zeKernelSchedulingHintExp
Related-To: LOCI-2319

Signed-off-by: Vinod Tipparaju <vinod.tipparaju@intel.com>
2021-08-09 21:07:08 +02:00
Jaroslaw Chodor
7c6c45f5b5 Add option to allocate private mem per dispatch
Signed-off-by: Jaroslaw Chodor <jaroslaw.chodor@intel.com>
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2021-07-27 13:34:12 +02:00
Dominik Dabek
dc9b2351d5 Change patchGlobalOffset in l0 kernel to void
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2021-07-06 11:36:53 +02:00
Dominik Dabek
62f89b174a Add work_dim patching to l0 kernel
Related-To: NEO-5931

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2021-07-05 20:09:20 +02:00
lgotszal
3bd4bca911 Copyright header update
Dates corrected in copyright headers to reflect original publication date
(2018 for OpenCL, 2020 for Level Zero).

Signed-off-by: lgotszal <lukasz.gotszald@intel.com>
2021-05-17 20:38:19 +02:00
Mateusz Jablonski
35ff284944 Cleanup Kernel class
move deviceVector to MultiDeviceKernel class
remove Device arg from Kernel's methods

Related-To: NEO-5001
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-03-24 09:17:41 +01:00
Filip Hazubski
8d55bfe21d Implement zeCommandListAppendLaunchCooperativeKernel
Resolves: NEO-4725


Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2021-03-22 19:26:41 +01:00
davidoli
8fdd1931a9 improve stub for zetKernelGetProfileInfo with ULT
Signed-off-by: davidoli <david.olien@intel.com>
2021-03-01 00:17:58 +01:00
Mateusz Hoppe
6dd0f0c728 Relocate debug data
Related-To: NEO-4769

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2021-02-16 14:59:30 +01:00
Jaime Arteaga
afffedebb2 Move ISA at kernel creation time
Instead of moving the ISAs for all kernel in a module when the module
is created, move the ISA when the kernel is created, to avoid
unnecessary memory transfers.

Related-To: LOCI-2009

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-02-01 13:28:38 +01:00
Jaime Arteaga
05b5ad37ea Initialize kernel private surface when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-13 17:22:40 +01:00
Jaime Arteaga
08655a315c Revert "Initialize kernel private surface when kernel is created"
This reverts commit be2a87fe98.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-10 22:56:32 +01:00
Jaime Arteaga
be2a87fe98 Initialize kernel private surface when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-08 19:22:17 +01:00
Jaime Arteaga
d7ea713c5f Revert "Initialize kernel immutable data when kernel is created"
This reverts commit a6ac10088c.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-04 11:11:39 +01:00
Jaime Arteaga
a6ac10088c Initialize kernel immutable data when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-12-29 19:29:10 +01:00
Jim Snow
37cd49330c Implement ZE_CACHE_CONFIG_FLAG_LARGE_DATA for zeKernelSetCacheConfig
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2020-12-16 07:00:13 +01:00
Jaime Arteaga
beb3c5ed05 Add support for global work offset extension in L0
Add experimental extension to set global work offest in L0.
Current L0 specification does not have interface to export
experimental function symbols, so for now, applications need
to find the symbol like with dlsym on Linux.

A blackbox test showing functionality is also added.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-12-09 07:33:40 +01:00
macabral
720ba46548 Register kernel Elf for debugging purpose
Signed-off-by: macabral <matias.a.cabral@intel.com>
2020-12-01 17:16:14 +01:00