Commit Graph

44 Commits

Author SHA1 Message Date
Maciej Bielski 97e7cda912 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-21 13:55:45 +02:00
Compute-Runtime-Validation 913a926fd4 Revert "feature: Optimize intra-module kernel ISA allocations"
This reverts commit c348831470.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-19 14:16:05 +02:00
Maciej Bielski c348831470 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-19 12:05:09 +02:00
Compute-Runtime-Validation 21a506b045 Revert "fix: serialize printf kernel accesses using device-wise locks"
This reverts commit 3d33366ff6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-08-24 19:29:14 +02:00
Lu, Wenbin 3d33366ff6 fix: serialize printf kernel accesses using device-wise locks
Related-To: LOCI-4114

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-08-22 14:57:08 +02:00
Neil R Spruit ded9d7bff2 feature: Get Peer Allocation with specified base Pointer
Related-To: LOCI-4176

- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-05-24 20:41:20 +02:00
Maciej Bielski 3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Zbigniew Zdanowicz c8b90613a8 [perf] simplify command list preemption state transition
- apply revelant flags only on platforms supporting these flags
- update command list preemption level when supported
- use actual kernel preemption level to program interface descriptor data

Related-To: NEO-7771

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-02 12:19:02 +01:00
Mateusz Hoppe d623ef391b feature: print printf contents right after gpu hang detection
- printf used in kernel is printed on synchronize() call, if
hang is detected - printf buffer was not printed immediately but
only when Kernel was destroyed
- this change adds copying printf buffer with internal engine
(whenever available) right after hang detection on
CommandQueue::synchronize() call

Related-To: NEO-6427

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-01-11 08:14:00 +01:00
Filip Hazubski 35d1f2e341 Add debug flag to control programming of thread arbitration policy with SCM
Related-To: NEO-6801

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-05-27 11:35:41 +02:00
Jaime Arteaga e8a6842b7e Add method to read kernel base address
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2022-03-25 21:49:22 +01:00
Filip Hazubski dd01cff879 Unify logic determining thread arbitration policy value
Related-To: NEO-6728

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-03-08 13:14:56 +01:00
Igor Venevtsev 71746a2fff Register zebin binary in L0 debugger
Related-To: NEO-5571

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2022-01-12 23:17:59 +01:00
Mateusz Hoppe 17f82bbe12 Fix double ISA transfer for user kernels in L0
Related-To: NEO-6555

- ISA should only be copied once, after linking phase is complete

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2021-12-21 07:54:51 +01:00
Mateusz Jablonski f958b053ab Merge patchWorkDim method's logic into setGroupCount method
Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-09-14 08:57:24 +02:00
Mateusz Jablonski caddc63eec Remove not needed function
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-09-07 12:24:35 +02:00
Filip Hazubski de1e4e0074 Add adjustMaxWorkGroupCount helper
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2021-08-27 15:39:48 +02:00
Vinod Tipparaju 37670aeb91 Add support for new thread arbitration policies via zeKernelSchedulingHintExp
Related-To: LOCI-2319

Signed-off-by: Vinod Tipparaju <vinod.tipparaju@intel.com>
2021-08-09 21:07:08 +02:00
Jaroslaw Chodor 7c6c45f5b5 Add option to allocate private mem per dispatch
Signed-off-by: Jaroslaw Chodor <jaroslaw.chodor@intel.com>
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2021-07-27 13:34:12 +02:00
Dominik Dabek dc9b2351d5 Change patchGlobalOffset in l0 kernel to void
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2021-07-06 11:36:53 +02:00
Dominik Dabek 62f89b174a Add work_dim patching to l0 kernel
Related-To: NEO-5931

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2021-07-05 20:09:20 +02:00
lgotszal 3bd4bca911 Copyright header update
Dates corrected in copyright headers to reflect original publication date
(2018 for OpenCL, 2020 for Level Zero).

Signed-off-by: lgotszal <lukasz.gotszald@intel.com>
2021-05-17 20:38:19 +02:00
Mateusz Jablonski 35ff284944 Cleanup Kernel class
move deviceVector to MultiDeviceKernel class
remove Device arg from Kernel's methods

Related-To: NEO-5001
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2021-03-24 09:17:41 +01:00
Filip Hazubski 8d55bfe21d Implement zeCommandListAppendLaunchCooperativeKernel
Resolves: NEO-4725


Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2021-03-22 19:26:41 +01:00
davidoli 8fdd1931a9 improve stub for zetKernelGetProfileInfo with ULT
Signed-off-by: davidoli <david.olien@intel.com>
2021-03-01 00:17:58 +01:00
Mateusz Hoppe 6dd0f0c728 Relocate debug data
Related-To: NEO-4769

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2021-02-16 14:59:30 +01:00
Jaime Arteaga afffedebb2 Move ISA at kernel creation time
Instead of moving the ISAs for all kernel in a module when the module
is created, move the ISA when the kernel is created, to avoid
unnecessary memory transfers.

Related-To: LOCI-2009

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-02-01 13:28:38 +01:00
Jaime Arteaga 05b5ad37ea Initialize kernel private surface when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-13 17:22:40 +01:00
Jaime Arteaga 08655a315c Revert "Initialize kernel private surface when kernel is created"
This reverts commit be2a87fe98.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-10 22:56:32 +01:00
Jaime Arteaga be2a87fe98 Initialize kernel private surface when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-08 19:22:17 +01:00
Jaime Arteaga d7ea713c5f Revert "Initialize kernel immutable data when kernel is created"
This reverts commit a6ac10088c.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2021-01-04 11:11:39 +01:00
Jaime Arteaga a6ac10088c Initialize kernel immutable data when kernel is created
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-12-29 19:29:10 +01:00
Jim Snow 37cd49330c Implement ZE_CACHE_CONFIG_FLAG_LARGE_DATA for zeKernelSetCacheConfig
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2020-12-16 07:00:13 +01:00
Jaime Arteaga beb3c5ed05 Add support for global work offset extension in L0
Add experimental extension to set global work offest in L0.
Current L0 specification does not have interface to export
experimental function symbols, so for now, applications need
to find the symbol like with dlsym on Linux.

A blackbox test showing functionality is also added.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-12-09 07:33:40 +01:00
macabral 720ba46548 Register kernel Elf for debugging purpose
Signed-off-by: macabral <matias.a.cabral@intel.com>
2020-12-01 17:16:14 +01:00
Mateusz Hoppe 0f42ef1ed7 Differentiate between users ISA and internal ISA allocation
Related-To: NEO-5240

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2020-11-16 13:16:30 +01:00
Jaime Arteaga b3700370a6 Remove dead-code functions for cache intermediate/last-level config
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-11-14 04:23:36 +01:00
Jaroslaw Chodor df2e76f526 Fixing residency of extern device functions
Change-Id: Icad696cbf6fb3fc0276f0d0d488bf92091525d9b
2020-10-02 12:27:59 +02:00
Jaime Arteaga 902fc2f6c4 level-zero v1.0 (2/N)
Change-Id: I1419231a721fab210e166d26a264cae04d661dcd
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Signed-off-by: macabral <matias.a.cabral@intel.com>
Signed-off-by: davidoli <david.olien@intel.com>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
Signed-off-by: Latif, Raiyan <raiyan.latif@intel.com>
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2020-08-03 13:11:13 +02:00
Jaime Arteaga 169089347f Add support for zeKernelGetName
Change-Id: I167cc202436b6a76841c56e46baa684e7be90132
Signed-off: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-07-31 00:24:37 +02:00
Jaime Arteaga 5b61ad0966 Add stub for dynamic link function and for extended kernel properties
Change-Id: Ifaaf1226114233618e7959def086989cf93bd0bd
Signed-off: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-06-29 02:11:29 +02:00
Mateusz Hoppe 4c23b60b30 Refactor setArgBufferWithAlloc, add zello_world blackbox test
Change-Id: I793f960582ce8c066dedd466befcbf534d6d7ddc
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2020-05-05 14:41:45 +02:00
Jaroslaw Chodor 2c25777f3c DispatchKernelEncoder refactor
Replacing parts of DispatchKernelEncoder with KernelDescriptor

Change-Id: I1c780b04a2d3d1de0fb75d5413a0dde8b41bbe07
2020-04-08 16:19:21 +02:00
Jaime Arteaga d96e462754 Reorganize Level Zero Core API files
Change-Id: I95750b90748dd65310fa72b030ea3ab2f72d3f24
Signed-off: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2020-03-25 11:21:43 +01:00