Commit Graph

1886 Commits

Author SHA1 Message Date
Milczarek, Slawomir 01d03aa5b6 Extended regkey to force prefetch of shared memory in enqueue commands
Extended the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in clEnqueueNDRangeKernel(), clEnqueueMemFillINTEL, ...

Related-To: NEO-7841

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2023-04-11 11:23:48 +02:00
Neil R Spruit d31b950b9a l0_feature: Use L0 Loader teardown callback
Related-To: LOCI-4174

- Call zelSetDriverTeardown during L0 Driver teardown to prevent users
from calling into destroyed functions and encountering crashes
during teardown.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-04-11 11:16:26 +02:00
John Falkowski e056082710 refactor graphics allocation structure elements for sub-allocation properties
Resolves:  LOCI-3772

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-04-07 16:53:23 +02:00
Zbigniew Zdanowicz 66c19c7749 [perf] remove redundant for loops in command list execution method
This fix is most important for multi command list execution use cases.
It is also benefitial for single command list execution, as driver saves
on loop enters and exits.
Methods handling single command list instead of array of objects are simpler.

Removed loops were at:
- CommandListExecutionContext constructor
- estimateLinearStreamSizeInitial method
- computePreemptionSize method
- collectPrintfContentsFromAllCommandsLists method

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 15:27:04 +02:00
Zbigniew Zdanowicz d4109eb153 [feat, perf] add closing mechanism to command list primary batch buffers
This change adds space reservation in command list for returning batch buffer
start hw command.
Primary batch buffer can be run from direct submission or from KMD call and
must be aligned to required size.
Ending patch for batch buffer start must be in the last command buffer of the
command list.

Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 11:28:41 +02:00
Zbigniew Zdanowicz 1fcf564cc1 Enable state base address tracking
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 11:22:24 +02:00
Zbigniew Zdanowicz 09b58f4a22 [perf] group once per context calls under single condition
Plenty of calls require hw command programming only once per context.
There is no need to visit every method of them every execute call.
Set global init flag only if any of them is true and then visit all of them.
But for regular command list execution it can save time when there is single
global check.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 09:21:28 +02:00
Zbigniew Zdanowicz 9ce5351d3f [fix] invalidate state caches only for heaps used by initialized context
This is number of small tweaks to state cache invalidation:
1. Invalidate if heap was actually created.
2. Check if os context was actually initialized.
3. Heap allocation was actually submitted, as it might attain zero task count
value, when allocation is stored in csr internal storage, as csr wasn't used,
but the csr task count being zero is assigned to heap allocation when stored.

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 09:16:12 +02:00
Katarzyna Cencelewska 1dc3b9b1e1 fix: add missing call make resident for dummy blit allocation
Resolves: HSD-18028732286
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2023-04-06 09:22:53 +02:00
Zbigniew Zdanowicz e695059152 [perf] reduce host overhead in command list reset call
There is no need to reset all fields and load support flags every reset call.
Add dedicated calls that will reset values and dirty flags.
Call virtual methods only once at init time.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-05 11:29:39 +02:00
Mateusz Hoppe d83684785c feature: add dumping debug elf file
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-04-04 18:28:16 +02:00
Compute-Runtime-Validation e1af516c25 Revert "Enable state base address tracking"
This reverts commit 6a08d29869.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-04-04 11:37:19 +02:00
Zbigniew Zdanowicz a5179aae0b [perf] add debug key and control variable to command list primary buffer
Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-04 10:58:11 +02:00
Brandon Yates 7dc4fd8dda fix(l0): Fix cache properties for subdevice
total cache size calculation should use subdevice count
rather than MultiTileArchInfo to ensure correct size when subdevices
are queried

Related-to: LOCI-4217

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2023-04-04 08:05:09 +02:00
Zbigniew Zdanowicz 6a08d29869 Enable state base address tracking
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-03 15:26:09 +02:00
Zbigniew Zdanowicz 7731264fe3 [fix] update ray tracing commands programing
- 3D btd command should be programed only once per context
- Add conditional pipe control command prior dispatching 3D btd command
- share 3D btd state between immediate and regular command lists
- add pipe control after ray tracing kernel to invalidate state cache

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-03 11:21:24 +02:00
Milczarek, Slawomir 50da94dc56 Add regkey to force prefetch of shared memory in cmd list execute
Add the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in zeCommandQueueExecuteCommandLists().

Related-To: NEO-7841

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2023-04-03 11:14:18 +02:00
Naklicki, Mateusz d9be9b32f7 fix: add offset when alloc address does not match src/dst address for blit cmd
Include an offset for the appendMemoryCopyBlitRegion
parameters when an source/destination pointer is offseted within
allocation.
Lack of this offset could result in an invalid pointer when source and
destination are on the same allocation or they do not point the start of
allocation.

Related-To: NEO-7694
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2023-04-03 10:01:11 +02:00
Fabian Zwolinski c0603e0854 Allocate SipKernel per ctx for Offline dbg mode
- Add debuggingEnabledMode getter in ExecutionEnvironment
- Add new overloaded function - BuiltIns::getSipKernel
- Add perContextSipKernels map to BuiltIns
- Add OsContext to PreemptionHelper::programStateSip arguments
- Add new overloaded function - SipKernel::getBindlessDebugSipKernel

Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-30 16:40:41 +02:00
Compute-Runtime-Validation 2b93126795 Revert "Add alignment support to createUnifiedMemoryAllocation"
This reverts commit ca02bbba4b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-30 15:43:47 +02:00
Zbigniew Zdanowicz 2349aa1e30 Do not allow stateful kernels on global stateless command list
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-30 13:15:32 +02:00
Young Jin Yoon 3f6443edfb Change allowZebin to enableZebin for apiOptions
Changed -cl-intel-allow-zebin to -cl-intel-enable-zebin only for
API options.

Related-To: NEO-7801

Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2023-03-30 12:46:10 +02:00
Zbigniew Zdanowicz a761003f4b [fix] pass command list heap address model to container
This fix will allow save memory and do not allocate private heaps
when global heaps are enabled.

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-29 11:04:22 +02:00
Rafal Maziejuk b9828b543e feature: adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-28 15:19:52 +02:00
Zbigniew Zdanowicz 6437c1a91e Flush state caches after command list is destroyed
When state base address tracking is enabled and command list use private heaps
then command list at destroy time must calls all compute CSRs that were using
that heap to invalidate state caches.
This allows new command list to reuse the same heap allocation for different
surface states, so before new use cached states are invalidated.

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-28 14:52:30 +02:00
Compute-Runtime-Validation 322c89cd1e Revert "Traverse pNext chain for memory allocations extensions"
This reverts commit e81fb20505.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-28 13:02:45 +02:00
Mateusz Jablonski dd39b822d3 feature implicit args: patch rt dispatch global array in implicit args buffer
handle has_rtcalls in kernels and functions in zebin

Related-To: NEO-7818
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-28 12:31:38 +02:00
Lu, Wenbin ca02bbba4b Add alignment support to createUnifiedMemoryAllocation
Allows the user to use alignments > 64KB in `createUnifiedMemoryAllocation`

So that the restriction in `piextUSMDeviceAlloc` of the DPC++ runtime
could be lifted

Related-To: LOCI-4168

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-03-28 10:57:04 +02:00
Rafal Maziejuk 27ff1c911d feature l0: handle additional properties in modules
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-24 10:27:44 +01:00
Zbigniew Zdanowicz b4cce380c8 Revert "Enable state base address tracking"
This reverts commit 6fb905acb2.

Resolves: HSD-18028477709

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-24 10:20:36 +01:00
Raiyan Latif e81fb20505 Traverse pNext chain for memory allocations extensions
Related-To: LOCI-4036

Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
2023-03-24 07:43:15 +01:00
Raiyan Latif e3f732f5a6 feature: Add support for P2P Image Copy
Enables P2P Copy support for all Image API related calls:
- zeCommandListAppendImageCopy
- zeCommandListAppendImageCopyRegion
- zeCommandListAppendImageCopyToMemory
- zeCommandListAppendImageCopyFromMemory

Related-To: LOCI-4112

Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
2023-03-24 07:36:01 +01:00
Maciej Bielski 3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Zbigniew Zdanowicz ef12312672 [perf] add selective properties update for one-time and multi-time properties
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-23 15:59:50 +01:00
Zbigniew Zdanowicz 38e50007f7 [perf] simplify memory layout of command container class
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-23 13:31:47 +01:00
Krzysztof Gibala ecd8c6b410 fix l0: Add missing calculation in kernel getProperties
After resolving NEO-7684 in turns out that `zeKernelGetProperties`
is still returning wrong value for `maxNumSubgroups` since it
did not take into account `LargeGRF & SIMD` limitation.

Related-To: NEO-7829
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2023-03-22 16:06:13 +01:00
Lu, Wenbin 299985f15e Add extension property reporting for zeImageViewCreateExt
`ZE_extension_image_view` and `ZE_extension_image_view_planar`
should be reported by NEO, and `ZE_STRUCTURE_TYPE_IMAGE_VIEW_PLANAR_EXT_DESC`
needs to be recognized

Related-to: LOCI-3769

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-03-21 20:09:25 +01:00
Zbigniew Zdanowicz 6fb905acb2 Enable state base address tracking
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-21 15:53:24 +01:00
John Falkowski a1e2eca9e8 Add zeMemGetAllocProperties extension for sub-allocations
Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-03-17 21:21:44 +01:00
Cencelewska, Katarzyna a4a296d59f wa: enable wa to add additional dummy blits after blit copy
- reduce number of dummy blits where are not needed
- track if dummy blit required in cmdlist

Related-To: NEO-7450
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-03-17 10:43:00 +01:00
Fabian Zwolinski 65c73a690f Introduce Online, Offline, Disabled DebuggingModes
This change allows to set DebuggingMode via
ZET_ENABLE_PROGRAM_DEBUGGING env var
0: Disabled
1: Online
2: Offline

Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-17 09:31:17 +01:00
Zbigniew Zdanowicz bc4e540c33 [fix] unify heaps size programing
- share same code between csr and cmd container to get default heap size
- share handling of debug flag to change heap size
- share platform level surface heap size between csr and command list
- refactor heap size files
- put heap size constant and function into namespace
- command list surface heap size increased to 2MB for xehp+ to match csr
- command list increased surface heap size only for sba tracking
- sba tracking heap consumption increased due to different reset policy

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-17 08:34:06 +01:00
Compute-Runtime-Validation 9c0ad71700 Revert "Add extension property reporting for zeImageViewCreateExt"
This reverts commit f087a4cf70.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-16 15:14:41 +01:00
Mateusz Jablonski 933d01549f refactor l0 core: cleanup cmake files 2/n
cleanup files per core/platform, cache and os specific

Related-To: NEO-7507
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-16 14:09:12 +01:00
Mateusz Jablonski 0da5e6f277 refactor l0: cleanup cmake file level_zero/core/source/CMakeLists.txt
Related-To: NEO-7507
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-16 12:38:15 +01:00
Mateusz Hoppe 0204761add feature: gpu assert implementation
- allocate assert buffer when kernel has assert
- track assert kernels in cmdlists and cmdqueues
- check and print assert at sync calls: cmdqueue synchronize(), fence
synchronize(), event hostSynchronize(), synchronous imm cmdlists
append()

Related-To: NEO-5753

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-15 19:22:09 +01:00
Zbigniew Zdanowicz e645f58b65 [fix] Do not reset state heap position for command list reset
- state base address tracking allows to reuse base address state
- surface state slots can be reused after sba reload or cache flush
- to avoid cache flush after each reset, then allow to gradualy consume heaps
- only until natural heap depletion and then dispatch reload of sba state

Related-To : NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-15 19:04:20 +01:00
Neil R Spruit 75fbaa0642 fix l0: Set isHostVisibleEventPoolAllocation for all host allocated EventPools
Related-To: LOCI-4147

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-03-15 16:56:13 +01:00
Fabian Zwolinski 93a30f002b L0 Debugger - check debug_eu entry.
Related-To: NEO-7790
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-15 16:14:49 +01:00
Mateusz Hoppe e62c5e25d5 refactor: change debugging enabled to debugging mode
Related-To: NEO-7630

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-15 13:41:41 +01:00