Commit Graph

2688 Commits

Author SHA1 Message Date
Milczarek, Slawomir
01d03aa5b6 Extended regkey to force prefetch of shared memory in enqueue commands
Extended the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in clEnqueueNDRangeKernel(), clEnqueueMemFillINTEL, ...

Related-To: NEO-7841

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2023-04-11 11:23:48 +02:00
Neil R Spruit
d31b950b9a l0_feature: Use L0 Loader teardown callback
Related-To: LOCI-4174

- Call zelSetDriverTeardown during L0 Driver teardown to prevent users
from calling into destroyed functions and encountering crashes
during teardown.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-04-11 11:16:26 +02:00
John Falkowski
e056082710 refactor graphics allocation structure elements for sub-allocation properties
Resolves:  LOCI-3772

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-04-07 16:53:23 +02:00
Zbigniew Zdanowicz
66c19c7749 [perf] remove redundant for loops in command list execution method
This fix is most important for multi command list execution use cases.
It is also benefitial for single command list execution, as driver saves
on loop enters and exits.
Methods handling single command list instead of array of objects are simpler.

Removed loops were at:
- CommandListExecutionContext constructor
- estimateLinearStreamSizeInitial method
- computePreemptionSize method
- collectPrintfContentsFromAllCommandsLists method

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 15:27:04 +02:00
Mateusz Jablonski
31f32cc16e fix implicit args: generate local ids as for grf size 32
Related-To: IGC-6936

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-04-07 11:37:07 +02:00
Zbigniew Zdanowicz
d4109eb153 [feat, perf] add closing mechanism to command list primary batch buffers
This change adds space reservation in command list for returning batch buffer
start hw command.
Primary batch buffer can be run from direct submission or from KMD call and
must be aligned to required size.
Ending patch for batch buffer start must be in the last command buffer of the
command list.

Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 11:28:41 +02:00
Zbigniew Zdanowicz
1fcf564cc1 Enable state base address tracking
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 11:22:24 +02:00
Zbigniew Zdanowicz
09b58f4a22 [perf] group once per context calls under single condition
Plenty of calls require hw command programming only once per context.
There is no need to visit every method of them every execute call.
Set global init flag only if any of them is true and then visit all of them.
But for regular command list execution it can save time when there is single
global check.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 09:21:28 +02:00
Zbigniew Zdanowicz
9ce5351d3f [fix] invalidate state caches only for heaps used by initialized context
This is number of small tweaks to state cache invalidation:
1. Invalidate if heap was actually created.
2. Check if os context was actually initialized.
3. Heap allocation was actually submitted, as it might attain zero task count
value, when allocation is stored in csr internal storage, as csr wasn't used,
but the csr task count being zero is assigned to heap allocation when stored.

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 09:16:12 +02:00
Matias Cabral
97bc43d18b Use correct timer resolution in zello_timestamp
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2023-04-06 14:34:30 +02:00
Katarzyna Cencelewska
1dc3b9b1e1 fix: add missing call make resident for dummy blit allocation
Resolves: HSD-18028732286
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2023-04-06 09:22:53 +02:00
Zbigniew Zdanowicz
e695059152 [perf] reduce host overhead in command list reset call
There is no need to reset all fields and load support flags every reset call.
Add dedicated calls that will reset values and dirty flags.
Call virtual methods only once at init time.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-05 11:29:39 +02:00
Mateusz Hoppe
d83684785c feature: add dumping debug elf file
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-04-04 18:28:16 +02:00
Compute-Runtime-Validation
e1af516c25 Revert "Enable state base address tracking"
This reverts commit 6a08d29869.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-04-04 11:37:19 +02:00
Dominik Dabek
1c52017ceb fix: use correct allocation type in program init
Globals surface allocation via USM manager will have correct allocation
type set (instead of just BUFFER) and will use cpu copy when possible.

Related-To: NEO-7796

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-04-04 11:31:11 +02:00
Zbigniew Zdanowicz
a5179aae0b [perf] add debug key and control variable to command list primary buffer
Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-04 10:58:11 +02:00
Brandon Yates
7dc4fd8dda fix(l0): Fix cache properties for subdevice
total cache size calculation should use subdevice count
rather than MultiTileArchInfo to ensure correct size when subdevices
are queried

Related-to: LOCI-4217

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2023-04-04 08:05:09 +02:00
Zbigniew Zdanowicz
6a08d29869 Enable state base address tracking
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-03 15:26:09 +02:00
Zbigniew Zdanowicz
7731264fe3 [fix] update ray tracing commands programing
- 3D btd command should be programed only once per context
- Add conditional pipe control command prior dispatching 3D btd command
- share 3D btd state between immediate and regular command lists
- add pipe control after ray tracing kernel to invalidate state cache

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-03 11:21:24 +02:00
Milczarek, Slawomir
50da94dc56 Add regkey to force prefetch of shared memory in cmd list execute
Add the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in zeCommandQueueExecuteCommandLists().

Related-To: NEO-7841

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2023-04-03 11:14:18 +02:00
Naklicki, Mateusz
d9be9b32f7 fix: add offset when alloc address does not match src/dst address for blit cmd
Include an offset for the appendMemoryCopyBlitRegion
parameters when an source/destination pointer is offseted within
allocation.
Lack of this offset could result in an invalid pointer when source and
destination are on the same allocation or they do not point the start of
allocation.

Related-To: NEO-7694
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2023-04-03 10:01:11 +02:00
Fabian Zwolinski
c0603e0854 Allocate SipKernel per ctx for Offline dbg mode
- Add debuggingEnabledMode getter in ExecutionEnvironment
- Add new overloaded function - BuiltIns::getSipKernel
- Add perContextSipKernels map to BuiltIns
- Add OsContext to PreemptionHelper::programStateSip arguments
- Add new overloaded function - SipKernel::getBindlessDebugSipKernel

Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-30 16:40:41 +02:00
Compute-Runtime-Validation
2b93126795 Revert "Add alignment support to createUnifiedMemoryAllocation"
This reverts commit ca02bbba4b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-30 15:43:47 +02:00
Zbigniew Zdanowicz
2349aa1e30 Do not allow stateful kernels on global stateless command list
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-30 13:15:32 +02:00
Mateusz Jablonski
8119905325 fix usm shared: propagate resource48Bit when allocating with kmd migrations
Related-To: NEO-7665
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-30 13:00:40 +02:00
Young Jin Yoon
3f6443edfb Change allowZebin to enableZebin for apiOptions
Changed -cl-intel-allow-zebin to -cl-intel-enable-zebin only for
API options.

Related-To: NEO-7801

Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2023-03-30 12:46:10 +02:00
Zbigniew Zdanowicz
a761003f4b [fix] pass command list heap address model to container
This fix will allow save memory and do not allocate private heaps
when global heaps are enabled.

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-29 11:04:22 +02:00
Rafal Maziejuk
b9828b543e feature: adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-28 15:19:52 +02:00
Zbigniew Zdanowicz
6437c1a91e Flush state caches after command list is destroyed
When state base address tracking is enabled and command list use private heaps
then command list at destroy time must calls all compute CSRs that were using
that heap to invalidate state caches.
This allows new command list to reuse the same heap allocation for different
surface states, so before new use cached states are invalidated.

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-28 14:52:30 +02:00
Compute-Runtime-Validation
322c89cd1e Revert "Traverse pNext chain for memory allocations extensions"
This reverts commit e81fb20505.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-28 13:02:45 +02:00
Mateusz Jablonski
dd39b822d3 feature implicit args: patch rt dispatch global array in implicit args buffer
handle has_rtcalls in kernels and functions in zebin

Related-To: NEO-7818
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-28 12:31:38 +02:00
Lu, Wenbin
ca02bbba4b Add alignment support to createUnifiedMemoryAllocation
Allows the user to use alignments > 64KB in `createUnifiedMemoryAllocation`

So that the restriction in `piextUSMDeviceAlloc` of the DPC++ runtime
could be lifted

Related-To: LOCI-4168

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-03-28 10:57:04 +02:00
Dunajski, Bartosz
e49e245bec Revert "Disable RelaxedOrdering if UpdateTagFromWait is disabled"
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-03-27 11:47:10 +02:00
Rafal Maziejuk
27ff1c911d feature l0: handle additional properties in modules
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-24 10:27:44 +01:00
Zbigniew Zdanowicz
b4cce380c8 Revert "Enable state base address tracking"
This reverts commit 6fb905acb2.

Resolves: HSD-18028477709

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-24 10:20:36 +01:00
Raiyan Latif
e81fb20505 Traverse pNext chain for memory allocations extensions
Related-To: LOCI-4036

Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
2023-03-24 07:43:15 +01:00
Raiyan Latif
e3f732f5a6 feature: Add support for P2P Image Copy
Enables P2P Copy support for all Image API related calls:
- zeCommandListAppendImageCopy
- zeCommandListAppendImageCopyRegion
- zeCommandListAppendImageCopyToMemory
- zeCommandListAppendImageCopyFromMemory

Related-To: LOCI-4112

Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
2023-03-24 07:36:01 +01:00
Maciej Bielski
3ec0a637ba fix(l0): return API error on ISA allocation OOM
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.

Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-03-23 17:30:15 +01:00
Zbigniew Zdanowicz
ef12312672 [perf] add selective properties update for one-time and multi-time properties
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-23 15:59:50 +01:00
Zbigniew Zdanowicz
38e50007f7 [perf] simplify memory layout of command container class
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-23 13:31:47 +01:00
Dunajski, Bartosz
151aecc8bd Disable RelaxedOrdering if UpdateTagFromWait is disabled
Related-To: NEO-7458

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-03-22 18:15:39 +01:00
Krzysztof Gibala
ecd8c6b410 fix l0: Add missing calculation in kernel getProperties
After resolving NEO-7684 in turns out that `zeKernelGetProperties`
is still returning wrong value for `maxNumSubgroups` since it
did not take into account `LargeGRF & SIMD` limitation.

Related-To: NEO-7829
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2023-03-22 16:06:13 +01:00
Lu, Wenbin
299985f15e Add extension property reporting for zeImageViewCreateExt
`ZE_extension_image_view` and `ZE_extension_image_view_planar`
should be reported by NEO, and `ZE_STRUCTURE_TYPE_IMAGE_VIEW_PLANAR_EXT_DESC`
needs to be recognized

Related-to: LOCI-3769

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-03-21 20:09:25 +01:00
Zbigniew Zdanowicz
6fb905acb2 Enable state base address tracking
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-21 15:53:24 +01:00
John Falkowski
a1e2eca9e8 Add zeMemGetAllocProperties extension for sub-allocations
Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-03-17 21:21:44 +01:00
Mateusz Jablonski
659cacf2c9 refactor l0 cmake: reduce include directories
Related-To: NEO-7507
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-17 13:41:55 +01:00
Cencelewska, Katarzyna
a4a296d59f wa: enable wa to add additional dummy blits after blit copy
- reduce number of dummy blits where are not needed
- track if dummy blit required in cmdlist

Related-To: NEO-7450
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-03-17 10:43:00 +01:00
Fabian Zwolinski
65c73a690f Introduce Online, Offline, Disabled DebuggingModes
This change allows to set DebuggingMode via
ZET_ENABLE_PROGRAM_DEBUGGING env var
0: Disabled
1: Online
2: Offline

Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-17 09:31:17 +01:00
Zbigniew Zdanowicz
bc4e540c33 [fix] unify heaps size programing
- share same code between csr and cmd container to get default heap size
- share handling of debug flag to change heap size
- share platform level surface heap size between csr and command list
- refactor heap size files
- put heap size constant and function into namespace
- command list surface heap size increased to 2MB for xehp+ to match csr
- command list increased surface heap size only for sba tracking
- sba tracking heap consumption increased due to different reset policy

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-03-17 08:34:06 +01:00
Compute-Runtime-Validation
9c0ad71700 Revert "Add extension property reporting for zeImageViewCreateExt"
This reverts commit f087a4cf70.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-16 15:14:41 +01:00