Commit Graph

3517 Commits

Author SHA1 Message Date
Zbigniew Zdanowicz
21ac5f2835 [perf] transition hw state only once, then dispatch command when needed
Before state transition was done twice, 1st time for estimation, 2nd time for
dispatch.
Now state transitions only during estimation and required state is saved then.
Commands are dispatched only when command list and property are marked to
dispatch.
During regular workload submission transition is performed only once and it
should be benefitial to reduce host overhead.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-19 16:31:12 +02:00
Lu, Wenbin
c66546df73 Disable kernel timestamp when not using implicit scaling
Related-To: LOCI-2826

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-04-19 12:14:17 +02:00
Kacper Nowak
c7adbc2140 Add debug key for dumping ELF to file
Add "DumpZEBin" debug flag. When this flag is enabled, Zebin will be
dumped to a .elf file (with appropiate suffix, in case such file has
been dumped before).
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-7895
2023-04-18 20:40:25 +02:00
Mateusz Jablonski
51b8dc66a3 fix ocloc/ult: set default PVC device to pvc xt C0
ensure default hw ip version matches the value from helper
change pvc ult execution to revision 3

Related-To: NEO-7738
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-04-18 13:48:48 +02:00
Mateusz Hoppe
bfd32d6284 fix: read state save area once for all threads for resume(ALL)
- when resume(all) is called - all threads' sr counter needs to be
verified. Reading state save area separately for all threads takes
longer than reading whole state save area once. State save area is
only read again if sr counter wasn't updated
- fail while reading state save area means threads might have completed
execution
- this fix optimizes time spent in resume(all), that may be called before
debugger detaches

Related-To: NEO-7897

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-04-17 15:41:13 +02:00
Devarinti, Puneeth Kumar Reddy
239ce79f43 Debug: Add debug logs for global module
Related-To: LOCI-3876

Signed-off-by: Devarinti, Puneeth Kumar Reddy <puneeth.kumar.reddy.devarinti@intel.com>
2023-04-17 11:30:12 +02:00
Compute-Runtime-Validation
e79fb5f39b Revert "fix ocloc/ult: set default PVC device id to pvc xt device id"
This reverts commit bd84ba819b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-04-15 11:43:21 +02:00
Bellekallu Rajkiran
007f5d70bf [Fix, Sysman] Map uevent to device based on device path
Related-To: LOCI-4307

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2023-04-14 23:25:39 +02:00
Zbigniew Zdanowicz
4ef879867c [fix] correct fence not ready value
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-14 16:43:45 +02:00
Dominik Dabek
8d834202af feat(l0): enable cpu copy for USM D2H
Enable cpu copy for USM device to USM host transfer in level zero
immediate cmdlist.

Related-To: NEO-7553

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-04-14 15:33:45 +02:00
Mateusz Hoppe
079105a5c2 fix: optimize ATT handling - read state save area once for all threads
- reading state save area for every threads takes too long when all
application threads have completed and there are stale ATT events to
process
- on detach gdb seemed to be frozen waiting for ATT event to be handled
- fix is to read state save area once - and check SIP counter for every
thread in ATT bitmask

Related-To: NEO-7897

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-04-14 10:57:18 +02:00
Zbigniew Zdanowicz
f5f073b9fc [perf] move validation call before lock
Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-14 10:53:46 +02:00
Bari, Pratik
e03b1581b9 Added support for the ECC APIs
- Added support for the ECC APIs in the new sysman design.
- Added ULTs for the ECC APIs in the new sysman design.

Related-To: LOCI-4244

Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
2023-04-14 07:17:12 +02:00
Devarinti, Puneeth Kumar Reddy
94dc789212 Debug: Add debug logs for pci module
Related-To: LOCI-3876

Signed-off-by: Devarinti, Puneeth Kumar Reddy <puneeth.kumar.reddy.devarinti@intel.com>
2023-04-13 13:00:01 +02:00
Zbigniew Zdanowicz
cd899871b1 [perf] tweak front end programing to remove not needed steps
1. separate front end programing when tracking is enabled and disabled, it will
limit number of conditional checks.
2. setup command list front end properties only when front end state is dirty.
3. instanced context id should be set once, as this is one time per context
property.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-13 11:43:26 +02:00
Zbigniew Zdanowicz
1a4dda57e7 [perf] reallocate residency container once for all command lists
When getting residency count for all command lists, driver is able to
reallocate container only once and not per each command list.
Add non-zero initial value for command queue residual allocations.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-13 11:28:42 +02:00
Zbigniew Zdanowicz
63eb88b819 [refactor] reposition level zero command list implementations
- group same implementation into dedicated inl files
- remove double implementations for the similiar hw generations

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-13 11:00:24 +02:00
Mateusz Jablonski
bd84ba819b fix ocloc/ult: set default PVC device id to pvc xt device id
ensure default hw ip version matches the value from helper
change pvc ult execution to revision 3

Related-To: NEO-7738
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-04-13 10:54:28 +02:00
Devarinti, Puneeth Kumar Reddy
a684e0ffc6 Debug: Add debug logs for fabricport module
Related-To: LOCI-3882

Signed-off-by: Devarinti, Puneeth Kumar Reddy <puneeth.kumar.reddy.devarinti@intel.com>
2023-04-13 10:47:04 +02:00
Konstanty Misiak
1f37e69fd2 Refactor of IO functions
Related-To: NEO-4562

Signed-off-by: Konstanty Misiak <konstanty.misiak@intel.com>
2023-04-13 10:46:47 +02:00
Zbigniew Zdanowicz
c0f0472b6e test l0: add command queue tests
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-13 10:14:05 +02:00
Daria Hinz
c3f4074f0a fix: Unification of aot config with hw ip version
In the case of mtl+ platforms, the returned config value
should equal the hardware ip version value.
This change fixes situations where some config has not been
added and in this case we returned an unknown value.

Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7738
2023-04-12 18:34:03 +02:00
Zbigniew Zdanowicz
f12b11786e [feat, perf] add primary batch buffer support to front end properties update
For primary batch buffer command list driver should not use return point.
Return points are useful when batch buffers are dispatched as secondary,
for primary buffers, patching of front end command is more desirable option.

Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-12 16:18:55 +02:00
Zbigniew Zdanowicz
62ea1b1a58 [feat, perf] add primary batch buffer support to multi-tile barrier
Implicit Scaling barrier have the same requirements as kernel.
It must dispach bb start command with the same level as the command list
is dispatched.

Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-12 16:18:38 +02:00
Bellekallu Rajkiran
24f73f4686 fix(Sysman): Support for fabric port health change event.
Related-To: LOCI-4053

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2023-04-12 06:46:19 +02:00
Compute-Runtime-Validation
41ad05eb52 Revert "l0_feature: Use L0 Loader teardown callback"
This reverts commit d31b950b9a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-04-12 06:45:46 +02:00
Milczarek, Slawomir
01d03aa5b6 Extended regkey to force prefetch of shared memory in enqueue commands
Extended the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in clEnqueueNDRangeKernel(), clEnqueueMemFillINTEL, ...

Related-To: NEO-7841

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2023-04-11 11:23:48 +02:00
Neil R Spruit
d31b950b9a l0_feature: Use L0 Loader teardown callback
Related-To: LOCI-4174

- Call zelSetDriverTeardown during L0 Driver teardown to prevent users
from calling into destroyed functions and encountering crashes
during teardown.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-04-11 11:16:26 +02:00
Singh, Prasoon
0165f6158c [Sysman] Replace normal pointers with smart pointers (13/n)
Replacing normal pointers by smart pointers in scheduler module
of LO sysman(zesinit).

Related-To: LOCI-2810

Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
2023-04-10 23:14:00 +02:00
Kulkarni, Ashwin Kumar
d0fb5a6e51 Support Windows Sysman initialization
Support for Initialization using zesInit
Support Power module using new sysman initialization

Related-To: LOCI-4134

Signed-off-by: Kulkarni, Ashwin Kumar <ashwin.kumar.kulkarni@intel.com>
2023-04-10 06:59:47 +02:00
Bari, Pratik
f8623fadaf Added support for Standby APIs
- Added support for the Standby APIs in the new sysman design.
- Added ULTs for the Standby APIs in the new sysman design.

Related-To: LOCI-4097

Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
2023-04-10 06:54:50 +02:00
John Falkowski
e056082710 refactor graphics allocation structure elements for sub-allocation properties
Resolves:  LOCI-3772

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-04-07 16:53:23 +02:00
Zbigniew Zdanowicz
66c19c7749 [perf] remove redundant for loops in command list execution method
This fix is most important for multi command list execution use cases.
It is also benefitial for single command list execution, as driver saves
on loop enters and exits.
Methods handling single command list instead of array of objects are simpler.

Removed loops were at:
- CommandListExecutionContext constructor
- estimateLinearStreamSizeInitial method
- computePreemptionSize method
- collectPrintfContentsFromAllCommandsLists method

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 15:27:04 +02:00
Mateusz Jablonski
31f32cc16e fix implicit args: generate local ids as for grf size 32
Related-To: IGC-6936

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-04-07 11:37:07 +02:00
Zbigniew Zdanowicz
d4109eb153 [feat, perf] add closing mechanism to command list primary batch buffers
This change adds space reservation in command list for returning batch buffer
start hw command.
Primary batch buffer can be run from direct submission or from KMD call and
must be aligned to required size.
Ending patch for batch buffer start must be in the last command buffer of the
command list.

Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 11:28:41 +02:00
Zbigniew Zdanowicz
1fcf564cc1 Enable state base address tracking
Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 11:22:24 +02:00
Zbigniew Zdanowicz
09b58f4a22 [perf] group once per context calls under single condition
Plenty of calls require hw command programming only once per context.
There is no need to visit every method of them every execute call.
Set global init flag only if any of them is true and then visit all of them.
But for regular command list execution it can save time when there is single
global check.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 09:21:28 +02:00
Zbigniew Zdanowicz
9ce5351d3f [fix] invalidate state caches only for heaps used by initialized context
This is number of small tweaks to state cache invalidation:
1. Invalidate if heap was actually created.
2. Check if os context was actually initialized.
3. Heap allocation was actually submitted, as it might attain zero task count
value, when allocation is stored in csr internal storage, as csr wasn't used,
but the csr task count being zero is assigned to heap allocation when stored.

Related-To: NEO-5055

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-07 09:16:12 +02:00
Matias Cabral
97bc43d18b Use correct timer resolution in zello_timestamp
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2023-04-06 14:34:30 +02:00
Katarzyna Cencelewska
1dc3b9b1e1 fix: add missing call make resident for dummy blit allocation
Resolves: HSD-18028732286
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2023-04-06 09:22:53 +02:00
Jitendra Sharma
d29ed25f8b Add support for global_operations in new sysman design
Related-To: LOCI-4135
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-04-05 17:25:03 +02:00
Singh, Prasoon
88fe17e50a [Sysman] Replace normal pointers with smart pointers (12/n)
Replacing normal pointers by smart pointers in engine module
of LO sysman(zesinit).

Related-To: LOCI-2810

Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
2023-04-05 15:18:33 +02:00
Singh, Prasoon
afe99d7dbc [Sysman] Replace normal pointers with smart pointers (11/n)
Replacing normal pointers by smart pointers in memory module
of LO sysman(zesinit).

Related-To: LOCI-2810

Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
2023-04-05 14:15:27 +02:00
Mateusz Hoppe
d1393e08d3 refactor: remove debug break when EU CONTROL ioctl fails
- when no threads are executing, interrupt all may fail and debug break
fires - although error is handled and correct event is returned. To
prevent abort, debug break has to be removed

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-04-05 13:07:31 +02:00
Mateusz Hoppe
e26ebfc51b fix: check exception reason for stopped threads
- before marking interrupt request check exception reason. If there is
exception other than forced exception or forced external halt treat
thread as stopped and generate distinct event for it.

Related-To: NEO-7869

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-04-05 12:01:20 +02:00
Zbigniew Zdanowicz
e695059152 [perf] reduce host overhead in command list reset call
There is no need to reset all fields and load support flags every reset call.
Add dedicated calls that will reset values and dirty flags.
Call virtual methods only once at init time.

Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-05 11:29:39 +02:00
Singh, Prasoon
71fe65b327 [Sysman] Replace normal pointers with smart pointers (10/n)
Replacing normal pointers by smart pointers in fan module of L0 sysman.

Related-To: LOCI-2810

Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
2023-04-05 11:05:44 +02:00
Mateusz Hoppe
d83684785c feature: add dumping debug elf file
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-04-04 18:28:16 +02:00
Singh, Prasoon
53fe4de534 [Sysman] Replace normal pointers with smart pointers (9/n)
Replacing  normal pointers by smart pointers in fabric_port module
of LO sysman(zesinit).

Related-To: LOCI-2810

Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
2023-04-04 13:49:20 +02:00
Compute-Runtime-Validation
e1af516c25 Revert "Enable state base address tracking"
This reverts commit 6a08d29869.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-04-04 11:37:19 +02:00