Before state transition was done twice, 1st time for estimation, 2nd time for
dispatch.
Now state transitions only during estimation and required state is saved then.
Commands are dispatched only when command list and property are marked to
dispatch.
During regular workload submission transition is performed only once and it
should be benefitial to reduce host overhead.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Add "DumpZEBin" debug flag. When this flag is enabled, Zebin will be
dumped to a .elf file (with appropiate suffix, in case such file has
been dumped before).
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-7895
ensure default hw ip version matches the value from helper
change pvc ult execution to revision 3
Related-To: NEO-7738
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- when resume(all) is called - all threads' sr counter needs to be
verified. Reading state save area separately for all threads takes
longer than reading whole state save area once. State save area is
only read again if sr counter wasn't updated
- fail while reading state save area means threads might have completed
execution
- this fix optimizes time spent in resume(all), that may be called before
debugger detaches
Related-To: NEO-7897
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Enable cpu copy for USM device to USM host transfer in level zero
immediate cmdlist.
Related-To: NEO-7553
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
- reading state save area for every threads takes too long when all
application threads have completed and there are stale ATT events to
process
- on detach gdb seemed to be frozen waiting for ATT event to be handled
- fix is to read state save area once - and check SIP counter for every
thread in ATT bitmask
Related-To: NEO-7897
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- Added support for the ECC APIs in the new sysman design.
- Added ULTs for the ECC APIs in the new sysman design.
Related-To: LOCI-4244
Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
1. separate front end programing when tracking is enabled and disabled, it will
limit number of conditional checks.
2. setup command list front end properties only when front end state is dirty.
3. instanced context id should be set once, as this is one time per context
property.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
When getting residency count for all command lists, driver is able to
reallocate container only once and not per each command list.
Add non-zero initial value for command queue residual allocations.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- group same implementation into dedicated inl files
- remove double implementations for the similiar hw generations
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
ensure default hw ip version matches the value from helper
change pvc ult execution to revision 3
Related-To: NEO-7738
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
In the case of mtl+ platforms, the returned config value
should equal the hardware ip version value.
This change fixes situations where some config has not been
added and in this case we returned an unknown value.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7738
For primary batch buffer command list driver should not use return point.
Return points are useful when batch buffers are dispatched as secondary,
for primary buffers, patching of front end command is more desirable option.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Implicit Scaling barrier have the same requirements as kernel.
It must dispach bb start command with the same level as the command list
is dispatched.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Extended the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in clEnqueueNDRangeKernel(), clEnqueueMemFillINTEL, ...
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Related-To: LOCI-4174
- Call zelSetDriverTeardown during L0 Driver teardown to prevent users
from calling into destroyed functions and encountering crashes
during teardown.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
Replacing normal pointers by smart pointers in scheduler module
of LO sysman(zesinit).
Related-To: LOCI-2810
Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
Support for Initialization using zesInit
Support Power module using new sysman initialization
Related-To: LOCI-4134
Signed-off-by: Kulkarni, Ashwin Kumar <ashwin.kumar.kulkarni@intel.com>
- Added support for the Standby APIs in the new sysman design.
- Added ULTs for the Standby APIs in the new sysman design.
Related-To: LOCI-4097
Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
This fix is most important for multi command list execution use cases.
It is also benefitial for single command list execution, as driver saves
on loop enters and exits.
Methods handling single command list instead of array of objects are simpler.
Removed loops were at:
- CommandListExecutionContext constructor
- estimateLinearStreamSizeInitial method
- computePreemptionSize method
- collectPrintfContentsFromAllCommandsLists method
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This change adds space reservation in command list for returning batch buffer
start hw command.
Primary batch buffer can be run from direct submission or from KMD call and
must be aligned to required size.
Ending patch for batch buffer start must be in the last command buffer of the
command list.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Plenty of calls require hw command programming only once per context.
There is no need to visit every method of them every execute call.
Set global init flag only if any of them is true and then visit all of them.
But for regular command list execution it can save time when there is single
global check.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This is number of small tweaks to state cache invalidation:
1. Invalidate if heap was actually created.
2. Check if os context was actually initialized.
3. Heap allocation was actually submitted, as it might attain zero task count
value, when allocation is stored in csr internal storage, as csr wasn't used,
but the csr task count being zero is assigned to heap allocation when stored.
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Replacing normal pointers by smart pointers in engine module
of LO sysman(zesinit).
Related-To: LOCI-2810
Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
Replacing normal pointers by smart pointers in memory module
of LO sysman(zesinit).
Related-To: LOCI-2810
Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>
- when no threads are executing, interrupt all may fail and debug break
fires - although error is handled and correct event is returned. To
prevent abort, debug break has to be removed
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- before marking interrupt request check exception reason. If there is
exception other than forced exception or forced external halt treat
thread as stopped and generate distinct event for it.
Related-To: NEO-7869
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
There is no need to reset all fields and load support flags every reset call.
Add dedicated calls that will reset values and dirty flags.
Call virtual methods only once at init time.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Replacing normal pointers by smart pointers in fabric_port module
of LO sysman(zesinit).
Related-To: LOCI-2810
Signed-off-by: Singh, Prasoon <prasoon.singh@intel.com>