Before state transition was done twice, 1st time for estimation, 2nd time for
dispatch.
Now state transitions only during estimation and required state is saved then.
Commands are dispatched only when command list and property are marked to
dispatch.
During regular workload submission transition is performed only once and it
should be benefitial to reduce host overhead.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
1. separate front end programing when tracking is enabled and disabled, it will
limit number of conditional checks.
2. setup command list front end properties only when front end state is dirty.
3. instanced context id should be set once, as this is one time per context
property.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
When getting residency count for all command lists, driver is able to
reallocate container only once and not per each command list.
Add non-zero initial value for command queue residual allocations.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Extended the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in clEnqueueNDRangeKernel(), clEnqueueMemFillINTEL, ...
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This fix is most important for multi command list execution use cases.
It is also benefitial for single command list execution, as driver saves
on loop enters and exits.
Methods handling single command list instead of array of objects are simpler.
Removed loops were at:
- CommandListExecutionContext constructor
- estimateLinearStreamSizeInitial method
- computePreemptionSize method
- collectPrintfContentsFromAllCommandsLists method
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Plenty of calls require hw command programming only once per context.
There is no need to visit every method of them every execute call.
Set global init flag only if any of them is true and then visit all of them.
But for regular command list execution it can save time when there is single
global check.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- 3D btd command should be programed only once per context
- Add conditional pipe control command prior dispatching 3D btd command
- share 3D btd state between immediate and regular command lists
- add pipe control after ray tracing kernel to invalidate state cache
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Add the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in zeCommandQueueExecuteCommandLists().
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
- reduce number of dummy blits where are not needed
- track if dummy blit required in cmdlist
Related-To: NEO-7450
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
- full properties update is time intesive task and must be done only once
- selective update can be done after initial update
- dirty flag will allow to distinguish initial update is done
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- debug address tracking estimation added for state base address tracking
- fix bt command estimation for private heap command lists
- set immediate command lists default one-time settings in csr
- simplify interface of estimate state base address
- set correct mode for legacy unit test expecting preamble
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
to guarantee that all subblt got complete for previous copy
affect xe hpg
temporary changes under flag ForceDummyBlitWa
Related-To: NEO-7450
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
This fix handles scenario when regular command list uses context first,
then immediate command list is used for the first time.
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- This change gets level one cache policy from cached values instead
of calling virtual methods
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- When enabled, sba tracking dispatches preambleless, tracked sba commands
in command lists and command queues.
- Tracking disallows any untracked sba commands.
- Adding some tweaks to data initialization and processing.
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This is small optimization to replace virtual call and retrieved struct with
cached value.
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- add correct stateless mocs state update in immediate command lists
- disallow stateless mocs dirty sba command dispatch when sba tracking enabled
- checks support first, only then do the dirty state check in csr
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Propogate error codes from ioctl failure properly up the layers
so that we skip exposing bad root devices.
Related-To: NEO-7709
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>