- when resume(all) is called - all threads' sr counter needs to be
verified. Reading state save area separately for all threads takes
longer than reading whole state save area once. State save area is
only read again if sr counter wasn't updated
- fail while reading state save area means threads might have completed
execution
- this fix optimizes time spent in resume(all), that may be called before
debugger detaches
Related-To: NEO-7897
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Add debug key LogZEInfo for logging ZE Info from zebin elf.
ZE Info will be dumped to a file (default igdrcl.log)
Related-To: NEO-7895
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Enable cpu copy for USM device to USM host transfer in level zero
immediate cmdlist.
Related-To: NEO-7553
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Related-To: NEO-6206
With this commit OpenCL will report cl_khr_integer_dot_product extension
in version 2. With all properties enabled.
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
Unit tests should not write output to the console.
Instead, every output should be captured.
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
- reading state save area for every threads takes too long when all
application threads have completed and there are stale ATT events to
process
- on detach gdb seemed to be frozen waiting for ATT event to be handled
- fix is to read state save area once - and check SIP counter for every
thread in ATT bitmask
Related-To: NEO-7897
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- Added support for the ECC APIs in the new sysman design.
- Added ULTs for the ECC APIs in the new sysman design.
Related-To: LOCI-4244
Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
Improves performance for benchmarks with KMD-migrated shared allocation
in scenarios with ZE_AFFINITY_MASK=0.1.
Related-To: NEO-7881
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
1. separate front end programing when tracking is enabled and disabled, it will
limit number of conditional checks.
2. setup command list front end properties only when front end state is dirty.
3. instanced context id should be set once, as this is one time per context
property.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
When getting residency count for all command lists, driver is able to
reallocate container only once and not per each command list.
Add non-zero initial value for command queue residual allocations.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- remove double implementation between similar hw generation.
- group the same implementations into dedicated inl files.
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- group same implementation into dedicated inl files
- remove double implementations for the similiar hw generations
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
ensure default hw ip version matches the value from helper
change pvc ult execution to revision 3
Related-To: NEO-7738
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
In the case of mtl+ platforms, the returned config value
should equal the hardware ip version value.
This change fixes situations where some config has not been
added and in this case we returned an unknown value.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7738
For primary batch buffer command list driver should not use return point.
Return points are useful when batch buffers are dispatched as secondary,
for primary buffers, patching of front end command is more desirable option.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Implicit Scaling barrier have the same requirements as kernel.
It must dispach bb start command with the same level as the command list
is dispatched.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
If indirect accesses in kernel are not detected by IGC, indirect
allocations will not be made resident for this kernel.
Related-To: NEO-7712
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Previously std::once_flag was assigned per map:
std::unordered_map<ContextId, std::unique_ptr<SipKernel>> which was
incorrect and caused the situation in which SipKernel is allocated only
on 1 context and was skipped for other contexts, so we ended up with
only one allocation regardless of the number of contexts.
This change assigns std::once_flag for each allocated SipKernel.
Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Add new regkey KMDSupportForCrossTileMigrationPolicy
(disabled by default, in absence of KMD suppport for cross-tile migrations)
to control placement of shared allocation and memory prefetch behavior.
Related-To: NEO-7885
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This change is part of performance feature to start command list batch buffers
as primary.
Implicit Scaling sometimes require to jump over control section and these jumps
must maintain the same level of batch buffer as the whole command list.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- For static create() method for Kernel and MultiDeviceKernel force errcodeRet
parameter to be passed via reference (instead of a pointer)
- Move part of kernel's creation logic to initialize() method
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Extended the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in clEnqueueNDRangeKernel(), clEnqueueMemFillINTEL, ...
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Related-To: LOCI-4174
- Call zelSetDriverTeardown during L0 Driver teardown to prevent users
from calling into destroyed functions and encountering crashes
during teardown.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>