- when bindless adressing is set in zeinfo, kernel attributes should
reflect that.
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- move isCachingOnCpuAvailable to product helper
- isCachingOnCpuAvailable should return false on mtl
- if wsl, skip checking method from product helper
Related-To: NEO-7194
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
- IGC stores invalid string index at the end of printf buffer,
invalid index is equal to (int32_t)-1 extended to (char*)
- printKernelOutput() should not parse buffer after invalid index was
found to avoid invalid string pointer dereference
Related-To: NEO-5753
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Have makeResident return error to the caller, instead of always
SUCCESS. This will allow interfaces like zeContextMakeMemoryResident
to fail properly.
Additionally, change the parsing of MemoryOperationsStatus from
ZE_RESULT_ERROR_OUT_OF_HOST_MEMORY to
ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY, since when making resources
resident, it is the device running out of memory, instead of the
host.
Related-To: LOCI-4443
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Related-To: LOCI-4176
- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
- vm_bind with user fence updates fence value independently for every
VM hence with per-context VMs, every context needs its unique fence
address. This prevents 2 contexts from updating value possibly
writing lower value than the one that was already stored
Resolves: NEO-8004
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Get ipVersion from productHelper function on xe and upstream.
On prelim first try to query ipVersion from kmd,
if it fails, get ipVersion from productHelper function.
Related-To: NEO-7786
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
This fixes illegal memory accesses by the job submitted to the GuC.
Also some unit tests are added to harness the vmBind operation.
Related-To: NEO-7996
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
when flag disabled, gmm flag Cacheable won't set on xe_hp and later
Related-To: NEO-7194
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
- add debug flag EnableCpuCacheForResources to be able to allow coherency when
resources could be cacheable
Resolves: NEO-7194
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
This commit keeps KMD migration still disabled by default on PVC platform.
Related-To: NEO-6465
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This patch add new environment variables to control compiler cache.
Works as follow: If persistent cache is set driver check if NEO_CACHE_DIR
is set. If not then driver checks XDG_CACHE_HOME - If exists
then driver create neo_compiler_cache folder, if
not then driver checks HOME directory. If each NEO_CACHE_DIR,
XDG_CACHE_HOME and HOME are not set then compiler cache is disabled.
Current support is for Linux only.
Signed-off-by: Diedrich, Kamil <kamil.diedrich@intel.com>
Related-To: NEO-4262
- allocateGraphicsMemoryUsingKmdAndMapItToCpuVA in case of no compression
- allocate32BitGraphicsMemoryImpl in case of allocate by KMD
remove redundant ctor of StorageInfo class
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Enable for zebin format but not CM kernels.
Use heuristic of simdSize == 1 to detect CM kernels.
Related-To: NEO-7712
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This change matches the appropriate aot config
for the combination of device ID and revision ID.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7905
Related-To: LOCI-4332
- Signal non-timestamp Walkers with in-order CL value
- Event host synchronization based on CL signal value
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
populate memory info based on mem usage and gts info
propagate error from xeWaitUserFence function
Related-To: NEO-7931
Co-authored-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Add debug variable to set sleep duration for HBM
IFR to complete
Related-To: LOCI-4298
Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
Related-To: LOCI-4174
- Call zelSetDriverTeardown during L0 Driver teardown to prevent users
from calling into destroyed functions and encountering crashes
during teardown.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
Allows the user to use alignments > 64KB in `createUnifiedMemoryAllocation`
So that the restriction in `piextUSMDeviceAlloc` of the DPC++ runtime
could be lifted
Related-To: LOCI-4168
Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
- set by default flag ZebinIgnoreIcbeVersion to true
- for zebin icbe version check is only inside flag
- only when use patchtoken then check icbe version is mandatory
Resolves: NEO-7904
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
adjust thread group dispatch size on pvc if chosen size does not evenly
divide dimension
this is to avoid leftover thread groups
Related-To: NEO-7927
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
use StackVec instead of unordered map
resize container at MemoryManager's creation time
Related-To: NEO-7925
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
in most cases we need to iterate over engines associated to single root device
Related-To: NEO-7925
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Related-To: LOCI-4172, LOCI-4305, LOCI-4306
- Create a new IPC Memory handle upon call to getIpcMemHandle if the
previous handle has been freed.
- Release the Ipc Memory Handle when zeMemPutIpcHandle is called.
- Create a new IPC Handle for tracking thru zeMemGetAllocProperties
when ze_external_memory_export_fd_t is used.
- Convert FD to opaque IPC handle and IPC Handle to FD.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
Current flow will be to have one synchronization point
config.file. Read remains unblocking, only write(caching)
operation will be blocking (lock on config.file)
Related-To: NEO-4262
Signed-off-by: Diedrich, Kamil <kamil.diedrich@intel.com>
Enable for zebin format but not CM kernels.
Use heuristic of simdSize == 1 to detect CM kernels.
Related-To: NEO-7712
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
The memadvise with preferred location for kmd-migrated shared allocation
is set to device associated with cmd list by default to migrate data
to lmem on non-atomic gpu page fault as well (for performance reasons).
Related-To: NEO-7252
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
- initialization of FileLogger always removed log file - this change only
removes old file when logging is enabled in current run
Resolves: NEO-7199
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- caps check is not needed when link engines are not available for
product
Related-To: NEO-7886
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
Ocloc supports passing hw ip version value to -device arg in
the form of major.minor.revision.
This change adds support for directly passed value as uint32_t as well.
Support added for single and fat binary.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7903
The memadvise with preferred location for kmd-migrated shared allocation
is set to device associated with cmd list by default to migrate data
to lmem on non-atomic gpu page fault too (for performance reasons).
Related-To: NEO-7252
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Add "DumpZEBin" debug flag. When this flag is enabled, Zebin will be
dumped to a .elf file (with appropiate suffix, in case such file has
been dumped before).
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-7895
Add mechanism to increase direct submission timeout up to a maximum
value when no new submissions were made since last sleep.
This should help in workloads that have delays between iterations larger
than current direct submission controller timeout.
Related-To: NEO-7878
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
ensure default hw ip version matches the value from helper
change pvc ult execution to revision 3
Related-To: NEO-7738
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Add debug key LogZEInfo for logging ZE Info from zebin elf.
ZE Info will be dumped to a file (default igdrcl.log)
Related-To: NEO-7895
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-6206
With this commit OpenCL will report cl_khr_integer_dot_product extension
in version 2. With all properties enabled.
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
Unit tests should not write output to the console.
Instead, every output should be captured.
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Improves performance for benchmarks with KMD-migrated shared allocation
in scenarios with ZE_AFFINITY_MASK=0.1.
Related-To: NEO-7881
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
- remove double implementation between similar hw generation.
- group the same implementations into dedicated inl files.
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
ensure default hw ip version matches the value from helper
change pvc ult execution to revision 3
Related-To: NEO-7738
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
In the case of mtl+ platforms, the returned config value
should equal the hardware ip version value.
This change fixes situations where some config has not been
added and in this case we returned an unknown value.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7738
If indirect accesses in kernel are not detected by IGC, indirect
allocations will not be made resident for this kernel.
Related-To: NEO-7712
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Previously std::once_flag was assigned per map:
std::unordered_map<ContextId, std::unique_ptr<SipKernel>> which was
incorrect and caused the situation in which SipKernel is allocated only
on 1 context and was skipped for other contexts, so we ended up with
only one allocation regardless of the number of contexts.
This change assigns std::once_flag for each allocated SipKernel.
Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Add new regkey KMDSupportForCrossTileMigrationPolicy
(disabled by default, in absence of KMD suppport for cross-tile migrations)
to control placement of shared allocation and memory prefetch behavior.
Related-To: NEO-7885
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This change is part of performance feature to start command list batch buffers
as primary.
Implicit Scaling sometimes require to jump over control section and these jumps
must maintain the same level of batch buffer as the whole command list.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Extended the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in clEnqueueNDRangeKernel(), clEnqueueMemFillINTEL, ...
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Related-To: LOCI-4174
- Call zelSetDriverTeardown during L0 Driver teardown to prevent users
from calling into destroyed functions and encountering crashes
during teardown.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
This change adds space reservation in command list for returning batch buffer
start hw command.
Primary batch buffer can be run from direct submission or from KMD call and
must be aligned to required size.
Ending patch for batch buffer start must be in the last command buffer of the
command list.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This is number of small tweaks to state cache invalidation:
1. Invalidate if heap was actually created.
2. Check if os context was actually initialized.
3. Heap allocation was actually submitted, as it might attain zero task count
value, when allocation is stored in csr internal storage, as csr wasn't used,
but the csr task count being zero is assigned to heap allocation when stored.
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This feature is part of performance improvement to dispatch and start
command buffers as primary batch buffers.
When exhausted command buffer is closed, then reserve exact space for chained
batch buffer start and bind it to the next command buffer.
When closing command buffer, then save ending pointer and
reserve aligned space.
Related-To: NEO-7807
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
There is no need to reset all fields and load support flags every reset call.
Add dedicated calls that will reset values and dirty flags.
Call virtual methods only once at init time.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
On linux OfflineDumpContextId consists of
32b processId in bits 63:32
32b drmContestId in bits 31:0
Also cache linux implementation of getProcessId since
the value is constant.
Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Call VM prefetch ioctl on all VMs for the KMD to apply
a synchronoues bind operation of buffer objects on all VMs.
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Globals surface allocation via USM manager will have correct allocation
type set (instead of just BUFFER) and will use cpu copy when possible.
Related-To: NEO-7796
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
State base address size proprties are not used to track state changes, but
they are important to carry size values.
Simplify state base address tracking, so they can update the value of the
property, but not the dirty state.
Related-To: NEO-7828
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- 3D btd command should be programed only once per context
- Add conditional pipe control command prior dispatching 3D btd command
- share 3D btd state between immediate and regular command lists
- add pipe control after ray tracing kernel to invalidate state cache
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Add the regkey ForceMemoryPrefetchForKmdMigratedSharedAllocations
to force meory prefetch of kmd-migrated shared allocation
in zeCommandQueueExecuteCommandLists().
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
- when destination surface is media compressed then disable compression bit
- rename command field CompressionType->ControlSurfaceType
- program this field only on Xe Hpg platforms
Related-To: NEO-7415
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Use cpu copy for globals surface when allocated through svm, allocation
not set as lockable but locking allocation succeeds.
Make sure gfx allocations is unlocked after copy.
Related-To: NEO-7796
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Apply the KMD advise with preferred device location for KMD-migrated
shared allocation to migrate to lmem on every GPU page fault
(default KMD migration policy).
Related-To: NEO-7851
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Changed -cl-intel-allow-zebin to -cl-intel-enable-zebin only for
API options.
Related-To: NEO-7801
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
- Added ULTS to improve the code coverage in fs_access.
- Used the shared functions in the ULTs.
- Added the function for pwrite in the shared code.
Related-To: LOCI-2117
Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
Use cpu copy for globals surface when allocated through svm, allocation
not set as lockable but locking allocation succeeds.
Related-To: NEO-7796
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
When state base address tracking is enabled and command list use private heaps
then command list at destroy time must calls all compute CSRs that were using
that heap to invalidate state caches.
This allows new command list to reuse the same heap allocation for different
surface states, so before new use cached states are invalidated.
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Allows the user to use alignments > 64KB in `createUnifiedMemoryAllocation`
So that the restriction in `piextUSMDeviceAlloc` of the DPC++ runtime
could be lifted
Related-To: LOCI-4168
Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
zeCommandQueueExecuteCommandLists return ZE_RESULT_ERROR_UNKNOWN when OOM
in some scenario of direct submission.
Related-To: NEO-7840
Signed-off-by: Pan Zhenjie <zhenjie.pan@intel.com>