Temporarily reverting back to previously reported value to avoid corner-case
regression. Intent is reintroduce as soon as regressions are rootcaused.
This only affects Linux.
Related-To: GSD-5474, GSD-5412
This reverts commit e8ac22c265.
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
By default prefer allocating memory first by KMD, instead of malloc first.
By default prefer not caching allocations on MTL devices. This results
in allocations being handled with non-coherent pat index.
For integrated devices when caching is not preferred do not allow
direct memory access in CPU domain. For map/unmap operations create
a dedicated memory allocation for CPU access, instead of accessing it
directly, reusing the same logic as when mapping/unmapping local memory.
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
The new classes SysmanKmdInterface, SysmanKmdInterfaceI915 and
SysmanKmdInterfaceXe have been introduced.
A map is maintained in the SysmanKmdInterfaceI915 and
SysmanKmdInterfaceXe class for the sysfs file names.
The access specifier of the function getDrmVersion has been changed from
protected to public so as to use it in the sysman code. This is required
for the SysmanKmdInterface pointer to point to the
SysmanKmdInterfaceI915 and SysmanKmdInterfaceXe accordingly.
The ULTs have been added for the new sysfs file path corresponding to
the i915 and the Xe driver.
Related-To: LOCI-4399
Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
Added if conditions to enable useChunking flag by checking
with ExecutionEnvironment::isDebuggingEnabled.
Related-To: NEO-8164
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
- Preserve releases on CMake level.
- Instead of generating builtins per platform, generate them per-release
(+ correct naming accordingly).
- Stop using revisions in builtin compilation logic path, as they are
already embedded in release (device ip).
- Remove platform names & revisions from names for generated files
(related to builtins).
- Remove unnecessary code, refactor ULT logic.
Related-To: NEO-7783
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
The previous name "pageSize2Mb" defined in
shared/source/helpers/constant.h is inconsistent to other variable,
i.e. pageSize64k.
Furthermore, it's a bit misleading because the page size is defined in
Megabytes (MB), not in Megabits (Mb).
Related-to: NEO-7695
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Related-to: NEO-7695
New debug keys added:
EnableBOChunking is now a mask
0 = no chunking (default).
1 = shared allocations only
2 = device allocations only
3 = shared and device allocations
MinimalAllocationSizeForChunking sets the minimum allocation
size to apply chunking. Default is 2MB.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
- all bos from Module must have requireImmediateBinding
flag set
- this change fixes hang in debugger - where MODULE LOAD event
was not sent
Resolves: NEO-8121
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
With direct submission disabled this resulted in waitTimeout long enough
that kmdWait fallback was rarely used.
This caused more CPU spin time.
Related-To: GSD-3612
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
When upstream ioctl helper is created it will try to create small
allocation, adding I915_GEM_CREATE_EXT_SET_PAT extension. If it
succeeds, for all resources with valid pat index value it will then
explicitly program pat index value with gem create ext call.
PrintBOCreateDestroyResult value can be used to:
- print whether the set pat extension is supported by the kernel, when
ioctl helper is created
- print whether set pat extension was added for a given gem create ext
call and what pat index value was programmed
Resolves: NEO-7896
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
- Abstracts product helpers logic for uuid
- Add UUID retrieval for Linux for Sysman via zesInit path
Related-To: LOCI-4137
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
The Memory Info object is used in the getState function for memory.
Some of the ULTS in the memory modules has been modified.
A function to return the sysfs nodes for the Memory address range has
been added in the IoctlHelper class corresponding to the XE and i915
driver.
Related-To: LOCI-4397
Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
Set no preference for KMD migrated shared allocation and rely on ACC to migrate.
Related-To: NEO-6839
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
- by default ZE_ENABLE_PCI_ID_DEVICE_ORDER is disabled
- by default devices are sorted by type (discrete first), then by pci order
- when ZE_ENABLE_PCI_ID_DEVICE_ORDER is enabled, devices are sorted by pci id
Related-To: LOCI-4520
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Use flushStamp=taskCount when passed flushStamp==0.
This will cause driver to busy wait for a short while before falling
back to use kmd notify.
Related-To: GSD-3612
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
Related-To: NEO-6075
Ngen binaries contain stateful information, however they are
not used in isa on Pvc. Therefore, we can just ignore them.
Read if support for chunking is available in the KMD.
If available, KMD will create a BO with 1 or more chunks,
depending on the chunk size selected.
Related-To: NEO-7695
Sync to
https://github.com/intel-gpu/drm-uapi-helper/releases/tag/v2.0-rc18
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Signed-off-by: John Falkowski <john.falkowski@intel.com>
- new debug key EnableDeviceStateVerification to check device state not
ony in debug mode
Related-To: NEO-7669
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
- move isCachingOnCpuAvailable to product helper
- isCachingOnCpuAvailable should return false on mtl
- if wsl, skip checking method from product helper
Related-To: NEO-7194
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
Have makeResident return error to the caller, instead of always
SUCCESS. This will allow interfaces like zeContextMakeMemoryResident
to fail properly.
Additionally, change the parsing of MemoryOperationsStatus from
ZE_RESULT_ERROR_OUT_OF_HOST_MEMORY to
ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY, since when making resources
resident, it is the device running out of memory, instead of the
host.
Related-To: LOCI-4443
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Related-To: LOCI-4176
- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
- vm_bind with user fence updates fence value independently for every
VM hence with per-context VMs, every context needs its unique fence
address. This prevents 2 contexts from updating value possibly
writing lower value than the one that was already stored
Resolves: NEO-8004
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Get ipVersion from productHelper function on xe and upstream.
On prelim first try to query ipVersion from kmd,
if it fails, get ipVersion from productHelper function.
Related-To: NEO-7786
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
This fixes illegal memory accesses by the job submitted to the GuC.
Also some unit tests are added to harness the vmBind operation.
Related-To: NEO-7996
Signed-off-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
This commit keeps KMD migration still disabled by default on PVC platform.
Related-To: NEO-6465
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This patch add new environment variables to control compiler cache.
Works as follow: If persistent cache is set driver check if NEO_CACHE_DIR
is set. If not then driver checks XDG_CACHE_HOME - If exists
then driver create neo_compiler_cache folder, if
not then driver checks HOME directory. If each NEO_CACHE_DIR,
XDG_CACHE_HOME and HOME are not set then compiler cache is disabled.
Current support is for Linux only.
Signed-off-by: Diedrich, Kamil <kamil.diedrich@intel.com>
Related-To: NEO-4262
- allocateGraphicsMemoryUsingKmdAndMapItToCpuVA in case of no compression
- allocate32BitGraphicsMemoryImpl in case of allocate by KMD
remove redundant ctor of StorageInfo class
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
populate memory info based on mem usage and gts info
propagate error from xeWaitUserFence function
Related-To: NEO-7931
Co-authored-by: Francois Dugast <francois.dugast@intel.com>
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
use StackVec instead of unordered map
resize container at MemoryManager's creation time
Related-To: NEO-7925
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
in most cases we need to iterate over engines associated to single root device
Related-To: NEO-7925
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Related-To: LOCI-4172, LOCI-4305, LOCI-4306
- Create a new IPC Memory handle upon call to getIpcMemHandle if the
previous handle has been freed.
- Release the Ipc Memory Handle when zeMemPutIpcHandle is called.
- Create a new IPC Handle for tracking thru zeMemGetAllocProperties
when ze_external_memory_export_fd_t is used.
- Convert FD to opaque IPC handle and IPC Handle to FD.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
Current flow will be to have one synchronization point
config.file. Read remains unblocking, only write(caching)
operation will be blocking (lock on config.file)
Related-To: NEO-4262
Signed-off-by: Diedrich, Kamil <kamil.diedrich@intel.com>
- Check allocation root device index during eviction
- Wait for and marked allocation only from the current root device index
Related-To: NEO-7920
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
Enable for zebin format but not CM kernels.
Use heuristic of simdSize == 1 to detect CM kernels.
Related-To: NEO-7712
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
The memadvise with preferred location for kmd-migrated shared allocation
is set to device associated with cmd list by default to migrate data
to lmem on non-atomic gpu page fault as well (for performance reasons).
Related-To: NEO-7252
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Extract distinct steps as dedicated functions, especially when the code
is duplicated. This eases analysis of the logic and highlights
differences between callers of a common code.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
Unify fp16/fp32/fp64 across all platforms. The capabilities indicated by
those flags now refer to both emulated and native-supported (HW) ones:
- Global/local atomic load: HW support on all platforms (handled by native
i16 atomic or) for FP16, FP32 and FP64.
- Global/local atomic store: HW support on all platforms (handled by
native i16 atomic exchange) for FP16, FP32 and FP64.
- Global/local atomic compare/exchange: HW support on all platforms
for FP32.
- Global/local atomic min/max: Emulation support on all platforms for
FP64, HW support on all platforms for FP32, HW support on XE+ platforms
and emulation support on all others for FP16.
- Global atomic add: HW support for PVC+ platforms, emulation support on
all other platforms for FP64, HW support on XE+ platforms and emulation
support on all other platforms for FP32.
- Local atomic add: Emulation on all platforms for both FP64 and FP32.
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-7734
- caps check is not needed when link engines are not available for
product
Related-To: NEO-7886
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
Ocloc supports passing hw ip version value to -device arg in
the form of major.minor.revision.
This change adds support for directly passed value as uint32_t as well.
Support added for single and fat binary.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7903
Improves performance for benchmarks with KMD-migrated shared allocation
in scenarios with ZE_AFFINITY_MASK=0.1.
Related-To: NEO-7881
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
In the case of mtl+ platforms, the returned config value
should equal the hardware ip version value.
This change fixes situations where some config has not been
added and in this case we returned an unknown value.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-7738
If indirect accesses in kernel are not detected by IGC, indirect
allocations will not be made resident for this kernel.
Related-To: NEO-7712
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Add new regkey KMDSupportForCrossTileMigrationPolicy
(disabled by default, in absence of KMD suppport for cross-tile migrations)
to control placement of shared allocation and memory prefetch behavior.
Related-To: NEO-7885
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
On linux OfflineDumpContextId consists of
32b processId in bits 63:32
32b drmContestId in bits 31:0
Also cache linux implementation of getProcessId since
the value is constant.
Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Call VM prefetch ioctl on all VMs for the KMD to apply
a synchronoues bind operation of buffer objects on all VMs.
Related-To: NEO-7841
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Use cpu copy for globals surface when allocated through svm, allocation
not set as lockable but locking allocation succeeds.
Make sure gfx allocations is unlocked after copy.
Related-To: NEO-7796
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Apply the KMD advise with preferred device location for KMD-migrated
shared allocation to migrate to lmem on every GPU page fault
(default KMD migration policy).
Related-To: NEO-7851
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Use cpu copy for globals surface when allocated through svm, allocation
not set as lockable but locking allocation succeeds.
Related-To: NEO-7796
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
zeCommandQueueExecuteCommandLists return ZE_RESULT_ERROR_UNKNOWN when OOM
in some scenario of direct submission.
Related-To: NEO-7840
Signed-off-by: Pan Zhenjie <zhenjie.pan@intel.com>
Keep resolving with semaphores if multiple (>2) queues are submitting to
the same CSR. In such case, semaphores allow concurrent execution while
pipecontrols would serialize it.
Related-To: NEO-7321
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Topology map was only being created when ZET_ENABLE_PROGAM_DEBUGGING was
set. This was not correct. Now it is unconditionally created at init,
and debug attach will fail if it is not valid.
Related-to: LOCI-3937
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
- Explicitly force unbind of Buffer Objects during unmap to ensure that
Buffer Objects can be reused in the same application.
Related-To: LOCI-4162
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
It is possible that a module has so many kernels that the 4GB limit of
GPU VA is depleted when each kernel allocates a 64 KB page for its own
ISA. In such case, propagate the ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY to
the API caller to indicate the actual problem.
Currently such scenario is not detected, the execution advances a bit
further and the following crashes do not let the user to easily
understand what happened.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
Select ccs engine for usm device and shared allocations
(i.e. for transfers from local to local).
Related-To: NEO-7252
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Do not make indirect allocations resident if kernel does not use
indirect access.
For both level zero and opencl.
Currently disabled by default, enable with debug flag
DetectIndirectAccessInKernel
Related-To: NEO-7712
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Related-To: LOCI-3871
- Relaxed the Virtual Memory Reservation to allow pStart and not fail if
the pStart value is not obtained.
- Moves checks on pStart to the user to check and determine if they want
to re-reserve or use the address allocated.
- Changed reserveGpuAddress to use unit64_t type to allow internal
address range structure assignment without cast.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
reuse EncodeEnableRayTracing in CommandStreamReceiver
add method to determine need for 48b resource flag for RT allocations
Related-To: NEO-7606
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
to guarantee that all subblt got complete for previous copy
affect xe hpg
Related-To: NEO-7450
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
This patch adds support to open and close the drm fd,
so that more granular control of drm fd could be
achieved.
Related-To: LOCI-3986
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
In level_zero/sysman directory This change:
- Adds support for accessing linux based filesystem
- Add support for telemetry
- Add support for LinuxSysmanImp
Related-To: LOCI-3889
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
Propogate error codes from ioctl failure properly up the layers
so that we skip exposing bad root devices.
Related-To: NEO-7709
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
- stateless mocs is present in all state base address commands
- select GPGPU pipeline is present in all pipeline select commands
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Related-To: LOCI-3871
- Enabled allocation of specified base address in the targeted heap.
- Enabled virtual memory reservations to grow by allocating at the start
of the heap vs the end of the heap.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
When one process had exported and then opened IPC handle
of memory, then close function was called twice for the
same BO handle. It caused debugBreak() and aborted
an application.
This change allows multiple separate BOs to share one
handle. The last shared handle owner calls close() function.
Related-To: NEO-7200
Signed-off-by: Wrobel, Patryk <patryk.wrobel@intel.com>
- Added support for mapping any portion of a virtual allocation to a
physical mapping with a lookup function for reserved virtual addresses.
- Added support for multiple mappings linked to the same virtual
reservation.
- Fixed bug with 64 bit addresses on windows with invalid addresses
passed to the user.
Related-To: LOCI-3904, LOCI-3914, LOCI-3931
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>