This to follow specification, which says:
zeMemOpenIpcHandle:
- Multiple calls to this function with the same IPC handle will return
unique pointers.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
- Enable support for L0 Virtual Memory reservation on Linux and Windows.
- Excludes support for Linux to allow pStart option
Related-To: LOCI-3397, LOCI-1543
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
If a handle cannot be obtained, like PRIME_HANDLE_TO_FD, then
properly check for the error and propagate it upwards.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Ensure memory prefetch be applied in every execution of command list.
Related-To: NEO-6740
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This patch force KMD allocation path for USM shared
Additionally we force 64kb page from lock which is
required to properly program GPU VA
Related-To: NEO-6913
Signed-off-by: Kamil Diedrich kamil.diedrich@intel.com
If a handle cannot be obtained, like PRIME_HANDLE_TO_FD, then
properly check for the error and propagate it upwards.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Allow to: disable performance hints, make allocation lockable
Used in BufferPoolAllocator
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Allocations of buffers <= 64KB will be lockable, to
allow copying through locked pointer.
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
PVC platform with no support for atomic operations on system memory
must always allocate buffers in local memory to avoid atomic access violation.
Note: the feature is being implemented under the new registry key
AllocateBuffersInLocalMemoryForMultiRootDeviceContexts (disabled by default)
Related-To: NEO-7092
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This change replaces unneeded copying of std::vectors
with usage of const references. Furthermore, it adds
reserve() call before filling the container via push_back().
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Discarding RAII lock returned from function almost always
is a bug. This change introduces usage of [[no_discard]]
attribute from C++17 to prevent such misues.
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Do not use aligned size when storing allocation
Trim allocation cache before deleting devices
Related-To: NEO-6893
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
- Properly check for IPC event handle flag to determine if the event
pool memory is sharable between processes.
- Given Host Visible Event Pool, a check is done to determine if the
Host memory can be shared between the processes.
- Enabled handling if Event Host Memory is shareable for DRM
- If Event Pool Memory is Not shareable, then retrieving the IPC Event
Pool Handle returns unsupported.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
Add virtual deconstructor to OsHandle and deconstructor to OsHandleLinux
Add override keyword to destructor
Add overriding deconstructor to OsHandleWin
Add newline before private members
https://github.com/intel/compute-runtime/pull/550
Signed-off-by: Cameron S Murtagh <cameron.murtagh00@gmail.com>
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
EOT WA requires allocating last 64KB of kernel heap and putting EOT
signature at the last 16 bytes of kernel heap
Related-To: NEO-7099
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Remove the restriction on USM allocation created in a single local memory region
with latest KMD fix for cross tile migration thrashing b/t lmem (dii-3516)
Related-To: NEO-6909
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
This commit fixes problem with untransfered shared usm memory to gpu
when there is submit to gpu trigerred by user event. Also there is a fix
for dead lock problem caused by mixed orders of locking mutexes in csr
and in direct submission controller.
Related-To: NEO-6762
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
With this commit on DG2 32bit driver will check if passed host ptr for
clEnqueueReadBuffer is write combined memory. If check will be true copy
will be make on CPU.
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
With flag enabled, when app calls freeSVMAlloc on device usm allocation,
don't free it immediately but save it,
and try to use it on subsequent allocations.
This allocation cache will be trimmed if an allocation fails.
Related-To: NEO-6893
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Use enum class for MemoryPool in GraphicsAllocation
This change will ensure that GA is constructed in the proper way
- Rename namespace for isSystemMemoryPool method
- Add method getMemoryPoolString for logging actual pool which is in used
- Remove wrong pattern in GraphicsAllocation constructor
Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
when using implicit scaling, 2 dma-buf handles, one per tile, are
needed to support dma access from peer.
Related-To: LOCI-3122
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Ensure KMD migrations for USM allocations to occur between smem and lmem only
Related-To: NEO-6969
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Define single .clang-tidy configuration with all used checks and use
NOLINT to selectively silence tool. That way cleanup should be easier.
third_part/ has its own configuration that disables clang-tidy for this
folder.
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
When using implicit scaling, device allocations may have
more than one internal allocation created internally. In that case,
a separate dma-buf handle per internal allocation needs to be
exported.
So introduced two driver experimental extensions to export and
import more than one IPC handle:
- zexMemGetIpcHandles
- zexMemOpenIpcHandles
Related-To: LOCI-2919
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
This is fixed reupload of this commit after auto revert
With this commit OpenCL will track if external host memory is used from
few threads and will secure to update task count in all threads before
destroing allocation.
Resolves: NEO-6807
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
With this commit OpenCL will track if external host memory is used from
few threads and will secure to update task count in all threads before
destroing allocation.
Resolves: NEO-6807
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
- different physical storage for every HW context
- adds support for debugging with implicit scaling on
- reorganize tests
Relates-To: NEO-6883
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Stack vector will not cause dynamic allocations in most circumstances
ie. number of root device indices not more than 16
Related-To: NEO-6837
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This change introduces checking of the return value
of wait function in case of blocking version of
evictUnusedAllocations(). Furthermore, it propagates
the error to the callers. It contains also ULTs.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
There is no need to force 2MB alignment for CPU allocation in dual
storage usage. Additionaly for WSL this will allow to avoid usage of
malloc in driver path.
Relates-To: NEO-6620
Signed-off-by: Kamil Diedrich <kamil.diedrich@intel.com>
require 48bit resource for ring/semaphore buffer
for multi tile allocations select first tile
for single tile allocation select preferred tile
Related-To: NEO-6698
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
In direct submission scenario command/ring/semaphore buffer allocations
are placed in the same memory bank to ensure that their memory is updated in
correct order
Related-To: NEO-6698
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
When making graphics allocations resident in multi-GPU scenarios,
we should make them resident only if there's an allocation for that
device. So return appropriate null pointer and skip it.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Add a per-instance SVMAllocsManager::nonGpuDomainAllocs container for
all allocations to be removed in
moveAllocationsWithinUMAllocsManagerToGpuDomain. This approach replaces
the current iterative search and performs the task faster.
Add 7 new unit-tests to verify the functionality related to
nonGpuDomainAllocs container, both in expected and unexpected/synthetic
scenarios.
For UTs replace a dummy unifiedMemoryManager pointer with a pointer to
an instace of SVMAllocsManager, otherwise a SegFault error is thrown at
the end of tests.
Perform overall cleanup in related tests implementation, includes but
not limited to removal of:
- givenInitialPlacementGpu\
WhenMovingToGpuDomainThenFirstAccessDoesNotInvokeTransfer
As it is fully covered by:
givenAllocationMovedToGpuDomain\
WhenVerifyingPagefaultThenAllocationIsMovedToCpuDomain
- givenInitialPlacementGpu\
WhenVerifyingPagefaultThenFirstAccessDoesNotInvokeTransfer
As it is fully covered by:
givenTbxAndnitialPlacementGpu\
WhenVerifyingPagefaultThenMemoryIsUnprotectedOnly
Finally, reduce code duplication where possible.
Related-To: NEO-6658
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
This feature is disabled by default, controlled with the knob
AppendMemoryPrefetchForKmdMigratedSharedAllocations
Related-To: NEO-6740
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
- Add getMemoryManagerType to check which memory manager has been init
to determine if Linux + WDDM memory manager is in use.
- Add isNTHandle to test and verify if a handle is an NT handle during
L0 Open IPC Handle.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
Make them resident directly instead of populating residency container
Remove finds, not needed, CSR resolves duplicates at makeResident calls
Observed gain is 32x for 10k indirect allocations.
Co-authored-by: Michal Mrozek <michal.mrozek@intel.com>
Co-authored-by: Dominik Dabek <dominik.dabek@intel.com>
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This patch implements the fallback method for Device UUID
using the BDF of the device for all product families.
Related-To: LOCI-2827
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Add support for ZE_DRIVER_MEMORY_FREE_POLICY_EXT_FLAG_BLOCKING_FREE
added in v1.3.
Related-To: LOCI-2672
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>