Improve L0 fill operations by copying the pattern using
two kernels: one that copies four bytes at a time, and one
that takes care of the remainder. Additionally, a new
allocation is created to fill up at least a cacheline.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
store reference to std of root device indices and device bitfields
store NEO::Device in USM properties
Related-To: NEO-3691
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
This instead of when the associated module is created, to avoid
allocating memory for kernels that are never created nor used.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Shared-allocations are currently migrated to GPU by the page-fault
manager when calling executeCommandLists. Allocations to migrate are
taken from the lists container. However, if a shared-allocation
has been made resident with zeContextMakeMemoryResident(), it is not
added to the list container, and hence it is not migrated to device.
So, add a container of resident allocations to the driver and migrate
them along with the other allocations.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Add experimental extension to set global work offest in L0.
Current L0 specification does not have interface to export
experimental function symbols, so for now, applications need
to find the symbol like with dlsym on Linux.
A blackbox test showing functionality is also added.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Add support for device and shared allocations that use the
ZE_DEVICE_MEM_ALLOC_FLAG_BIAS_UNCACHED flag, whether the
kernel using the memory is stateless or statefull.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
In this example, two processes are launched on different devices
if more than one device is detected. Then, P2P capabilities are
queried through zeDeviceCanAccessPeer().
If P2P capabilities are available, then an IPC memory handle is
exchanged from server to client, and the client process running on
device 1 copies data from its buffer (allocated on device 1) to
the buffer exported by the server (allocated on device 0).
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
In this example, two processes are launched on the same device,
and an IPC memory handle is exchanged from server to client.
Then, the client process running copies data from its buffer
to the buffer exported by the server.
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Update usage of SUPPORTED_IMAGES flag and do not use images when disabled.
Use SUPPORTED_2_0 only on fully OCL 2.1 conformant platforms.
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>