Addresses regressions from the reverted merge
of the tbx fault manager for host memory.
Recursive locking of mutex caused deadlock.
To fix, separate tbx fault data from base
cpu fault data, allowing separate mutexes
for each, eliminating recursive locks on
the same mutex.
By separating, we also help ensure that tbx-related
changes don't affect the original cpu fault manager code
paths.
As an added safe guard preventing critical regressions
and avoiding another auto-revert, the tbx fault manager
is hidden behind a new debug flag which is disabled by default.
Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
Addresses regressions from the reverted merge
of the tbx fault manager for host memory.
This fixes attempts by the tbx fault manager
to protect/unprotect host buffer memory, even
if the host ptr was not driver-allocated.
In the case of the smoke test that triggered
the critical regression, clCreateBuffer was
called with the CL_MEM_USE_HOST_PTR flag.
The subsequent `mprotect` calls on the
provided host ptr then failed.
Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
In TBX mode, the host could not write to host buffers after access from device
code due to the lack of a migration mechanism post-initial TBX upload.
Migration is unnecessary with real hardware, but required for TBX.
This patch introduces a new page fault manager type that extends the original
CPU fault manager, enabling automatic migration of host buffers in TBX mode.
Refactoring was necessary to avoid diamond inheritance, achieved by using a
template parameter as the base class for OS-specific fault managers.
Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
Related-To: NEO-13340
When regular copy CSR has enabled direct submission,
stop it before migration on internal CSR.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
Related-To: NEO-13340
When regular copy CSR has enabled direct submission,
stop it before migration on internal CSR.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
Enable eviction of CPU side USM allocation for UMD migrations on Windows.
Reverts incorrect auto-revert commit 218de586a4f28b1de3e983b9006e7a99d3a4d10e.
Related-To: NEO-8015
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Enable eviction of CPU side USM allocation for UMD migrations on Windows.
Related-To: NEO-8015
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Enable eviction of CPU side USM allocation for UMD migrations on Windows.
Related-To: NEO-8015
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Have makeResident return error to the caller, instead of always
SUCCESS. This will allow interfaces like zeContextMakeMemoryResident
to fail properly.
Additionally, change the parsing of MemoryOperationsStatus from
ZE_RESULT_ERROR_OUT_OF_HOST_MEMORY to
ZE_RESULT_ERROR_OUT_OF_DEVICE_MEMORY, since when making resources
resident, it is the device running out of memory, instead of the
host.
Related-To: LOCI-4443
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
- Added support for disabling CPU migration of USM memory given
ZE_MEMORY_ADVICE_SET_READ_MOSTLY && ZE_MEMORY_ADVICE_SET_PREFERRED_LOCATION
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
Dates corrected in copyright headers to reflect original publication date
(2018 for OpenCL, 2020 for Level Zero).
Signed-off-by: lgotszal <lukasz.gotszald@intel.com>
Make sure UNCACHED flags are translated into setting the MOCS index
for uncaching L3.
Related-To: NEO-5500
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>