Pass hwInfo to isHeaplessModeEnabled and isForceBindlessRequired functions.
Related-To: NEO-14526
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
A change related to the tbx fault manager
incorrectly removed a switch case from
`AubHelper::isOneTimeAubWritableAllocationType`.
This fixes that and refactors some APIs to prevent
similar mistakes from happening again by cleaning
up logic.
Addresses show stopper for pre-si pytorch workflows.
Resolves: NEO-14399
Signed-off-by: Jack Myers <jack.myers@intel.com>
Test program in the linked, related issue
is crashing in tbx mode. Tbx server indicated
upload of invalid memory was made before exit.
Running with debug messages showed that the
problematic upload was an svmcpu buffer when
running neo with separate cpu and gpu
buffers for shared memory management.
Using this info, the problem was narrowed down
to a missing unprotect call in page fault manager
related code, resulting in a protected(invalid)
memory region getting uploaded to tbx.
It is unclear yet why this unprotect call was not made,
since other svmcpu buffers were uploaded without issue.
This hotfix forces the unprotect call in the fault handler,
which allows the test program to run to completion. However,
there is now a failing test case.
Considering the critical nature of the associated
NEO issue and that this patch should unblock
the work depending on the fix, this hotfix should
get merged regardless of the failing test case.
In the meantime, I will continue triaging the
failing test and will implement a proper fix
once the root cause is isolated.
Related-To: NEO-13404
Signed-off-by: Jack Myers <jack.myers@intel.com>
- command list append state is managed from internal queue and can be skipped
- initial state configuration should be processed by both kernel and non-kernel
- only kernel operation can process required state, as non-kernel cannot change
Related-To: NEO-10356
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Related-To: NEO-13870
Currently all monitor fences are triggering
interrupt due to Notify Enable field.
With this change, such field is programmed
right before KMD wait.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
- add new enum type for command list flush from immediate
- add new argument for flushing immediate command list - regular command list
- add capability to provide additional stream for epilogue commands
- add pointer to provide external csr mutex to lock both execution and flush
Related-To: NEO-10356
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Do not check if ULLS light is active during every Csr::makeResident
call. Store that information once during ULLS init.
Related-To: NEO-13922
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.
Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>