Pass hwInfo to isHeaplessModeEnabled and isForceBindlessRequired functions.
Related-To: NEO-14526
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
Test program in the linked, related issue
is crashing in tbx mode. Tbx server indicated
upload of invalid memory was made before exit.
Running with debug messages showed that the
problematic upload was an svmcpu buffer when
running neo with separate cpu and gpu
buffers for shared memory management.
Using this info, the problem was narrowed down
to a missing unprotect call in page fault manager
related code, resulting in a protected(invalid)
memory region getting uploaded to tbx.
It is unclear yet why this unprotect call was not made,
since other svmcpu buffers were uploaded without issue.
This hotfix forces the unprotect call in the fault handler,
which allows the test program to run to completion. However,
there is now a failing test case.
Considering the critical nature of the associated
NEO issue and that this patch should unblock
the work depending on the fix, this hotfix should
get merged regardless of the failing test case.
In the meantime, I will continue triaging the
failing test and will implement a proper fix
once the root cause is isolated.
Related-To: NEO-13404
Signed-off-by: Jack Myers <jack.myers@intel.com>
- command list append state is managed from internal queue and can be skipped
- initial state configuration should be processed by both kernel and non-kernel
- only kernel operation can process required state, as non-kernel cannot change
Related-To: NEO-10356
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.
Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.
Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
This commit addresses a bug in the previous implementation where almost all once
writable types, except `gpuTimestampBuffers`, were incorrectly enabled for TBX
faultable checks. The fix ensures that only the subset of once writable
types that are also lockable are considered TBX faultable, using the lockable
check to avoid manual exceptions and re-inventing the wheel.
Changes:
- Updated `isAllocTbxFaultable` method to check if the allocation type is
lockable in addition to being once writable.
- Refactored unit tests to include separate checks for lockable and non-lockable
allocation types.
Performance optimization:
- Removed unnecessary memory data erasure in `handlePageFault` to avoid constant
erase/insert operations, leveraging the O(1) search time of unordered maps.
Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
Patch #34223 introduced the TbxPageFaultManager for handling
uploads/downloads of host buffers to the Tbx server, ensuring
host memory is kept consistent between the host and device,
even after multiple alternating writes from the host and gpu.
This patch enable fault handling for all `isAubOnceWritable`
types.
Minor exception for gpuTimestampBuffers as enabling this type
seems to break things in real-world use cases outside of ULTs.
Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>