Commit Graph

968 Commits

Author SHA1 Message Date
Lukasz Jobczyk
6f4a56d440 refactor: pass product helper to isFenceAllocationRequired
Related-To: NEO-14642

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-04-28 14:09:02 +02:00
Szymon Morek
3596522637 refactor: remove unused logic in ULLS controller
Related-To: NEO-13843

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-04-17 18:35:20 +02:00
Jaroslaw Warchulski
3e1aa33924 refactor: cleanup headers
Related-To: NEO-5548
Signed-off-by: Jaroslaw Warchulski <jaroslaw.warchulski@intel.com>
2025-04-14 14:59:40 +02:00
Mateusz Hoppe
3204411aca refactor: use deviceBitfield from CSR when creating engine
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2025-04-11 17:30:29 +02:00
Filip Hazubski
504440fc4d feature: Add ftrHeaplessMode flag
Pass hwInfo to isHeaplessModeEnabled and isForceBindlessRequired functions.

Related-To: NEO-14526

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-04-02 21:06:05 +02:00
Brandon Yates
a48d66ad75 feature: Add programExceptions stub to CSR
Related-to: NEO-12967

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2025-04-01 18:33:40 +02:00
Szymon Morek
62964a0b08 fix: invalidate caches when heap is placed into reuse list
Related-To: NEO-9004

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-03-31 12:30:29 +02:00
Jack Myers
0aa2c4f0cb feature: allow removal of heapful code paths
Related-To: NEO-13007

Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-03-27 01:34:35 +01:00
Lukasz Jobczyk
60b551758c performance: Adjust waitpkg threshold for discrete devices
Related-To: NEO-14336

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-26 14:59:19 +01:00
Kamil Kopryk
73795ced64 refactor: add setupTimestampPacketFlushL3 function
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-03-24 13:10:03 +01:00
Lukasz Jobczyk
54cb0e24f8 performance: Switch waitpkg use to tpause for ULLS light
Related-To: NEO-13922, NEO-14336

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-21 18:09:37 +01:00
Lukasz Jobczyk
8a85a96ed2 feature: Add 3-level wait scheme with tpause intrinsic
Related-To: NEO-14336

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-21 12:12:57 +01:00
Jack Myers
0e25970853 fix: re-add switch case for once writable query
A change related to the tbx fault manager
incorrectly removed a switch case from
`AubHelper::isOneTimeAubWritableAllocationType`.

This fixes that and refactors some APIs to prevent
similar mistakes from happening again by cleaning
up logic.

Addresses show stopper for pre-si pytorch workflows.

Resolves: NEO-14399
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-03-19 09:54:54 +01:00
Compute-Runtime-Validation
5f7f0dd785 Revert "performance: Enable waitpkg"
This reverts commit 8ec5434460.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-03-18 13:28:29 +01:00
Maciej Plewka
36fa6d66ae fix: lock csr in stopDirectSubmission if needed
Related-To: NEO-13875, NEO-14143, HSD-16026538384, HSD-16026780358
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2025-03-18 09:30:38 +01:00
Jack Myers
5f78147e16 fix: hotfix for svmcpu tbx uploads
Test program in the linked, related issue
is crashing in tbx mode. Tbx server indicated
upload of invalid memory was made before exit.

Running with debug messages showed that the
problematic upload was an svmcpu buffer when
running neo with separate cpu and gpu
buffers for shared memory management.

Using this info, the problem was narrowed down
to a missing unprotect call in page fault manager
related code, resulting in a protected(invalid)
memory region getting uploaded to tbx.

It is unclear yet why this unprotect call was not made,
since other svmcpu buffers were uploaded without issue.

This hotfix forces the unprotect call in the fault handler,
which allows the test program to run to completion. However,
there is now a failing test case.

Considering the critical nature of the associated
NEO issue and that this patch should unblock
the work depending on the fix, this hotfix should
get merged regardless of the failing test case.

In the meantime, I will continue triaging the
failing test and will implement a proper fix
once the root cause is isolated.

Related-To: NEO-13404
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-03-14 04:47:21 +01:00
Zbigniew Zdanowicz
ddc0b0d03b feature: disable flat ring buffer for command list append operation
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-13 06:06:55 +01:00
Lukasz Jobczyk
8ec5434460 performance: Enable waitpkg
Resolves: NEO-14336

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-11 14:43:01 +01:00
Zbigniew Zdanowicz
cd904269ed fix: request for task count should enable monitor fence dispatch
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-10 18:25:35 +01:00
Lukasz Jobczyk
53062056ec performance: Enable wait pkg for ULLS light
Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-07 12:18:38 +01:00
Damian Tomczak
81b0cac65f fix: raytracing heapless missing allocation
Related-to: NEO-12737

Signed-off-by: Damian Tomczak <damian.tomczak@intel.com>
2025-03-06 17:26:09 +01:00
Brandon Yates
64b027f71c feature: Add gfxCoreHelper for StateSip required
Related-to: NEO-12967

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2025-03-05 20:24:17 +01:00
Zbigniew Zdanowicz
ae1eb076b7 feature: add optional epilogue to flush task method
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-05 15:15:59 +01:00
Zbigniew Zdanowicz
27d7d72033 feature: add pipeline state management for append command list operation
- command list append state is managed from internal queue and can be skipped
- initial state configuration should be processed by both kernel and non-kernel
- only kernel operation can process required state, as non-kernel cannot change

Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-04 20:43:22 +01:00
Szymon Morek
ff4da67979 fix: signal notify field before KMD wait
Related-To: NEO-13870

Currently all monitor fences are triggering
interrupt due to Notify Enable field.
With this change, such field is programmed
right before KMD wait.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-03-04 15:19:16 +01:00
Zbigniew Zdanowicz
db99c25c79 feature: add support to dispatch epilogue commands into dedicated stream
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-03 14:39:34 +01:00
Zbigniew Zdanowicz
08b13750a1 fix: set stall cmd flag for bcs flush task count flag
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-02-28 12:24:51 +01:00
Zbigniew Zdanowicz
cae3bb1d0a feature: add internal interfaces to manage all dispatch models of command lists
- add new enum type for command list flush from immediate
- add new argument for flushing immediate command list - regular command list
- add capability to provide additional stream for epilogue commands
- add pointer to provide external csr mutex to lock both execution and flush

Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-02-28 12:02:22 +01:00
Lukasz Jobczyk
be946ae56c performance: Optimize make resident for ULLS light
Do not check if ULLS light is active during every Csr::makeResident
call. Store that information once during ULLS init.

Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-24 12:59:56 +01:00
Filip Hazubski
b60c02d597 fix: Add asserts to ensure NonCopyable and NonMovable n/n
Related-To: NEO-14068

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-19 11:36:24 +01:00
Filip Hazubski
4be1153253 fix: Remove pragma once from inl files
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-18 20:19:15 +01:00
Filip Hazubski
6b6202446b fix: Add asserts to ensure NonCopyable and NonMovable 3/n
Related-To: NEO-14068

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-18 17:16:03 +01:00
Bartosz Dunajski
c1f2ff1ad6 fix: disable batched dispatch mode in aub csr
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2025-02-18 13:45:15 +01:00
Jack Myers
c26d24e555 fix: tbx page fault manager hang issue
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.

Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-02-18 05:05:38 +01:00
Filip Hazubski
4c7900008f refactor: Change wording from NonCopyableOrMovable to NonCopyableAndNonMovable
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-17 14:19:10 +01:00
Lukasz Jobczyk
356d89d608 performance: Disable USM cleaner for ULLS light
Realted-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-14 12:38:16 +01:00
Lukasz Jobczyk
c7c7ae9d49 refactor: Remove redundancy around gemCloseWorker in csr
Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-13 21:42:28 +01:00
Lukasz Jobczyk
bc2b49b958 feature: Introduce ULLS light
Add core implementation of ULLS without VM_BIND interface aka ULLS
light.

Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-12 17:52:02 +01:00
Compute-Runtime-Validation
116f7270be Revert "fix: tbx page fault manager hang issue"
This reverts commit 7d4e70a25b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-02-12 10:38:05 +01:00
Jack Myers
7d4e70a25b fix: tbx page fault manager hang issue
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.

Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-02-12 02:19:37 +01:00
Michał Pryba
9119a1e802 refactor: adjust file names after pre-gen12 removal 6/n
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-06 14:00:01 +01:00
Michał Pryba
2cdd9f46cd refactor: adjust file names after pre-gen12 removal 5/n
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-06 08:24:40 +01:00
Michał Pryba
75bc74089b refactor: adjust file names after pre-gen12 removal 2/3
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-03 15:31:51 +01:00
Szymon Morek
254e7c5c6a fix: set notify enable flag when flushing monitor fence
Related-To: NEO-13848

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-01-24 22:19:24 +01:00
Bartosz Dunajski
c2dbdb6797 refactor: move blit post sync data to BlitProperties
Related-To: NEO-13003

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2025-01-22 18:35:21 +01:00
Jack Myers
d62122a656 fix: exceptions to TBX faultable types
This commit addresses a bug in the previous implementation where almost all once
writable types, except `gpuTimestampBuffers`, were incorrectly enabled for TBX
faultable checks. The fix ensures that only the subset of once writable
types that are also lockable are considered TBX faultable, using the lockable
check to avoid manual exceptions and re-inventing the wheel.

Changes:

- Updated `isAllocTbxFaultable` method to check if the allocation type is
lockable in addition to being once writable.
- Refactored unit tests to include separate checks for lockable and non-lockable
allocation types.

Performance optimization:

- Removed unnecessary memory data erasure in `handlePageFault` to avoid constant
erase/insert operations, leveraging the O(1) search time of unordered maps.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-17 00:52:49 +01:00
Jack Myers
0b2ac4d331 feature: Tbx faults for all once writable types
Patch #34223 introduced the TbxPageFaultManager for handling
uploads/downloads of host buffers to the Tbx server, ensuring
host memory is kept consistent between the host and device,
even after multiple alternating writes from the host and gpu.

This patch enable fault handling for all `isAubOnceWritable`
types.

Minor exception for gpuTimestampBuffers as enabling this type
seems to break things in real-world use cases outside of ULTs.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-16 01:43:19 +01:00
Krzysztof Sprzaczkowski
a17745532c performance: Move preemption allocation init to the first submission
Related-To: NEO-12323
Signed-off-by: Krzysztof Sprzaczkowski <krzysztof.sprzaczkowski@intel.com>
2025-01-15 20:22:50 +01:00
Compute-Runtime-Validation
af031ee0e3 Revert "performance: align structures for 64-bit platforms"
This reverts commit 9f07f56f7f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-01-15 09:02:01 +01:00
Vysochyn, Illia
ca72dff1ab feature: Add missing pipelined EU thread arbitration on Xe3
Related-To: NEO-13682

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2025-01-15 08:24:43 +01:00