Commit Graph

959 Commits

Author SHA1 Message Date
Kamil Kopryk 73795ced64 refactor: add setupTimestampPacketFlushL3 function
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-03-24 13:10:03 +01:00
Lukasz Jobczyk 54cb0e24f8 performance: Switch waitpkg use to tpause for ULLS light
Related-To: NEO-13922, NEO-14336

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-21 18:09:37 +01:00
Lukasz Jobczyk 8a85a96ed2 feature: Add 3-level wait scheme with tpause intrinsic
Related-To: NEO-14336

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-21 12:12:57 +01:00
Jack Myers 0e25970853 fix: re-add switch case for once writable query
A change related to the tbx fault manager
incorrectly removed a switch case from
`AubHelper::isOneTimeAubWritableAllocationType`.

This fixes that and refactors some APIs to prevent
similar mistakes from happening again by cleaning
up logic.

Addresses show stopper for pre-si pytorch workflows.

Resolves: NEO-14399
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-03-19 09:54:54 +01:00
Compute-Runtime-Validation 5f7f0dd785 Revert "performance: Enable waitpkg"
This reverts commit 8ec5434460.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-03-18 13:28:29 +01:00
Maciej Plewka 36fa6d66ae fix: lock csr in stopDirectSubmission if needed
Related-To: NEO-13875, NEO-14143, HSD-16026538384, HSD-16026780358
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2025-03-18 09:30:38 +01:00
Jack Myers 5f78147e16 fix: hotfix for svmcpu tbx uploads
Test program in the linked, related issue
is crashing in tbx mode. Tbx server indicated
upload of invalid memory was made before exit.

Running with debug messages showed that the
problematic upload was an svmcpu buffer when
running neo with separate cpu and gpu
buffers for shared memory management.

Using this info, the problem was narrowed down
to a missing unprotect call in page fault manager
related code, resulting in a protected(invalid)
memory region getting uploaded to tbx.

It is unclear yet why this unprotect call was not made,
since other svmcpu buffers were uploaded without issue.

This hotfix forces the unprotect call in the fault handler,
which allows the test program to run to completion. However,
there is now a failing test case.

Considering the critical nature of the associated
NEO issue and that this patch should unblock
the work depending on the fix, this hotfix should
get merged regardless of the failing test case.

In the meantime, I will continue triaging the
failing test and will implement a proper fix
once the root cause is isolated.

Related-To: NEO-13404
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-03-14 04:47:21 +01:00
Zbigniew Zdanowicz ddc0b0d03b feature: disable flat ring buffer for command list append operation
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-13 06:06:55 +01:00
Lukasz Jobczyk 8ec5434460 performance: Enable waitpkg
Resolves: NEO-14336

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-11 14:43:01 +01:00
Zbigniew Zdanowicz cd904269ed fix: request for task count should enable monitor fence dispatch
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-10 18:25:35 +01:00
Lukasz Jobczyk 53062056ec performance: Enable wait pkg for ULLS light
Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-03-07 12:18:38 +01:00
Damian Tomczak 81b0cac65f fix: raytracing heapless missing allocation
Related-to: NEO-12737

Signed-off-by: Damian Tomczak <damian.tomczak@intel.com>
2025-03-06 17:26:09 +01:00
Brandon Yates 64b027f71c feature: Add gfxCoreHelper for StateSip required
Related-to: NEO-12967

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2025-03-05 20:24:17 +01:00
Zbigniew Zdanowicz ae1eb076b7 feature: add optional epilogue to flush task method
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-05 15:15:59 +01:00
Zbigniew Zdanowicz 27d7d72033 feature: add pipeline state management for append command list operation
- command list append state is managed from internal queue and can be skipped
- initial state configuration should be processed by both kernel and non-kernel
- only kernel operation can process required state, as non-kernel cannot change

Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-04 20:43:22 +01:00
Szymon Morek ff4da67979 fix: signal notify field before KMD wait
Related-To: NEO-13870

Currently all monitor fences are triggering
interrupt due to Notify Enable field.
With this change, such field is programmed
right before KMD wait.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-03-04 15:19:16 +01:00
Zbigniew Zdanowicz db99c25c79 feature: add support to dispatch epilogue commands into dedicated stream
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-03-03 14:39:34 +01:00
Zbigniew Zdanowicz 08b13750a1 fix: set stall cmd flag for bcs flush task count flag
Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-02-28 12:24:51 +01:00
Zbigniew Zdanowicz cae3bb1d0a feature: add internal interfaces to manage all dispatch models of command lists
- add new enum type for command list flush from immediate
- add new argument for flushing immediate command list - regular command list
- add capability to provide additional stream for epilogue commands
- add pointer to provide external csr mutex to lock both execution and flush

Related-To: NEO-10356

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2025-02-28 12:02:22 +01:00
Lukasz Jobczyk be946ae56c performance: Optimize make resident for ULLS light
Do not check if ULLS light is active during every Csr::makeResident
call. Store that information once during ULLS init.

Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-24 12:59:56 +01:00
Filip Hazubski b60c02d597 fix: Add asserts to ensure NonCopyable and NonMovable n/n
Related-To: NEO-14068

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-19 11:36:24 +01:00
Filip Hazubski 4be1153253 fix: Remove pragma once from inl files
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-18 20:19:15 +01:00
Filip Hazubski 6b6202446b fix: Add asserts to ensure NonCopyable and NonMovable 3/n
Related-To: NEO-14068

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-18 17:16:03 +01:00
Bartosz Dunajski c1f2ff1ad6 fix: disable batched dispatch mode in aub csr
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2025-02-18 13:45:15 +01:00
Jack Myers c26d24e555 fix: tbx page fault manager hang issue
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.

Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-02-18 05:05:38 +01:00
Filip Hazubski 4c7900008f refactor: Change wording from NonCopyableOrMovable to NonCopyableAndNonMovable
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-17 14:19:10 +01:00
Lukasz Jobczyk 356d89d608 performance: Disable USM cleaner for ULLS light
Realted-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-14 12:38:16 +01:00
Lukasz Jobczyk c7c7ae9d49 refactor: Remove redundancy around gemCloseWorker in csr
Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-13 21:42:28 +01:00
Lukasz Jobczyk bc2b49b958 feature: Introduce ULLS light
Add core implementation of ULLS without VM_BIND interface aka ULLS
light.

Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-12 17:52:02 +01:00
Compute-Runtime-Validation 116f7270be Revert "fix: tbx page fault manager hang issue"
This reverts commit 7d4e70a25b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-02-12 10:38:05 +01:00
Jack Myers 7d4e70a25b fix: tbx page fault manager hang issue
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.

Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-02-12 02:19:37 +01:00
Michał Pryba 9119a1e802 refactor: adjust file names after pre-gen12 removal 6/n
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-06 14:00:01 +01:00
Michał Pryba 2cdd9f46cd refactor: adjust file names after pre-gen12 removal 5/n
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-06 08:24:40 +01:00
Michał Pryba 75bc74089b refactor: adjust file names after pre-gen12 removal 2/3
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-03 15:31:51 +01:00
Szymon Morek 254e7c5c6a fix: set notify enable flag when flushing monitor fence
Related-To: NEO-13848

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-01-24 22:19:24 +01:00
Bartosz Dunajski c2dbdb6797 refactor: move blit post sync data to BlitProperties
Related-To: NEO-13003

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2025-01-22 18:35:21 +01:00
Jack Myers d62122a656 fix: exceptions to TBX faultable types
This commit addresses a bug in the previous implementation where almost all once
writable types, except `gpuTimestampBuffers`, were incorrectly enabled for TBX
faultable checks. The fix ensures that only the subset of once writable
types that are also lockable are considered TBX faultable, using the lockable
check to avoid manual exceptions and re-inventing the wheel.

Changes:

- Updated `isAllocTbxFaultable` method to check if the allocation type is
lockable in addition to being once writable.
- Refactored unit tests to include separate checks for lockable and non-lockable
allocation types.

Performance optimization:

- Removed unnecessary memory data erasure in `handlePageFault` to avoid constant
erase/insert operations, leveraging the O(1) search time of unordered maps.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-17 00:52:49 +01:00
Jack Myers 0b2ac4d331 feature: Tbx faults for all once writable types
Patch #34223 introduced the TbxPageFaultManager for handling
uploads/downloads of host buffers to the Tbx server, ensuring
host memory is kept consistent between the host and device,
even after multiple alternating writes from the host and gpu.

This patch enable fault handling for all `isAubOnceWritable`
types.

Minor exception for gpuTimestampBuffers as enabling this type
seems to break things in real-world use cases outside of ULTs.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-16 01:43:19 +01:00
Krzysztof Sprzaczkowski a17745532c performance: Move preemption allocation init to the first submission
Related-To: NEO-12323
Signed-off-by: Krzysztof Sprzaczkowski <krzysztof.sprzaczkowski@intel.com>
2025-01-15 20:22:50 +01:00
Compute-Runtime-Validation af031ee0e3 Revert "performance: align structures for 64-bit platforms"
This reverts commit 9f07f56f7f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-01-15 09:02:01 +01:00
Vysochyn, Illia ca72dff1ab feature: Add missing pipelined EU thread arbitration on Xe3
Related-To: NEO-13682

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2025-01-15 08:24:43 +01:00
Damian Tomczak 9a149b6da5 refactor: useGlobalHeaps naming unification
Related-to: NEO-12737

Signed-off-by: Damian Tomczak <damian.tomczak@intel.com>
2025-01-14 11:01:07 +01:00
Mateusz Jablonski 112abeeeef fix: don't adjust programmed per thread scratch size
when adjusting scratch space size then adjust only allocation size

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-10 11:35:50 +01:00
Mateusz Jablonski a3b6c1fa6d fix: correct thread/eu ratio for scratch to Xe2
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-09 22:42:36 +01:00
Szymon Morek f3c9362fc5 fix: check for gpu hang during wait for ring completion
Related-To: NEO-13490

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-01-09 18:44:25 +01:00
Jack Myers 7f9fadc314 fix: regression caused by tbx fault mngr
Addresses regressions from the reverted merge
of the tbx fault manager for host memory.

Recursive locking of mutex caused deadlock.

To fix, separate tbx fault data from base
cpu fault data, allowing separate mutexes
for each, eliminating recursive locks on
the same mutex.

By separating, we also help ensure that tbx-related
changes don't affect the original cpu fault manager code
paths.

As an added safe guard preventing critical regressions
and avoiding another auto-revert, the tbx fault manager
is hidden behind a new debug flag which is disabled by default.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-09 07:48:53 +01:00
Semenov Herman (Семенов Герман) 9f07f56f7f performance: align structures for 64-bit platforms
Signed-off-by: Semenov Herman (Семенов Герман) <GermanAizek@yandex.ru>
2025-01-09 06:03:39 +01:00
Mateusz Jablonski 5eece6d578 feature: add enableVariableRegisterSizeAllocation to StateComputeModeProperties
Related-To: NEO-12803

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-02 17:15:18 +01:00
Mateusz Jablonski 165c294590 refactor: extract methods to setup SCM state per context
per context properties are now set explicitly

Related-To: NEO-12803, NEO-13632
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-02 15:19:39 +01:00
Compute-Runtime-Validation ed24c07227 Revert "feature: add enableVariableRegisterSizeAllocation to StateComputeMode...
This reverts commit 9ccecb5a35.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-01-01 02:35:32 +01:00