Commit Graph

940 Commits

Author SHA1 Message Date
Lukasz Jobczyk
be946ae56c performance: Optimize make resident for ULLS light
Do not check if ULLS light is active during every Csr::makeResident
call. Store that information once during ULLS init.

Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-24 12:59:56 +01:00
Filip Hazubski
b60c02d597 fix: Add asserts to ensure NonCopyable and NonMovable n/n
Related-To: NEO-14068

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-19 11:36:24 +01:00
Filip Hazubski
4be1153253 fix: Remove pragma once from inl files
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-18 20:19:15 +01:00
Filip Hazubski
6b6202446b fix: Add asserts to ensure NonCopyable and NonMovable 3/n
Related-To: NEO-14068

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-18 17:16:03 +01:00
Bartosz Dunajski
c1f2ff1ad6 fix: disable batched dispatch mode in aub csr
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2025-02-18 13:45:15 +01:00
Jack Myers
c26d24e555 fix: tbx page fault manager hang issue
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.

Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-02-18 05:05:38 +01:00
Filip Hazubski
4c7900008f refactor: Change wording from NonCopyableOrMovable to NonCopyableAndNonMovable
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2025-02-17 14:19:10 +01:00
Lukasz Jobczyk
356d89d608 performance: Disable USM cleaner for ULLS light
Realted-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-14 12:38:16 +01:00
Lukasz Jobczyk
c7c7ae9d49 refactor: Remove redundancy around gemCloseWorker in csr
Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-13 21:42:28 +01:00
Lukasz Jobczyk
bc2b49b958 feature: Introduce ULLS light
Add core implementation of ULLS without VM_BIND interface aka ULLS
light.

Related-To: NEO-13922

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-02-12 17:52:02 +01:00
Compute-Runtime-Validation
116f7270be Revert "fix: tbx page fault manager hang issue"
This reverts commit 7d4e70a25b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-02-12 10:38:05 +01:00
Jack Myers
7d4e70a25b fix: tbx page fault manager hang issue
- Updated `isAllocTbxFaultable` to exclude `gpuTimestampDeviceBuffer` from being
faultable.
- Replaced `SpinLock` with `RecursiveSpinLock` in `CpuPageFaultManager` and
`TbxPageFaultManager` to allow recursive locking.
- Added unit tests to verify the correct handling of `gpuTimestampDeviceBuffer`
in `TbxCommandStreamTests`.

Related-To: NEO-13748
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-02-12 02:19:37 +01:00
Michał Pryba
9119a1e802 refactor: adjust file names after pre-gen12 removal 6/n
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-06 14:00:01 +01:00
Michał Pryba
2cdd9f46cd refactor: adjust file names after pre-gen12 removal 5/n
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-06 08:24:40 +01:00
Michał Pryba
75bc74089b refactor: adjust file names after pre-gen12 removal 2/3
Related-To: NEO-12681
Signed-off-by: Michał Pryba <michal.pryba@intel.com>
2025-02-03 15:31:51 +01:00
Szymon Morek
254e7c5c6a fix: set notify enable flag when flushing monitor fence
Related-To: NEO-13848

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-01-24 22:19:24 +01:00
Bartosz Dunajski
c2dbdb6797 refactor: move blit post sync data to BlitProperties
Related-To: NEO-13003

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2025-01-22 18:35:21 +01:00
Jack Myers
d62122a656 fix: exceptions to TBX faultable types
This commit addresses a bug in the previous implementation where almost all once
writable types, except `gpuTimestampBuffers`, were incorrectly enabled for TBX
faultable checks. The fix ensures that only the subset of once writable
types that are also lockable are considered TBX faultable, using the lockable
check to avoid manual exceptions and re-inventing the wheel.

Changes:

- Updated `isAllocTbxFaultable` method to check if the allocation type is
lockable in addition to being once writable.
- Refactored unit tests to include separate checks for lockable and non-lockable
allocation types.

Performance optimization:

- Removed unnecessary memory data erasure in `handlePageFault` to avoid constant
erase/insert operations, leveraging the O(1) search time of unordered maps.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-17 00:52:49 +01:00
Jack Myers
0b2ac4d331 feature: Tbx faults for all once writable types
Patch #34223 introduced the TbxPageFaultManager for handling
uploads/downloads of host buffers to the Tbx server, ensuring
host memory is kept consistent between the host and device,
even after multiple alternating writes from the host and gpu.

This patch enable fault handling for all `isAubOnceWritable`
types.

Minor exception for gpuTimestampBuffers as enabling this type
seems to break things in real-world use cases outside of ULTs.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-16 01:43:19 +01:00
Krzysztof Sprzaczkowski
a17745532c performance: Move preemption allocation init to the first submission
Related-To: NEO-12323
Signed-off-by: Krzysztof Sprzaczkowski <krzysztof.sprzaczkowski@intel.com>
2025-01-15 20:22:50 +01:00
Compute-Runtime-Validation
af031ee0e3 Revert "performance: align structures for 64-bit platforms"
This reverts commit 9f07f56f7f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-01-15 09:02:01 +01:00
Vysochyn, Illia
ca72dff1ab feature: Add missing pipelined EU thread arbitration on Xe3
Related-To: NEO-13682

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2025-01-15 08:24:43 +01:00
Damian Tomczak
9a149b6da5 refactor: useGlobalHeaps naming unification
Related-to: NEO-12737

Signed-off-by: Damian Tomczak <damian.tomczak@intel.com>
2025-01-14 11:01:07 +01:00
Mateusz Jablonski
112abeeeef fix: don't adjust programmed per thread scratch size
when adjusting scratch space size then adjust only allocation size

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-10 11:35:50 +01:00
Mateusz Jablonski
a3b6c1fa6d fix: correct thread/eu ratio for scratch to Xe2
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-09 22:42:36 +01:00
Szymon Morek
f3c9362fc5 fix: check for gpu hang during wait for ring completion
Related-To: NEO-13490

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-01-09 18:44:25 +01:00
Jack Myers
7f9fadc314 fix: regression caused by tbx fault mngr
Addresses regressions from the reverted merge
of the tbx fault manager for host memory.

Recursive locking of mutex caused deadlock.

To fix, separate tbx fault data from base
cpu fault data, allowing separate mutexes
for each, eliminating recursive locks on
the same mutex.

By separating, we also help ensure that tbx-related
changes don't affect the original cpu fault manager code
paths.

As an added safe guard preventing critical regressions
and avoiding another auto-revert, the tbx fault manager
is hidden behind a new debug flag which is disabled by default.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-09 07:48:53 +01:00
Semenov Herman (Семенов Герман)
9f07f56f7f performance: align structures for 64-bit platforms
Signed-off-by: Semenov Herman (Семенов Герман) <GermanAizek@yandex.ru>
2025-01-09 06:03:39 +01:00
Mateusz Jablonski
5eece6d578 feature: add enableVariableRegisterSizeAllocation to StateComputeModeProperties
Related-To: NEO-12803

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-02 17:15:18 +01:00
Mateusz Jablonski
165c294590 refactor: extract methods to setup SCM state per context
per context properties are now set explicitly

Related-To: NEO-12803, NEO-13632
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-02 15:19:39 +01:00
Compute-Runtime-Validation
ed24c07227 Revert "feature: add enableVariableRegisterSizeAllocation to StateComputeMode...
This reverts commit 9ccecb5a35.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-01-01 02:35:32 +01:00
Vysochyn, Illia
f198507875 refactor: Remove 3DSTATE_BTD_BODY structure
Removes 3DSTATE_BTD_BODY as redundant structure.

Related-To: NEO-13147

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2024-12-31 16:27:29 +01:00
Mateusz Jablonski
9ccecb5a35 feature: add enableVariableRegisterSizeAllocation to StateComputeModeProperties
Related-To: NEO-12803

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-12-31 12:28:19 +01:00
Compute-Runtime-Validation
124e755b9d Revert "fix: regression caused by tbx fault mngr"
This reverts commit 9a14fe2478.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-19 17:35:03 +01:00
Jack Myers
9a14fe2478 fix: regression caused by tbx fault mngr
Addresses regressions from the reverted merge
of the tbx fault manager for host memory.

This fixes attempts by the tbx fault manager
to protect/unprotect host buffer memory, even
if the host ptr was not driver-allocated.

In the case of the smoke test that triggered
the critical regression, clCreateBuffer was
called with the CL_MEM_USE_HOST_PTR flag.
The subsequent `mprotect` calls on the
provided host ptr then failed.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2024-12-18 23:16:36 +01:00
Bartosz Dunajski
e8cfb38db4 performance: improve relaxed ordering task count tracking
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-18 11:36:22 +01:00
Bartosz Dunajski
b1dea19fbd refactor: move tag initialization to allocator [1/n]
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-17 17:53:13 +01:00
Filip Hazubski
a0cc124b2e performance: Pass RootDeviceIndicesContainer by reference
Additionally pass std::map by reference in UsmMemAllocPoolsManager c-tor.

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2024-12-17 14:18:30 +01:00
Compute-Runtime-Validation
6c5d9a6ed7 Revert "feature: extend TBX page fault manager from CPU implementation"
This reverts commit 51c0e80299.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-12 12:30:22 +01:00
Jack Myers
51c0e80299 feature: extend TBX page fault manager from CPU implementation
In TBX mode, the host could not write to host buffers after access from device
code due to the lack of a migration mechanism post-initial TBX upload.
Migration is unnecessary with real hardware, but required for TBX.

This patch introduces a new page fault manager type that extends the original
CPU fault manager, enabling automatic migration of host buffers in TBX mode.

Refactoring was necessary to avoid diamond inheritance, achieved by using a
template parameter as the base class for OS-specific fault managers.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2024-12-11 09:09:50 +01:00
Dunajski, Bartosz
37e81d2a11 feature: new heuristic to enable relaxed ordering 2
Related-To: NEO-13431

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2024-12-09 11:58:42 +01:00
Compute-Runtime-Validation
af8ad3aa7a Revert "feature: new heuristic to enable relaxed ordering"
This reverts commit 526f9c5e81.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-08 16:01:32 +01:00
Bartosz Dunajski
526f9c5e81 feature: new heuristic to enable relaxed ordering
Related-To: GSD-10308

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-06 17:04:39 +01:00
Bartosz Dunajski
9629ab3cc3 fix: disable fence wait if not supported on given CSR type
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-02 15:32:23 +01:00
Bartosz Dunajski
5e1fa75676 refactor: adjust code to compile with c++20
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-11-29 10:27:29 +01:00
Mateusz Jablonski
fa58073095 refactor: remove not used usings/typedefs/variables
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-11-28 16:19:39 +01:00
Zbigniew Zdanowicz
6453a5ec31 fix: correct sequence of estimates to get correct size for start command
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-11-27 09:38:39 +01:00
Bartosz Dunajski
dab4166837 fix: add missing aub polls on sync points
Related-To: HSD-14023925176

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-11-21 09:17:54 +01:00
Szymon Morek
1f60935930 fix: don't return csr as busy if gpu hang is detected
Related-To: NEO-13071

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-10-30 16:17:38 +01:00
Szymon Morek
01a0b8e7f7 performance: improve ULLS controller timeout detection
Related-To: NEO-12991

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-10-22 13:53:25 +02:00