Commit Graph

475 Commits

Author SHA1 Message Date
Mateusz Hoppe 1ce795c265 refactor: fixes in ults
Related-To: NEO-13789

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2025-01-20 09:57:05 +01:00
Jack Myers d62122a656 fix: exceptions to TBX faultable types
This commit addresses a bug in the previous implementation where almost all once
writable types, except `gpuTimestampBuffers`, were incorrectly enabled for TBX
faultable checks. The fix ensures that only the subset of once writable
types that are also lockable are considered TBX faultable, using the lockable
check to avoid manual exceptions and re-inventing the wheel.

Changes:

- Updated `isAllocTbxFaultable` method to check if the allocation type is
lockable in addition to being once writable.
- Refactored unit tests to include separate checks for lockable and non-lockable
allocation types.

Performance optimization:

- Removed unnecessary memory data erasure in `handlePageFault` to avoid constant
erase/insert operations, leveraging the O(1) search time of unordered maps.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-17 00:52:49 +01:00
Kamil Kopryk 7d8e08f00b test: adjust code to compile with c++20 2/n
Related-To: NEO-10767
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-01-16 15:24:25 +01:00
Kamil Kopryk 0278d2e652 test: adjust code to compile with c++20
Related-To: NEO-10767
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-01-16 12:04:50 +01:00
Kamil Kopryk c5ba3dd575 test: remove not needed volatile
Related-To: NEO-10767
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-01-16 11:36:45 +01:00
Jack Myers 0b2ac4d331 feature: Tbx faults for all once writable types
Patch #34223 introduced the TbxPageFaultManager for handling
uploads/downloads of host buffers to the Tbx server, ensuring
host memory is kept consistent between the host and device,
even after multiple alternating writes from the host and gpu.

This patch enable fault handling for all `isAubOnceWritable`
types.

Minor exception for gpuTimestampBuffers as enabling this type
seems to break things in real-world use cases outside of ULTs.

Related-To: NEO-12319
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-16 01:43:19 +01:00
Kamil Kopryk 99a7b5a4fb refactor: remove not needed volatile
Related-To: NEO-10767
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-01-16 00:43:10 +01:00
Krzysztof Sprzaczkowski a17745532c performance: Move preemption allocation init to the first submission
Related-To: NEO-12323
Signed-off-by: Krzysztof Sprzaczkowski <krzysztof.sprzaczkowski@intel.com>
2025-01-15 20:22:50 +01:00
Mateusz Jablonski 112abeeeef fix: don't adjust programmed per thread scratch size
when adjusting scratch space size then adjust only allocation size

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-10 11:35:50 +01:00
Mateusz Jablonski a3b6c1fa6d fix: correct thread/eu ratio for scratch to Xe2
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-09 22:42:36 +01:00
Jack Myers 7f9fadc314 fix: regression caused by tbx fault mngr
Addresses regressions from the reverted merge
of the tbx fault manager for host memory.

Recursive locking of mutex caused deadlock.

To fix, separate tbx fault data from base
cpu fault data, allowing separate mutexes
for each, eliminating recursive locks on
the same mutex.

By separating, we also help ensure that tbx-related
changes don't affect the original cpu fault manager code
paths.

As an added safe guard preventing critical regressions
and avoiding another auto-revert, the tbx fault manager
is hidden behind a new debug flag which is disabled by default.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-09 07:48:53 +01:00
Mateusz Jablonski 5eece6d578 feature: add enableVariableRegisterSizeAllocation to StateComputeModeProperties
Related-To: NEO-12803

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-02 17:15:18 +01:00
Mateusz Jablonski 165c294590 refactor: extract methods to setup SCM state per context
per context properties are now set explicitly

Related-To: NEO-12803, NEO-13632
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-02 15:19:39 +01:00
Compute-Runtime-Validation ed24c07227 Revert "feature: add enableVariableRegisterSizeAllocation to StateComputeMode...
This reverts commit 9ccecb5a35.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-01-01 02:35:32 +01:00
Vysochyn, Illia f198507875 refactor: Remove 3DSTATE_BTD_BODY structure
Removes 3DSTATE_BTD_BODY as redundant structure.

Related-To: NEO-13147

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2024-12-31 16:27:29 +01:00
Mateusz Jablonski 9ccecb5a35 feature: add enableVariableRegisterSizeAllocation to StateComputeModeProperties
Related-To: NEO-12803

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-12-31 12:28:19 +01:00
Compute-Runtime-Validation 124e755b9d Revert "fix: regression caused by tbx fault mngr"
This reverts commit 9a14fe2478.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-19 17:35:03 +01:00
Jack Myers 9a14fe2478 fix: regression caused by tbx fault mngr
Addresses regressions from the reverted merge
of the tbx fault manager for host memory.

This fixes attempts by the tbx fault manager
to protect/unprotect host buffer memory, even
if the host ptr was not driver-allocated.

In the case of the smoke test that triggered
the critical regression, clCreateBuffer was
called with the CL_MEM_USE_HOST_PTR flag.
The subsequent `mprotect` calls on the
provided host ptr then failed.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2024-12-18 23:16:36 +01:00
Compute-Runtime-Validation 6c5d9a6ed7 Revert "feature: extend TBX page fault manager from CPU implementation"
This reverts commit 51c0e80299.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-12 12:30:22 +01:00
Jack Myers 51c0e80299 feature: extend TBX page fault manager from CPU implementation
In TBX mode, the host could not write to host buffers after access from device
code due to the lack of a migration mechanism post-initial TBX upload.
Migration is unnecessary with real hardware, but required for TBX.

This patch introduces a new page fault manager type that extends the original
CPU fault manager, enabling automatic migration of host buffers in TBX mode.

Refactoring was necessary to avoid diamond inheritance, achieved by using a
template parameter as the base class for OS-specific fault managers.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2024-12-11 09:09:50 +01:00
Bartosz Dunajski 9629ab3cc3 fix: disable fence wait if not supported on given CSR type
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-02 15:32:23 +01:00
Mateusz Jablonski b1f7a3d125 test: remove not used usings/typedefs in shared tests
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-11-29 16:31:25 +01:00
Mateusz Jablonski d4e201db86 test: remove not used usings/typedefs/variables in shared tests
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-11-29 10:46:10 +01:00
Bartosz Dunajski 5e1fa75676 refactor: adjust code to compile with c++20
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-11-29 10:27:29 +01:00
Zbigniew Zdanowicz 6453a5ec31 fix: correct sequence of estimates to get correct size for start command
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-11-27 09:38:39 +01:00
Bartosz Dunajski dab4166837 fix: add missing aub polls on sync points
Related-To: HSD-14023925176

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-11-21 09:17:54 +01:00
Wojciech Konior 6b40f9bc5a refactor: engineInstancedType removed
Related-To: NEO-12594

Signed-off-by: Wojciech Konior <wojciech.konior@intel.com>
2024-10-09 16:30:48 +02:00
Compute-Runtime-Validation ef1b569a85 Revert "performance: Do not create global fence allocation on integrated"
This reverts commit 6bf5183eff.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-10-01 08:14:39 +02:00
Lukasz Jobczyk 6bf5183eff performance: Do not create global fence allocation on integrated
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-09-30 13:13:27 +02:00
Compute-Runtime-Validation 6cb0e45330 Revert "performance: Do not create global fence allocation on integrated"
This reverts commit 50eb6af9ac.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-27 11:48:01 +02:00
Lukasz Jobczyk 50eb6af9ac performance: Do not create global fence allocation on integrated
Resolves: NEO-12324

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-09-27 09:32:42 +02:00
Lukasz Jobczyk c93998bcb9 performance: Do not program additional synchronization on integrated
Related-To: NEO-12324

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-09-26 10:54:31 +02:00
Bartosz Dunajski 5b1bd4b088 refactor: dont mix aub and hw wait prints
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-09-26 08:51:23 +02:00
Maciej Plewka 80f75ceace fix: submit dummy exec to pin memory during zeContextMakeMemoryResident call
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-09-23 14:43:59 +02:00
Wojciech Konior c1edd23fd2 test: removing isEngineInstanced from tests
Related-To: NEO-12594

Signed-off-by: Wojciech Konior <wojciech.konior@intel.com>
2024-09-20 15:08:12 +02:00
Mateusz Hoppe 4a068c8eab fix: correclty program StateBaseAddress in global bindless mode
- prepare bindful ssh when kernel requires ssh heap and
SurfaceStateBaseAddress
- remove lastAppendedKernelBindlesMode - local ssh heap may be needed
for bindless kernels with scratch or misaligned buffer args
- use ssh heap gpu address to program SurfaceStateBaseAddress, global base is
used for BindlessSurfaceState and DynamicState

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-09-20 11:57:05 +02:00
Mateusz Jablonski 7e218a5f70 test: simplify IsAtLeastGen12lp and IsAtMostGen12lp matchers
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-09-17 13:42:19 +02:00
Bartosz Dunajski d3d8b5fcc1 fix: inherit work partition allocation from primary root csr
Related-To: NEO-8171

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-09-16 18:45:16 +02:00
Kamil Kopryk 20c4a75171 test: correct expectation in ults if heapless enabled
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-09-13 09:57:26 +02:00
Jitendra Sharma f6a89bbc03 fix: initialize debugger before creating engines
Related-To: NEO-12571
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2024-09-12 18:12:12 +02:00
Compute-Runtime-Validation d842f65cf1 Revert "fix: submit dummy exec to pin memory during zeContextMakeMemoryReside...
This reverts commit f9b87d53e6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-05 03:28:03 +02:00
Maciej Plewka f9b87d53e6 fix: submit dummy exec to pin memory during zeContextMakeMemoryResident call
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>

Related-To: NEO-11879
2024-09-04 14:07:29 +02:00
Szymon Morek e6abfafa16 fix: drain paging fence queue before waiting for resources
Related-To: NEO-12197

If ULLS controller waits for CSR lock, and driver must
wait for resources due to OOM, then draing paging fence queue
directly

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-09-03 07:45:25 +02:00
Bartosz Dunajski db611962f7 fix: improve task count handling in tbx download path
Related-To: HSD-18039789178

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-28 15:32:15 +02:00
Mateusz Hoppe d9864eca7a feature: add context group support for root device engine
Related-To: NEO-12257

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-08-27 10:07:28 +02:00
Kamil Kopryk fc3646b58c test: correct expectations in shared ults if heapless enabled
Related-To: NEO-10641
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-08-26 09:51:19 +02:00
Compute-Runtime-Validation 956dd8e17d Revert "fix: set properly resource params when setAllocationType"
This reverts commit 2e0884a301.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-24 05:01:38 +02:00
Katarzyna Cencelewska 2e0884a301 fix: set properly resource params when setAllocationType
gmm params: usage, cachable and resource info
should be set properly when override allocation type

Resolves: HSD-22020344331
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2024-08-23 16:57:23 +02:00
Szymon Morek b8f181d50e performance: remove trim candidate list
Related-To: NEO-11755

Removing trim candidate list reduces overhead
caused by residency handling. Allocations required
for eviction are placed in eviction container managed
by CSR.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-23 12:21:50 +02:00
Bartosz Dunajski 696b02bfd3 fix: improve TBX downloading after L0 Event sync
Related-To: HSD-18038498579

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-23 10:42:17 +02:00