Commit Graph

905 Commits

Author SHA1 Message Date
Bartosz Dunajski
e8cfb38db4 performance: improve relaxed ordering task count tracking
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-18 11:36:22 +01:00
Bartosz Dunajski
b1dea19fbd refactor: move tag initialization to allocator [1/n]
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-17 17:53:13 +01:00
Filip Hazubski
a0cc124b2e performance: Pass RootDeviceIndicesContainer by reference
Additionally pass std::map by reference in UsmMemAllocPoolsManager c-tor.

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2024-12-17 14:18:30 +01:00
Compute-Runtime-Validation
6c5d9a6ed7 Revert "feature: extend TBX page fault manager from CPU implementation"
This reverts commit 51c0e80299.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-12 12:30:22 +01:00
Jack Myers
51c0e80299 feature: extend TBX page fault manager from CPU implementation
In TBX mode, the host could not write to host buffers after access from device
code due to the lack of a migration mechanism post-initial TBX upload.
Migration is unnecessary with real hardware, but required for TBX.

This patch introduces a new page fault manager type that extends the original
CPU fault manager, enabling automatic migration of host buffers in TBX mode.

Refactoring was necessary to avoid diamond inheritance, achieved by using a
template parameter as the base class for OS-specific fault managers.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2024-12-11 09:09:50 +01:00
Dunajski, Bartosz
37e81d2a11 feature: new heuristic to enable relaxed ordering 2
Related-To: NEO-13431

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2024-12-09 11:58:42 +01:00
Compute-Runtime-Validation
af8ad3aa7a Revert "feature: new heuristic to enable relaxed ordering"
This reverts commit 526f9c5e81.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-08 16:01:32 +01:00
Bartosz Dunajski
526f9c5e81 feature: new heuristic to enable relaxed ordering
Related-To: GSD-10308

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-06 17:04:39 +01:00
Bartosz Dunajski
9629ab3cc3 fix: disable fence wait if not supported on given CSR type
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-02 15:32:23 +01:00
Bartosz Dunajski
5e1fa75676 refactor: adjust code to compile with c++20
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-11-29 10:27:29 +01:00
Mateusz Jablonski
fa58073095 refactor: remove not used usings/typedefs/variables
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-11-28 16:19:39 +01:00
Zbigniew Zdanowicz
6453a5ec31 fix: correct sequence of estimates to get correct size for start command
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-11-27 09:38:39 +01:00
Bartosz Dunajski
dab4166837 fix: add missing aub polls on sync points
Related-To: HSD-14023925176

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-11-21 09:17:54 +01:00
Szymon Morek
1f60935930 fix: don't return csr as busy if gpu hang is detected
Related-To: NEO-13071

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-10-30 16:17:38 +01:00
Szymon Morek
01a0b8e7f7 performance: improve ULLS controller timeout detection
Related-To: NEO-12991

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-10-22 13:53:25 +02:00
Wojciech Konior
6b40f9bc5a refactor: engineInstancedType removed
Related-To: NEO-12594

Signed-off-by: Wojciech Konior <wojciech.konior@intel.com>
2024-10-09 16:30:48 +02:00
Mateusz Jablonski
1f801f412f fix: don't program mid thread preemption for pre-Xe2 platforms
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-10-08 14:16:14 +02:00
Mateusz Jablonski
552930a75f fix: don't setup preemption surface when debugger is active
Related-To: NEO-12878
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-10-08 13:58:11 +02:00
Maciej Plewka
73e4b6ae7c fix: remove w/a which disables wmtp in kernels with ray tracing
Related-To: NEO-12872
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-10-07 14:28:08 +02:00
Szymon Morek
a915ef4b7b fix: Don't program redundant paging fence semaphores
Related-To: NEO-12197

Don't program semaphore to wait for paging fence if it was
already programmed with the same value

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-10-07 11:42:06 +02:00
Andrzej Koska
6abc5eb1a1 fix: using releaseHelper to determine MTP enablement
Related-To: NEO-12466

Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2024-10-01 15:06:07 +02:00
Slawomir Milczarek
edeb7bdd4b refactor: Allocate copy source for work partition allocation on heap
No perforamce impact expected since it is initialized once only,
but has the advantage of using custom allocator by overriding malloc.

Related-To: NEO-12846

Signed-off-by: Slawomir Milczarek <slawomir.milczarek@intel.com>
2024-10-01 13:32:55 +02:00
Bartosz Dunajski
5b1bd4b088 refactor: dont mix aub and hw wait prints
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-09-26 08:51:23 +02:00
Maciej Plewka
80f75ceace fix: submit dummy exec to pin memory during zeContextMakeMemoryResident call
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-09-23 14:43:59 +02:00
Mateusz Hoppe
4a068c8eab fix: correclty program StateBaseAddress in global bindless mode
- prepare bindful ssh when kernel requires ssh heap and
SurfaceStateBaseAddress
- remove lastAppendedKernelBindlesMode - local ssh heap may be needed
for bindless kernels with scratch or misaligned buffer args
- use ssh heap gpu address to program SurfaceStateBaseAddress, global base is
used for BindlessSurfaceState and DynamicState

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-09-20 11:57:05 +02:00
Kamil Kopryk
ec5beaf616 refactor: reduce csr class size
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-09-18 13:33:55 +02:00
Kamil Kopryk
d2bf3e4431 refactor: remove not needed volatile keywords
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-09-18 11:54:38 +02:00
Bartosz Dunajski
d3d8b5fcc1 fix: inherit work partition allocation from primary root csr
Related-To: NEO-8171

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-09-16 18:45:16 +02:00
Mateusz Jablonski
8e7959b243 refactor: remove not needed code
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-09-16 14:55:55 +02:00
Mateusz Jablonski
78604bd475 refactor: remove not needed code
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-09-16 12:12:43 +02:00
Zbigniew Zdanowicz
b7dfc5c1de refactor: remove not used code
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-09-12 13:40:11 +02:00
Bartosz Dunajski
dd8460beba refactor: reduce TBX download timeout for unit tests
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-09-09 19:05:03 +02:00
Lukasz Jobczyk
a54a3bf624 performance: Optimize heap handling when mitigate dc flush
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-09-06 04:33:41 +02:00
Andrzej Koska
b0e7a11e9a refactor: Improving information transfer about the copy engine
Related-To: NEO-11934

Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2024-09-05 16:11:52 +02:00
Compute-Runtime-Validation
d842f65cf1 Revert "fix: submit dummy exec to pin memory during zeContextMakeMemoryReside...
This reverts commit f9b87d53e6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-05 03:28:03 +02:00
Maciej Plewka
f9b87d53e6 fix: submit dummy exec to pin memory during zeContextMakeMemoryResident call
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>

Related-To: NEO-11879
2024-09-04 14:07:29 +02:00
Kamil Kopryk
6d7e2760dc refactor: correct expectations in level zero tests if heapless enabled 3/n
Related-To: NEO-10641
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-09-03 15:11:59 +02:00
Compute-Runtime-Validation
dc84b163b5 Revert "performance: Optimize heap handling when mitigate dc flush"
This reverts commit 9249c5c65c.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-03 08:33:20 +02:00
Szymon Morek
e6abfafa16 fix: drain paging fence queue before waiting for resources
Related-To: NEO-12197

If ULLS controller waits for CSR lock, and driver must
wait for resources due to OOM, then draing paging fence queue
directly

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-09-03 07:45:25 +02:00
Lukasz Jobczyk
9249c5c65c performance: Optimize heap handling when mitigate dc flush
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-09-02 18:12:19 +02:00
Compute-Runtime-Validation
63528e70a7 Revert "performance: Optimize heap handling when mitigate dc flush"
This reverts commit 1a8149e91c.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-30 05:59:25 +02:00
Lukasz Jobczyk
1a8149e91c performance: Optimize heap handling when mitigate dc flush
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-08-29 17:03:55 +02:00
Bartosz Dunajski
db611962f7 fix: improve task count handling in tbx download path
Related-To: HSD-18039789178

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-28 15:32:15 +02:00
Szymon Morek
b8f181d50e performance: remove trim candidate list
Related-To: NEO-11755

Removing trim candidate list reduces overhead
caused by residency handling. Allocations required
for eviction are placed in eviction container managed
by CSR.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-23 12:21:50 +02:00
Bartosz Dunajski
696b02bfd3 fix: improve TBX downloading after L0 Event sync
Related-To: HSD-18038498579

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-23 10:42:17 +02:00
Mateusz Jablonski
7ac41615cd fix: create thread with function pointer
don't create async thread in neo shared tests

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-21 18:02:37 +02:00
Aravind Gopalakrishnan
cb8063f71d feature: Append recorded command list into immediate (3/N)
- Use correct stream for dispatch
- Add capability to append signal event
- Check available space globally in immediate append call

Related-To: NEO-10356

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
2024-08-16 17:40:28 +02:00
Compute-Runtime-Validation
9b652f4a34 Revert "feature: Improving information transfer about the copy engine"
This reverts commit 17ffdff4f1.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-15 22:06:31 +02:00
Andrzej Koska
17ffdff4f1 feature: Improving information transfer about the copy engine
Related-To: NEO-11934

Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2024-08-14 11:28:29 +02:00
Fabian Zwoliński
f4ad45eafd fix: initialize engine in AubMemoryOperationsHandler::makeResident
Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
2024-08-13 20:57:57 +02:00