Commit Graph

6327 Commits

Author SHA1 Message Date
Szymon Morek b6fc2b5861 performance: enable copy through staging buffers on xe2 linux
Related-To: NEO-13456

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-12-16 11:18:33 +01:00
Andrzej Koska 105a586615 feature: Enable Tile64 Optimization Flag
Related-To: NEO-12134

Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2024-12-16 11:09:26 +01:00
Marcel Skierkowski e27a6dc280 refactor: use virtualFileSystem in ULTs
reducing the number of tests that have interactions with filesystem.
writeDataToFile() saves filename and content in std::map.
fileExistsHasSize() checks if file was previously written to virtualFileSystem
loadDataFromVirtualFile() fetches data from std::map based on filename

Related-To: NEO-7006
Signed-off-by: Marcel Skierkowski <marcel.skierkowski@intel.com>
2024-12-13 17:57:07 +01:00
Mateusz Jablonski 8f7bacdd95 feature: add eudebug interface class
eudebug interface is now hidden under EuDebugInterface class
shared code uses generic object and param values

layout of structs is guarded by static asserts

eudebug support is guarded by cmake flags:
- NEO_ENABLE_XE_EU_DEBUG_SUPPORT - enables eudebug in general
- NEO_USE_XE_EU_DEBUG_EXP_UPSTREAM - registers exp upstream uAPI support
- NEO_ENABLE_XE_PRELIM_DETECTION - registers prelim uAPI support

This way we can support two different xe-eudebug interfaces within
single binary.

In unit tests there is mock eudebug interface enabled (even if no
eudebug support is enabled by cmake flag).

Related-To: NEO-13472
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-12-13 17:07:12 +01:00
Mateusz Jablonski 49c028858a fix: optimize bind info in ioctl helper xe
bind info container is needed to distinguish between userptr and handles
with this change it stores only userptrs and their corresponding gpu va

Related-To: NEO-13501
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-12-13 16:07:30 +01:00
Lukasz Jobczyk 093d987e33 performance: Enable timestamp wait for queues on Xe2
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-13 09:48:12 +01:00
Maciej Plewka 8151224501 fix: add microsecond resolution for timeout
Related-To: NEO-13445
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-12-13 09:32:18 +01:00
Wenbin Lu c19df80bd8 feature: add key to force GPU status check in event synchronization
Related-To: GSD-10187

Signed-off-by: Wenbin Lu <wenbin.lu@intel.com>
2024-12-12 22:24:36 +01:00
Mateusz Hoppe 0589a70dc7 feature(sysman): reinitialize gfxPartition on reset
Related-To: NEO-13203

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-12-12 16:27:25 +01:00
Damian Tomczak 98331e7d63 feature: is48bResourceNeededForRayTracing specialization for rt encoder
Related-to: NEO-10074

Signed-off-by: Damian Tomczak <damian.tomczak@intel.com>
2024-12-12 13:29:07 +01:00
Compute-Runtime-Validation 6c5d9a6ed7 Revert "feature: extend TBX page fault manager from CPU implementation"
This reverts commit 51c0e80299.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-12 12:30:22 +01:00
Chandio, Bibrak Qamar 8cf4804fcd fix: Overhead in zeDeviceGetGlobalTimestamps
Related-To: NEO-11908

There is overhead when submission method is used for
zeDeviceGetGlobalTimestamps. This fixes it.

Signed-off-by: Chandio, Bibrak Qamar <bibrak.qamar.chandio@intel.com>
2024-12-12 08:54:19 +01:00
Compute-Runtime-Validation 4f12ee40e9 Revert "performance: Enable timestamp wait for queues on Xe2"
This reverts commit f43bfa4cc1.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-12 03:55:22 +01:00
Lukasz Jobczyk f43bfa4cc1 performance: Enable timestamp wait for queues on Xe2
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-11 15:05:27 +01:00
Szymon Morek 6c4eb322b1 performance: introduce staging reads from image
Related-To: NEO-12968

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-12-11 14:43:45 +01:00
Lukasz Jobczyk f2725f217e refactor: Introduce debug flags to manipulate event's signal visibility
-Add AbortHostSyncOnNonHostVisibleEvent which abort when waiting for non
host visible event from host
-Add ForceHostSignalScope which forces add or clear of host scope to
event's signal scope

Related-To: NEO-13441

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-11 14:07:43 +01:00
Bartosz Dunajski eca3d5a677 feature: debug flag to clear timestamp before submission
Related-To: HSD-18040896547

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-11 13:56:49 +01:00
Szymon Morek fa2ff678fa performance: enable direct submission on BMG linux
Related-To: NEO-13454

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-12-11 11:45:20 +01:00
Lukasz Jobczyk c2093990d4 fix: Flush monitor fence only to context where needed
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-11 10:06:43 +01:00
Jack Myers 51c0e80299 feature: extend TBX page fault manager from CPU implementation
In TBX mode, the host could not write to host buffers after access from device
code due to the lack of a migration mechanism post-initial TBX upload.
Migration is unnecessary with real hardware, but required for TBX.

This patch introduces a new page fault manager type that extends the original
CPU fault manager, enabling automatic migration of host buffers in TBX mode.

Refactoring was necessary to avoid diamond inheritance, achieved by using a
template parameter as the base class for OS-specific fault managers.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2024-12-11 09:09:50 +01:00
Compute-Runtime-Validation 924ad580bd Revert "fix: enable scratch pages on xekmd"
This reverts commit 74824f659a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-11 07:10:27 +01:00
Filip Hazubski 3315db7d92 fix: Correct mutex logic in SVMAllocsManager::freeSVMAllocImpl
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2024-12-10 16:16:53 +01:00
Szymon Morek c3ed6e062a performance: enable timestamp reuse on linux
Related-To: NEO-13456

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-12-10 11:57:45 +01:00
Naklicki, Mateusz 74824f659a fix: enable scratch pages on xekmd
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2024-12-10 11:48:09 +01:00
Maciej Bielski 1fafd44af5 refactor: use level-specific name for CacheInfo instances
Related-To: NEO-12837

Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2024-12-10 11:45:11 +01:00
Fabian Zwoliński 5f8e761541 fix: HeapAllocator - ensure getBaseAddress returns initial base address
getBaseAddress was incorrectly returning pLeftBound which changes after
memory allocation.
Added baseAddress field to store and return initial address value.

Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
2024-12-10 10:52:36 +01:00
Mateusz Jablonski d78ca38d9b refactor: remove dead code
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-12-09 19:26:00 +01:00
Maciej Bielski c9726dbb10 refactor: simplify tracking CacheRegion reservations
Leverage features of the mechanism to simplify implementation:
- The maximum number of possible cache-region reservations is a small
value known at compile-time
- Each reservation is unique (described by `CacheRegion`) so can have
a dedicated entry with either zero (free) or non-zero (reserved) value

So, there is no need for a dynamic collection (unordered_map here) to
keep track of reservations. A simple array is enough for that purpose.

Also, add some helper-code to enable array-indexing with the values of
`CacheRegion` enum.

Related-To: NEO-12837
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2024-12-09 16:50:28 +01:00
Dominik Dabek 894c74b62d fix: disable indirects detection on non-PVC
Also add ULTs for getting indirect detection version product helper
methods.

Related-To: GSD-10453

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-12-09 15:11:30 +01:00
Maciej Bielski 4467b1e8de feature: specify cache level when reserving a region
Related-To: NEO-12837
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2024-12-09 12:54:48 +01:00
Dunajski, Bartosz 37e81d2a11 feature: new heuristic to enable relaxed ordering 2
Related-To: NEO-13431

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2024-12-09 11:58:42 +01:00
Lukasz Jobczyk 8f671cb6a8 fix: Disable dc flush mitigation
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-09 11:48:11 +01:00
Vysochyn, Illia 0b7367ed5f refactor: Update STATE_BASE_ADDRESS
Refactors the STATE_BASE_ADDRESS to align with the latest specification.

Removes redundant functionality for multiple GPU partial writes and
atomics.

Related-To: NEO-13147

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2024-12-09 08:50:59 +01:00
Compute-Runtime-Validation af8ad3aa7a Revert "feature: new heuristic to enable relaxed ordering"
This reverts commit 526f9c5e81.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-08 16:01:32 +01:00
Compute-Runtime-Validation 58e45afd39 Revert "fix: HeapAllocator - ensure getBaseAddress returns initial base address"
This reverts commit ffec97acc5.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-08 14:35:22 +01:00
Compute-Runtime-Validation ff1c5837fa Revert "fix: Disable dc flush mitigation"
This reverts commit 60a6d3875b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-07 06:21:41 +01:00
Bartosz Dunajski 526f9c5e81 feature: new heuristic to enable relaxed ordering
Related-To: GSD-10308

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-06 17:04:39 +01:00
Lukasz Jobczyk 60a6d3875b fix: Disable dc flush mitigation
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-06 10:49:53 +01:00
Compute-Runtime-Validation 484210d656 Revert "fix: limit usm device reuse based on used memory"
This reverts commit 1252b10ba9.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-05 23:17:51 +01:00
Fabian Zwoliński ffec97acc5 fix: HeapAllocator - ensure getBaseAddress returns initial base address
getBaseAddress was incorrectly returning pLeftBound which changes after
memory allocation.
Added baseAddress field to store and return initial address value.

Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
2024-12-05 20:42:16 +01:00
Fabian Zwoliński d2ce3badfc fix: bindlessHeapsHelper handle unavailable external heap
This PR handles the situation in which a component
has reserved a front window space for itself in the external heap,
so that the Compute Runtime cannot access this area.

In such a situation, we perform the following steps:
1. reserve 4GB chunk in heapStandard
2. split our chunk into 2 parts: heapFrontWindow, heapRegular
3. from this point on, map all linearStream allocations in reserved 4GB
chunk

Patch applies to Windows and WSL.
Patch only applies when the bindless global allocator is enabled.

Related-To: HSD-16025889919
Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
2024-12-05 14:18:01 +01:00
Vysochyn, Illia e3bb555f1d refactor: Modify thread group batch size naming
Modifies thread group batch size enumerator naming to follow
the specification.

Related-To: NEO-13147

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2024-12-05 08:48:58 +01:00
Chandio, Bibrak Qamar ab2e831a4a fix: zeDeviceGetGlobalTimestamp to use submisison
Related-To: GSD-10253, GSD-9467, GSD-9381, NEO-11908

When EnableGlobalTimestampViaSubmission is set then
zeDeviceGetGlobalTimestamp uses immediate cmd submission
method to get GPU time.

Signed-off-by: Chandio, Bibrak Qamar <bibrak.qamar.chandio@intel.com>
2024-12-04 19:10:07 +01:00
Dominik Dabek 819ffea90f fix: reenable indirect detection for non-VC, PVC
Issue is limited to detection in VC, can reenable for other kernels.

Related-To: NEO-13372

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-12-04 14:59:02 +01:00
Lukasz Jobczyk d40a804bca performance: Allocate by KMD on BMG
Related-To: NEO-10526

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-04 14:41:25 +01:00
Dominik Dabek 1252b10ba9 fix: limit usm device reuse based on used memory
Calculate available memory for usm device reuse based as (total device
memory - used memory) * fraction for reuse.

Use sys mem allocs for devices without local memory.

Related-To: NEO-12902

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-12-04 08:11:23 +01:00
Compute-Runtime-Validation d4bfa0f758 Revert "performance: Allocate by KMD on BMG"
This reverts commit 331fffaeea.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-04 07:03:32 +01:00
Compute-Runtime-Validation b1fa7a0e24 Revert "performance: Enable timestamp wait for queues on Xe2"
This reverts commit 2789c50090.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-04 02:53:31 +01:00
Chodor, Jaroslaw 49e904df74 feature: Parse actual_kernel_start_offset zeinfo entry
This is a deprecated and redundant entry but needs to
be preserved for compatibility reasons.

Related-To: GSD-10402
Signed-off-by: Chodor, Jaroslaw <jaroslaw.chodor@intel.com>
2024-12-03 18:57:06 +01:00
Lukasz Jobczyk 331fffaeea performance: Allocate by KMD on BMG
Related-To: NEO-10526

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-03 15:25:56 +01:00