Commit Graph

225 Commits

Author SHA1 Message Date
John Falkowski 8e59ac7576 refactor: Gate shared system mem caps with KMD cap
Enabled only by setting EnableSharedSystemUsmSupport=1 flag

Related-To: NEO-12988

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2025-05-07 06:38:01 +02:00
Compute-Runtime-Validation d477935ab9 Revert "refactor: Gate shared system mem caps with KMD cap"
This reverts commit f38fae3b18.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-05-06 12:43:16 +02:00
John Falkowski f38fae3b18 refactor: Gate shared system mem caps with KMD cap
Related-To: NEO-12988

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2025-05-06 07:28:59 +02:00
Brandon Yates 4651e72b0b fix: Fail device init if kernel debugging is misconfigured
Also print error to stderr

Related-to: GSD-10780

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2025-04-02 21:06:30 +02:00
Compute-Runtime-Validation a89113fa1a Revert "fix: Fail device init if kernel debugging is misconfigured"
This reverts commit c122bc51f9.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-03-26 13:27:12 +01:00
Brandon Yates c122bc51f9 fix: Fail device init if kernel debugging is misconfigured
Also print error to stderr

Related-to: GSD-10780

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2025-03-25 20:40:25 +01:00
John Falkowski 4d281cf51d feature: Implement appendMemoryPrefetch for Shared System USM allocations
Related-To: NEO-12989

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2025-03-13 06:26:38 +01:00
Compute-Runtime-Validation fa2e3adad3 Revert "feature: Implement appendMemoryPrefetch for Shared System USM Allocat...
This reverts commit 97799b3faf.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-03-12 05:55:32 +01:00
John Falkowski 97799b3faf feature: Implement appendMemoryPrefetch for Shared System USM Allocations
Related-To: NEO-12989

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2025-03-11 09:12:48 +01:00
Compute-Runtime-Validation 6ee39ed94c Revert "fix: Fail device init if kernel debugging is misconfigured"
This reverts commit b0c92ea425.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-03-10 12:23:07 +01:00
Brandon Yates b0c92ea425 fix: Fail device init if kernel debugging is misconfigured
Also print error to stderr

Related-to: GSD-10780

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2025-03-08 05:02:42 +01:00
Maciej Bielski 6924a48ca6 refactor: prepare CLOS logic for extension
Prepare cache setup and reservation logic to be extended w.r.t other
cache-levels.

Conceptually this change is like adding a switch-statement, in several
places, in which existing code makes a single (and only) case. This is
caused by splitting larger development to ease the review. Further cases
will be added in following steps. Such approach sometimes creates code
which may seem redundant but it is meant to simplify plugging following
extensions in an easy way.

Related-To: NEO-12837
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2025-02-17 10:43:08 +01:00
Chandio, Bibrak Qamar 7149743162 fix: Set vmbind user fence when makeMemoryResident
Related-To: NEO-11977, GSD-10293

Signed-off-by: Chandio, Bibrak Qamar <bibrak.qamar.chandio@intel.com>
2025-02-10 14:20:09 +01:00
Mateusz Hoppe 6e35d055f2 feature: make contextGroupSize dependent on number of processes
Related-To: NEO-12952

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2025-02-03 18:13:33 +01:00
Compute-Runtime-Validation d23249b061 Revert "fix: Set vmbind user fence when makeMemoryResident"
This reverts commit 80dc4fb43a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-01-31 11:36:29 +01:00
Chandio, Bibrak Qamar 80dc4fb43a fix: Set vmbind user fence when makeMemoryResident
Related-To: NEO-11977, GSD-10293

Signed-off-by: Chandio, Bibrak Qamar <bibrak.qamar.chandio@intel.com>
2025-01-28 22:04:37 +01:00
John Falkowski e11e7b9b94 feature: Add shared System USM Allocation in support of appendLaunchKernel
Related-To: NEO-12988

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2025-01-24 23:41:26 +01:00
Mateusz Jablonski 270570c5d3 refactor: move i915 specific logic to ioctl helper i915
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-12-18 09:44:30 +01:00
Maciej Bielski 1fafd44af5 refactor: use level-specific name for CacheInfo instances
Related-To: NEO-12837

Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2024-12-10 11:45:11 +01:00
Slawomir Milczarek ec66d9e82d feature: Add getter to query drm about device node
Related-To: NEO-11817

Signed-off-by: Slawomir Milczarek <slawomir.milczarek@intel.com>
2024-10-14 17:58:24 +02:00
Wojciech Konior 6b40f9bc5a refactor: engineInstancedType removed
Related-To: NEO-12594

Signed-off-by: Wojciech Konior <wojciech.konior@intel.com>
2024-10-09 16:30:48 +02:00
Mateusz Jablonski a168bf2f33 fix: ensure drm topology is queried only once
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-26 14:56:41 +02:00
Mateusz Jablonski 7f6c6c6bb9 fix: ensure system info is queried only once
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-26 12:42:27 +02:00
Mateusz Jablonski 3d3dff8dc2 fix: ensure engine info is queried only once
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-25 15:49:37 +02:00
Mateusz Jablonski ef1075a06a fix: ensure memory info is queried only once
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-25 15:01:43 +02:00
Young Jin Yoon 06faaab5bb refactor: read scratch page options during init
Change scratch page logic to initialize during Drm::create.

Related-To: GSD-7742
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-15 08:56:14 +02:00
Young Jin Yoon 2c488d9e84 fix: check reset status after completion
Added a logic to check the reset status after the completion to make
sure we go through the logic at least once

Related-To: GSD-8902
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-14 06:38:07 +02:00
Bartosz Dunajski e5882e0d31 feature: pass GraphicsAllocation to fence wait
Related-To: NEO-8179

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-05-07 17:59:28 +02:00
Young Jin Yoon 07aa53fd87 fix: disable scratch page by default only on PVC
Disabled scratch paged by default only on PVC with productHelper.

Related-To: GSD-7742
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-01 23:44:48 +02:00
Bartosz Dunajski 806da85ec6 refactor: prework to pass interrupt hint
Related-To: NEO-8179

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-29 11:14:53 +02:00
Bartosz Dunajski 8e5f9e72c8 refactor: simplify waiting for fence logic
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-25 22:36:19 +02:00
Young Jin Yoon 907129bb33 feature: disable scratch page by default
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_neo.cpp, in order to
disable scratch pages by default.
Modified to set gpuPageFault to 0 as a default value when
scratch page is not disabled.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-04-04 09:50:02 +02:00
Mateusz Jablonski 420e1391b2 fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-29 07:51:06 +01:00
Compute-Runtime-Validation e3f50e8aa9 Revert "fix: handle not aligned gtt size reported by i915"
This reverts commit dae901c13f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-28 12:03:23 +01:00
Mateusz Jablonski dae901c13f fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-28 08:46:53 +01:00
Mateusz Jablonski d94be09020 refactor: remove not needed check for exec softpin
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-22 17:30:49 +01:00
Compute-Runtime-Validation 016c234893 Revert "feature: disable scratch page by default"
This reverts commit dab5469f81.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-16 01:52:00 +01:00
Young Jin Yoon dab5469f81 feature: disable scratch page by default
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_nep.cpp, in order to
disable scratch pages by default.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 11:44:10 +01:00
Young Jin Yoon 9633f49dab fix: make gpuFaultCheckCounter more robust
Modified drm_neo.h and .cpp to check when condition is greater
than and equal to instead of equal, and changed gpuFaultCheckCounter
to be atomic

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 10:40:12 +01:00
Young Jin Yoon 82728ff394 feature: add logic to iterate for all contexts to check GPU pagefault
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.

Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 07:48:39 +01:00
Brandon Yates 76de854a69 feature: Set Debug Attach Available for Xe
Related-to: NEO-8402

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-01-24 09:04:11 +01:00
Jitendra Sharma aa191b6f88 feature: Set runalone mode for contexts with online debugging
Related-To: NEO-9139

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2024-01-17 09:01:30 +01:00
Brandon Yates ba0db2488a refactor: Implement Xe Resoure Registration (2/x)
Refactor drm_debug.cpp into IoctlHelper

Related-to: NEO-9161

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-01-11 08:26:29 +01:00
Mateusz Jablonski 138fb65401 refactor: correct naming of enum class constants 11/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-19 14:52:57 +01:00
Mateusz Jablonski dd1b9d6abc refactor: correct naming of enum class constants 8/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-19 08:18:18 +01:00
Mateusz Jablonski 35c1f34672 refactor: move number of threads per eu to release helper
Related-To: HSD-18034098647
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-11-20 12:16:33 +01:00
John Falkowski f0175b3916 feature: set device allocation chunking as default
Device allocation chunking only applies for multi-tile mode for implicit scaling

Related-To: NEO-9051

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-11-07 10:58:17 +01:00
Mateusz Jablonski 4dfa12c8eb fix: add mechanism to detect gpu timestamp overflows
unify naming CpuGpu to GpuCpu

Related-To: NEO-8394
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-19 16:31:06 +02:00
Dunajski, Bartosz 06a02552ce refactor: debug flag to override PAT index for given memory type
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-10-12 15:47:28 +02:00
Mateusz Jablonski 85eafc9e61 fix: query drm info to aligned storages
xe topology info to byte aligned storage
xe engine info to 2 byte aligned storage
system info to 4 byte aligned storage

all other info to 8 byte aligned storage

Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-06 16:11:40 +02:00