Commit Graph

456 Commits

Author SHA1 Message Date
Young Jin Yoon
e204d27190 fix: print to stdout for disable scratch page
Modified to print out error messages to stdout when disable scratch page
is used.

Related-To: GSD-7611
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-16 15:05:07 +02:00
Krzysztof Gibala
a70aaa72ed refactor: add debug message about the zero engine info size
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2024-05-15 09:07:40 +02:00
Young Jin Yoon
06faaab5bb refactor: read scratch page options during init
Change scratch page logic to initialize during Drm::create.

Related-To: GSD-7742
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-15 08:56:14 +02:00
Young Jin Yoon
2c488d9e84 fix: check reset status after completion
Added a logic to check the reset status after the completion to make
sure we go through the logic at least once

Related-To: GSD-8902
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-14 06:38:07 +02:00
Bartosz Dunajski
e5882e0d31 feature: pass GraphicsAllocation to fence wait
Related-To: NEO-8179

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-05-07 17:59:28 +02:00
Young Jin Yoon
07aa53fd87 fix: disable scratch page by default only on PVC
Disabled scratch paged by default only on PVC with productHelper.

Related-To: GSD-7742
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-01 23:44:48 +02:00
Bartosz Dunajski
806da85ec6 refactor: prework to pass interrupt hint
Related-To: NEO-8179

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-29 11:14:53 +02:00
Bartosz Dunajski
2a2596c13b refactor: pass additional data to ioctl helper
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-26 14:53:14 +02:00
Bartosz Dunajski
8e5f9e72c8 refactor: simplify waiting for fence logic
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-25 22:36:19 +02:00
Mateusz Jablonski
62390d3def feature: add number of l3 banks to TopologyData
Related-To: NEO-11125
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-04-15 11:20:43 +02:00
Mateusz Jablonski
9468915768 fix: correct preemption support in xe path
preemption is always supported by xe kmd

Related-To: NEO-10496, HSD-18037744953
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-04-04 13:29:02 +02:00
Young Jin Yoon
907129bb33 feature: disable scratch page by default
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_neo.cpp, in order to
disable scratch pages by default.
Modified to set gpuPageFault to 0 as a default value when
scratch page is not disabled.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-04-04 09:50:02 +02:00
Mateusz Jablonski
420e1391b2 fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-29 07:51:06 +01:00
Young Jin Yoon
d6a14d4ed5 feature: support explicit memory locking
Added lockMemory in context to explicitly locking memory,
Added a boolean flag in graphics_allocation to indicate the allocation
is locked, and modified memory_operations_handler to add lock().

Related-To: NEO-8277
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-29 07:31:22 +01:00
Compute-Runtime-Validation
e3f50e8aa9 Revert "fix: handle not aligned gtt size reported by i915"
This reverts commit dae901c13f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-28 12:03:23 +01:00
Mateusz Jablonski
dae901c13f fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-28 08:46:53 +01:00
Maciej Plewka
b722f3b579 feature: Add interface to bind resources as readonly
Related-To: NEO-10398
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-03-27 14:24:58 +01:00
Compute-Runtime-Validation
8e44a46983 Revert "feature: bind resources as read only"
This reverts commit f3d36d3350.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-27 08:51:47 +01:00
Maciej Plewka
f3d36d3350 feature: bind resources as read only
Related-to: NEO-10398
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-03-26 14:11:57 +01:00
Young Jin Yoon
068f6a25c6 Revert "feature: support explicit memory locking"
This reverts commit 27a3307bb0.

Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-25 20:10:20 +01:00
Mateusz Jablonski
78a4a92b44 refactor: reorder members to reduce internal padding in structs
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-25 15:50:00 +01:00
Young Jin Yoon
27a3307bb0 feature: support explicit memory locking
Added lockMemory in context to explicitly locking memory,
Added a boolean flag in graphics_allocation to indicate the allocation
is locked, and modified memory_operations_handler to add lock().
Change the logic to work correctly with makeResident() when lock() is
called previously for the same memory region

Related-To: NEO-8277
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-25 09:49:18 +01:00
Mateusz Jablonski
d94be09020 refactor: remove not needed check for exec softpin
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-22 17:30:49 +01:00
Mateusz Jablonski
ec19ce536a refactor: store userptr value in buffer object
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-22 12:49:13 +01:00
Young Jin Yoon
ec009cf9e3 fix: abort only when disabling scratch page
Modifed getResetStatus to abort only when scratch page is disabled
Removed an incorrect UNRECOVERABLE_IF statement based on the status:
validPageFault can be true when banned flag is not set, if CAT error
does not occur as a result of page fault.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-21 21:55:25 +01:00
Mateusz Jablonski
1e343053ba refactor: remove redundant recreating vector of engines in xe kmd path
make ContextParamEngine structure more generic and populate engines
by drm specific methods

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-21 17:55:39 +01:00
Mateusz Jablonski
0270cd6a5b fix: respect gt id when getting engines for drm context under xe kmd
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-15 16:02:47 +01:00
Young Jin Yoon
9633f49dab fix: make gpuFaultCheckCounter more robust
Modified drm_neo.h and .cpp to check when condition is greater
than and equal to instead of equal, and changed gpuFaultCheckCounter
to be atomic

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 10:40:12 +01:00
Young Jin Yoon
82728ff394 feature: add logic to iterate for all contexts to check GPU pagefault
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.

Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 07:48:39 +01:00
Compute-Runtime-Validation
94cc48f81b Revert "fix: don't use fake userptr flag in ioctl helper xe"
This reverts commit d3ab256f55.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-15 03:08:01 +01:00
Compute-Runtime-Validation
e11917cfcd Revert "fix: remove not needed checks in ioctl helper xe"
This reverts commit 5a6d0b21ac.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-14 21:38:09 +01:00
Mateusz Jablonski
d3ab256f55 fix: don't use fake userptr flag in ioctl helper xe
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 18:41:17 +01:00
Mateusz Jablonski
5a6d0b21ac fix: remove not needed checks in ioctl helper xe
pass gt id to contextSetParam

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 18:14:50 +01:00
Young Jin Yoon
7b81c4e08f feature: abort when unexpected GPU page fault detected
If ResetStats from i915 is from the GPU page fault, abort
the entire process instead of disabling engines.
Added a fallback mechanism when prelim_drm_i915_reset_stats
fails.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-14 08:14:59 +01:00
Compute-Runtime-Validation
9cce1183cd Revert "feature: use prelim reset_stats for detailed statisics"
This reverts commit 835dc8b594.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-13 14:31:57 +01:00
Aravind Gopalakrishnan
3f20dd3b49 refactor: Add optional user fence during unbind
Add optional fence and wait operations after unbind operation.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
2024-03-13 12:47:44 +01:00
Young Jin Yoon
835dc8b594 feature: use prelim reset_stats for detailed statisics
Added getResetStats() in ioctl_helper.h to support extended header for
prelim_drm_i915_reset_stats.
Added new data structure to capture the fault data structure for prelim.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-13 11:37:04 +01:00
Brandon Yates
7a0d2df2fe fix: Handle Pat Index Ext not supported on Xe
Xe does not support VmBindPatIndexExtension. This patch
fixes the handling of this case and prevents corrupting
other extensions

Related-to: NEO-9674

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-03-06 11:18:31 +01:00
Brandon Yates
fa4b737326 feature: Implement metadata attaching for vm_bind in xe
Related-to: NEO-9674

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-02-28 01:36:20 +01:00
Dominik Dabek
ed011de03e performance: program pat index on mtl linux
Enable programming pat indexes on mtl linux for device buffers.

Change DrmMemoryManager::allocateMemoryByKMD to use gemCreateExt.

Changes currently disabled, can be enabled with flag
DisableGemCreateExtSetPat=0

Related-To: NEO-7896

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-15 17:15:28 +01:00
Compute-Runtime-Validation
7b340775c6 Revert "performance: program pat index on mtl linux"
This reverts commit 8e0b23db84.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-02-15 02:06:03 +01:00
Dominik Dabek
8e0b23db84 performance: program pat index on mtl linux
Enable programming pat indexes on mtl linux for device buffers.

Change DrmMemoryManager::allocateMemoryByKMD to use gemCreateExt.

Related-To: NEO-7896

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-14 18:42:04 +01:00
Lukasz Jobczyk
486cc71b76 refactor: Add GDI profiling
Resolves: NEO-9236
Related-To: NEO-10036

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-02-07 18:44:11 +01:00
Katarzyna Cencelewska
e6ba9766bd feature: add debug flags to force pat index
for cached recouces: OverridePatIndexForCachedTypes
for uncached resouces: OverridePatIndexForUncachedTypes

Related-To: NEO-10157

Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2024-02-02 16:11:34 +01:00
Compute-Runtime-Validation
fa9c79fb63 Revert "refactor: Add GDI profiling"
This reverts commit 524ae7713a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-01-30 10:47:34 +01:00
Mateusz Jablonski
da16dad344 fix: don't limit vm bind support based on platform
Related-To: GSD-7097
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-01-29 19:43:15 +01:00
Lukasz Jobczyk
524ae7713a refactor: Add GDI profiling
Resolves: NEO-9236
Related-To: NEO-10036

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-01-29 11:36:04 +01:00
Brandon Yates
76de854a69 feature: Set Debug Attach Available for Xe
Related-to: NEO-8402

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-01-24 09:04:11 +01:00
Compute-Runtime-Validation
e949ba7144 Revert "refactor: Add GDI profiling"
This reverts commit 8d56f8fb6b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-01-23 06:13:02 +01:00
Lukasz Jobczyk
8d56f8fb6b refactor: Add GDI profiling
Resolves: NEO-9236
Related-To: NEO-10036

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-01-22 14:24:08 +01:00