Commit Graph

1147 Commits

Author SHA1 Message Date
Young Jin Yoon
82728ff394 feature: add logic to iterate for all contexts to check GPU pagefault
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.

Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 07:48:39 +01:00
Compute-Runtime-Validation
94cc48f81b Revert "fix: don't use fake userptr flag in ioctl helper xe"
This reverts commit d3ab256f55.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-15 03:08:01 +01:00
Compute-Runtime-Validation
e11917cfcd Revert "fix: remove not needed checks in ioctl helper xe"
This reverts commit 5a6d0b21ac.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-14 21:38:09 +01:00
Mateusz Jablonski
d3ab256f55 fix: don't use fake userptr flag in ioctl helper xe
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 18:41:17 +01:00
Mateusz Jablonski
5a6d0b21ac fix: remove not needed checks in ioctl helper xe
pass gt id to contextSetParam

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 18:14:50 +01:00
Neil R. Spruit
b5f8a38f19 feature: Enable Per IP euStall Functionality
Related-To: NEO-10220

Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
2024-03-14 16:49:52 +01:00
Compute-Runtime-Validation
ef7dbc99f1 Revert "fix: don't use fake userptr flag in ioctl helper xe"
This reverts commit 98824fdaf6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-14 14:35:14 +01:00
Mateusz Jablonski
833fa6bce1 fix: correct querying engines from xe kmd
we get drm_xe_query_engines, not array of drm_xe_engine_class_instance

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 12:06:25 +01:00
Mateusz Jablonski
98824fdaf6 fix: don't use fake userptr flag in ioctl helper xe
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 10:07:38 +01:00
Young Jin Yoon
7b81c4e08f feature: abort when unexpected GPU page fault detected
If ResetStats from i915 is from the GPU page fault, abort
the entire process instead of disabling engines.
Added a fallback mechanism when prelim_drm_i915_reset_stats
fails.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-14 08:14:59 +01:00
Mateusz Jablonski
0210e37f03 fix: respect gt id when finding xe engine info
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-13 20:52:36 +01:00
Francois Dugast
78e55f31b6 fix: Remove unused constant USER_FENCE_VALUE
Related-to: NEO-10321

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2024-03-13 15:26:26 +01:00
Compute-Runtime-Validation
9cce1183cd Revert "feature: use prelim reset_stats for detailed statisics"
This reverts commit 835dc8b594.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-13 14:31:57 +01:00
Aravind Gopalakrishnan
3f20dd3b49 refactor: Add optional user fence during unbind
Add optional fence and wait operations after unbind operation.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
2024-03-13 12:47:44 +01:00
Young Jin Yoon
835dc8b594 feature: use prelim reset_stats for detailed statisics
Added getResetStats() in ioctl_helper.h to support extended header for
prelim_drm_i915_reset_stats.
Added new data structure to capture the fault data structure for prelim.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-13 11:37:04 +01:00
Francois Dugast
5483e466e8 fix: Align on strings returned for unknown values
Related-to: NEO-10321

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2024-03-13 11:21:51 +01:00
Mateusz Jablonski
973757a58d build: enable xe drm detection by default
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-11 14:29:20 +01:00
Young Jin Yoon
65f3e80796 Revert "build: remove static_assert for drm header change"
This reverts commit 219470f60d.

Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-08 09:54:30 +01:00
Bartosz Dunajski
79d80047ef refactor: improve mmap logging logic
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-03-07 12:15:39 +01:00
Bartosz Dunajski
fcd57f94cf refactor: capability to print mmap and munmap calls
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-03-06 14:29:01 +01:00
Brandon Yates
7a0d2df2fe fix: Handle Pat Index Ext not supported on Xe
Xe does not support VmBindPatIndexExtension. This patch
fixes the handling of this case and prevents corrupting
other extensions

Related-to: NEO-9674

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-03-06 11:18:31 +01:00
Young Jin Yoon
bf9805b0bb fix: override reset_stat IOCTL macro for prelim
Modified to return DRM_IOCTL_I915_GET_RESET_STATS of prelim headers
as the macro values used for non-prelim is different from the prelim
value due to sizeof() embedded in _IOWR()

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-02-28 10:09:27 +01:00
Brandon Yates
fa4b737326 feature: Implement metadata attaching for vm_bind in xe
Related-to: NEO-9674

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-02-28 01:36:20 +01:00
Young Jin Yoon
219470f60d build: remove static_assert for drm header change
Removed static_assert for reset_stats before updating
drm header to v2.0-r23.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-02-27 17:42:01 +01:00
Dunajski, Bartosz
6cdd2d5dca fix: add missing gt_id when creating XE context
Related-To: GSD-8046

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2024-02-23 16:50:32 +01:00
Brandon Yates
0fa730e524 build: Update debugger uapi headers to latest
Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-02-22 16:07:06 +01:00
Michal Mrozek
27f4eab52f fix: restore previous order of variables
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
Resolves: NEO-10439
2024-02-19 14:13:54 +01:00
Dominik Dabek
0120d8a58d performance: program pat index on mtl linux
Enable programming pat indexes on mtl linux for device buffers.

Change DrmMemoryManager::allocateMemoryByKMD to use gemCreateExt.

Set mmap flags based on coherency.
Map as write back on legacy and coherent.
On non-coherent map as write combined.

Changes currently disabled, to enable use debug keys:
DisableGemCreateExtSetPat=0
UseGemCreateExtInAllocateMemoryByKMD=1

Reorder BufferObject to decrease padding.

Related-To: NEO-7896

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-16 17:33:07 +01:00
Dominik Dabek
ed011de03e performance: program pat index on mtl linux
Enable programming pat indexes on mtl linux for device buffers.

Change DrmMemoryManager::allocateMemoryByKMD to use gemCreateExt.

Changes currently disabled, can be enabled with flag
DisableGemCreateExtSetPat=0

Related-To: NEO-7896

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-15 17:15:28 +01:00
Compute-Runtime-Validation
7b340775c6 Revert "performance: program pat index on mtl linux"
This reverts commit 8e0b23db84.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-02-15 02:06:03 +01:00
Dominik Dabek
8e0b23db84 performance: program pat index on mtl linux
Enable programming pat indexes on mtl linux for device buffers.

Change DrmMemoryManager::allocateMemoryByKMD to use gemCreateExt.

Related-To: NEO-7896

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-14 18:42:04 +01:00
Mateusz Jablonski
bb5f6d9660 fix: don't query vm bind support on i915 prelim for pre-Xe platforms
Related-To: HSD-18036843571
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-02-09 12:50:48 +01:00
Brandon Yates
ea7ae7564f feature: Implement read/writeGpuMemory for Xe debugger
- makes prelim read/writeGpuMemory generic
- Implements Xe specific ioctls and fsyncs
- Refactors dbg IoctlHelper to use shared base class
for Xe and i915

Related-to: NEO-9668

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-02-08 22:09:59 +01:00
Lukasz Jobczyk
486cc71b76 refactor: Add GDI profiling
Resolves: NEO-9236
Related-To: NEO-10036

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-02-07 18:44:11 +01:00
Maciej Plewka
ce17580b28 fix: Use Rcs engine in blender on DG2
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-02-07 18:21:54 +01:00
Naklicki, Mateusz
eb0b0c2c89 refactor: add missing xe logs
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2024-02-06 13:03:46 +01:00
Kamil Kopryk
a4f7dda98f refactor: Add xe print debug key
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-02-02 16:39:51 +01:00
Katarzyna Cencelewska
e6ba9766bd feature: add debug flags to force pat index
for cached recouces: OverridePatIndexForCachedTypes
for uncached resouces: OverridePatIndexForUncachedTypes

Related-To: NEO-10157

Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2024-02-02 16:11:34 +01:00
Brandon Yates
27c089d60d feature: Register ELF for xe debugger
Related-to:  NEO-9674

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-02-02 09:19:19 +01:00
Jitendra Sharma
00b1f1c5b5 fix: set runalone mode in xe only for render and compute
Runalone mode in XE is supported only for RENDER and COMPUTE.

Related-To: NEO-9139

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2024-02-02 09:13:17 +01:00
Francois Dugast
e311ba5597 refactor: Move ownership of engine type to caller of setDefaultEngine
Related-To: GSD-7097

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2024-02-01 14:20:20 +01:00
Filip Hazubski
d920753ca6 fix: Disable related logic when EnableHostAllocationMemPolicy is not set
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2024-01-31 13:45:52 +01:00
Yoon, Young Jin
cbe35d70a5 fix: initialize libnuma only when flag is set
Modified in memory_info.cpp to initialize libnuma only when
EnableHostAllocationMemPolicy is set.

Related-To: NEO-8276
Signed-off-by: Yoon, Young Jin <young.jin.yoon@intel.com>
2024-01-30 18:27:43 +01:00
Maciej Plewka
564e0f0319 performance: Align host mem to 2MB when range is not limited
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>

Related-To: NEO-10217
2024-01-30 14:43:13 +01:00
Francois Dugast
278ced35dc fix: Use capability table to determine engine type for defaultEngine
Related-To: GSD-7097

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2024-01-30 14:28:09 +01:00
Compute-Runtime-Validation
fa9c79fb63 Revert "refactor: Add GDI profiling"
This reverts commit 524ae7713a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-01-30 10:47:34 +01:00
Mateusz Jablonski
da16dad344 fix: don't limit vm bind support based on platform
Related-To: GSD-7097
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-01-29 19:43:15 +01:00
Lukasz Jobczyk
524ae7713a refactor: Add GDI profiling
Resolves: NEO-9236
Related-To: NEO-10036

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-01-29 11:36:04 +01:00
Maciej Plewka
7728123907 fix: Do not use 2mb alignment for host ptr allocs
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>

Related-To: NEO-9945
2024-01-29 11:01:00 +01:00
Jitendra Sharma
548ecec7f8 feature: Implement debugger open IOCTL
Related-To: NEO-8405

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2024-01-24 09:50:39 +01:00