Added lockMemory in context to explicitly locking memory,
Added a boolean flag in graphics_allocation to indicate the allocation
is locked, and modified memory_operations_handler to add lock().
Change the logic to work correctly with makeResident() when lock() is
called previously for the same memory region
Related-To: NEO-8277
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Modifed getResetStatus to abort only when scratch page is disabled
Removed an incorrect UNRECOVERABLE_IF statement based on the status:
validPageFault can be true when banned flag is not set, if CAT error
does not occur as a result of page fault.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
make ContextParamEngine structure more generic and populate engines
by drm specific methods
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
driver is built with xe drm support by default
added cmake flag to control xe eu debug API support
NEO_ENABLE_XE_EU_DEBUG_SUPPORT
This flag is disabled by default and uapi-eu-debug headers are not
needed for driver compilation as these headers are not a part of
upstream kernel yet.
Related-To: NEO-10780
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_nep.cpp, in order to
disable scratch pages by default.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Modified drm_neo.h and .cpp to check when condition is greater
than and equal to instead of equal, and changed gpuFaultCheckCounter
to be atomic
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.
Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
we get drm_xe_query_engines, not array of drm_xe_engine_class_instance
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
If ResetStats from i915 is from the GPU page fault, abort
the entire process instead of disabling engines.
Added a fallback mechanism when prelim_drm_i915_reset_stats
fails.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Added getResetStats() in ioctl_helper.h to support extended header for
prelim_drm_i915_reset_stats.
Added new data structure to capture the fault data structure for prelim.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Xe does not support VmBindPatIndexExtension. This patch
fixes the handling of this case and prevents corrupting
other extensions
Related-to: NEO-9674
Signed-off-by: Brandon Yates <brandon.yates@intel.com>
Modified to return DRM_IOCTL_I915_GET_RESET_STATS of prelim headers
as the macro values used for non-prelim is different from the prelim
value due to sizeof() embedded in _IOWR()
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Removed static_assert for reset_stats before updating
drm header to v2.0-r23.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>