Previous method used for retrieiving sub-device count in this path
did not take into account device affinity mask nor device hiearchy
mode, resulting in segmentation faults when attempting to allocate
rtDispatchGlobals structure using improper deviceBitFields.
Related-To: NEO-8422
Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
The correct implementation is in DebugSessionLinux and
the overrides in DebugSessionLinuxXe are not needed
Related-to: NEO-9669
Signed-off-by: Brandon Yates <brandon.yates@intel.com>
driver is built with xe drm support by default
added cmake flag to control xe eu debug API support
NEO_ENABLE_XE_EU_DEBUG_SUPPORT
This flag is disabled by default and uapi-eu-debug headers are not
needed for driver compilation as these headers are not a part of
upstream kernel yet.
Related-To: NEO-10780
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Add support for configuring ccs mode for all applicable devices
before KMD is loaded.
Use ZEX_NUMBER_OF_CCS to configure ccs mode.
Format is as follows:
ZEX_NUMBER_OF_CCS=NumberOfCcs i,e Setting ZEX_NUMBER_OF_CCS
to 4 sets ccs mode to 4 for all devices for which configuration
is supported.
Related-To: NEO-10378
Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_nep.cpp, in order to
disable scratch pages by default.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Modified drm_neo.h and .cpp to check when condition is greater
than and equal to instead of equal, and changed gpuFaultCheckCounter
to be atomic
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.
Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
Pass actual pool address to heap allocator. This removes the need to
calculate pooled pointer from pool address and offset.
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
we get drm_xe_query_engines, not array of drm_xe_engine_class_instance
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
When calculating the size of the indirect object heap,
the local work group size from kernel implicit args is taken into account.
If the LWS is not set before this calculation,
it can lead to insufficient ioh allocation size.
Such a problem is seen when local ids are generated by the runtime
and then written to ioh. The write fails due to lack of space in the allocation.
Related-To: IGC-7708
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
If ResetStats from i915 is from the GPU page fault, abort
the entire process instead of disabling engines.
Added a fallback mechanism when prelim_drm_i915_reset_stats
fails.
Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>