Commit Graph

1783 Commits

Author SHA1 Message Date
Mateusz Jablonski
0b2e8e2848 fix: remove hardcoded caps reported by ioctl helper xe
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-22 10:41:54 +01:00
Young Jin Yoon
ec009cf9e3 fix: abort only when disabling scratch page
Modifed getResetStatus to abort only when scratch page is disabled
Removed an incorrect UNRECOVERABLE_IF statement based on the status:
validPageFault can be true when banned flag is not set, if CAT error
does not occur as a result of page fault.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-21 21:55:25 +01:00
Mateusz Jablonski
4df0dd7894 fix: remove hardcoded caps reported by ioctl helper xe
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-21 21:09:20 +01:00
Mateusz Jablonski
92d37b20a6 fix: setup gpu address space based on config info from xe kmd
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-21 18:49:19 +01:00
Mateusz Jablonski
1e343053ba refactor: remove redundant recreating vector of engines in xe kmd path
make ContextParamEngine structure more generic and populate engines
by drm specific methods

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-21 17:55:39 +01:00
Mateusz Jablonski
a2742492ab feature: enable xe drm detection by default
driver is built with xe drm support by default

added cmake flag to control xe eu debug API support
NEO_ENABLE_XE_EU_DEBUG_SUPPORT

This flag is disabled by default and uapi-eu-debug headers are not
needed for driver compilation as these headers are not a part of
upstream kernel yet.

Related-To: NEO-10780

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-19 08:47:13 +01:00
Mateusz Jablonski
6b33d91140 fix: remove not needed check for context param engine count
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-18 13:17:05 +01:00
Mateusz Jablonski
19dcc80e44 Revert "build: enable xe drm detection by default"
This reverts commit 973757a58d.

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-18 09:04:59 +01:00
Compute-Runtime-Validation
016c234893 Revert "feature: disable scratch page by default"
This reverts commit dab5469f81.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-16 01:52:00 +01:00
Mateusz Jablonski
1319ab4efc refactor: don't setup struct members with designated initializers
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-15 16:36:00 +01:00
Mateusz Jablonski
e21180992f fix: remove not needed check for engine instance count
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-15 16:16:21 +01:00
Mateusz Jablonski
0270cd6a5b fix: respect gt id when getting engines for drm context under xe kmd
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-15 16:02:47 +01:00
Bellekallu Rajkiran
9736313d10 feature: Support for ccs mode configuration via SysFs
Add support for configuring ccs mode for all applicable devices
before KMD is loaded.

Use ZEX_NUMBER_OF_CCS to configure ccs mode.

Format is as follows:

ZEX_NUMBER_OF_CCS=NumberOfCcs i,e Setting ZEX_NUMBER_OF_CCS
to 4 sets ccs mode to 4 for all devices for which configuration
is supported.

Related-To: NEO-10378

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2024-03-15 15:51:45 +01:00
Young Jin Yoon
dab5469f81 feature: disable scratch page by default
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_nep.cpp, in order to
disable scratch pages by default.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 11:44:10 +01:00
Young Jin Yoon
9633f49dab fix: make gpuFaultCheckCounter more robust
Modified drm_neo.h and .cpp to check when condition is greater
than and equal to instead of equal, and changed gpuFaultCheckCounter
to be atomic

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 10:40:12 +01:00
Young Jin Yoon
82728ff394 feature: add logic to iterate for all contexts to check GPU pagefault
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.

Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 07:48:39 +01:00
Compute-Runtime-Validation
94cc48f81b Revert "fix: don't use fake userptr flag in ioctl helper xe"
This reverts commit d3ab256f55.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-15 03:08:01 +01:00
Compute-Runtime-Validation
e11917cfcd Revert "fix: remove not needed checks in ioctl helper xe"
This reverts commit 5a6d0b21ac.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-14 21:38:09 +01:00
Mateusz Jablonski
d3ab256f55 fix: don't use fake userptr flag in ioctl helper xe
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 18:41:17 +01:00
Mateusz Jablonski
5a6d0b21ac fix: remove not needed checks in ioctl helper xe
pass gt id to contextSetParam

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 18:14:50 +01:00
Neil R. Spruit
b5f8a38f19 feature: Enable Per IP euStall Functionality
Related-To: NEO-10220

Signed-off-by: Neil R. Spruit <neil.r.spruit@intel.com>
2024-03-14 16:49:52 +01:00
Zbigniew Zdanowicz
8fe1a460f8 refactor: simplify isDcFlushAllowed implementation
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-03-14 15:09:39 +01:00
Compute-Runtime-Validation
ef7dbc99f1 Revert "fix: don't use fake userptr flag in ioctl helper xe"
This reverts commit 98824fdaf6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-14 14:35:14 +01:00
Mateusz Jablonski
833fa6bce1 fix: correct querying engines from xe kmd
we get drm_xe_query_engines, not array of drm_xe_engine_class_instance

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 12:06:25 +01:00
Mateusz Jablonski
98824fdaf6 fix: don't use fake userptr flag in ioctl helper xe
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-14 10:07:38 +01:00
Zbigniew Zdanowicz
9815f1e99b refactor: group template implementations and change inl file names
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-03-14 08:38:05 +01:00
Young Jin Yoon
7b81c4e08f feature: abort when unexpected GPU page fault detected
If ResetStats from i915 is from the GPU page fault, abort
the entire process instead of disabling engines.
Added a fallback mechanism when prelim_drm_i915_reset_stats
fails.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-14 08:14:59 +01:00
Mateusz Jablonski
0210e37f03 fix: respect gt id when finding xe engine info
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-13 20:52:36 +01:00
Francois Dugast
78e55f31b6 fix: Remove unused constant USER_FENCE_VALUE
Related-to: NEO-10321

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2024-03-13 15:26:26 +01:00
Compute-Runtime-Validation
9cce1183cd Revert "feature: use prelim reset_stats for detailed statisics"
This reverts commit 835dc8b594.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-13 14:31:57 +01:00
Aravind Gopalakrishnan
3f20dd3b49 refactor: Add optional user fence during unbind
Add optional fence and wait operations after unbind operation.

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
2024-03-13 12:47:44 +01:00
Young Jin Yoon
835dc8b594 feature: use prelim reset_stats for detailed statisics
Added getResetStats() in ioctl_helper.h to support extended header for
prelim_drm_i915_reset_stats.
Added new data structure to capture the fault data structure for prelim.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-13 11:37:04 +01:00
Francois Dugast
5483e466e8 fix: Align on strings returned for unknown values
Related-to: NEO-10321

Signed-off-by: Francois Dugast <francois.dugast@intel.com>
2024-03-13 11:21:51 +01:00
Lukasz Jobczyk
c3f1eba24a refactor: Add flag to control DC flush
Related-To: NEO-10556

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-03-12 14:54:16 +01:00
Mateusz Jablonski
973757a58d build: enable xe drm detection by default
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-11 14:29:20 +01:00
Dominik Dabek
5ba9308804 performance: debug flag for localPreferred
Add flag for setting localPreferred (implicit when gmm localOnly=0 and
NonLocalOnly=0) when allocating buffer, svmGpu and image.

Related-To: NEO-9695

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-03-11 10:51:49 +01:00
Young Jin Yoon
65f3e80796 Revert "build: remove static_assert for drm header change"
This reverts commit 219470f60d.

Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-08 09:54:30 +01:00
Mrozek, Michal
10313b7b84 refactor: remove not needed code
Signed-off-by: Mrozek, Michal <michal.mrozek@intel.com>
2024-03-07 18:50:16 +01:00
Lukasz Jobczyk
6d1a3d404e refactor: Add helper to control flat ring buffer
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-03-07 17:52:23 +01:00
Bartosz Dunajski
79d80047ef refactor: improve mmap logging logic
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-03-07 12:15:39 +01:00
Mateusz Jablonski
8ae4a3bc7a fix: pass Sku/Wa tables for gmm without additional translations on Windows
Related-To: NEO-10623
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-06 14:58:58 +01:00
Bartosz Dunajski
fcd57f94cf refactor: capability to print mmap and munmap calls
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-03-06 14:29:01 +01:00
Brandon Yates
7a0d2df2fe fix: Handle Pat Index Ext not supported on Xe
Xe does not support VmBindPatIndexExtension. This patch
fixes the handling of this case and prevents corrupting
other extensions

Related-to: NEO-9674

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-03-06 11:18:31 +01:00
Dominik Dabek
a04c67ec52 performance(ocl): refactor pool allocators tests
add explicit tests for xe hpc
Related-To: NEO-9700

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-03-05 18:48:55 +01:00
Kozlowski, Marek
6751d19c19 fix: decanonize pointer to match GPU heap address space
* `zeVirtualMemReserve` `pStart` address may be passed in a canonizated form.

Resolves: NEO-10086

Signed-off-by: Kozlowski, Marek <marek.kozlowski@intel.com>
2024-03-01 12:18:11 +01:00
Lukasz Jobczyk
e5db84f370 performance: Use GEMCreateExt when allocate by KMD
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-02-29 18:01:55 +01:00
Lukasz Jobczyk
409e19a832 performance: Enable cmd buffer preallocation per CmdQ on xe and later
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-02-29 17:41:58 +01:00
Lukasz Jobczyk
676644bc50 performance: Enable internal heap preallocation on xe and later
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-02-28 17:58:52 +01:00
Lukasz Jobczyk
d1dd34d0c7 performance: Enable timestamp wait for events on xe and later
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-02-28 14:18:14 +01:00
Young Jin Yoon
bf9805b0bb fix: override reset_stat IOCTL macro for prelim
Modified to return DRM_IOCTL_I915_GET_RESET_STATS of prelim headers
as the macro values used for non-prelim is different from the prelim
value due to sizeof() embedded in _IOWR()

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-02-28 10:09:27 +01:00