Commit Graph

82 Commits

Author SHA1 Message Date
Young Jin Yoon
07aa53fd87 fix: disable scratch page by default only on PVC
Disabled scratch paged by default only on PVC with productHelper.

Related-To: GSD-7742
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-01 23:44:48 +02:00
Mateusz Jablonski
9468915768 fix: correct preemption support in xe path
preemption is always supported by xe kmd

Related-To: NEO-10496, HSD-18037744953
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-04-04 13:29:02 +02:00
Young Jin Yoon
907129bb33 feature: disable scratch page by default
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_neo.cpp, in order to
disable scratch pages by default.
Modified to set gpuPageFault to 0 as a default value when
scratch page is not disabled.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-04-04 09:50:02 +02:00
Mateusz Jablonski
420e1391b2 fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-29 07:51:06 +01:00
Compute-Runtime-Validation
e3f50e8aa9 Revert "fix: handle not aligned gtt size reported by i915"
This reverts commit dae901c13f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-28 12:03:23 +01:00
Mateusz Jablonski
dae901c13f fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-28 08:46:53 +01:00
Mateusz Jablonski
d94be09020 refactor: remove not needed check for exec softpin
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-22 17:30:49 +01:00
Young Jin Yoon
ec009cf9e3 fix: abort only when disabling scratch page
Modifed getResetStatus to abort only when scratch page is disabled
Removed an incorrect UNRECOVERABLE_IF statement based on the status:
validPageFault can be true when banned flag is not set, if CAT error
does not occur as a result of page fault.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-21 21:55:25 +01:00
Compute-Runtime-Validation
016c234893 Revert "feature: disable scratch page by default"
This reverts commit dab5469f81.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-16 01:52:00 +01:00
Young Jin Yoon
dab5469f81 feature: disable scratch page by default
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_nep.cpp, in order to
disable scratch pages by default.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 11:44:10 +01:00
Young Jin Yoon
82728ff394 feature: add logic to iterate for all contexts to check GPU pagefault
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.

Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 07:48:39 +01:00
Young Jin Yoon
7b81c4e08f feature: abort when unexpected GPU page fault detected
If ResetStats from i915 is from the GPU page fault, abort
the entire process instead of disabling engines.
Added a fallback mechanism when prelim_drm_i915_reset_stats
fails.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-14 08:14:59 +01:00
Brandon Yates
76de854a69 feature: Set Debug Attach Available for Xe
Related-to: NEO-8402

Signed-off-by: Brandon Yates <brandon.yates@intel.com>
2024-01-24 09:04:11 +01:00
Mateusz Jablonski
2eba5b35e4 refactor: correct naming of DrmParam enum values
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-13 15:43:46 +01:00
Mateusz Jablonski
739d181026 refactor: correct naming of enum class constants 6/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-13 14:48:52 +01:00
Mateusz Jablonski
432142c574 refactor: correct naming of enum class constants 4/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-13 08:08:51 +01:00
Mateusz Jablonski
cff6c81be0 refactor: correct naming of DrmIoctl enums
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-12 10:02:19 +01:00
Mateusz Jablonski
beafea9b39 refactor: correct naming of enum class constants 2/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-11 13:13:35 +01:00
Mateusz Jablonski
c9664e6bad refactor: rename global debug manager to debugManager
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-11-30 13:00:59 +01:00
Mateusz Jablonski
36194c4e7d refactor: correct variable namings
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-11-29 23:49:03 +01:00
Mateusz Jablonski
35c1f34672 refactor: move number of threads per eu to release helper
Related-To: HSD-18034098647
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-11-20 12:16:33 +01:00
Mateusz Jablonski
3ceafa2259 fix: remove setting debug flags for ioctl helper xe
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-26 15:42:52 +02:00
Cencelewska, Katarzyna
98dae70415 fix: add helper to proper call GemCreate on xe kmd
Related-To: NEO-8325
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-09-08 12:27:11 +02:00
Bari, Pratik
3f083360a2 feature(sysman): Added sysfs filenames for the memory module
- The sysfs filenames have been added in the sysfsNameToFileMap of the
SysmanKmdInterface classes.
- The functions returning the sysfs filenames have been removed from the
shared directory.
- The ULTs have been added to return the sysfs filenames.

Related-To: LOCI-4699

Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
2023-08-03 22:36:17 +02:00
Kamil Kopryk
082d33bb7c fix: correct query topology on xe
Related-To: NEO-7996
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-06-22 13:24:52 +02:00
Matias Cabral
96517a08aa feature: Implement zetMetricGroupGetGlobalTimestampsExp()
Resolves: LOCI-3072

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2023-06-21 09:48:41 +02:00
Bari, Pratik
a15e8a9679 feature: Added changes for Porting Memory API with XE driver
The Memory Info object is used in the getState function for memory.
Some of the ULTS in the memory modules has been modified.
A function to return the sysfs nodes for the Memory address range has
been added in the IoctlHelper class corresponding to the XE and i915
driver.

Related-To: LOCI-4397

Signed-off-by: Bari, Pratik <pratik.bari@intel.com>
2023-06-20 21:38:17 +02:00
Matias Cabral
cfa187aec6 feature: Support for metrics group exp extension
Support zet_metric_global_timestamps_resolution_exp_t

Resolves: LOCI-4350

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2023-06-16 07:48:32 +02:00
Mateusz Jablonski
3b981331c9 fix: correct handling ZE_ENABLE_PCI_ID_DEVICE_ORDER flag
- by default ZE_ENABLE_PCI_ID_DEVICE_ORDER is disabled
- by default devices are sorted by type (discrete first), then by pci order
- when ZE_ENABLE_PCI_ID_DEVICE_ORDER is enabled, devices are sorted by pci id

Related-To: LOCI-4520

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-06-14 16:27:55 +02:00
Mateusz Jablonski
6f21d133cf fix: extend MemoryInfo class interface to expose single memory region
unify logic of OverrideDrmRegion debug flag

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-05-30 16:27:42 +02:00
Mateusz Hoppe
9c17cb9bd9 fix: add CLOEXEC flag when opening gpu cards
- close-on-exec prevents old file descriptor to leak when exec() is
called

Resolves: NEO-7944

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-05-09 11:53:57 +02:00
Mateusz Jablonski
fd1ad7c1f0 feature: setup heap extended host size based on system memory size
Related-To: NEO-7665
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-04-28 15:23:01 +02:00
Mateusz Jablonski
e4a446df58 feature usm: add debug flag to allocate shared USM in heap extended
Related-To: NEO-7665
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-04-13 11:30:09 +02:00
Compute-Runtime-Validation
b11a64718a Revert "feature usm: allocate shared USM in heap extended"
This reverts commit 03ed1e1e12.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-30 11:39:59 +02:00
Mateusz Jablonski
03ed1e1e12 feature usm: allocate shared USM in heap extended
Related-To: NEO-7665
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-29 16:04:05 +02:00
Mateusz Hoppe
d8f99161dd fix: create VMs with correct flags when perContextVms used
Related-To: NEO-7813

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-28 13:09:46 +02:00
Fabian Zwolinski
65c73a690f Introduce Online, Offline, Disabled DebuggingModes
This change allows to set DebuggingMode via
ZET_ENABLE_PROGRAM_DEBUGGING env var
0: Disabled
1: Online
2: Offline

Related-To: NEO-7630
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-17 09:31:17 +01:00
Fabian Zwolinski
93a30f002b L0 Debugger - check debug_eu entry.
Related-To: NEO-7790
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-15 16:14:49 +01:00
Mateusz Hoppe
e62c5e25d5 refactor: change debugging enabled to debugging mode
Related-To: NEO-7630

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-15 13:41:41 +01:00
Compute-Runtime-Validation
3e1d931296 Revert "L0 Debugger - check debug_eu entry"
This reverts commit 9f935276a0.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-03-15 12:28:08 +01:00
Fabian Zwolinski
9f935276a0 L0 Debugger - check debug_eu entry
Related-To: NEO-7790
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-03-13 14:46:28 +01:00
Mateusz Jablonski
553dd7f21f refactor: return thread per eu from compiler product helper
Related-To: NEO-7442
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-08 16:25:20 +01:00
Aravind Gopalakrishnan
d75c4d3ec7 fix: Skip adding device to list if context creation fails
Propogate error codes from ioctl failure properly up the layers
so that we skip exposing bad root devices.

Related-To: NEO-7709

Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
2023-02-16 11:40:54 +01:00
Warchulski, Jaroslaw
11764dd9bf Cleanup includes 40
Cleaned up files:
shared/source/os_interface/linux/drm_neo.h
shared/source/os_interface/windows/wddm/um_km_data_translator.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-23 16:19:35 +01:00
Mateusz Jablonski
1fd8b26499 refactor: rename IoctlHelper::get to IoctlHelper::getI915Helper
remove drm version parameter as i915 is always expected

Related-To: NEO-7578
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-01-09 12:32:45 +01:00
Kamil Kopryk
468d722efb Move clGfxCoreHelper ownership to rootDeviceEnv
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-01-05 12:58:38 +01:00
Mateusz Hoppe
f19abda0e2 Set root device index in OsContext
- correclty choose default engine context accounting for root device
index and  subdevices bitfield

Related-To: NEO-7516

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-11-16 23:02:19 +01:00
Mateusz Jablonski
930ca001a1 Add a layer to translate exec buffer error to SubmissionStatus value
handle errors: EWOULDBLOCK, ENOSPC, ENOMEM, ENXIO

Related-To: NEO-7144, NEO-7412
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-11-04 10:32:54 +01:00
Dunajski, Bartosz
a9ba581d97 Always use unrecoverable drm context
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2022-10-31 10:08:57 +01:00
Compute-Runtime-Validation
040d6693cd Revert "Always use unrecoverable drm context"
This reverts commit 343371faad.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-10-29 19:28:04 +02:00