Commit Graph

375 Commits

Author SHA1 Message Date
Mateusz Jablonski
55907c853b fix: limit EU count based on subslice count and info from GuC
Related-To: NEO-12073
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-17 16:40:20 +02:00
Mateusz Jablonski
48d6788b3a fix: add fallback for missing eu count in topology
for xe kmd there will be new query for EU per DSS for PVC, LNL, BMG platforms
when new query is available, previous one (currently used in NEO) will be empty
To avoid integration issues this commit adds fallback to setup that value based
on max eu per dss that we get from GuC in device blob

Support for new query is in PR https://github.com/intel/compute-runtime/pull/745

Related-To: NEO-12012
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-17 14:31:29 +02:00
Mateusz Jablonski
9aa7ad0fd7 refactor: remove redundant querying topology info
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-17 07:34:27 +02:00
Mateusz Jablonski
e39994f525 fix: setup slm size based on gt system info when not set in capability table
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-09 15:21:35 +02:00
Compute-Runtime-Validation
991640f558 Revert "fix: update slm size in capability table based on gt system info"
This reverts commit 47e064a686.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-09 03:31:42 +02:00
Mateusz Jablonski
47e064a686 fix: update slm size in capability table based on gt system info
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-08 09:35:25 +02:00
Compute-Runtime-Validation
c679e7df30 Revert "fix: update slm size in capability table based on gt system info"
This reverts commit e624a4b0ab.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-06 03:40:49 +02:00
Mateusz Jablonski
e624a4b0ab fix: update slm size in capability table based on gt system info
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-05 14:25:33 +02:00
Mateusz Jablonski
30fac27508 fix: setup slm size in hw info based on device blob
Related-To: NEO-8188, NEO-10774
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-04 15:25:13 +02:00
Maciej Bielski
bfaf2309e8 refactor: cleanup setupHardwareInfo
Use only `usDeviceID` and `usRevId` (known to be correct) instead of
declaring `hwInfo` pointer until the type of the latter is completely
initialized.

Put initialization of all `hwInfo` fields into `setupHardwareInfo()` for
consistency and make overrides explicit.

Related-To: NEO-9754
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2024-07-04 08:54:57 +02:00
Mateusz Jablonski
116c3ef771 refactor: remove not needed check
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-06-26 16:56:05 +02:00
Mateusz Jablonski
43ca7c082e fix: query csr size in mb and slm size per dss from device blob
Related-To: NEO-8188, NEO-10774
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-06-26 14:42:18 +02:00
Compute-Runtime-Validation
2800282bdb Revert "fix: unblock xekmd recoverable pagefaults vmbind"
This reverts commit 5e144dae16.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-06-25 18:09:25 +02:00
Naklicki, Mateusz
5e144dae16 fix: unblock xekmd recoverable pagefaults vmbind
Related-To: HSD-13011898606
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2024-06-25 13:23:48 +02:00
Compute-Runtime-Validation
1731c09d3a Revert "fix: add missing pagefault support query on XeKMD"
This reverts commit 50dfe33efa.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-06-04 22:46:01 +02:00
Naklicki, Mateusz
50dfe33efa fix: add missing pagefault support query on XeKMD
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2024-06-04 15:47:14 +02:00
Maciej Bielski
dafaaf5f7d fix: add hasEngines() to check for engines detection
Related-To: NEO-9754
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2024-05-22 13:51:53 +02:00
John Falkowski
cf8f0b9cd8 feature: 2-Tile device memory chunking independent of KMD migration
Related-To: NEO-10916

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2024-05-21 08:07:10 +02:00
Young Jin Yoon
49cc1a0ba0 fix: use llx for fprintf and IoctlFunctions
Changed format for address printing from %lx to %llx for
fprintf introduced in drm_neo.cpp, and then use
IoctlFunctions::fprintf instead of std::printf to avoid
errors on gcc.

Changed formate for address printing from %lx to %llx for
snprintf introduced in drm_test.cpp, and then type casted
to long long unsigned int explictly to avoid errors.

Related-To: GSD-7611
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-20 16:39:49 +02:00
Young Jin Yoon
e204d27190 fix: print to stdout for disable scratch page
Modified to print out error messages to stdout when disable scratch page
is used.

Related-To: GSD-7611
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-16 15:05:07 +02:00
Krzysztof Gibala
a70aaa72ed refactor: add debug message about the zero engine info size
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2024-05-15 09:07:40 +02:00
Young Jin Yoon
06faaab5bb refactor: read scratch page options during init
Change scratch page logic to initialize during Drm::create.

Related-To: GSD-7742
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-15 08:56:14 +02:00
Young Jin Yoon
2c488d9e84 fix: check reset status after completion
Added a logic to check the reset status after the completion to make
sure we go through the logic at least once

Related-To: GSD-8902
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-14 06:38:07 +02:00
Bartosz Dunajski
e5882e0d31 feature: pass GraphicsAllocation to fence wait
Related-To: NEO-8179

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-05-07 17:59:28 +02:00
Young Jin Yoon
07aa53fd87 fix: disable scratch page by default only on PVC
Disabled scratch paged by default only on PVC with productHelper.

Related-To: GSD-7742
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-05-01 23:44:48 +02:00
Bartosz Dunajski
806da85ec6 refactor: prework to pass interrupt hint
Related-To: NEO-8179

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-29 11:14:53 +02:00
Bartosz Dunajski
2a2596c13b refactor: pass additional data to ioctl helper
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-26 14:53:14 +02:00
Bartosz Dunajski
8e5f9e72c8 refactor: simplify waiting for fence logic
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-04-25 22:36:19 +02:00
Mateusz Jablonski
62390d3def feature: add number of l3 banks to TopologyData
Related-To: NEO-11125
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-04-15 11:20:43 +02:00
Mateusz Jablonski
9468915768 fix: correct preemption support in xe path
preemption is always supported by xe kmd

Related-To: NEO-10496, HSD-18037744953
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-04-04 13:29:02 +02:00
Young Jin Yoon
907129bb33 feature: disable scratch page by default
Modified default values for disableScratch and gpuPageFault
to true and 10 respectively in drm_neo.cpp, in order to
disable scratch pages by default.
Modified to set gpuPageFault to 0 as a default value when
scratch page is not disabled.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-04-04 09:50:02 +02:00
Mateusz Jablonski
420e1391b2 fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-29 07:51:06 +01:00
Young Jin Yoon
d6a14d4ed5 feature: support explicit memory locking
Added lockMemory in context to explicitly locking memory,
Added a boolean flag in graphics_allocation to indicate the allocation
is locked, and modified memory_operations_handler to add lock().

Related-To: NEO-8277
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-29 07:31:22 +01:00
Compute-Runtime-Validation
e3f50e8aa9 Revert "fix: handle not aligned gtt size reported by i915"
This reverts commit dae901c13f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-28 12:03:23 +01:00
Mateusz Jablonski
dae901c13f fix: handle not aligned gtt size reported by i915
when i915 reports gtt size between 47 and 48 bits we consider
it as 48 bit VA space

Related-To: GSD-8215
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-28 08:46:53 +01:00
Maciej Plewka
b722f3b579 feature: Add interface to bind resources as readonly
Related-To: NEO-10398
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-03-27 14:24:58 +01:00
Compute-Runtime-Validation
8e44a46983 Revert "feature: bind resources as read only"
This reverts commit f3d36d3350.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-27 08:51:47 +01:00
Maciej Plewka
f3d36d3350 feature: bind resources as read only
Related-to: NEO-10398
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-03-26 14:11:57 +01:00
Young Jin Yoon
068f6a25c6 Revert "feature: support explicit memory locking"
This reverts commit 27a3307bb0.

Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-25 20:10:20 +01:00
Mateusz Jablonski
78a4a92b44 refactor: reorder members to reduce internal padding in structs
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-25 15:50:00 +01:00
Young Jin Yoon
27a3307bb0 feature: support explicit memory locking
Added lockMemory in context to explicitly locking memory,
Added a boolean flag in graphics_allocation to indicate the allocation
is locked, and modified memory_operations_handler to add lock().
Change the logic to work correctly with makeResident() when lock() is
called previously for the same memory region

Related-To: NEO-8277
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-25 09:49:18 +01:00
Mateusz Jablonski
d94be09020 refactor: remove not needed check for exec softpin
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-22 17:30:49 +01:00
Mateusz Jablonski
ec19ce536a refactor: store userptr value in buffer object
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-22 12:49:13 +01:00
Young Jin Yoon
ec009cf9e3 fix: abort only when disabling scratch page
Modifed getResetStatus to abort only when scratch page is disabled
Removed an incorrect UNRECOVERABLE_IF statement based on the status:
validPageFault can be true when banned flag is not set, if CAT error
does not occur as a result of page fault.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-21 21:55:25 +01:00
Mateusz Jablonski
1e343053ba refactor: remove redundant recreating vector of engines in xe kmd path
make ContextParamEngine structure more generic and populate engines
by drm specific methods

Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-21 17:55:39 +01:00
Mateusz Jablonski
0270cd6a5b fix: respect gt id when getting engines for drm context under xe kmd
Related-To: NEO-10496
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-03-15 16:02:47 +01:00
Young Jin Yoon
9633f49dab fix: make gpuFaultCheckCounter more robust
Modified drm_neo.h and .cpp to check when condition is greater
than and equal to instead of equal, and changed gpuFaultCheckCounter
to be atomic

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 10:40:12 +01:00
Young Jin Yoon
82728ff394 feature: add logic to iterate for all contexts to check GPU pagefault
Implemented to go through entire contexts in the process and then query
reset status to check the unexpected GPU segfault.

Added a new debug variable GpuFaultCheckThreshold to change the checking
frequency for each hang check for performance analysis.

Related-To: GSD-5673
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-03-15 07:48:39 +01:00
Compute-Runtime-Validation
94cc48f81b Revert "fix: don't use fake userptr flag in ioctl helper xe"
This reverts commit d3ab256f55.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-15 03:08:01 +01:00
Compute-Runtime-Validation
e11917cfcd Revert "fix: remove not needed checks in ioctl helper xe"
This reverts commit 5a6d0b21ac.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-03-14 21:38:09 +01:00