Commit Graph

719 Commits

Author SHA1 Message Date
Jack Myers 7f9fadc314 fix: regression caused by tbx fault mngr
Addresses regressions from the reverted merge
of the tbx fault manager for host memory.

Recursive locking of mutex caused deadlock.

To fix, separate tbx fault data from base
cpu fault data, allowing separate mutexes
for each, eliminating recursive locks on
the same mutex.

By separating, we also help ensure that tbx-related
changes don't affect the original cpu fault manager code
paths.

As an added safe guard preventing critical regressions
and avoiding another auto-revert, the tbx fault manager
is hidden behind a new debug flag which is disabled by default.

Related-To: NEO-12268
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-01-09 07:48:53 +01:00
Mateusz Jablonski bb1a125f0c feature: add support for Panther Lake platform
Related-To: NEO-12803

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-01-07 11:39:56 +01:00
Compute-Runtime-Validation b83db7ee32 Revert "feature: disable page fault handler on migration"
This reverts commit a258c9b010.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-31 03:57:02 +01:00
Young Jin Yoon a258c9b010 feature: disable page fault handler on migration
Disabled RegisterPageFaultHandlerOnMigration by default

Related-To: NEO-11563
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-12-30 09:42:52 +01:00
Lukasz Jobczyk 83ebbb01d3 performance: Add flag to mitigate host visible signal in CB events
Related-To: NEO-13441

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-24 12:35:55 +01:00
Mateusz Jablonski 593a6c54ea feature: add debug flag to ignore product specific ioctl helper creation
Related-To: NEO-13527
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-12-18 13:52:30 +01:00
Kamil Kopryk ae394d815c refactor: fix typo
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-12-18 10:21:48 +01:00
Andrzej Koska 105a586615 feature: Enable Tile64 Optimization Flag
Related-To: NEO-12134

Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2024-12-16 11:09:26 +01:00
Wenbin Lu c19df80bd8 feature: add key to force GPU status check in event synchronization
Related-To: GSD-10187

Signed-off-by: Wenbin Lu <wenbin.lu@intel.com>
2024-12-12 22:24:36 +01:00
Chandio, Bibrak Qamar 8cf4804fcd fix: Overhead in zeDeviceGetGlobalTimestamps
Related-To: NEO-11908

There is overhead when submission method is used for
zeDeviceGetGlobalTimestamps. This fixes it.

Signed-off-by: Chandio, Bibrak Qamar <bibrak.qamar.chandio@intel.com>
2024-12-12 08:54:19 +01:00
Lukasz Jobczyk f2725f217e refactor: Introduce debug flags to manipulate event's signal visibility
-Add AbortHostSyncOnNonHostVisibleEvent which abort when waiting for non
host visible event from host
-Add ForceHostSignalScope which forces add or clear of host scope to
event's signal scope

Related-To: NEO-13441

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-12-11 14:07:43 +01:00
Bartosz Dunajski eca3d5a677 feature: debug flag to clear timestamp before submission
Related-To: HSD-18040896547

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-11 13:56:49 +01:00
Dunajski, Bartosz 37e81d2a11 feature: new heuristic to enable relaxed ordering 2
Related-To: NEO-13431

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2024-12-09 11:58:42 +01:00
Vysochyn, Illia 0b7367ed5f refactor: Update STATE_BASE_ADDRESS
Refactors the STATE_BASE_ADDRESS to align with the latest specification.

Removes redundant functionality for multiple GPU partial writes and
atomics.

Related-To: NEO-13147

Signed-off-by: Vysochyn, Illia <illia.vysochyn@intel.com>
2024-12-09 08:50:59 +01:00
Compute-Runtime-Validation af8ad3aa7a Revert "feature: new heuristic to enable relaxed ordering"
This reverts commit 526f9c5e81.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-12-08 16:01:32 +01:00
Bartosz Dunajski 526f9c5e81 feature: new heuristic to enable relaxed ordering
Related-To: GSD-10308

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-12-06 17:04:39 +01:00
Chandio, Bibrak Qamar ab2e831a4a fix: zeDeviceGetGlobalTimestamp to use submisison
Related-To: GSD-10253, GSD-9467, GSD-9381, NEO-11908

When EnableGlobalTimestampViaSubmission is set then
zeDeviceGetGlobalTimestamp uses immediate cmd submission
method to get GPU time.

Signed-off-by: Chandio, Bibrak Qamar <bibrak.qamar.chandio@intel.com>
2024-12-04 19:10:07 +01:00
Dominik Dabek 99a353a15a feature: flags for logging indirect detection
Add flag to log information for indirect detection debugging.
Add flag to disable indirect detection by kernel name.
Add flag to force indirect detection enable/disable for CM kernels.

Related-To: NEO-13372

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-11-29 16:37:47 +01:00
Katarzyna Cencelewska 4ad8c17db9 feature: add debug flags for timestamps
PrintCalculatedTimestamps - print ts in level zero paths
PrintTimestampPacketContents - add logging also to level zero paths
ForceUseOnlyGlobalTimestamps - force using a global ts

Related-To: HSD-14023527252
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2024-11-21 11:28:08 +01:00
Bartosz Dunajski ed20069d47 feature: debug flag to override region count
Related-To: HSD-18040537404

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-11-19 16:46:43 +01:00
Mateusz Jablonski 608c1d30c5 feature: add support for release helper 30.0/30.1
Related-To: NEO-12803

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-11-13 10:51:39 +01:00
Joshua Santosh Ranjan d294d71f95 feature: make programmable metrics enabled by default
Related-To: NEO-13011

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2024-11-13 09:42:23 +01:00
Dominik Dabek 0a12817664 performance: flag, force zero copy for host ptr
When debug flag ForceZeroCopyForUseHostPtr is set, add
CL_MEM_FORCE_HOST_MEMORY_INTEL flag to buffers created with
CL_MEM_USE_HOST_PTR.
This makes the buffers use zero copy.

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-11-12 15:43:17 +01:00
Wojciech Konior c65b45471b feature: support binary compatibility across multiple HW targets
- EnableCompatibilityMode flag added
- validateTergetDevice func modified to take into account the flag

Related-To: NEO-11568

Signed-off-by: Wojciech Konior <wojciech.konior@intel.com>
2024-11-04 16:53:57 +01:00
Lukasz Jobczyk 1f6eaf2525 refactor: Add debug flags to set PATs for dc flush mitigation
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-10-29 21:12:59 +01:00
Wenbin Lu 1c46ed9b40 feature: enable reservation from SVM range by default
Related-To: NEO-11981

Signed-off-by: Wenbin Lu <wenbin.lu@intel.com>
2024-10-24 20:01:18 +02:00
Jitendra Sharma b51be4e2dd refactor: fix description of debug variables
Fix description of debug variable DebugUmdInterruptTimeout
and DebugUmdMaxReadWriteRetry.

Related-To: NEO-13046
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2024-10-24 09:35:57 +02:00
Joshua Santosh Ranjan c9e48d0d2b refactor: support oa programmable metric group
Related-To: NEO-12184


Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2024-10-24 08:35:12 +02:00
Chodor, Jaroslaw 5f908ce092 feature: adding support for custom compiler backends
This adds abbility to load different versions of the backend
compiler based on underlying device.

Related-To: NEO-12747

Signed-off-by: Chodor, Jaroslaw <jaroslaw.chodor@intel.com>
2024-10-23 19:55:36 +02:00
Bartosz Dunajski 9d76158c1f feature: debug flag to change ULLS BCS timeout
Related-To: HSD-18040119232

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-10-23 18:18:42 +02:00
Lukasz Jobczyk e687e11ab1 performance: Add CCS Optimization
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-10-23 11:35:57 +02:00
Jitendra Sharma 171f1e27a3 fix: Add debug variables for configurable timeouts in debugger
Related-To: NEO-13046
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2024-10-23 10:30:03 +02:00
Wenbin Lu a8a40d2afd feature: support SVM heap in reserveVirtualMem
Related-To: NEO-11981

Signed-off-by: Wenbin Lu <wenbin.lu@intel.com>
2024-10-22 16:47:14 +02:00
Szymon Morek 01a0b8e7f7 performance: improve ULLS controller timeout detection
Related-To: NEO-12991

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-10-22 13:53:25 +02:00
Compute-Runtime-Validation e10998db45 Revert "performance: Add CCS Optimization"
This reverts commit e7b3a40aa7.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-10-22 05:52:14 +02:00
Michal Mrozek 18d828421d performance: add debug flag to control huge chunk size on wddm.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-10-21 16:51:03 +02:00
Naklicki, Mateusz a5a11f4a0e feature: add debug key to set MaxSubSlicesSupported
Related-To: HSD-16025421624
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2024-10-21 13:34:43 +02:00
Lukasz Jobczyk e7b3a40aa7 performance: Add CCS Optimization
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-10-21 09:59:14 +02:00
Jitendra Sharma 26709ba124 fix: Implement polling of SW FIFO
Related-To: NEO-12955
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2024-10-21 07:19:42 +02:00
Chodor, Jaroslaw 5463ddea06 feature: New forward-compatibility model for zeinfo
Up till now, NEO ignored uknown attributes in zeinfo
which could lead to undefined behavior. With this change
NEO will emit an error whenever an unknown attribute is
encountered.

Note : old behavior can be restored using new
IgnoreZebinUnknownAttributes debug environment variable

Resolves: NEO-11762

Signed-off-by: Chodor, Jaroslaw <jaroslaw.chodor@intel.com>
2024-10-17 14:03:01 +02:00
Compute-Runtime-Validation 2098e64dc1 Revert "feature: adding support for custom compiler backends"
This reverts commit 8098bcc48d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-10-16 02:07:25 +02:00
Chodor, Jaroslaw 8098bcc48d feature: adding support for custom compiler backends
This adds abbility to load different versions of the backend
compiler based on underlying device.

Related-To: NEO-12747

Signed-off-by: Chodor, Jaroslaw <jaroslaw.chodor@intel.com>
2024-10-14 18:23:11 +02:00
Szymon Morek 7f2b806413 fix: Override timestamp width from KMD
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-10-14 13:38:33 +02:00
Wojciech Konior 6b40f9bc5a refactor: engineInstancedType removed
Related-To: NEO-12594

Signed-off-by: Wojciech Konior <wojciech.konior@intel.com>
2024-10-09 16:30:48 +02:00
Matias Cabral 6ddb550e05 feature: improve metrics debug messages
Resolves: NEO-12640

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2024-10-07 17:58:41 +02:00
Michal Mrozek cb3463db05 fix: update description of debug env
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-10-03 10:10:33 +02:00
Bartosz Dunajski 0564a41919 feature: debug flag to control direct submission semaphore mode
Related-To: HSD-18039521047

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-10-02 21:06:53 +02:00
Filip Hazubski 72cf384c7d refactor: Fix typo
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2024-10-01 09:31:02 +02:00
Compute-Runtime-Validation 4f96b6132f Revert "feature: capture multiple cpu pagefault handler"
This reverts commit 4b3a6e9cfe.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-28 07:53:30 +02:00
Young Jin Yoon 4b3a6e9cfe feature: capture multiple cpu pagefault handler
Recorded multiple page fault handlers by using vector in
cpu_page_fault_manager_linux.

Added a static handlerIndex in order to track the depth of
handler logic to call appropriate previous handlers.

Related-To: NEO-11563
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2024-09-27 00:34:45 +02:00