Commit Graph

605 Commits

Author SHA1 Message Date
Yates, Brandon
66581a0a1d L0 Debug - Avoid SSAH lookup when no threads are stopped
During async thread event processing, it was possible to
read SSAH before any threads stopped and before it was
resident resulting in an assert. This is both a fix for
assertion and minor optimization.

Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-12-13 03:07:22 +01:00
Jitendra Sharma
391941c447 Sysman: Enhance Scheduler compute unit debug mode implementation
This change helps in achieving the following:
- Moves the OS specific code from scheduler_imp.cpp to os specific
files.
- Frees any drm resource, including level zero's before enabling/dis
-abling Debug mode. And once Debug mode is toggled, reinitialize of
level zero occurs.
- If current mode is Debug mode and any other mode is requested by user,
then new mode will be made effective by unsetting debug mode.

Related-To: LOCI-866

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2022-12-12 17:58:28 +01:00
Kamil Kopryk
03b687881f Rename HwHelper -> GfxCoreHelper
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-09 10:29:06 +01:00
Dunajski, Bartosz
61544f13cf RelaxedOrdering: Dont apply optimization for barrier calls
Related-To: NEO-7458

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2022-12-08 16:21:16 +01:00
Yates, Brandon
43ddabd8e6 L0 Debug - Change interrupt return code to match linux
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-12-08 09:42:08 +01:00
Yates, Brandon
668149ab81 [L0 Debug] Add additional log messages to gfxmem r/w
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-12-07 21:08:10 +01:00
Warchulski, Jaroslaw
be647d42d9 Cleanup includes 12
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-07 13:14:15 +01:00
Jitendra Sharma
5baf75b9a8 Sysman: Redesign event API to effectively use uevents
Earlier implementation of sysman events API was based on file
creation in the filesystem. Whenever a uevent for some event
which needs to be monitored arrive, at that time a file was
created in the filesystem based on some preinstalled udev rules.
This approach was inefficient as it heavily depends over file
system and second with this approach losing events is always a
possibility.

Now with this change, we are removing our dependency over file
creation in filesystem. Rather we will be using libudev library
to monitor the uevents. This approach could also be extended,
when we want to listen to all the uevents for all the gpu
devices present in the system.

Related-To: LOCI-2140
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2022-12-07 07:29:57 +01:00
Warchulski, Jaroslaw
c10aa90815 Cleanup includes 11
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-06 12:25:30 +01:00
Kamil Kopryk
73b2104183 Rename L0HwHelper -> L0GfxCoreHelper
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-12-05 11:26:05 +01:00
Mayank Raghuwanshi
69e6c8b1c6 Add support for retrieving max b/w for DG2
Related-To: LOCI-3425

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2022-12-01 10:45:32 +01:00
Bellekallu Rajkiran
6806a0fb36 Fix memory error counter reporting issue
Buffer allocation of less size to retrieve memory error
count result in failure to get error count.

Add support to igsc interface to get information related to buffer
allocation.

Related-To: LOCI-3667

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-12-01 07:21:08 +01:00
Matias Cabral
467119931c Add SIP version check
Make SLM access a single template function

Resolves: NEO-7335

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2022-12-01 00:55:04 +01:00
Joshua Santosh Ranjan
fb8aa01a01 Metrics: Use physical subdevice index when using affinity mask
Related-To: LOCI-2975

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-11-29 07:48:39 +01:00
Warchulski, Jaroslaw
4100e1aa72 Cleanup includes 7
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-11-28 17:01:48 +01:00
Mayank Raghuwanshi
eacf42455d Fix setting perf factor for media
Related-To: LOCI-3554

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2022-11-26 20:52:54 +01:00
Bellekallu Rajkiran
47a2d309bb Fix issue with board number property
Buffer usage of less size resulted in invalid board number.
Added logic to use sufficient size to retrieve board number
from PMT.

Added logic to provide decoded values rather than ASCII
characters.

Related-To: LOCI-3545

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-11-22 11:23:45 +01:00
Warchulski, Jaroslaw
f35f59b573 Cleanup includes 5
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-11-18 22:46:38 +01:00
Igor Venevtsev
271a50d48e L0Debug Win: Fix process hang on exit in L0 debugger tests
Some L0 debug CTSs intentionnally written to exit w/o proper
resource clenup, f.e do not call zetDebugDetach() etc.
On windows it could be the situation when cleanup of DebugSession is
called in context of DllMain(DLL_PROCESS_DETACH).
At this point all threads other then main already terminated by Windows,
see remarks for DLL_PROCESS_DETACH in
https://learn.microsoft.com/en-us/windows/win32/dlls/dllmain
In this case worker thread object still exists, handle and Id are not
null but corresponding Windows thread does not exist any more and
application waits forever for threadFinished variable. We can safely
omit this waiting since join() will either return immediately in case of
thread was killed by Windows or wait until thread is terminated in normal way.

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2022-11-18 17:34:11 +01:00
Mateusz Hoppe
5c23d05312 L0Debug - add support for blocking VM BIND on fence
Related-To: NEO-7454

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-11-18 12:00:23 +01:00
Mateusz Hoppe
e0370d25b9 L0Debug - Fix scratch offset calculation
- euRatio should only affect EUs offsets - not thread offsets

Resolves: NEO-7520

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-11-18 09:52:07 +01:00
Yates, Brandon
c95d40ccdd L0 debug- Windows - Add gpuVA to log for r/w gfx mem
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-11-17 19:08:50 +01:00
Joshua Santosh Ranjan
7c050291bf Fix fabric ras errors accumulated to all devices
This patch fixes the issue that fabric ras errors
from all devies are reported for all devices.

Related-To: LOCI-3548

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-11-16 12:03:50 +01:00
Kamil Kopryk
88ed486f6b Move L0HwHelper ownership to RootDeviceEnvironment 3/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>

Use RootDeviceEnvironment getHelper<L0CoreHelper> for
- getAttentionBitmaskForSingleThreads
- getThreadsFromAttentionBitmask
2022-11-15 17:30:15 +01:00
Kamil Kopryk
aaa4e90ad4 Move L0HwHelper ownership to RootDeviceEnvironment 1/n
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>

Use RootDeviceEnvironment getHelper<L0CoreHelper> for
- setAdditionalGroupProperty
- createEvent
- isResumeWARequired
2022-11-15 08:24:23 +01:00
Mayank Raghuwanshi
ffcca3ba53 Use physical subdeviceId for sysman ras, freq and standby module
Related-To: LOCI-2925, LOCI-2926, LOCI-3236
Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2022-11-14 14:10:23 +01:00
Mateusz Hoppe
5206fd1b9a L0Debug - interrupt stopped events for newly stopped threads
- do not mark interrupt as complete when thread was stopped
before handling ATT event
- if no newly stopped threads reported in ATT event, interrupt
trigger thread unavailable event

Related-To: NEO-7501

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-11-10 15:25:24 +01:00
Yates, Brandon
99ef6c499b L0 debug - fix windows bitmask decode
Keep threads created in EU range (0,7)

Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-11-07 14:41:29 +01:00
Igor Venevtsev
f47e1306f2 L0Debug: do not set acknowledge flag for MODULE_UNLOAD event
Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2022-11-07 14:01:29 +01:00
Warchulski, Jaroslaw
fb25f96081 Cleanup includes 2
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-11-07 10:36:50 +01:00
Bellekallu Rajkiran
4bbec2dbf4 Add support for board number property
Related-To: LOCI-3545

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-11-07 08:40:11 +01:00
Warchulski, Jaroslaw
ef95bfb45e Cleanup includes
Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-11-04 18:04:13 +01:00
Mateusz Hoppe
6f710bfad7 L0Debug - disallow attaching to multiple pids
Resolves: NEO-7476

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-11-03 19:21:53 +01:00
Matias Cabral
8d8f821b6c Optimize SLM ULTs execution time
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2022-11-03 11:12:13 +01:00
Matias Cabral
0772d32a76 Windows debugger access to SLM
Resolves: NEO-7382

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2022-11-02 10:58:09 +01:00
Ezhilsivam Shanmugam
0a7166d10e Implement FanGetConfig sysman API for windows
Signed-off-by: Ezhilsivam Shanmugam <ezhilsivam.shanmugam@intel.com>
2022-10-29 00:55:11 +02:00
Joshua Santosh Ranjan
436ec1234b Sysman Add support for auxiliary bus for fabric Ras
Related-To: LOCI-3531

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-10-28 18:18:33 +02:00
Compute-Runtime-Validation
d653779098 Revert "L0 debug - Fix thread creation for windows DSS"
This reverts commit 3724807eed.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-10-22 14:38:24 +02:00
Artur Harasimiuk
9ad3f6190f do not sleep in ULTs
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2022-10-21 19:37:52 +02:00
Yates, Brandon
3724807eed L0 debug - Fix thread creation for windows DSS
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-10-21 18:47:49 +02:00
Matias Cabral
b103b0c43f Reduce the SLM time waiting on ready CMD to 100 uSec
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2022-10-21 12:50:58 +02:00
Warchulski, Jaroslaw
90bc1a69d2 L0Debug - wait for the thread to start
Related-To: NEO-7322
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-10-20 10:44:16 +02:00
Yates, Brandon
789a53e8f8 L0 Debug - Don't pollute debug log with event timeouts
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-10-19 17:26:31 +02:00
Jitendra Sharma
6a73001a3f Implement zesSchedulerComputeUnitDebugMode sysman Api
Related-To: LOCI-866

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2022-10-18 13:43:29 +02:00
Matias Cabral
4affe9907c Update SLM access offset behavior
Related-To: NEO-5998

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2022-10-17 18:07:27 +02:00
Mateusz Hoppe
95505d87a5 L0Debug - fix interrupt
- pass deviceIndex based on deviceBitfield
- do not call ioctl again on EBUSY error

Resolves: NEO-7414

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-10-14 13:56:57 +02:00
Yates, Brandon
44894c1fdf L0 Debug- Create generic topologyMap interface
Move Linux DebugSession thread conversion functions up to
DebugSessionImp to allow reuse in windows implementation

Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-10-13 15:12:05 +02:00
Matias Cabral
56109b882f Support debugger SLM write
Resolves: NEO-5998

Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2022-10-11 16:37:14 +02:00
Bellekallu Rajkiran
3323deb825 Add support for serial number property
Related-To: LOCI-3396

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-10-09 19:37:30 +02:00
Compute-Runtime-Validation
668f988e61 Revert "Add support for serial number property"
This reverts commit ba461e565e.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-10-09 06:59:32 +02:00