During async thread event processing, it was possible to
read SSAH before any threads stopped and before it was
resident resulting in an assert. This is both a fix for
assertion and minor optimization.
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
This change helps in achieving the following:
- Moves the OS specific code from scheduler_imp.cpp to os specific
files.
- Frees any drm resource, including level zero's before enabling/dis
-abling Debug mode. And once Debug mode is toggled, reinitialize of
level zero occurs.
- If current mode is Debug mode and any other mode is requested by user,
then new mode will be made effective by unsetting debug mode.
Related-To: LOCI-866
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
Earlier implementation of sysman events API was based on file
creation in the filesystem. Whenever a uevent for some event
which needs to be monitored arrive, at that time a file was
created in the filesystem based on some preinstalled udev rules.
This approach was inefficient as it heavily depends over file
system and second with this approach losing events is always a
possibility.
Now with this change, we are removing our dependency over file
creation in filesystem. Rather we will be using libudev library
to monitor the uevents. This approach could also be extended,
when we want to listen to all the uevents for all the gpu
devices present in the system.
Related-To: LOCI-2140
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
Buffer allocation of less size to retrieve memory error
count result in failure to get error count.
Add support to igsc interface to get information related to buffer
allocation.
Related-To: LOCI-3667
Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
Buffer usage of less size resulted in invalid board number.
Added logic to use sufficient size to retrieve board number
from PMT.
Added logic to provide decoded values rather than ASCII
characters.
Related-To: LOCI-3545
Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
Some L0 debug CTSs intentionnally written to exit w/o proper
resource clenup, f.e do not call zetDebugDetach() etc.
On windows it could be the situation when cleanup of DebugSession is
called in context of DllMain(DLL_PROCESS_DETACH).
At this point all threads other then main already terminated by Windows,
see remarks for DLL_PROCESS_DETACH in
https://learn.microsoft.com/en-us/windows/win32/dlls/dllmain
In this case worker thread object still exists, handle and Id are not
null but corresponding Windows thread does not exist any more and
application waits forever for threadFinished variable. We can safely
omit this waiting since join() will either return immediately in case of
thread was killed by Windows or wait until thread is terminated in normal way.
Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
This patch fixes the issue that fabric ras errors
from all devies are reported for all devices.
Related-To: LOCI-3548
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
- do not mark interrupt as complete when thread was stopped
before handling ATT event
- if no newly stopped threads reported in ATT event, interrupt
trigger thread unavailable event
Related-To: NEO-7501
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- pass deviceIndex based on deviceBitfield
- do not call ioctl again on EBUSY error
Resolves: NEO-7414
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Move Linux DebugSession thread conversion functions up to
DebugSessionImp to allow reuse in windows implementation
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>