Commit Graph

57 Commits

Author SHA1 Message Date
Mayank Raghuwanshi b733d56a36 Make calls to igsc from Sysman thread safe
Related-To: LOCI-4325

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2023-04-21 15:51:27 +02:00
Mayank Raghuwanshi 3816b85fa0 Add check for memory type before calculating ras hbm errors
Related-To: LOCI-3500

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2023-03-31 13:47:41 +02:00
Mateusz Jablonski 659cacf2c9 refactor l0 cmake: reduce include directories
Related-To: NEO-7507
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-17 13:41:55 +01:00
Bellekallu Rajkiran 2282f26734 feature(sysman): Support events for multiple devices
Related-To: LOCI-3683

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2023-03-07 09:50:32 +01:00
Warchulski, Jaroslaw c275008e51 Cleanup includes 32
Cleaned up files:
level_zero/core/source/cmdlist/cmdlist_hw.h
level_zero/core/source/cmdqueue/cmdqueue.h
level_zero/core/source/event/event.h
opencl/source/helpers/get_info_status_mapper.h
opencl/source/helpers/hardware_commands_helper.h
shared/source/helpers/per_thread_data.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-16 20:41:37 +01:00
Warchulski, Jaroslaw f275eea6ec Cleanup includes 14
Cleaned up files:
shared/source/device/device.h

Related-To: NEO-5548

Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2022-12-23 10:46:34 +01:00
Mayank Raghuwanshi 5edbca1aa2 Use physical subdevice for sysman engine module
Related-To: LOCI-3231

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2022-12-16 07:11:57 +01:00
Jitendra Sharma 391941c447 Sysman: Enhance Scheduler compute unit debug mode implementation
This change helps in achieving the following:
- Moves the OS specific code from scheduler_imp.cpp to os specific
files.
- Frees any drm resource, including level zero's before enabling/dis
-abling Debug mode. And once Debug mode is toggled, reinitialize of
level zero occurs.
- If current mode is Debug mode and any other mode is requested by user,
then new mode will be made effective by unsetting debug mode.

Related-To: LOCI-866

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2022-12-12 17:58:28 +01:00
Jitendra Sharma 5baf75b9a8 Sysman: Redesign event API to effectively use uevents
Earlier implementation of sysman events API was based on file
creation in the filesystem. Whenever a uevent for some event
which needs to be monitored arrive, at that time a file was
created in the filesystem based on some preinstalled udev rules.
This approach was inefficient as it heavily depends over file
system and second with this approach losing events is always a
possibility.

Now with this change, we are removing our dependency over file
creation in filesystem. Rather we will be using libudev library
to monitor the uevents. This approach could also be extended,
when we want to listen to all the uevents for all the gpu
devices present in the system.

Related-To: LOCI-2140
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2022-12-07 07:29:57 +01:00
Artur Harasimiuk 9ad3f6190f do not sleep in ULTs
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2022-10-21 19:37:52 +02:00
Bellekallu Rajkiran 3323deb825 Add support for serial number property
Related-To: LOCI-3396

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-10-09 19:37:30 +02:00
Compute-Runtime-Validation 668f988e61 Revert "Add support for serial number property"
This reverts commit ba461e565e.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-10-09 06:59:32 +02:00
Bellekallu Rajkiran ba461e565e Add support for serial number property
Related-To: LOCI-3396

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-10-07 20:08:01 +02:00
Vilvaraj, T J Vivek 67b670c5b9 Sysman: warm reset needs hot plug interrupts disabled
hot plug interrupts need to be disabled before
issuing a warm reset. and needs to be enable after
completion of the warm reset.

Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2022-09-29 17:21:23 +02:00
Vilvaraj, T J Vivek 0d23fa1a98 Sysman: increase sleep duration in warm reset
warm reset needs to have a longer sleep duaration after remove to
ensure that the PCIe state is save and restored safely.

Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2022-09-26 17:57:54 +02:00
Bellekallu Rajkiran de06d91db8 Sysman: Fix few memory leaks
Invoking FwDeviceInit from several modules without closing
igsc device result in memory leak.

Add support to invoke FwDeviceInit only during creation of fw util
interface.

Related-To: LOCI-3204

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-09-14 16:16:30 +02:00
Bellekallu Rajkiran ceff16084d Fix temperature handle enumeration issue on single tile devices
Add platform check to read pmt offsets corresponding to
tile instead of root node for single tile devices.

Related-To: LOCI-2575

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-09-06 10:57:35 +02:00
Kulkarni, Ashwin Kumar 44649faa0f Defer Sysman Engine Module Initialization
With this change, init for sysman Engine API would not be done during zeInit.
init and thereby Engine API handle creation would be done only
when user explicitly requests to enumerate handles
using zesDeviceEnumEngineGroups

Related-To: LOCI-3127

Signed-off-by: Kulkarni, Ashwin Kumar <ashwin.kumar.kulkarni@intel.com>
2022-08-26 08:41:22 +02:00
Kulkarni, Ashwin Kumar 137959c647 Defer Sysman Power and Performance Module Initialization
With this change, init for sysman Power/Performance API would
not be done during zeInit.
init and thereby Power/Performance API handle creation would be done
only when user explicitly requests to enumerate handles
using zesDeviceEnumPowerDomains/zesDeviceEnumPerformanceFactorDomains.

Related-To: LOCI-3127

Signed-off-by: Kulkarni, Ashwin Kumar <ashwin.kumar.kulkarni@intel.com>
2022-07-25 08:17:46 +02:00
Mateusz Hoppe 5956aea18d Limit header includes from level_zero device.h
- remove including debugger_l0.h from device.h
- add getL0Debugger() to shared NEO Device

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-07-06 16:41:17 +02:00
Kulkarni, Ashwin Kumar 49aaf62bbd Lazy init implementation for RAS module
Related-To: LOCI-3127

Signed-off-by: Kulkarni, Ashwin Kumar <ashwin.kumar.kulkarni@intel.com>
2022-07-04 18:29:57 +02:00
Vilvaraj, T J Vivek 894f90f89e SysMan:fix device reset
The distance from the SGUnit to the  Root port is constant.
calculating the Rootport and Cardbus based on this observation.
the root port and card bus are used by Warmreset function to
preserve the PCI config space.

Resolves: LOCI-2899

Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2022-06-25 03:13:52 +02:00
Mayank Raghuwanshi 281c98dcf9 Add firmware util interface for sysman windows
Related-To: LOCI-3132

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2022-06-24 08:42:48 +02:00
Vilvaraj, T J Vivek c0121eb824 SysMan: fix issues in execution environment restoration.
the scpoe of the restorer is till the LinuxSysmanImp is deleted.
Ideally the scope of restorer needs to be function level.

Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2022-06-15 11:38:23 +02:00
Vilvaraj, T J Vivek 973bcb9dbc Sysman: cleanup execution env referrals
convert the present system of calling Inc/Dec of
execution environment to a more elegant solution

Resolves: LOCI-3165

Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2022-06-06 18:40:23 +02:00
Vilvaraj, T J Vivek da52303e6e SysMan: Diagnostics warm reset fix.
The following modifications were done as part of the fix
for warm reset.
1. Release sysman resources before quiscenting the GPU.
2. Add additional checks to confirm quiscenting of the GPU
before launching the diagnostics tests.
3. Fixed warm reset with wait time to allow the changes to be
propagated to the entire GPU PCI tree.
4. Modified the ULT's completely to avoid the usage of MOCKS.
5. Made Diagnostics handle creation per-device from per-SubDevice.

Related-To: LOCI-3053

Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2022-06-06 10:09:47 +02:00
Mateusz Jablonski 2a4c68dc38 Remove not needed dependencies from device_imp.h
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-06-02 18:35:34 +02:00
Artur Harasimiuk 819e0f5515 style: configure readability-identifier-naming.LocalVariableCase
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2022-05-16 12:39:44 +02:00
Vilvaraj, T J Vivek 47f7b4f509 sysman: clean up code duplication for reset
warm and cold reset are common functionality,
the code is being moved to the common sysman implementation
from diagnostics specific files.

Related-To: LOCI-1908
Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2022-03-31 14:10:39 +02:00
T J Vivek Vilvaraj b95428078e sysman: filter diagnostics related accesses
prevent diagnostics related calls on unsupported platforms


Signed-off-by: T J Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2021-12-20 12:49:49 +01:00
Pichika Uday Kiran 7764924387 sysman: Avoid creating the IGSC libary handle in ULTs
- Contains the changes to avoid invoking IGSC library
during ULT execution.


Related-To: LOCI-2719
Signed-off-by: Pichika Uday Kiran <pichika.uday.kiran@intel.com>
2021-12-02 19:30:35 +01:00
Vilvaraj, T J Vivek b91cec5655 sysman: mock firmware utility in sysman
firmware utility needs to be mocked to prevent file access while
executing ULT's


Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2021-11-17 07:57:46 +01:00
T J Vivek Vilvaraj e1a1e96110 sysman:close diagnostics handles before reset
Relates-to:LOCI-2650

Signed-off-by: T J Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2021-11-15 21:30:13 +01:00
Jitendra Sharma 1186c1aae3 zesSysmanDeviceReset: Reinitialize device after device reset
Before performing gpu device reset, first all level zero resources
and gpu device specific resources have to be cleaned up. Also as
after device reset, state of gpu device would be lost.
Hence after performing gpu device reset, level zero device have
to be reinitialized by querying gpu device again.
This change is aimed at reinitializing the level zero resources
after gpu device reset, so that user could continue using level zero
devices after device reset.

Related-To: LOCI-2627

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2021-11-07 23:43:48 +01:00
T J Vivek Vilvaraj f27f430429 sysman: fix firmware device enumeration in firmware Utils
firmware Utils was always enumerating the same firmware
device handle for all sysman devices.

Related-To:LOCI-2609

Signed-off-by: T J Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2021-10-14 18:02:55 +02:00
Jitendra Sharma 135ec380fc Dont initialize sysman for linux if Driver Model is not DRM
Related-To: LOCI-2533

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2021-10-04 12:15:21 +02:00
Jitendra Sharma c46f591a99 Do not initialize Sysman if OsSysman Init failed
Related-To: LOCI-2552

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2021-09-30 09:11:55 +02:00
T J Vivek Vilvaraj aba7d74bcd remove unused libxml module from sysman
Signed-off-by: T J Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2021-09-30 09:01:35 +02:00
Jaroslaw Chodor 0e9aa45e46 Improving OS abstraction
Signed-off-by: Jaroslaw Chodor <jaroslaw.chodor@intel.com>
2021-05-23 21:40:37 +02:00
Jitendra Sharma e1458fc95c To avoid seg fault, add null check before deleting PMT objects
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2021-05-18 12:37:15 +02:00
lgotszal 3bd4bca911 Copyright header update
Dates corrected in copyright headers to reflect original publication date
(2018 for OpenCL, 2020 for Level Zero).

Signed-off-by: lgotszal <lukasz.gotszald@intel.com>
2021-05-17 20:38:19 +02:00
Jitendra Sharma 46c51cb8a9 Sysman device reset stability fix
Close PMT, and PMU fds created during Sysman's init before calling
device reset.

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2021-04-15 11:53:10 +02:00
Jitendra Sharma 3597093758 Update Temperature APIs to get correct temperature
This change updates Temperature APIs to get correct current
temperature based on updated PMT interface.

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2021-02-06 18:39:58 +01:00
Vilvaraj, T J Vivek bddc63e8bd add firmware flashing utility interface
Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2020-12-22 09:10:11 +01:00
Bill Jordan 909107cab6 Fix zesDeviceReset for Spec version 1.
This patch does the following:
- Fixes a bug in FsAccess::listDirectory that could return
ZE_RESULT_UNKNOWN_ERROR when no error has occurred.
- Fixes a bug in zesDeviceReset that would reset the device
if force was set to false, even if the device was in use.
- Fixes a bug in zesDeviceReset that would reset the device
if force was set to false without closing the file descriptor.
- Added a releaseResources method method to Device object.
This method does the same thing as the DeviceImp
destructor except it does not free the DeviceImp object
and it does not free the SysmanDeviceImp object.
- Added the releaseResources methods to Mock<Device> object.
- Moved the reset of the debugger out of DriverHandleImp
destructor and into DeviceImp releaseResources.
- Added a releaseEngine method to the EngineHandleContext. This
method frees all the Engine handles.
- On reset, I call the Devcie->releaseResources and
EngineHandleContext->releaseEngines before resetting the device.
- Added a -r (--reset) option to zello_sysman so I
could easily test resets.

With these patches, the L0 Sysman CTS for zesDeviceReset
both pass.

Change-Id: I31fad1b27bc5cc6befe31cd6f9319748e2683424
2020-10-05 19:55:14 +02:00
Jitendra Sharma 38ecb25ee6 Add support to query PMU counters
This change adds support to get engine active time and timestamp
from PMU interface.

Change-Id: I486dcce9fef3c7dc3f73fb8c7ea4c0bd020a6807
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2020-09-07 16:15:21 +02:00
mraghuwa 4b8d4285d7 add getDeviceHandle() at OS specific level
Change-Id: I95fc24043f8f603d6270323b0f23a78f9d8ad2f1
Signed-off-by: mraghuwa <mayank.raghuwanshi@intel.com>
2020-08-18 17:11:22 +02:00
Jaime Arteaga 902fc2f6c4 level-zero v1.0 (2/N)
Change-Id: I1419231a721fab210e166d26a264cae04d661dcd
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Signed-off-by: macabral <matias.a.cabral@intel.com>
Signed-off-by: davidoli <david.olien@intel.com>
Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
Signed-off-by: Latif, Raiyan <raiyan.latif@intel.com>
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2020-08-03 13:11:13 +02:00
Bill Jordan 6e20dfafab Added limited XML parsing capability to L0 Sysman using libxml2.
XML parsing for linux only, hanging off the OsSysmanImp object.

Change-Id: I473dc8dde0611cc13f38a2c0b59e839fced2e59e
Signed-off-by: Bill Jordan <bill.jordan@intel.com>
2020-07-16 12:12:45 -07:00
Jitendra Sharma 146fc900c3 Add initial sysman stub as per latest spec
Change-Id: I6f36b9faa21e05a6954de0b50ea01240539441d1
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2020-07-11 06:54:08 +05:30