Commit Graph

381 Commits

Author SHA1 Message Date
Bellekallu Rajkiran
5a2145ad8d Add prelim support for ras diagnostics and firmware
Related-To: LOCI-2864

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-03-03 18:51:21 +01:00
Bellekallu Rajkiran
922a224cc9 Add prelim support for temperature, power and global operations
Related-To: LOCI-2864

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-03-03 18:11:34 +01:00
Mateusz Hoppe
7a2c5e28c1 Add getLastCounter() to EuThread
Related-To: NEO-6447

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-02-25 17:14:01 +01:00
Bellekallu Rajkiran
b6d3b4cca6 Sysman: Fix incorrect frequency request value
Sysman queries frequency request value from an incorrect
sysFs node which results in incorrect frequency request
value.
Modify sysFs node to query from correct node.

Related-To: LOCI-2887

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-02-24 05:15:18 +01:00
Joshua Santosh Ranjan
d24c6cedfb Metrics Ip Sampling Fix Inclusions
This patch fixes isolation build issues
due to inclusions.

Related-To: LOCI-2707

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-23 10:02:21 +01:00
Joshua Santosh Ranjan
10f98b45db Metrics Add Os specific implementation Structure for IP Sampling
This patch adds OS specific implementation for IP Sampling.
Implementation for linux is provided as part of this patch.

Related-To: LOCI-2787

--- master-files
level_zero/tools/source/metrics/linux/os_metric_ip_sampling_imp_linux.cpp
level_zero/tools/source/metrics/os_metric_ip_sampling.h
level_zero/tools/source/metrics/windows/os_metric_ip_sampling_imp_windows.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_prelim.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_upstream.cpp
level_zero/tools/test/unit_tests/sources/metrics/windows/test_metric_ip_sampling_windows.cpp
--- master-files

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-18 23:15:44 +01:00
T.J.Vivek Vilvaraj
1e6a38035e sysman: serialize access to libigsc.
In a multi thread environment the access to the external
library needs to be synchronized.

Resolves: LOCI-2871, LOCI-2873

Signed-off-by: T.J.Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2022-02-15 08:08:44 +01:00
Joshua Santosh Ranjan
cec0ea2809 Metrics Rename OA specific files
Rename OA specific files with _oa so that
implementation of other metric sources
could be added seamlessly.

Related-To: LOCI-2945

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-14 18:35:19 +01:00
Joshua Santosh Ranjan
596fe02dd3 Metrics Refactor Metric Streamer
This patch moves OA specific Metric Streamer implementation
to OA specific classes.

Related-To: LOCI-2905

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-08 22:42:05 +01:00
Joshua Santosh Ranjan
82ad3d61be Metrics Refactor MetricQuery and Metric Query Pool
This patch is to refactor MetricQuery and MetricQueryPool
so that Stall sampling metric could be integrated seamlessly

Related-To: LOCI-2904

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-07 19:34:22 +01:00
Joshua Santosh Ranjan
93e117fa9e Metrics Refactor Metric Group
Refactor Metric Group Implementation to move OA specific
implementation to OA specific classes.
This is so that stall sampling specific Metric Group
implementation could be done seamlessly.

Related-To: LOCI-2753

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-04 18:28:39 +01:00
Bartosz Dunajski
4b0d986876 Move AllocationType enum out of GraphicsAllocation class
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2022-02-04 17:49:09 +01:00
Joshua Santosh Ranjan
f1c50a8c89 Metric Refactor Metric Context
Rename Metric Context and move OA specific functions
and members from Metric Context to OA specific class(MetricSource).
This refactoring is done so that additional Metric Source
like Stall Sampling could be Seamlessly implemented.

Related-To: LOCI-2753


Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-03 21:34:14 +01:00
Joshua Santosh Ranjan
78fa21f31a Metrics Refactor Rename Metric Source Specific Classes
Renaming Oa Specific classes.

Related-To: LOCI-2753

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2022-02-01 21:42:57 +01:00
Bellekallu Rajkiran
0bd60e524a Initialize telemetry device entry variable
Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-01-31 20:33:53 +01:00
Mayank Raghuwanshi
90963b95ad Update mechanism for getting subdeviceId and onSubdevice for memory
Earlier sysman memory module was using logical subdeviceId
exposed by core to retrieve memory telmetry data, replace
the logical subdeviceId with actual subdeviceId for collecting
telemetry data.

Related-To: LOCI-2828

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2022-01-28 07:52:47 +01:00
Ayush Pandey
715b9d31d2 Find sscanf alternative.
Used strtol() to write sscanfUtil to extraxt info of BDF pcipath.

Related-To: LOCI-1002

Signed-off-by: Ayush Pandey <ayush.pandey@intel.com>
2022-01-21 09:02:48 +01:00
Robert Krzemien
c724f35abb Fixed offsets in calculation for multidevices. Fixed metric types.
Related-To: LOCI-2870
Signed-off-by: Robert Krzemien <robert.krzemien@intel.com>
2022-01-19 17:10:09 +01:00
Szymon Morek
26a24e8fde Query engine info with distances
If prelim kernel is being used, query distances
and set correctly number of available engines

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2022-01-10 13:30:26 +01:00
Szymon Morek
6258575e5e Use queryEngineInfo with prelim ioctls
If prelim kernel is being used, query engines
with prelim ioctls.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2022-01-04 12:15:04 +01:00
Mayank Raghuwanshi
25403cf85d Add support for device level performance factor
Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2021-12-28 07:13:51 +01:00
Joshua Santosh Ranjan
e1ef9ac79a Revert "Metric Detect Paranoid mode and fail gracefully"
This reverts commit a720282358dff08fb36b95eaf9bf184efa315f48.

This revert is to avoid suggesting to disable paranoid mode.
This revert also avoids L0 metrics mandating paranoid mode
setting.

Related-To: LOCI-2822

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2021-12-23 19:30:47 +01:00
T J Vivek Vilvaraj
9a39cad07d sysman:add reinitialization code to diagnostics
Signed-off-by: T J Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2021-12-23 18:27:06 +01:00
Vilvaraj, T J Vivek
15f102a7cb sysman:modify diagnostics tests
Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2021-12-22 14:53:38 +01:00
Szymon Morek
2647d563c7 Remove i915 structs from MemoryInfo
Use structs defined in ioctl_helper.h instead of
i915 dependent ones to avoid conflicts between
different kernels

Related-To: NEO-6149

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2021-12-22 12:46:59 +01:00
T J Vivek Vilvaraj
b95428078e sysman: filter diagnostics related accesses
prevent diagnostics related calls on unsupported platforms


Signed-off-by: T J Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2021-12-20 12:49:49 +01:00
Michal Mrozek
62faecf6d5 Optimize virtual calls #2.
Optimize frequently used virtual cost.
Compiler cannot inline those which causes overhead.

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2021-12-16 16:26:59 +01:00
Ranjan, Joshua Santhosh
5a2a19fa1a Sysman Fix FirmwareUtil Cleanup
Fixed by avoiding library function access if library is unavailable.


Related-To: LOCI-2719

Signed-off-by: Ranjan, Joshua Santhosh <joshua.santosh.ranjan@intel.com>
2021-12-15 09:18:18 +01:00
Bellekallu Rajkiran
4ae2f6e111 Sysman: Add support for device level energy counters
Related-To: LOCI-2724

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2021-12-14 01:33:59 +01:00
Aleksei Keisel
1e2a57d533 Update MDAPI headers
Resolves: LOCI-2692
Signed-off-by: Aleksei Keisel aleksei.keisel@intel.com
2021-12-10 21:17:33 +01:00
Daniel Enriquez
cf70a57efb Sysman WindowS: Fix Max Memory Bandwidth.
Signed-off-by: Daniel Enriquez <daniel.enriquez.montanez@intel.com>
2021-12-10 00:36:15 +01:00
Filip Hazubski
cf4ce308d9 Rename function
Rename multiDeviceCapable to implicitScalingCapable
Rename isMultiDeviceCapable to isImplicitScalingCapable

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2021-12-09 11:14:08 +01:00
Mayank Raghuwanshi
94d09f75b7 Get RAS HBM errors count using firmware interface
-- master-commit
Add functionality to retrieve memory errors from Firmware
-- master-commit

Related-To: LOCI-2491, LOCI-2726

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2021-12-08 18:57:24 +01:00
Joshua Santosh Ranjan
feae44bce8 Added Fabric RAS error support
fabric error counts are read from sysfs nodes

Related-To: LOCI-2613

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2021-12-08 07:00:39 +01:00
Bellekallu Rajkiran
a1121ccb6b Sysman: Replace mmap with pread
Use pread sys call instead of mmap and munmap
to get telemetry info.

Related-To: LOCI-2634


Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2021-12-06 17:44:29 +01:00
Pichika Uday Kiran
7764924387 sysman: Avoid creating the IGSC libary handle in ULTs
- Contains the changes to avoid invoking IGSC library
during ULT execution.


Related-To: LOCI-2719
Signed-off-by: Pichika Uday Kiran <pichika.uday.kiran@intel.com>
2021-12-02 19:30:35 +01:00
Mayank Raghuwanshi
2ec2d514ec Update create Handle mechanism for sysman RAS
Use set instead of vector to get the supported error types,
using vector may cause duplication of error types when quering
supported error types from different interfaces which in turn
may cause duplication of handles.

Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
2021-12-02 12:39:30 +01:00
Vilvaraj, T J Vivek
0d86842780 Sysman: add Cold Reset to diagnostics API
the diagnostics API expects the device to be correctly reset based on
the type of diagnostics result. Cold reset is expected when there is
some repair scheduled on the Device.

Relates-to:LOCI-2508


Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2021-11-30 20:12:34 +01:00
Bellekallu Rajkiran
ede0123561 Update sysfs path for setting standby mode
Related-To: LOCI-2734

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2021-11-29 23:54:06 +01:00
Szymon Morek
12777bd758 Move MemoryInfoImpl logic to MemoryInfo
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2021-11-29 13:13:11 +01:00
Daniel Enriquez
0ce5c6c9c9 Windows Sysman: Updating VRAM memory Type.
Adding support for the complete range of memory types.

Signed-off-by: Daniel Enriquez <daniel.enriquez.montanez@intel.com>
2021-11-27 11:51:44 +01:00
Vilvaraj, T J Vivek
35607e7830 sysman: add warm reset capability to diagnostics.
Relates-to:LOCI-2507

Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2021-11-25 21:00:19 +01:00
Joshua Santosh Ranjan
ed6b30af12 Metrics Library Release For Query Case
Release Metrics Library after Query related objects are released

Related-To: LOCI-2656

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2021-11-24 19:24:31 +01:00
Daniel Enriquez
f131b75d39 Events Windows:Fix corner case for the exit handle.
Corner case where the signal state is not restored after registering events.

Signed-off-by: Daniel Enriquez <daniel.enriquez.montanez@intel.com>
2021-11-22 15:19:26 +01:00
Joshua Santosh Ranjan
d15eed035b Metrics Restore addressOffsetCCSOffset after query programming
Related-To: LOCI-2711

Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
2021-11-17 11:57:51 +01:00
Vilvaraj, T J Vivek
b91cec5655 sysman: mock firmware utility in sysman
firmware utility needs to be mocked to prevent file access while
executing ULT's


Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
2021-11-17 07:57:46 +01:00
Mateusz Hoppe
35795357e9 DebugSession - add printBitmask()
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2021-11-15 21:40:27 +01:00
T J Vivek Vilvaraj
e1a1e96110 sysman:close diagnostics handles before reset
Relates-to:LOCI-2650

Signed-off-by: T J Vivek Vilvaraj <t.j.vivek.vilvaraj@intel.com>
2021-11-15 21:30:13 +01:00
Zbigniew Zdanowicz
f90932cca7 Use references instead copy ctors
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2021-11-15 14:42:33 +01:00
Jitendra Sharma
1186c1aae3 zesSysmanDeviceReset: Reinitialize device after device reset
Before performing gpu device reset, first all level zero resources
and gpu device specific resources have to be cleaned up. Also as
after device reset, state of gpu device would be lost.
Hence after performing gpu device reset, level zero device have
to be reinitialized by querying gpu device again.
This change is aimed at reinitializing the level zero resources
after gpu device reset, so that user could continue using level zero
devices after device reset.

Related-To: LOCI-2627

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2021-11-07 23:43:48 +01:00