This patch adds OS specific implementation for IP Sampling.
Implementation for linux is provided as part of this patch.
Related-To: LOCI-2787
--- master-files
level_zero/tools/source/metrics/linux/os_metric_ip_sampling_imp_linux.cpp
level_zero/tools/source/metrics/os_metric_ip_sampling.h
level_zero/tools/source/metrics/windows/os_metric_ip_sampling_imp_windows.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_prelim.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_upstream.cpp
level_zero/tools/test/unit_tests/sources/metrics/windows/test_metric_ip_sampling_windows.cpp
--- master-files
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Rename OA specific files with _oa so that
implementation of other metric sources
could be added seamlessly.
Related-To: LOCI-2945
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
This change uses value of cpuAddress from monitored fence
to detect GPU hang.
Related-To: NEO-5313
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This patch is to refactor MetricQuery and MetricQueryPool
so that Stall sampling metric could be integrated seamlessly
Related-To: LOCI-2904
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Refactor Metric Group Implementation to move OA specific
implementation to OA specific classes.
This is so that stall sampling specific Metric Group
implementation could be done seamlessly.
Related-To: LOCI-2753
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Rename Metric Context and move OA specific functions
and members from Metric Context to OA specific class(MetricSource).
This refactoring is done so that additional Metric Source
like Stall Sampling could be Seamlessly implemented.
Related-To: LOCI-2753
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Related-To: NEO-6575
This is needed to fix accessing IoctlHelper
after driver detach.
This way we are also reducing accessing
sysfs file in Drm::getPrelimVersion
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
Earlier sysman memory module was using logical subdeviceId
exposed by core to retrieve memory telmetry data, replace
the logical subdeviceId with actual subdeviceId for collecting
telemetry data.
Related-To: LOCI-2828
Signed-off-by: Mayank Raghuwanshi <mayank.raghuwanshi@intel.com>
- Added Functionality to pass ze_power_saving_hint_type_t to zeContextCreate
included in the pNext extensions in ze_context_desc_t.
- Enables handling a hint value 0-100 with 0 being no power savings
and 100 being maximum power savings.
- ZE_RESULT_ERROR_INVALID_ENUMERATION is returned given an invalid hint.
Related-To: LOCI-2567
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
In some of the drm functions there is a pattern
to store array in unique_ptr and pass it's length
as an argument. This commit simplifies this.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
This reverts commit a720282358dff08fb36b95eaf9bf184efa315f48.
This revert is to avoid suggesting to disable paranoid mode.
This revert also avoids L0 metrics mandating paranoid mode
setting.
Related-To: LOCI-2822
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Use structs defined in ioctl_helper.h instead of
i915 dependent ones to avoid conflicts between
different kernels
Related-To: NEO-6149
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
Fixed by avoiding library function access if library is unavailable.
Related-To: LOCI-2719
Signed-off-by: Ranjan, Joshua Santhosh <joshua.santosh.ranjan@intel.com>
Use pread sys call instead of mmap and munmap
to get telemetry info.
Related-To: LOCI-2634
Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
the diagnostics API expects the device to be correctly reset based on
the type of diagnostics result. Cold reset is expected when there is
some repair scheduled on the Device.
Relates-to:LOCI-2508
Signed-off-by: Vilvaraj, T J Vivek <t.j.vivek.vilvaraj@intel.com>
Release Metrics Library after Query related objects are released
Related-To: LOCI-2656
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Long timeout for event listen API is resulting in higher ULT execution time
for sysman ULTs. Hence reducing this timeout.
Related-To: NEO-6412
Signed-off-by: Ayush Pandey <ayush.pandey@intel.com>
Before performing gpu device reset, first all level zero resources
and gpu device specific resources have to be cleaned up. Also as
after device reset, state of gpu device would be lost.
Hence after performing gpu device reset, level zero device have
to be reinitialized by querying gpu device again.
This change is aimed at reinitializing the level zero resources
after gpu device reset, so that user could continue using level zero
devices after device reset.
Related-To: LOCI-2627
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>