documentation: Add metrics GPU specific document

Add new file with L0 GPU specific details of metrics support Resolves: LOCI-4209 Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2023-05-17 06:30:31 -07:00 · 2023-05-17 06:30:31 -07:00 · 8f159308ac
parent d02885767d
commit 8f159308ac
4 changed files with 70 additions and 10 deletions
--- a/README.md
+++ b/README.md
@ -1,6 +1,6 @@
 <!---
-Copyright (C) 2018-2021 Intel Corporation
+Copyright (C) 2018-2023 Intel Corporation
 SPDX-License-Identifier: MIT
@ -85,8 +85,8 @@ Directly linking to the runtime library is not supported:
 * Intel Graphics Compiler - https://github.com/intel/intel-graphics-compiler
 In addition, to enable performance counters support, the following packages are needed:
-* Intel Metrics Discovery - https://github.com/intel/metrics-discovery
+* Intel(R) Metrics Discovery (MDAPI) - https://github.com/intel/metrics-discovery
-* Intel Metrics Library for MDAPI - https://github.com/intel/metrics-library
+* Intel(R) Metrics Library for MDAPI - https://github.com/intel/metrics-library
 ## How to provide feedback
@ -103,6 +103,7 @@ for more details.
 ## See also
 * [Contribution guidelines](https://github.com/intel/compute-runtime/blob/master/CONTRIBUTING.md)
 * [Programmers Guide](https://github.com/intel/compute-runtime/blob/master/programmers-guide/PROGRAMMERS_GUIDE.md)
 * [Frequently Asked Questions](https://github.com/intel/compute-runtime/blob/master/FAQ.md)
 ### Level Zero specific
--- a/programmers-guide/IMMEDIATE_COMMANDLIST.md
+++ b/programmers-guide/IMMEDIATE_COMMANDLIST.md
@ -31,7 +31,7 @@ Since the intention of immediate command lists are to primarily provide a razor
 ## Programming model
 Pseudo-code for creating immediate command list (async mode):
-```
+```cpp
    ze_command_queue_desc_t cmdQueueDesc = {ZE_STRUCTURE_TYPE_COMMAND_QUEUE_DESC};
    cmdQueueDesc.pNext = nullptr;
    cmdQueueDesc.flags = 0;
@ -44,7 +44,7 @@ Pseudo-code for creating immediate command list (async mode):
 Submitting commands and synchronization:
 Launching kernels:
-```
+```cpp
    zeCommandListAppendLaunchKernel(cmdList, kernel, &dispatchTraits,
                                    events[0], 0, nullptr);
    // If Async mode, use event for sync
@ -52,7 +52,7 @@ Launching kernels:
 ```
 Performing copies:
-```
+```cpp
    zeCommandListAppendMemoryCopy(cmdList, deviceBuffer, hostBuffer, allocSize,
                                    events[0],
                                    0, nullptr);
@ -66,7 +66,7 @@ Performing copies:
 ```
 Pseudo-code for creating immediate command list (sync mode):
-```
+```cpp
    ze_command_queue_desc_t cmdQueueDesc = {ZE_STRUCTURE_TYPE_COMMAND_QUEUE_DESC};
    cmdQueueDesc.pNext = nullptr;
    cmdQueueDesc.flags = 0;
@ -78,7 +78,7 @@ Pseudo-code for creating immediate command list (sync mode):
 ```
 Launching kernel:
-```
+```cpp
    zeCommandListAppendLaunchKernel(cmdList, kernel, &dispatchTraits,
                                    nullptr, 0, nullptr);
 ```
--- a/programmers-guide/METRICS.md
+++ b/programmers-guide/METRICS.md
@ -0,0 +1,58 @@
 <!---
 Copyright (C) 2023 Intel Corporation
 SPDX-License-Identifier: MIT
 -->
 # GPU Metrics collection in Level Zero
 * [Introduction](#Introduction)
 * [Dependencies](#Dependencies)
 * [Environment Setup](#Environment-Setup)
 * [EU Stall Sampling](#EU-Stall-Sampling)
 * [Limitations](#Limitations)
 # Introduction
 Implementation independent details of Level-Zero metrics are described in the Level-Zero specification [Metrics Section](https://spec.oneapi.io/level-zero/latest/tools/PROG.html#metrics). This implementation supports Time-based and Event-based sampling. Two domains are supported, one for collecting GPU performance metrics and one for collecting EU stall sampling data (type ZET_METRIC_TYPE_IP).
 # Dependencies
 Metrics collection depends on:
 * Intel(R) Metrics Discovery (MDAPI) - https://github.com/intel/metrics-discovery
 * Intel(R) Metrics Library for MDAPI - https://github.com/intel/metrics-library
 # Environment Setup
 As described in Level-Zero specification [Tools Section](https://spec.oneapi.io/level-zero/latest/tools/PROG.html#environment-variables) environment variable `ZET_ENABLE_METRICS` must be set to 1.
 ## Linux
 Additionally in Linux environment, is is required to disable the kernel module driver i915 performance stream paranoid mode. This can be done with command
 ```
 sudo sysctl dev.i915.perf_stream_paranoid=0
 ```
 # EU Stall Sampling 
 HW-assisted EU stall sampling allows statistically correlating Xe-Vector Engine (XVE) stall events to the executed instructions and breaks down the stall events by different stall reasons. Using the Instruction Pointer it is possible to point to the GPU kernel source code line causing the most stalls. 
 # Limitations
 ## EU Stall Sampling 
 * Only supported on Linux
 * Does not support streamer markers
 * The inherent nature of the samples only make sense for Time-based sampling. Therefore, Event-based sampling is not supported. 
 ## GPU performance metrics
 * To obtain the most recent metric values using Time-based sampling, it is necessary to read all metrics reports from the hardware buffer and calculate them all. This may be costly operation if the hardware buffer is not read at frequent intervals. Therefore, it is recommended to call zetMetricStreamerReadData() at a time interval that does not require processing big number of reports. This can be calculated based on the sampling rate decided when opening the metrics streamer (zet_metric_streamer_desc_t.samplingPeriod).
 ## notifyEveryNReports
 * Linux support for notifyEveryNReports on performance metrics will always return true when one metric report is available.
--- a/programmers-guide/PROGRAMMERS_GUIDE.md
+++ b/programmers-guide/PROGRAMMERS_GUIDE.md
@ -12,8 +12,9 @@ SPDX-License-Identifier: MIT
 This document provides the architectural design followed in the Intel(R) Graphics Compute Runtime for oneAPI Level Zero and OpenCL(TM) Driver. Implementation details and optimization guidelines are explained, as well as a description of the different features available for the different supported platforms.
 ### [Allocations greater than 4GB](ALLOCATIONS_GREATER_THAN_4GB.md)
 ### [Implicit scaling](IMPLICIT_SCALING.md)
 ### [Immediate Commandlist](IMMEDIATE_COMMANDLIST.md)
-### [System Memory Allocations in Level Zero](SYSTEM_MEMORY_ALLOCATIONS.md)
+### [L0 Metrics](METRICS.md)
 ### [Module Symbols and Linking in Level Zero](MODULE_SYMBOL_SUPPORT.md)
-### [Allocations greater than 4GB](ALLOCATIONS_GREATER_THAN_4GB.md)
+### [System Memory Allocations in Level Zero](SYSTEM_MEMORY_ALLOCATIONS.md)