This change:
- moves NEO::WaitStatus to a separate file
- enables detection of GPU hang in clWaitForEvents
- adjusts most of blocking calls in CommandStreamReceiver to return WaitStatus
- adds ULTs to cover the new code
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This patch adds OS specific implementation for IP Sampling.
Implementation for linux is provided as part of this patch.
Related-To: LOCI-2787
--- master-files
level_zero/tools/source/metrics/linux/os_metric_ip_sampling_imp_linux.cpp
level_zero/tools/source/metrics/os_metric_ip_sampling.h
level_zero/tools/source/metrics/windows/os_metric_ip_sampling_imp_windows.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_prelim.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_upstream.cpp
level_zero/tools/test/unit_tests/sources/metrics/windows/test_metric_ip_sampling_windows.cpp
--- master-files
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
according to implicit args design for SIMD-1 local ids are one-by-one
Resolves: NEO-6692
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- PROCESS_ENTRY - triggered by first zeCommandQueueCreate()
- PROCESS_EXIT - triggered by last zeCommandQueueDestroy()
Resolves: NEO-6503
Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
This change introduces detection of GPU hangs in
zeEventHostSynchronize and zeFenceHostSynchronize.
Furthermore, if CommandQueueHw::executeCommandLists
uses ZE_COMMAND_QUEUE_MODE_SYNCHRONOUS and hang occurs,
the information about it is propagated to the caller.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This commit fixes setting usesStringMap flag for printf, taking into
account using indirect functions in legacy (non-zebinary) path. It also
adds new field to kernelDescriptor, specifying the binary type
(legacy/zebin).
Related-To: NEO-6604
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-6075
After this change driver will fail clBuildProgram/zeModuleCreate api calls
whenever stateful access is discovered on PVC.
This is required since in this case allocation greater than 4GB
will not work.
If user still wants to use stateful addressing mode,
-cl-opt-smaller-than-4GB-buffers-only / -ze-opt-smaller-than-4GB-buffers-only
build option should be passed as build option, but then user can not use
bufers greater than 4GB.
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
This commits removes part of condition requiring requiresImplicitArgs
flag set in kernel descriptor in order to set usesStringMap flag.
Related-To: NEO-6604
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This patch moves OA specific Metric Streamer implementation
to OA specific classes.
Related-To: LOCI-2905
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
- every isa allocation will have ELF linked
- fix for debug elf from patchtoken binary:
pass relocated ELF when exists
- simplify code
Related-To: NEO-5571
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
In OCL product family of target device is not set
which leads to a fail on validating target device in
ZEBin path.
This change adds function that sets all
necessary fields based on provided hardware info.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Forces extended buffer size by adding pageSize specify by number when
debug flag is >=1 in L0 USM calls
Usage:
ForceExtendedUSMBufferSize=2
size += (2 * pageSize)
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
This patch is to refactor MetricQuery and MetricQueryPool
so that Stall sampling metric could be integrated seamlessly
Related-To: LOCI-2904
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
- In WSL the handle for DMA buf is an NT handle.
- To share and import this memory we check if the
handle is an NT handle before attempting to
load as an FD.
- If the handle is and NT handle, then we open the fd
as an NT handle.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
when local ids are generated by HW, use same dim order for runtime generation
move common logic to separated file
Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- Allow flushTask usage for XeHp+ only
- Fix black box test to only use Copy queue if found
Related-To: LOCI-1988
Signed-off-by: Aravind Gopalakrishnan <aravind.gopalakrishnan@intel.com>
Rename Metric Context and move OA specific functions
and members from Metric Context to OA specific class(MetricSource).
This refactoring is done so that additional Metric Source
like Stall Sampling could be Seamlessly implemented.
Related-To: LOCI-2753
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
If kernel has no stateless indirect accesses don't set the
kernelHasIndirectAccess flag.
Don't make resident or migrate if kernel has no indirect accesses.
Changed initial values in KernelAttributes.
Related-To: NEO-6597
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Instead of creating new allocation per fence, use the task count.
Fence synchronize will wait for task count update.
Related-To: NEO-6634
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>