In most cases, there was code redundancy, which was minimized in this change.
The setupHardwareInfoBase extraction will also be used in ocloc.
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-6910
This change:
* Adds support for build options section in zebinary - using
build options in binary when rebuilding.
* Appends "-cl-intel-allow-zebin" flag to build options when zebin is
used.
Resolves: NEO-6916
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
With this commit OpenCL will track if external host memory is used from
few threads and will secure to update task count in all threads before
destroing allocation.
Resolves: NEO-6807
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
use tag allocation address as a completion address in exec call
wait for completion value before destroying drm direct submission
Related-To: NEO-6643
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Do not apply relocations with types different than {1, 2, 3}, when creating
debug zebin.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
- helper sets all SbaAddresses for debugger in
EncodeStateBaseAddress<GfxFamily>::setSbaAddressesForDebugger()
- change DebuggerL0::captureStateBaseAddress() to take
LinearStream
- move getSbaTrackingCommandsSize() to Debugger class
Related-To: NEO-6845
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
This commit prevents a yaml parsing error in case a data type is passed
after empty vetor type data entry with the same indendation. In this
case, a parsing error will be returned.
- Corrected .ze_info section in valid empty program (zebin mock)
- Minor ults refactor in order to use mock zebin program with valid
.ze_info
Related-To: NEO-6735
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
feat(zebin): Use addend from RELA sections when performing relocations.
Resolves: NEO-6898
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Stack vector will not cause dynamic allocations in most circumstances
ie. number of root device indices not more than 16
Related-To: NEO-6837
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This change introduces checking of the return value
of wait function in case of blocking version of
evictUnusedAllocations(). Furthermore, it propagates
the error to the callers. It contains also ULTs.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
feat: Set text sections' addresses with valid GPU VA. Offset debug symbols
with text segment names by corresponding segment's GPU VA.
Resolves: NEO-6873
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Previous change regarding NEO-6785 added encoding of number of barriers
to specific value representation depending on hardware that we program for.
In patch token format encoding of number of barriers is sent via
hasBarriers field in a token.
In zebin true number of barriers is sent via barrier_count field in
zeInfo.
To remove this discrepancy, translate encoded number of barriers into
true number of barriers in legacy format.
Resolves: NEO-6785
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This commits adds rebuilding zebin binary.
If zebin is built for different device and has SPIRV, then new ze binary
will be built using SPIRV.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Change modifies the encoding entry in fatbinary for platforms.
If numbering in -device is used, the value PRODUCT_CONFIG will be encoded.
The functionality that returns the correct product config values has
also been added.
Related-To: NEO-6744
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
This change introduces checking of waits status in
CommandQueue and CommandList classes.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
When programming number of barriers use BARRIER_SIZE enumeration.
Resolves: NEO-6785
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
There is only one implementation of said class and we don't
even adhere to the interface it provides.
Signed-off-by: Daniel Chabrowski <daniel.chabrowski@intel.com>
This change introduces unit tests related to
member functions of OfflineCompiler class.
OfflineCompiler::initialize() is not covered
and it will be added in a separate commit.
Related-To: NEO-6834
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This commit removes early fail in linking with zebin and external
functions which happens when, there's a relocation to external functions
section, but it's not modifying any external function. And only treats
GLOBAL FUNC symbols pointing to external functions section as external
function symbols.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Do not allocate heap if command list is copy only.
Related-To: NEO-6821
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Co-authored-by: Michal Mrozek <michal.mrozek@intel.com>
This change introduces checking of values returned
by blocking calls used in cmdlist_hw_immediate.inl.
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Related-To: NEO-6681
require 48bit resource for ring/semaphore buffer
for multi tile allocations select first tile
for single tile allocation select preferred tile
Related-To: NEO-6698
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
For different platforms based on number of available threads
and debug surface layout, calculate max debug surface size.
Related-To: NEO-6676
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
This commit removes custom definition of
_PATCH_TOKEN_GLOBAL_HOST_ACCESS_TABLE and
instead uses one provided by IGC.
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
In direct submission scenario command/ring/semaphore buffer allocations
are placed in the same memory bank to ensure that their memory is updated in
correct order
Related-To: NEO-6698
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
This patch adds support for reading PCI bandwidth, generation
and linkwidth information from sysfs nodes for the linux
platform.
Related-To: LOCI-2969
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Remove parameter requiredThreadArbitrationPolicy
from PreambleHelper::programPreamble function.
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
This commit moves patch token with global host access table to the
kernel scope from the program scope.
Related-To: NEO-6734
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Add a per-instance SVMAllocsManager::nonGpuDomainAllocs container for
all allocations to be removed in
moveAllocationsWithinUMAllocsManagerToGpuDomain. This approach replaces
the current iterative search and performs the task faster.
Add 7 new unit-tests to verify the functionality related to
nonGpuDomainAllocs container, both in expected and unexpected/synthetic
scenarios.
For UTs replace a dummy unifiedMemoryManager pointer with a pointer to
an instace of SVMAllocsManager, otherwise a SegFault error is thrown at
the end of tests.
Perform overall cleanup in related tests implementation, includes but
not limited to removal of:
- givenInitialPlacementGpu\
WhenMovingToGpuDomainThenFirstAccessDoesNotInvokeTransfer
As it is fully covered by:
givenAllocationMovedToGpuDomain\
WhenVerifyingPagefaultThenAllocationIsMovedToCpuDomain
- givenInitialPlacementGpu\
WhenVerifyingPagefaultThenFirstAccessDoesNotInvokeTransfer
As it is fully covered by:
givenTbxAndnitialPlacementGpu\
WhenVerifyingPagefaultThenMemoryIsUnprotectedOnly
Finally, reduce code duplication where possible.
Related-To: NEO-6658
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
Logic related to programming non coherent and thread arbitration policy for
gens 9 and 11 has been moved to EncodeComputeMode object, where similar
logic for gens gen12lp and newer is located.
Functions PreambleHelper::programThreadArbitration and
PreambleHelper::getThreadArbitrationCommandsSize have been removed.
Redundant setForceNonCoherent call has been removed from XE HPG
Related-To: NEO-6728
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
This Patch adds support for collecting IP Metrics using
StreamerOpen, StreamerClose and StreamerReadData
Related-To: LOCI-2755
Related-To: LOCI-2756
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
This commit removes ZebinTargetMetadata struct, and uses
ZebinTargetFlags for both target validations: via machine type, and
via intel gt notes.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This change introduces the new flag called DisableGpuHangDetection.
By default it is disabled. When someone wants to disable hang checking,
then this flag can be set to true.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This feature is disabled by default, controlled with the knob
AppendMemoryPrefetchForKmdMigratedSharedAllocations
Related-To: NEO-6740
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
Change ThreadArbitrationPolicy::NotPresent value to -1
Update initial values to ThreadArbitrationPolicy::NotPresent
Related-To: NEO-6728
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
This commit adds check in Linker::resolveExternalFunctions checking
if external functions are present before trying to resolve dependencies
and adds default values for ExternalFunctionInfo.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Remove function clearComputeModePropertiesIfNeeded.
If a field has to be programmed unconditionally, ignore isDirty flag.
Related-To: NEO-6728
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
This commit enables parsing symbol infos
passed in the PATCH_TOKEN_PROGRAM_SYMBOL_TABLE patchtoken.
Related-To: NEO-6734
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This change allows for modifying kernel's barrier count
based on called external functions metadata passed
via zeInfo section in zebin.
Added parsing external functions metadata.
Added resolving external functions call graph.
Added updating kernel barriers based on called external functions.
Added support for L0 dynamic link.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This commit adds support for querying global pointers via decorated
names passed in zeInfo.
Related-To: NEO-6734
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This change introduces detection of GPU hangs in
clFinish function as well as unit tests to cover
the new code.
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
On pre-XeHp platforms implicit args aren't at the beginning of indirect data,
GPU address of implicit args buffer is programmed within cross thread data
Related-To: NEO-5081, IGC-4710
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
This commit fixes generating localIDs in zebin.
With thix fix, Emit Local in compute walker will be set accordingly with
the size of local_id argtype (currently, Emit Local is set to Emit None,
which prevents generating local IDs).
Related-To: NEO-6089
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This change:
- moves NEO::WaitStatus to a separate file
- enables detection of GPU hang in clWaitForEvents
- adjusts most of blocking calls in CommandStreamReceiver to return WaitStatus
- adds ULTs to cover the new code
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This commit fixes generating localIDs in zebin.
With thix fix, Emit Local in compute walker will be set accordingly with
the size of local_id argtype (currently, Emit Local is set to Emit None,
which prevents generating local IDs).
Related-To: NEO-6089
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This patch adds OS specific implementation for IP Sampling.
Implementation for linux is provided as part of this patch.
Related-To: LOCI-2787
--- master-files
level_zero/tools/source/metrics/linux/os_metric_ip_sampling_imp_linux.cpp
level_zero/tools/source/metrics/os_metric_ip_sampling.h
level_zero/tools/source/metrics/windows/os_metric_ip_sampling_imp_windows.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_prelim.cpp
level_zero/tools/test/unit_tests/sources/metrics/linux/test_metric_ip_sampling_linux_upstream.cpp
level_zero/tools/test/unit_tests/sources/metrics/windows/test_metric_ip_sampling_windows.cpp
--- master-files
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
according to implicit args design for SIMD-1 local ids are one-by-one
Resolves: NEO-6692
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
In order to setup ioctl helper we need to call ioctl to get hw info
Related-To: NEO-6591
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
When running multiple threads, one thread could clear
allocationsForDownload while another was iterating over it.
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
- PROCESS_ENTRY - triggered by first zeCommandQueueCreate()
- PROCESS_EXIT - triggered by last zeCommandQueueDestroy()
Resolves: NEO-6503
Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
Following API calls are being tested:
- enqueueWriteImage
- enqueueReadImage
- enqueueCopyImage
Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
Related-To: NEO-6135
This change introduces detection of GPU hangs in
zeEventHostSynchronize and zeFenceHostSynchronize.
Furthermore, if CommandQueueHw::executeCommandLists
uses ZE_COMMAND_QUEUE_MODE_SYNCHRONOUS and hang occurs,
the information about it is propagated to the caller.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This commit fixes setting usesStringMap flag for printf, taking into
account using indirect functions in legacy (non-zebinary) path. It also
adds new field to kernelDescriptor, specifying the binary type
(legacy/zebin).
Related-To: NEO-6604
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This change prevents embedding identical SPIR-V section for each
target requested in fatbinary build. Instead of duplicating SPIR-V,
a new file called 'generic_ir' is added to AR archive. It contains
SPIR-V, which was used to build fatbinary. Build fallback in runtime
has been also adjusted - if 'generic_ir' file is defined in fatbinary
and there is no matching binary, then this generic SPIR-V is used to
rebuild for the requested target.
Additionally, MockOclocArgumentHelper::loadDataFromFile() was adjusted
to ensure null-termination of returned strings.
This change also removes possible undefined behavior, which was
related to reading names of files from AR archive. Previously,
if filename was shorter than requested target name, we tried to
read more memory than allowed.
Related-To: NEO-6490
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Rename:
- debug flag ProgramPipeControlPriorToNonPipelinedStateCommand
to ProgramExtendedPipeControlPriorToNonPipelinedStateCommand
- local variables
Related-To: NEO-6615
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
use waitUserFence to wait for fence value
move some tests to shared
- ioctl helper tests
- drm memory info tests
- drm cache info
Related-To: NEO-6591
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Add pipe control before state base address, state compute
mode and state sip commands.
Related-To: NEO-6615
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
Add pipe control before state base address, state compute
mode and state sip commands on DG2 and PVC when CCS flow is used.
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
This change uses value of cpuAddress from monitored fence
to detect GPU hang.
Related-To: NEO-5313
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
when local ids are generated by HW, use same dim order for runtime generation
move common logic to separated file
Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>