This change replaces mechanism of patching global constants and
variables in kernel per relocation to patching them only once. This
would improve linking time performance for kernels with multiple global
symbols.
Signed-off-by: Luzynski, Sebastian Jozef <sebastian.jozef.luzynski@intel.com>
This commit adds support for retrieving extended args metadata passed in
.kernel_misc_info zeInfo's section on clGetKernelArgInfo call.
Related-To: NEO-7372
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Removes usage of precompiled binaries in debug program tests.
Related-To: NEO-7383
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This change makes that drm file is opened in nonblocking mode for prelim
kernels. In such case when calling exec buffer ioctl and get
EAGAIN (aka EWOULDBLOCK) we may return error to API level
Related-To: NEO-7144
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- Only initialize vertexes when queried.
- Return also bandwidth when calling PRELIM_DRM_I915_QUERY_FABRIC_INFO.
Related-To: LOCI-3464
Signed-off-by: Jaime A Arteaga Molina <jaime.a.arteaga.molina@intel.com>
This commit fixes problem in zebin manipulator when dump was not
created.
* Explicitly create dump directory.
* Add slash to dump argument.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Heuristic is enabled by default
to disable, set:
AdjustThreadGroupDispatchSize=0
Related-To: NEO-6989
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This change replaces unneeded copying of std::vectors
with usage of const references. Furthermore, it adds
reserve() call before filling the container via push_back().
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This commit adds support for parsing new .kernel_arg_metadata zeInfo's section,
which will be parsed only on demand (it won't get parsed on initial
zeInfo parsing).
Usage of populated structs will be added in the next commit.
Implemented section's parsing, decoding & populating corresponding fields in
kernelDescriptor.
Related-To: NEO-7372
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This patch force KMD allocation path for USM host allocation
and also for host part of USM shared allocation
Related-To: NEO-6913
Signed-off-by: Kamil Diedrich <kamil.diedrich@intel.com>
Add support for newly added indirect statelss count check;
populate related field in kernelInfo.
- Move hasIndirectStatelessAccess check from KernelInfo to
KernelDescriptor.
Related-To: NEO-7428
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This patch force KMD allocation path for USM host allocation
and also for host part of USM shared allocation
Related-To: NEO-6913
Signed-off-by: Kamil Diedrich <kamil.diedrich@intel.com>
Improves performance in workloads that create small opencl buffers.
To enable, set env var ExperimentalSmallBufferPoolAllocator=1
Known issues (will be addressed in further commits):
- cannot create subBuffer from such buffer
- pool buffer allocation should be reused
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
- Added Support for reading the Device LUID of the given device used in
Windows WDDM given EnableL0ReadLUIDExtension=1.
- Added inital support for passing back the NodeMask of 1.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
On pre-Gen12 platform we use igfxdcd kernel module for debugging, which
does not support zebinary format.
- When platform is pre-Gen12 an and debugger is
attached, if binary format is zebin and it's not a builtin:
- If SPIR-V is available - force rebuild with zebin disabled
- Otherwise, return an error.
- Minor refactor: extend check for ir presence for each case of
rebuilt in OCL.
Related-To: NEO-7328
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Command buffers in CommandContainer are removed
through cmdBufferAllocations. This PR ensures
that allocations will be stored there if they
are currently used by given cmd container.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
This change makes that drm file is opened in nonblocking mode for prelim
kernels. In such case when calling exec buffer ioctl and get
EAGAIN (aka EWOULDBLOCK) we may return error to API level
Related-To: NEO-7144
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Move Batching code under batching if to not call not required functions.
Update task level only if level is closed.
70ns gain.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
This commit adds option to disassemble and assemble zebinary.
Disasm disassembles zebinary into sections. Text sections are
translated to assembly, relocations and symbols are
translated into human readable format.
Asm assembles zebinary from files generated by disasm.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Add support for inline samplers in zebin.
Generate required SAMPLER_STATEs in DSH.
Resolves: NEO-7388
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Current implementation of getIntelGTNotes function causes
segfault in case empty owner name string would be passed.
This commit removes potential out-of-bound array access.
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
- copy att bitmask to event in tests, not in mockWddm
- simplifies logic, gets rid of redundant struct, allows
setting different att bitmask for every event
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
This change is intended to be used in immediate command lists that are
using flush task functionality.
With this change all immediate command list using the same csr will consume
shared allocations for dsh and ssh heaps. This will decrease number of SBA
commands dispatched when multiple command lists coexists and dispatch kernels.
With this change new SBA command should be dispatched only when current heap
allocation is exhausted.
Functionality is currently disabled and available under debug key.
Functionality will be enabled by default for all immediate command lists
with flush task functionality enabled.
Related-To: NEO-7142
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This patch makes iaf nl interfaces in sysman be usable
by other modules.
1.change Iaf_nl interfaces to remove sysman dependencies
2.move iaf specific interfaces(getPorts()) to iaf specific file
3.move iaf_nl files to sysman/linux
Related-To: LOCI-3357
Signed-off-by: Joshua Santosh Ranjan <joshua.santosh.ranjan@intel.com>
Improve performance by binding the command buffer together with other
allocations if VM_BIND feature is available. Remove the legacy
flag PassBoundBOToExec from DebugManager to simplify the logic.
Adapt unit tests and reuse handy macros to generate proxy mock-methods.
Related-To: NEO-7348
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
This extension allows pairing two buffer objects so they can be
exported using a single dma-buf handle. When imported, a single
buffer object is created with a total size of the two buffer
objects.
Related-To: LOCI-3355
Sync to
https://github.com/intel-gpu/drm-uapi-helper/releases/tag/v2.0-rc15
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Previously we used an array-of-pointers approach, but using an
array-of-structures is in some ways simpler.
We also split out the RTStack as a separate allocation.
Related-To: LOCI-2966
Signed-off-by: Jim Snow <jim.m.snow@intel.com>
use cpu copy with locked pointer if possible
because this is faster than copy on gpu
limit to buffers of size at most 64kb
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Related-To: NEO-7237
Enable copy on cpu by default.
This commit also changes barrierCounter to bool
barrierCalled
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
This commit adds support for 32 bit zebinary in NEO runtime and in
ocloc validate.
Resolves: NEO-7288
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
- Added Support for reading the Device LUID of the given device used in
Windows WDDM.
- Added inital support for passing back the NodeMask of 1.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
- Added Support for reading the Device LUID of the given device used in
Windows WDDM.
- Added inital support for passing back the NodeMask of 1.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
This change reflects exact nature of debug variable and what is code
actually doing
Related-To: NEO-7187
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- enable tile attach mode by default
- both root device and subdevice may be attached to
Related-To: NEO-7347
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Related-To: NEO-7237
If size is small enough, it is more efficient to
perform copy through locked ptr on CPU.
This change also introduces experimental flag to
enable this.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
Parse and adjust ccs count on reset so that initial
environment is restored.
Related-To: LOCI-3435
Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
With this change, module's data sections will be allocated in USM device
pool instead of SVM or USM shared.
Signed-off-by: Luzynski, Sebastian Jozef <sebastian.jozef.luzynski@intel.com>
This optimization removes pipeline select from command list preamble
and presented to command queue for necessary state update.
Code is disabled by default and available under debug key.
Related-To: NEO-5019
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Move env variable check to setAlarm function. This will help in
disabling test alarm across all test binaries.
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
We want to return error code to the application instead of aborting when
we are not able to make more memory resident.
Related-To: NEO-7289
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
- Enabled default setting of Program & Global Symbols to be generated by
IGC when building L0 Modules with the ability to fallback to previous
behavior thru build failure checks.
- Enabled selective disable of default program or global symbol
generation thru debug variables.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
In case of debuggable context device should be additionally
initialized by early empty submission issue.
Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
Another step towards cleaner callers of
StateBaseAddressHelper<>::programStateBaseAddress.
Export programming state base address into a separate function to
improve code reuse and reduce copy-pasted fragments, which make code
modifications or maintenance more and more difficult over time. Use
specialization for gen-specific variations.
Related-To: NEO-6774
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
This change prepares infrastructure for pipeline select handling in
command lists and queues by optimization of number of commands dispatched.
State is synchronized between flush-task immediate and regular command lists.
Next step is to add optimization itself which disables legacy hw command
dispatch algorithm.
This change corrects ADL-P support for systolic mode changes.
Related-To: NEO-5019
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
With this change, module's data sections will be allocated in USM device pool
instead of SVM or USM shared.
Signed-off-by: Sebastian Luzynski <sebastian.jozef.luzynski@intel.com>
Adding new interface to cooperate with hw context state
Simplify programming removing unnecessary functions
Code optimization that stop using expensive call and instead
stores configuration parameter
Related-To: NEO-5019
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
So far captureStateBaseAddress() was a wrapper around
programSbaTrackingCommands(), doing an additional checking before
calling the latter. The checking is apparently no longer relevant, so
unify the distinction and remove part of the code which is no longer
needed.
In practice, keep the captureStateBaseAddress() while moving the body of
programSbaTrackingCommands() into it. This imposes lower diff-impact
onto the class hierarchy. Remove the second function. Simplify the
caller which had to distinct these two functions previously.
Related-To: NEO-6774
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
Fixes found out while working on the StateBaseAddress adaptation to
StreamProperties. Removing unused parameters, improving code reuse
(further improvements come with following commits).
Related-To: NEO-6774
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
- Enhance zebinary module mock for L0 with kernel attribute,
so at this point L0 ULTs are completely free from IGC dependencies.
Related-To: NEO-7281
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
Related-To: NEO-7156
Additional pipeline select is required only on specific
gen12lp products. That check was missing only in cmdstream size
estimation path, hence we had an overestimation.
This commits add support for relocating
symbols with local binding and of functional type
(STB_LOCAL, STT_FUNC).
Related-To: NEO-7299
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
It was only needed for LOAD_BALANCED scenarios, so with recent disabling
of this feature in KMD, it is no longer required.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
With compiler LSC WAs this gives better performance.
If debugger is active, policy will not be changed ie.
will be WBP.
Related-To: NEO-7003
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This commit adds parsing of "user_attributes" section of zeInfo
containing kernel's language attributes.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This change gives fine grain control over front end configuration for each
kernel.
As it gives possible to inject FE command in command queue and return to exact
place in command list.
Programming commands in queue makes patching commands in command lists
not needed as that operation is costly.
And it allows to program context information for each command list too.
Related-To: NEO-5019
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- For test kernels compiled with options passed, change their naming to
following convention:
{basename}_{options_passed}_{suffix}.
- Correct CMake variables naming.
- Refactor logic of retrieving test kernels' data (also in compilers
mock)
- In relation to previous changes: do not generate unnecessary
.gen binary for L0 test kernel
Related-To: NEO-7285
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
This change removes the following UBs:
- calling std::vector::insert() with source range, which comes
from the destination vector; when vector needs to grow then
the source iterators are invalid
- comparison of singular (default-constructed) iterators is
disallowed
Signed-off-by: Wrobel, Patryk <patryk.wrobel@intel.com>
- Enabled default setting of Program & Global Symbols to be generated by
IGC when building L0 Modules with the ability to fallback to previous
behavior thru build failure checks.
- Enabled selective disable of default program or global symbol
generation thru debug variables.
Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
This commit adds handling for "invalid_kernel" kernel's attribute.
This attribute is present when kernel is invalid e.g. could not be
correctly compiled due to missing feature.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Discarding RAII lock returned from function almost always
is a bug. This change introduces usage of [[no_discard]]
attribute from C++17 to prevent such misues.
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
With compiler LSC WAs this gives better performance.
If debugger is active, policy will not be changed ie.
will be WBP.
Related-To: NEO-7003
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>