Expands support for deprecated acronyms to fatbinary. Previously,
these were allowed only in single-target builds.
Related-To: NEO-10190
Signed-off-by: Chodor, Jaroslaw <jaroslaw.chodor@intel.com>
increase pool size to 2MB and threshold to 1MB
add limit to the number of pools, set to 2
Related-To: NEO-9690
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Fix querying allocation info from pooled svm ptr.
Handle requested allocation alignment.
Refactor sorted vector usage.
Do not associate device with host pool allocation.
Related-To: NEO-9700
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
EnableDeviceUsmAllocationPool and EnableHostUsmAllocationPool for device
and host allocations respectively.
Pool size will be set to flag value * MB.
Allocation size threshold to be pooled is 1MB.
Pools are created per context.
Related-To: NEO-9700
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Move code out to base class. This will allow to use the sorted vector
class with different values than only SvmData.
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Increase pool allocator threshold to 1MB
Remove stack allocations based on threshold in tests.
Related-To: NEO-9690
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
* Add new Software Tag `ArbitraryString`
* Add minimal interface for injecting software tags
Related-To: NEO-5550
Signed-off-by: Marek Kozlowski <marek.kozlowski@intel.com>
when class defines copy/move ctor then corresponding assign operator(s)
should be defined or deleted
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Instances returned by `getAllocationsVector()` in some cases cannot be
freed (in the `malloc/new` sense) until the `drain()` function invokes
`allocInUse()` on them. Plus, the `chunksToFree` container operates on
pairs `{offset, size}`, not pointers, so such pair cannot be used to
release allocations either.
Provide an optional callback, which can be implemented by the custom
pool derived from `AbstractBuffersPool`. This callback can be used, for
example, to perform actual release of an allocation related to the
currently processed chunk.
Additionally, provide the `drain()` and `tryFreeFromPoolBuffer()`
functions with pool-independent versions and keep the previous versions
as defaults (for allocators with a single pool). The new versions allow
reusing the code for cases when allocator has multiple pools.
In both cases, there was no such needs so far but it arose when working
on `IsaBuffersAllocator`. The latter is coming with future commits, but
the shared code modifications are extracted as an independent step.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
Instances returned by `getAllocationsVector()` in some cases cannot be
freed (in the `malloc/new` sense) until the `drain()` function invokes
`allocInUse()` on them. Plus, the `chunksToFree` container operates on
pairs `{offset, size}`, not pointers, so such pair cannot be used to
release allocations either.
Provide an optional callback, which can be implemented by the custom
pool derived from `AbstractBuffersPool`. This callback can be used, for
example, to perform actual release of an allocation related to the
currently processed chunk.
Additionally, provide the `drain()` and `tryFreeFromPoolBuffer()`
functions with pool-independent versions and keep the previous versions
as defaults (for allocators with a single pool). The new versions allow
reusing the code for cases when allocator has multiple pools.
In both cases, there was no such needs so far but it arose when working
on `IsaBuffersAllocator`. The latter is coming with future commits, but
the shared code modifications are extracted as an independent step.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
Currently the whole code resides within the opencl/ tree, but the
mechanism is meant to be reused in L0 for kernel-ISA allocations
optimization (further work).
This commit is a preparation step, which extracts the generic mechanism
and moves the extracted part under the shared/ tree.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
remove method to removing duplicates from StackVec as the method
implicitly sorted the vector
Related-To: GSD-4692
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- initialization of FileLogger always removed log file - this change only
removes old file when logging is enabled in current run
Resolves: NEO-7199
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
When using ReturnSubDevicesAsApiDevices=1 to have
sub-devices-as-root-devices, then the driver should read the values
passed in the mask as those corresponding to the physical
sub-devices.
For instance, in a dual system with multi-tile device, we would have:
card 0, tile 0
card 0, tile 1
card 1, tile 0
card 1, tile 1
With:
ReturnSubDevicesAsApiDevices=0
ZE_AFFINITY_MASK=0,1
Then all tiles in card 0 and card 1 need to be exposed.
With:
ReturnSubDevicesAsApiDevices=1
ZE_AFFINITY_MASK=0,3
Then card 0 tile 0, and card 1 tile 1 need to be exposed.
Related-To: NEO-7137
Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Related-To: LOCI-3871
- Enabled allocation of specified base address in the targeted heap.
- Enabled virtual memory reservations to grow by allocating at the start
of the heap vs the end of the heap.
Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
This patch force KMD allocation path for USM shared
Additionally we force 64kb page from lock which is
required to properly program GPU VA
Related-To: NEO-6913
Signed-off-by: Kamil Diedrich kamil.diedrich@intel.com
Fix given scenarios in yaml parsings:
- Correct reading string containing multiple words separated by a
whitespace (space/tab) on token value retrieving
- Remove any unnecessary whitespaces from the end of a string on token
value retrieving
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Limit the amount of times compare_exchange_weak is called,
to avoid issues with contention when multiple cpu cores request
the same address.
Related-To: NEO-7030
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This commit fixes problem in zebin manipulator when dump was not
created.
* Explicitly create dump directory.
* Add slash to dump argument.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This commit adds option to disassemble and assemble zebinary.
Disasm disassembles zebinary into sections. Text sections are
translated to assembly, relocations and symbols are
translated into human readable format.
Asm assembles zebinary from files generated by disasm.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
Historically, FileLogger was intended to work with scalar types.
Therefore, its member functions utilized parameter packs, which
copied the arguments. However, some time ago std::strings had
started to be passed to this function. The recursion was performing
multiple copies of the same std::string, which could cause
unneeded memory allocations.
This change:
- replaces copying with const references
- applies std::move() operator if possible
- replaces std::unique_lock with std::lock_guard
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This commit simplifies parsing of enums in zebin decoder and removes
unnecessary tests.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
In addition to supporting the official -device acronyms
(e.g. xe-hpg), support for shorter and deprecated acronyms
has also been added.
An example of supported variances:
- xehpg
- xe_hpg
- xe_hpg_core
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
Related-To: NEO-6910
Use enum class for MemoryPool in GraphicsAllocation
This change will ensure that GA is constructed in the proper way
- Rename namespace for isSystemMemoryPool method
- Add method getMemoryPoolString for logging actual pool which is in used
- Remove wrong pattern in GraphicsAllocation constructor
Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
Define single .clang-tidy configuration with all used checks and use
NOLINT to selectively silence tool. That way cleanup should be easier.
third_part/ has its own configuration that disables clang-tidy for this
folder.
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
Stack vector will not cause dynamic allocations in most circumstances
ie. number of root device indices not more than 16
Related-To: NEO-6837
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
When programming number of barriers use BARRIER_SIZE enumeration.
Resolves: NEO-6785
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This change introduces detection of GPU hangs
in CommandMapUnmap::submit() as well as in Event::submitCommand().
ULTs have been added to cover the new code.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
When linear stream created for command container has not enough space
for command and BB_END it will program BB_END and allocate new command
buffer allocation. Pointer returned from getSpace in this case will
return storage from new command buffer allocation.
Related-To: NEO-5707
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
- Clear dependencies even if last engine changed
- Do no program semaphore waiting for blit when blit is submitted with gpgpu
- Track barrier timestamps to correctly synchronize blits in OOQ
Related-To: NEO-6444
Signed-off-by: Maciej Dziuban <maciej.dziuban@intel.com>