Instances returned by `getAllocationsVector()` in some cases cannot be
freed (in the `malloc/new` sense) until the `drain()` function invokes
`allocInUse()` on them. Plus, the `chunksToFree` container operates on
pairs `{offset, size}`, not pointers, so such pair cannot be used to
release allocations either.
Provide an optional callback, which can be implemented by the custom
pool derived from `AbstractBuffersPool`. This callback can be used, for
example, to perform actual release of an allocation related to the
currently processed chunk.
Additionally, provide the `drain()` and `tryFreeFromPoolBuffer()`
functions with pool-independent versions and keep the previous versions
as defaults (for allocators with a single pool). The new versions allow
reusing the code for cases when allocator has multiple pools.
In both cases, there was no such needs so far but it arose when working
on `IsaBuffersAllocator`. The latter is coming with future commits, but
the shared code modifications are extracted as an independent step.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
This commit adds a validation layer in ocloc,
which is designed to check if the data read from
the binary file does not exceed the size of the section.
Related-To: NEO-8062
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
With direct submission disabled this resulted in waitTimeout long enough
that kmdWait fallback was rarely used.
This caused more CPU spin time.
Related-To: GSD-3612
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
For image view mapped directly to UV plane,
the dimensions should 2 times smaller than
dimensions of the source image.
(1 raw UV pair maps to 2x2 block of original image)
Related-To: NEO-7936
Signed-off-by: Jaroslaw Chodor <jaroslaw.chodor@intel.com>
- remove useless flag ForceNumberOfThreadsInGpgpuThreadGroup
- add new flag "RemoveRestrictionsOnNumberOfThreadsInGpgpuThreadGroup"
to restore old path without restrictions about number of threads in
thread group
- fix forwarding information about hw local ids generations to
calculate numOfThreadsInThreadGroup correctly
Related-To: NEO-7952, NEO-7982
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
denorm support is controlled by IGC, we should just set zero by default
Related-To: NEO-8059
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- Value correction: IntelGTSectionType::ProductConfig to 6, add new type
IntelGTSectionType::vISAAbiVersion = 5 - currently ignored by the
runtime
- For zebin manipulator: allow to extract PRODUCT_FAMILY from AOT
productConfig - required by IGA wrapper for binary encoding/decoding +
add tests
- Bump ZEInfo version to the latest: 1.32
Related-To: IGC-6300
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Remove not needed c-style cl_context handle casting on
clCreateContextFromType API call. This bug is currently also visible
when using OCL tracing API.
Related-To: NEO-8011
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Add new BB test for OCL tracing interface.
- Test usage of OCL tracing API
- Test usage of user-definied callback
- Test scenario with possible infinite recursion (nested call in
callback).
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Prevent from tracing nested API calls (case when similar
call is invoked in tracing callback) in OCL.
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-7958
- by default ZE_ENABLE_PCI_ID_DEVICE_ORDER is disabled
- by default devices are sorted by type (discrete first), then by pci order
- when ZE_ENABLE_PCI_ID_DEVICE_ORDER is enabled, devices are sorted by pci id
Related-To: LOCI-4520
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Use flushStamp=taskCount when passed flushStamp==0.
This will cause driver to busy wait for a short while before falling
back to use kmd notify.
Related-To: GSD-3612
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
denorm support is controlled by IGC, we should just set zero by default
Related-To: NEO-8059
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Currently the whole code resides within the opencl/ tree, but the
mechanism is meant to be reused in L0 for kernel-ISA allocations
optimization (further work).
This commit is a preparation step, which extracts the generic mechanism
and moves the extracted part under the shared/ tree.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
remove method to removing duplicates from StackVec as the method
implicitly sorted the vector
Related-To: GSD-4692
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
Related-To: NEO-6075
Ngen binaries contain stateful information, however they are
not used in isa on Pvc. Therefore, we can just ignore them.
- getGlobalBindlessHeapConfiguration() should be used to choose global
alloctor for SSH
- remove not needed and incorrect unit tests
- remove not needed branches
- bindless mode controls bindless compilation only
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Change DEBUG_BREAK to UNRECOVERABLE macro in the case of offset greater
than 32 bit (4 GB). Such huge offsets are not supported.
Current implementation is able to hide issues leading to incorrect
behaviour (i.e. overwritting indirect data).
Related-To: NEO-7970
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
- move isCachingOnCpuAvailable to product helper
- isCachingOnCpuAvailable should return false on mtl
- if wsl, skip checking method from product helper
Related-To: NEO-7194
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>