Increase chunk alignment from 256 to 512.
Restores performance in some workloads with pool enabled but lowers maximum
possible number of buffers in pool from 256 to 128.
MemObj size will keep the value passed to clCreateBuffer ie. will not be
aligned up by chunk alignment.
CL_MEM_SIZE will now return same value as with pool disabled.
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
additional tag allocations are not needed before creating OCL contexts
with multiple root devices
Related-To: NEO-7634
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
With this commit events created on multi root device contexts will
synchronize using signaled TagNodes instead of using taskCounts.
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
Related-To: NEO-7105
Flag == -1 - platform default
Flag == 0 - disabled
Flag == 1 - enabled for single device contexts
Flag == 2 - enabled for all contexts
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
check if flags allow buffer from pool
add buffer offset to aubtests
disable pool buffer where required
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Allow to: disable performance hints, make allocation lockable
Used in BufferPoolAllocator
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Allocations of buffers <= 64KB will be lockable, to
allow copying through locked pointer.
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Allow creating subBuffer from buffer from buffer pool allocator
by redirecting the call to the pool buffer and adjusting offset
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Improves performance in workloads that create small opencl buffers.
To enable, set env var ExperimentalSmallBufferPoolAllocator=1
Known issues (will be addressed in further commits):
- cannot create subBuffer from such buffer
- pool buffer allocation should be reused
Related-To: NEO-7332
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Notify gtpin onContextDestroy before SVM Allocations are deleted.
Resolves: NEO-6985
Signed-off-by: Sebastian Luzynski <sebastian.jozef.luzynski@intel.com>
Define single .clang-tidy configuration with all used checks and use
NOLINT to selectively silence tool. That way cleanup should be easier.
third_part/ has its own configuration that disables clang-tidy for this
folder.
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
Stack vector will not cause dynamic allocations in most circumstances
ie. number of root device indices not more than 16
Related-To: NEO-6837
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Dates corrected in copyright headers to reflect original publication date
(2018 for OpenCL, 2020 for Level Zero).
Signed-off-by: lgotszal <lukasz.gotszald@intel.com>
return if context has multiple sub devices related to a given root device
Related-To: NEO-3691
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
store reference to std of root device indices and device bitfields
store NEO::Device in USM properties
Related-To: NEO-3691
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>