Commit Graph

139 Commits

Author SHA1 Message Date
Lukasz Jobczyk b050a83242 performance: Use lock pointer copy for dc flush mitigation
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-10-31 10:53:39 +01:00
Compute-Runtime-Validation 3fcb9b18ee Revert "performance: Use lock pointer copy for dc flush mitigation"
This reverts commit b8be102455.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-10-31 05:15:07 +01:00
Lukasz Jobczyk b8be102455 performance: Use lock pointer copy for dc flush mitigation
Resolves: NEO-12898

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-10-29 21:37:53 +01:00
Dominik Dabek b2fc7345cf performance: redesign usm alloc reuse mechanism
Dedicated pools for different allocations size ranges.
Additional reused allocations will create their own pools.
Do not reuse allocations >256MB.

Related-To: NEO-6893, NEO-12299, NEO-12349

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-09-10 18:12:14 +02:00
Lukasz Jobczyk 03690e9b83 fix: Set special queue after its setup
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-09-06 12:05:24 +02:00
Dominik Dabek a47ca96a42 fix(ocl): allocate small buffer pool uncompressed
Related-To: HSD-15016054429

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-08-26 13:00:47 +02:00
Lukasz Jobczyk 9152b6ac04 performance: Defer special queue init to first use
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-08-23 14:22:51 +02:00
Compute-Runtime-Validation 4b01058706 Revert "performance: Defer special queue init to first use"
This reverts commit 25bb3c87ad.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-23 09:09:17 +02:00
Lukasz Jobczyk 25bb3c87ad performance: Defer special queue init to first use
Resolves: NEO-12332

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-08-22 10:07:44 +02:00
Michal Mrozek d52ca080bd Revert "performance: improve pool handling"
This reverts commit a3c3b6533a.

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-08-06 13:04:02 +02:00
Michal Mrozek 20d6910b66 performance: move usm pool init to first alloc call
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-07-18 16:07:22 +02:00
Dominik Dabek c1c9ac634b performance(ocl): enable host usm alloc recycle
Enable at threshold of 2% system memory.

Related-To: NEO-6893

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-07-17 19:33:56 +02:00
Michal Mrozek a3c3b6533a performance: improve pool handling
Related-To: NEO-11731
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-06-25 17:04:17 +02:00
Szymon Morek 29e3eb512c performance: non-usm copy through staging buffers
Related-To: NEO-11501

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-06-25 07:18:53 +02:00
Compute-Runtime-Validation 7136dfbd38 Revert "performance: improve pool handling"
This reverts commit 5f0b9efd2b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-06-24 12:47:19 +02:00
Dominik Dabek b6d86d2648 refactor: tests for buffer pool
add support for future AIL

Related-To: NEO-11694

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-06-18 13:55:21 +02:00
Mrozek, Michal 5f0b9efd2b performance: improve pool handling
Signed-off-by: Mrozek, Michal <michal.mrozek@intel.com>
Resolves: NEO-11731
2024-06-14 12:02:34 +02:00
Dominik Dabek b4d839fe29 performance(usm): l0, add usm host memory pooling
Disabled by default.

Related-To: NEO-11356

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-05-15 15:20:51 +02:00
Szymon Morek 10ed479b16 performance: share inter-module ISA allocations
Related-To: NEO-10258

Currently each module has it's own GA
for kernel ISA's. This change allows new modules to
reuse existing allocation.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-05-09 08:43:55 +02:00
Dominik Dabek 371788210d performance: limit usm host allocation recycle
Query system total memory size and limit usm host allocation recycle to
use at most x%.
x is read from ExperimentalEnableDeviceAllocationCache for device and
ExperimentalEnableHostAllocationCache for host.

Related-To: GSD-7497

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-07 17:45:41 +01:00
Dominik Dabek dcab4863d5 performance(ocl): calculate max buffer pool count
Set max buffer pool count to use at most 2 percent of device total memory.

Related-To: NEO-9690

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-01-30 07:04:36 +01:00
Dominik Dabek 6e434e0424 performance(ocl): increase buffer pool size
increase pool size to 2MB and threshold to 1MB
add limit to the number of pools, set to 2

Related-To: NEO-9690

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-01-24 10:55:29 +01:00
Dominik Dabek 9b52d52062 performance(ocl): enable usm pool allocator
Enable on xe hpg and lpg platforms

Related-To: NEO-9700

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-01-05 16:17:48 +01:00
Dominik Dabek af1620a308 fix(ocl): allocation info from pool svm ptr
Fix querying allocation info from pooled svm ptr.
Handle requested allocation alignment.
Refactor sorted vector usage.
Do not associate device with host pool allocation.

Related-To: NEO-9700

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-01-05 15:20:01 +01:00
Compute-Runtime-Validation 5535ef3049 Revert "performance(ocl): enable usm pool allocator"
This reverts commit 7bc8424a69.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-12-29 05:54:07 +01:00
Dominik Dabek 7bc8424a69 performance(ocl): enable usm pool allocator
Enable opencl usm pool allocator by default

Related-To: NEO-9700

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-12-28 13:14:41 +01:00
Dominik Dabek d238a68bae fix(ocl): usm pool allocator correct size
Wrong debug flag was used for setting host allocation pool size

Related-To: NEO-9700

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-12-27 23:14:28 +01:00
Dominik Dabek 2fe3804cc2 performance(ocl): add usm allocation pooling flag
EnableDeviceUsmAllocationPool and EnableHostUsmAllocationPool for device
and host allocations respectively.

Pool size will be set to flag value * MB.

Allocation size threshold to be pooled is 1MB.

Pools are created per context.

Related-To: NEO-9700

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-12-27 11:42:01 +01:00
Mateusz Jablonski 27fbdde4c5 refactor: correct naming of unified memory enums
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-13 15:58:21 +01:00
Mateusz Jablonski 739d181026 refactor: correct naming of enum class constants 6/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-13 14:48:52 +01:00
Compute-Runtime-Validation a2994e9b29 Revert "performance(ocl): set pool allocator threshold 1MB"
This reverts commit fc1d93af8e.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-12-09 07:02:42 +01:00
Dominik Dabek fc1d93af8e performance(ocl): set pool allocator threshold 1MB
Increase pool allocator threshold to 1MB
Remove stack allocations based on threshold in tests.

Related-To: NEO-9690

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-12-06 19:55:48 +01:00
Mateusz Jablonski c9664e6bad refactor: rename global debug manager to debugManager
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-11-30 13:00:59 +01:00
Maciej Bielski c7a971a28f feature: add optional onChunkFree callback to AbstractBuffersPool
Instances returned by `getAllocationsVector()` in some cases cannot be
freed (in the `malloc/new` sense) until the `drain()` function invokes
`allocInUse()` on them. Plus, the `chunksToFree` container operates on
pairs `{offset, size}`, not pointers, so such pair cannot be used to
release allocations either.

Provide an optional callback, which can be implemented by the custom
pool derived from `AbstractBuffersPool`. This callback can be used, for
example, to perform actual release of an allocation related to the
currently processed chunk.

Additionally, provide the `drain()` and `tryFreeFromPoolBuffer()`
functions with pool-independent versions and keep the previous versions
as defaults (for allocators with a single pool). The new versions allow
reusing the code for cases when allocator has multiple pools.

In both cases, there was no such needs so far but it arose when working
on `IsaBuffersAllocator`. The latter is coming with future commits, but
the shared code modifications are extracted as an independent step.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-07-13 17:26:51 +02:00
Compute-Runtime-Validation 9c7950cd22 Revert "feature: add optional onChunkFree callback to AbstractBuffersPool"
This reverts commit b7ecf99abb.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-07 04:31:30 +02:00
Maciej Bielski b7ecf99abb feature: add optional onChunkFree callback to AbstractBuffersPool
Instances returned by `getAllocationsVector()` in some cases cannot be
freed (in the `malloc/new` sense) until the `drain()` function invokes
`allocInUse()` on them. Plus, the `chunksToFree` container operates on
pairs `{offset, size}`, not pointers, so such pair cannot be used to
release allocations either.

Provide an optional callback, which can be implemented by the custom
pool derived from `AbstractBuffersPool`. This callback can be used, for
example, to perform actual release of an allocation related to the
currently processed chunk.

Additionally, provide the `drain()` and `tryFreeFromPoolBuffer()`
functions with pool-independent versions and keep the previous versions
as defaults (for allocators with a single pool). The new versions allow
reusing the code for cases when allocator has multiple pools.

In both cases, there was no such needs so far but it arose when working
on `IsaBuffersAllocator`. The latter is coming with future commits, but
the shared code modifications are extracted as an independent step.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-07-06 10:38:55 +02:00
Maciej Bielski 7ea8ed1757 refactor: extract generic parts of small buffers allocator
Currently the whole code resides within the opencl/ tree, but the
mechanism is meant to be reused in L0 for kernel-ISA allocations
optimization (further work).

This commit is a preparation step, which extracts the generic mechanism
and moves the extracted part under the shared/ tree.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-06-13 10:46:03 +02:00
Mateusz Jablonski 4f72835b7d fix: create dedicated class for root device indices to store unique values
remove method to removing duplicates from StackVec as the method
implicitly sorted the vector

Related-To: GSD-4692
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-06-12 22:24:06 +02:00
Fabian Zwolinski e351a90f81 refactor: Rename member variables to camelCase 2/n
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-27 20:39:22 +02:00
Krzysztof Gibala 16db7cc890 fix: Add missing checks in multi gpu scenario
- Check allocation root device index during eviction
- Wait for and marked allocation only from the current root device index

Related-To: NEO-7920
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2023-04-24 23:26:28 +02:00
Igor Venevtsev 3e5101424d Optimize small buffers allocator
- Do not wait for GPU completion on pool exhaust if allocs are in use,
allocate new pool instead
- Free small buffer address range if allocs are not in use and
buffer pool is exhausted

Resolves: NEO-7769, NEO-7836

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2023-04-19 11:56:50 +02:00
Igor Venevtsev 6aadf63725 Revert "Optimize small buffers allocator"
This reverts commit f57ff2913c.

Resolves: HSD-15013057572

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2023-03-24 12:17:54 +01:00
Igor Venevtsev f57ff2913c Optimize small buffers allocator
- Do not wait for GPU completion on pool exhaust if allocs are in use,
allocate new pool instead
- Reuse existing pool if allocs are not in use

Related-To: NEO-7769

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2023-03-15 19:12:30 +01:00
Kacper Nowak efba242570 fix(zebin): Extend oneDNN WA for whole application context
When a dummy kernel "kernel void_(){}" is passed in sources - specific
for workloads with ngen backend - enforce fallback to CTNI for the whole
application context (mark the context as non-zebinary).

Related-To: NEO-7772
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2023-03-07 14:21:57 +01:00
Warchulski, Jaroslaw 64f735481d Cleanup includes 48
Cleaned up files:
shared/source/command_container/command_encoder.inl
shared/source/os_interface/hw_info_config.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-02-10 17:23:02 +01:00
Mateusz Jablonski 24c5352350 refactor: remove redundant including of compiler_cache.h
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-02-03 11:16:31 +01:00
Warchulski, Jaroslaw 439aa6c87f Cleanup includes 43
Cleaned up files:
level_zero/core/test/unit_tests/mocks/mock_kernel.h
opencl/source/mem_obj/mem_obj.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-25 11:33:39 +01:00
Warchulski, Jaroslaw c43233dabf Cleanup includes 42
Cleaned up files:
level_zero/core/source/kernel/kernel_hw.h
shared/source/helpers/common_types.h
shared/test/common/libult/linux/drm_mock.h
shared/test/common/libult/ult_command_stream_receiver.h

Related-To: NEO-5548
Signed-off-by: Warchulski, Jaroslaw <jaroslaw.warchulski@intel.com>
2023-01-25 09:16:39 +01:00
Maciej Plewka fa4830036a feature(ocl) use tags to synchronize multi root device events
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-01-23 10:28:01 +01:00
Maciej Plewka 1421796541 Revert "feature(ocl) use tags to synchronize multi root device events"
This reverts commit 353a7510b2bd2d774d0b7ee82ee48eae7f5dc1d3.

Signed-off-by: Maciej Plewka maciej.plewka@intel.com
2023-01-17 11:29:58 +01:00