Commit Graph

168 Commits

Author SHA1 Message Date
Dominik Dabek 752f313808 fix: limit allocation cache memory wastage
Allocations over a certain size will be checked for memory utilization
when chosen for reuse.
If utilization is below a threshold, they will not be reused.

Related-To: NEO-6893

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-10-01 09:49:19 +02:00
Bartosz Dunajski fa4812f963 fix: add alignment flag support in svm path
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-01 10:40:47 +02:00
Dominik Dabek 9b3ccf73b7 refactor: host usm recycle
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-07-23 16:20:21 +02:00
Szymon Morek 0e6729062a performance: enable compression on shared USM
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-22 15:36:37 +02:00
Dominik Dabek 4fa6711025 performance(ocl): change device usm recycle to 8%
Increase threshold of device usm allocation recycling to 8% of device
memory.

Related-To: NEO-6893

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-07-17 19:50:46 +02:00
Szymon Morek b03ac6abd1 fix: disable usm compression on linux
Related-To: NEO-12047

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-16 14:59:33 +02:00
Szymon Morek 432ecbc8f4 fix: disable compression for exported allocations
Related-To: NEO-12021

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-16 05:41:14 +02:00
Szymon Morek 35cbbfe43a performance: Don't wait for taskCount for indirect allocs
Related-To: GSD-9385

In case of indirect allocations, we don't really know
their task count because we can't track their true usage
on GPU.
In case of non-blocking free, don't wait for latestSentTaskCount.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-10 15:51:04 +02:00
Szymon Morek 457cb005de performance: iterate over indirect allocations once
Related-To: NEO-11921

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-09 09:31:52 +02:00
Szymon Morek e8ee91a694 fix: iterate over each indirect allocation
Related-To: GSD-9450

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-04 12:01:46 +02:00
Szymon Morek 3dd051c3ee performance: adjust compression handling
Related-To: NEO-11882

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-03 09:37:11 +02:00
Dominik Dabek 79b9e73311 fix: device usm alloc reuse
Do not put into usm reuse if is internal.
Set new isInternalAllocation flag for globals allocations.

Use actual size on device for tracking memory usage.

Related-To: NEO-6893

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-05-29 12:18:34 +02:00
Compute-Runtime-Validation dd55225041 Revert "fix: device usm alloc reuse"
This reverts commit 7cb1819b22.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-05-28 21:19:40 +02:00
Dominik Dabek 7cb1819b22 fix: device usm alloc reuse
Do not put into usm reuse if is internal.
Set new isInternalAllocation flag for globals allocations.

Use actual size on device for tracking memory usage.

Related-To: NEO-6893

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-05-27 15:34:05 +02:00
Dominik Dabek c9758216fc fix(ocl): do not reuse usm for globals export
Allocating global surface is expecting that the usm allocation is zeroed
out. Reusing allocations can be filled with junk data and this caused
errors.

Resolves: HSD-18038551036, HSD-18038551766, HSD-18038551957, HSD-18038552252

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-05-21 14:38:28 +02:00
Dominik Dabek a236171f0d performance(ocl): enable device usm alloc reuse
Enabling on MTL+
Limited to use max 2% of global device memory.

Related-To: NEO-6893, NEO-11463

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-05-17 13:32:45 +02:00
Szymon Morek aa0441bc63 fix: Iterate from oldest allocation to latest
Related-To: NEO-11409

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-05-13 19:05:11 +02:00
Szymon Morek e35b951a00 performance: Allow indirect allocs as pack on OpenCL
Related-To: NEO-11228

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-05-10 17:57:42 +02:00
Szymon Morek 6df46aa062 performance: Iterate over indirect allocations once
Related-To: NEO-11228

Iterate only on new allocations when making indirect
allocations resident.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-05-06 15:51:37 +02:00
Lu, Wenbin 5e562ae7b0 fix: store the correct pagesize in SvmAllocationData
Also use the same alignment for both CPU & GPU in shared USM

Related-To: GSD-7103, NEO-9812

Signed-off-by: Wenbin Lu <wenbin.lu@intel.com>
2024-02-08 10:10:22 +01:00
Dominik Dabek 371788210d performance: limit usm host allocation recycle
Query system total memory size and limit usm host allocation recycle to
use at most x%.
x is read from ExperimentalEnableDeviceAllocationCache for device and
ExperimentalEnableHostAllocationCache for host.

Related-To: GSD-7497

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-07 17:45:41 +01:00
Dominik Dabek 2cad595a0d performance: debug flag for usm host alloc recycle
set ExperimentalEnableHostAllocationCache=1 to recycle host usm
allocations

Related-To: GSD-7497

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-02-01 16:47:59 +01:00
Compute-Runtime-Validation e7b7eb06e4 Revert "fix: store the correct pagesize in SvmAllocationData"
This reverts commit a104d9199d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-02-01 05:00:45 +01:00
Lu, Wenbin a104d9199d fix: store the correct pagesize in SvmAllocationData
Related-To: GSD-7103, NEO-9812

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2024-01-31 09:12:20 +01:00
Dominik Dabek 2fe3804cc2 performance(ocl): add usm allocation pooling flag
EnableDeviceUsmAllocationPool and EnableHostUsmAllocationPool for device
and host allocations respectively.

Pool size will be set to flag value * MB.

Allocation size threshold to be pooled is 1MB.

Pools are created per context.

Related-To: NEO-9700

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-12-27 11:42:01 +01:00
Mateusz Jablonski dd1b9d6abc refactor: correct naming of enum class constants 8/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-19 08:18:18 +01:00
Mateusz Jablonski 27fbdde4c5 refactor: correct naming of unified memory enums
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-13 15:58:21 +01:00
Lu, Wenbin 67fa39c9a1 fix: get right page size when malloc uses 0 alignment
Related-To: GSD-7103

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-12-13 15:00:56 +01:00
Dominik Dabek 2146cd07ee refactor: SortedVectorBasedAllocationTracker
Move code out to base class. This will allow to use the sorted vector
class with different values than only SvmData.

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-12-13 12:47:04 +01:00
Mateusz Jablonski b182917d9d refactor: correct naming of allocation types
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-11 16:23:37 +01:00
Mateusz Jablonski c9664e6bad refactor: rename global debug manager to debugManager
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-11-30 13:00:59 +01:00
Lukasz Jobczyk ac8c00048e performance: optimize svm allocation tracking
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-11-23 10:54:01 +01:00
Compute-Runtime-Validation 7f61217a44 Revert "performance: optimize svm allocation tracking"
This reverts commit e91ce78ec8.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-11-16 11:03:19 +01:00
Lukasz Jobczyk e91ce78ec8 performance: optimize svm allocation tracking
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-11-15 13:58:05 +01:00
Lukasz Jobczyk 9a8138725a fix: Deferred SVM allocations look up by gpu address
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-11-14 15:27:01 +01:00
John Falkowski f156a74f54 fix: split chunking prefetch flags
Related-To: NEO-9120

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-10-18 19:20:42 +02:00
Mateusz Jablonski fc508212de refactor: pass big parameters as reference instead of by value
Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-04 14:53:13 +02:00
John Falkowski 2403212dcd fix: chunking prefetch add USER_FENCE
Add USER_FENCE before PREFETCH call and after the BIND

Related-To: NEO-8098

Signed-off by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-08-17 21:32:47 +02:00
Lukasz Jobczyk 3ab72e7d79 fix: Align svm cpu to alignment passed to properties
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-08-11 14:57:49 +02:00
Compute-Runtime-Validation 820e94e89c Revert "fix: Align svm cpu to alignment passed to properties"
This reverts commit d66da494d4.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-08-11 07:14:30 +02:00
Lukasz Jobczyk d66da494d4 fix: Align svm cpu to alignment passed to properties
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-08-09 11:52:21 +02:00
Milczarek, Slawomir 1195578d96 fix: KW issue with dereference in function call that may return null
Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2023-07-27 07:39:15 +02:00
Lu, Wenbin 4de792cee0 fix: support alignments in host and shared UnifiedMemoryAllocation
Related-To: LOCI-4334

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-07-13 08:48:41 +02:00
Compute-Runtime-Validation 02436b8877 Revert "fix: support alignments in host and shared UnifiedMemoryAllocation"
This reverts commit c11809e002.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-06-14 06:32:40 +02:00
Jitendra Sharma 38415162c5 fix: While creating shared memory use given device
When creating shared USM, currently default root device index
is used when accessing memoryManager.
This change fixes this issue, by using device provided by caller.
In case device is not provided, then default root device index
could be used.

Related-To: LOCI-4474

Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-06-13 15:44:14 +02:00
Lu, Wenbin c11809e002 fix: support alignments in host and shared UnifiedMemoryAllocation
Related-To: LOCI-4334

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-06-13 10:01:11 +02:00
Neil R Spruit ded9d7bff2 feature: Get Peer Allocation with specified base Pointer
Related-To: LOCI-4176

- Given a Base Pointer passed into Get Peer Allocation, then the base
pointer is used in the map of the new allocation to the virtual memory.
- Enables users to use the same pointer for all devices in Peer To Peer.
- Currently unsupported on reserved memory due to mapped and exec
resiedency of Virtual addresses.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2023-05-24 20:41:20 +02:00
Compute-Runtime-Validation b2b41e613b Revert "fix: add alignment support to host and shared UnifiedMemoryAllocation"
This reverts commit c3df92ac41.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-05-12 13:04:08 +02:00
Lu, Wenbin c3df92ac41 fix: add alignment support to host and shared UnifiedMemoryAllocation
Related-To: LOCI-4334

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-05-11 11:45:12 +02:00
Lu, Wenbin 5d653c8536 fix: Add alignment support to createUnifiedMemoryAllocation
Allows the user to use alignments > 64KB in `createUnifiedMemoryAllocation`

So that the restriction in `piextUSMDeviceAlloc` of the DPC++ runtime
could be lifted

Related-To: LOCI-4168

Signed-off-by: Lu, Wenbin <wenbin.lu@intel.com>
2023-05-02 09:19:23 +02:00