Commit Graph

333 Commits

Author SHA1 Message Date
Milczarek, Slawomir
6a9fcd38b1 Create KMD-migrated unified shared memory with multiple local memory regions
Remove the restriction on USM allocation created in a single local memory region
with latest KMD fix for cross tile migration thrashing b/t lmem (dii-3516)

Related-To: NEO-6909

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2022-07-04 13:33:23 +02:00
Mateusz Hoppe
31dbc04f23 L0Debug - capture SurfaceState heaps
Related-To: NEO-7103

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-06-24 15:05:56 +02:00
Maciej Plewka
6ab6e1abff Fix mutex order for event task and move args to gpu
This commit fixes problem with untransfered shared usm memory to gpu
when there is submit to gpu trigerred by user event. Also there is a fix
for dead lock problem caused by mixed orders of locking mutexes in csr
and in direct submission controller.

Related-To: NEO-6762

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-06-21 11:28:25 +02:00
Maciej Plewka
213dc2fe24 Make CPU copy for read buffer when host ptr is write combined on DG2
With this commit on DG2 32bit driver will check if passed host ptr for
clEnqueueReadBuffer is write combined memory. If check will be true copy
will be make on CPU.

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-06-13 21:23:21 +02:00
Dominik Dabek
cdfe2ce8ad Feature: Flag for device usm allocation reusing
With flag enabled, when app calls freeSVMAlloc on device usm allocation,
don't free it immediately but save it,
and try to use it on subsequent allocations.

This allocation cache will be trimmed if an allocation fails.

Related-To: NEO-6893

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-06-13 20:02:52 +02:00
Bartosz Dunajski
39c1c4d530 Remove virtual padding support
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2022-06-08 12:42:44 +02:00
Krzysztof Gibala
77dde01503 Pass canonized gpuAddress in setCpuPtrAndGpuAddress
Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-06-07 10:05:18 +02:00
Krzysztof Gibala
c51ce2a35c Use GmmHelper as an object in freeGraphicsMemoryImpl
Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-06-07 09:53:41 +02:00
Krzysztof Gibala
81899c4477 Add canonized gpuAddress to GraphicsAllocation constructor
Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-06-06 18:17:36 +02:00
Krzysztof Gibala
dc1fe7d59a Change MemoryPool to enum class
Use enum class for MemoryPool in GraphicsAllocation
This change will ensure that GA is constructed in the proper way

- Rename namespace for isSystemMemoryPool method
- Add method getMemoryPoolString for logging actual pool which is in used
- Remove wrong pattern in GraphicsAllocation constructor

Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-06-02 12:46:15 +02:00
Jaime Arteaga
325db6a99c Fix P2P support for implicit scaling
when using implicit scaling, 2 dma-buf handles, one per tile, are
needed to support dma access from peer.

Related-To: LOCI-3122

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2022-06-01 20:10:32 +02:00
Krzysztof Gibala
ae56d50b4f Pass canonized gpuAddress to GraphicsAllocation
Related-To: NEO-6523

Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-05-30 16:54:38 +02:00
Daniel Chabrowski
7463e1970b Cleanup headers
Make TUs and headers self-contained, remove unused headers

Signed-off-by: Daniel Chabrowski <daniel.chabrowski@intel.com>
2022-05-18 11:42:06 +02:00
Artur Harasimiuk
3f04769f07 style: configure readability-identifier-naming.FunctionCase
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2022-05-17 20:55:56 +02:00
Szymon Morek
4266f861ac Make implicit flush for cross-device dependency
Related-To: NEO-6418

If there's a cross-device dependency, flush batched
submissions to avoid deadlock.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2022-05-16 16:29:09 +02:00
Milczarek, Slawomir
9d31d36491 Disable cross-tile kmd migration for usm allocations
Ensure KMD migrations for USM allocations to occur between smem and lmem only

Related-To: NEO-6969

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2022-05-16 15:23:34 +02:00
Artur Harasimiuk
819e0f5515 style: configure readability-identifier-naming.LocalVariableCase
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2022-05-16 12:39:44 +02:00
Artur Harasimiuk
e9be9b64c6 clang-tidy configuration cleanup
Define single .clang-tidy configuration with all used checks and use
NOLINT to selectively silence tool. That way cleanup should be easier.
third_part/ has its own configuration that disables clang-tidy for this
folder.

Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2022-05-11 14:02:04 +02:00
Krzysztof Gibala
2fcda0a528 Refactor: Change decanonize method accessing point
Accessing decanonize method as a member of GmmHelper class object

Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-05-11 12:57:02 +02:00
Mateusz Jablonski
d935815d74 Move residency data from WddmAllocation to GraphicsAllocation
Related-To: NEO-6848

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-05-10 18:16:42 +02:00
Mateusz Jablonski
943ad0e1eb style: skip redundant unique_ptr::get function
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-05-10 13:22:40 +02:00
Krzysztof Gibala
4f1e01d279 Create getGmmHelper function in MemoryManager
Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-05-09 16:51:23 +02:00
Krzysztof Gibala
1c366d1ec0 Refactor: Change canonize method accessing point
Accessing canonize method as a member of GmmHelper class object

Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-05-09 14:16:31 +02:00
Jaime Arteaga
3f26f45c10 Add support for IPC handles with implicit scaling
When using implicit scaling, device allocations may have
more than one internal allocation created internally. In that case,
a separate dma-buf handle per internal allocation needs to be
exported.

So introduced two driver experimental extensions to export and
import more than one IPC handle:

- zexMemGetIpcHandles
- zexMemOpenIpcHandles

Related-To: LOCI-2919

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2022-05-09 00:38:17 +02:00
Maciej Plewka
0a16dc6c47 Fix multi thread usage of external host alloc
This is fixed reupload of this commit after auto revert
With this commit OpenCL will track if external host memory is used from
few threads and will secure to update task count in all threads before
destroing allocation.

Resolves: NEO-6807

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-05-05 13:32:15 +02:00
Compute-Runtime-Validation
00a1a14652 Revert "Fix multi thread usage of external host alloc"
This reverts commit 54eee2a88b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-04-28 17:42:07 +02:00
Krzysztof Gibala
9b778863b4 Store GmmHelper in Gmm class
Store GmmHelper in Gmm class instead of GmmClientContext

Related-To: NEO-6523
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-04-27 15:45:49 +02:00
Maciej Plewka
54eee2a88b Fix multi thread usage of external host alloc
With this commit OpenCL will track if external host memory is used from
few threads and will secure to update task count in all threads before
destroing allocation.

Resolves: NEO-6807

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-04-26 15:31:24 +02:00
Mateusz Hoppe
c9d61840a2 Allocate SBA buffers per HW context
- different physical storage for every HW context
- adds support for debugging with implicit scaling on
- reorganize tests

Relates-To: NEO-6883

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-04-15 10:01:28 +02:00
Dominik Dabek
8d1ad5a4f3 Refactor: use stack vector for root device indices
Stack vector will not cause dynamic allocations in most circumstances
ie. number of root device indices not more than 16

Related-To: NEO-6837

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-04-14 14:05:42 +02:00
Patryk Wrobel
352583b9d9 Detect GPU hang in evictUnusedAllocations()
This change introduces checking of the return value
of wait function in case of blocking version of
evictUnusedAllocations(). Furthermore, it propagates
the error to the callers. It contains also ULTs.

Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
2022-04-13 15:39:02 +02:00
John Falkowski
789587bacc Return nullptr for gfx alloc mmap error
Signed-off-by: John Falkowski <john.falkowski@intel.com>
2022-04-12 10:08:19 +02:00
Dominik Dabek
29e4518407 Update formatting in dynamic memory tracking
Format as follows:
backtrace start
call_1
...
call_n
backtrace end

Related-To: NEO-6837

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-04-08 13:17:30 +02:00
Lukasz Jobczyk
fffcf2612e Flush tag only if needed in deferred deleter
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-04-07 15:36:29 +02:00
Compute-Runtime-Validation
7a3976ad64 Revert "Force 64KB page size for cpu alignment in dual storage allocation"
This reverts commit 7ff6a5c1fa.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-04-07 00:45:07 +02:00
Kamil Diedrich
7ff6a5c1fa Force 64KB page size for cpu alignment in dual storage allocation
There is no need to force 2MB alignment for CPU allocation in dual
storage usage. Additionaly for WSL this will allow to avoid usage of
malloc in driver path.

Relates-To: NEO-6620
Signed-off-by: Kamil Diedrich <kamil.diedrich@intel.com>
2022-04-06 15:28:17 +02:00
Dominik Dabek
8d5c674110 Dynamic memory tracking, update function printing
Print only demangled name if it succeeded.

Related-To: NEO-6837

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-04-06 14:34:24 +02:00
Dominik Dabek
f7a767c767 Add demangling to dynamic memory tracking
Related-To: NEO-6837

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-04-05 20:38:27 +02:00
Filip Hazubski
88193cc242 Minor fixes
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-03-31 20:02:11 +02:00
Michal Mrozek
0c0603966b Add debug functionality to track dynamic allocations.
- available only via manual build with ENABLE_DYNAMIC_MEMORY_TRACKING.

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2022-03-31 13:24:15 +02:00
Compute-Runtime-Validation
91cfd3cd1a Revert "Unify command/ring/semaphore buffers placement"
This reverts commit e035199de4.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-03-30 14:05:47 +02:00
Mateusz Jablonski
e035199de4 Unify command/ring/semaphore buffers placement
put them all to the same memory location

Related-To: NEO-6698
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-03-29 17:55:48 +02:00
Lukasz Jobczyk
a230f267e1 Poll task count indefinitely on high throttle command queue
Resolves: NEO-6781

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-03-25 10:06:16 +01:00
Mateusz Jablonski
8a8b4866cb XeHPC: force local memory for command/ring/semaphore buffer
require 48bit resource for ring/semaphore buffer
for multi tile allocations select first tile
for single tile allocation select preferred tile

Related-To: NEO-6698
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-03-24 10:54:26 +01:00
Milczarek, Slawomir
f03f530327 Extend zeCommandListAppendMemoryPrefetch to migrate to associated device
Related-To: NEO-6740

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2022-03-23 14:21:17 +01:00
Mateusz Jablonski
3792481d33 XeHPC Implicit scaling: put command/ring/semaphore buffer to first memory bank
In direct submission scenario command/ring/semaphore buffer allocations
are placed in the same memory bank to ensure that their memory is updated in
correct order

Related-To: NEO-6698
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-03-21 08:10:50 +01:00
Daniel Chabrowski
754efbdaff Cleanup mocks
Signed-off-by: Daniel Chabrowski daniel.chabrowski@intel.com
Related-To: NEO-6591
2022-03-18 11:54:03 +01:00
Jaime Arteaga
c0e2251ceb Skip adding allocations to remote devices if not allocated there
When making graphics allocations resident in multi-GPU scenarios,
we should make them resident only if there's an allocation for that
device. So return appropriate null pointer and skip it.

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
2022-03-16 23:58:07 +01:00
Maciej Bielski
a71b88fefb PageFaultHandler: speedup UM allocations lookup
Add a per-instance SVMAllocsManager::nonGpuDomainAllocs container for
all allocations to be removed in
moveAllocationsWithinUMAllocsManagerToGpuDomain. This approach replaces
the current iterative search and performs the task faster.

Add 7 new unit-tests to verify the functionality related to
nonGpuDomainAllocs container, both in expected and unexpected/synthetic
scenarios.

For UTs replace a dummy unifiedMemoryManager pointer with a pointer to
an instace of SVMAllocsManager, otherwise a SegFault error is thrown at
the end of tests.

Perform overall cleanup in related tests implementation, includes but
not limited to removal of:

- givenInitialPlacementGpu\
WhenMovingToGpuDomainThenFirstAccessDoesNotInvokeTransfer

As it is fully covered by:

givenAllocationMovedToGpuDomain\
WhenVerifyingPagefaultThenAllocationIsMovedToCpuDomain

- givenInitialPlacementGpu\
WhenVerifyingPagefaultThenFirstAccessDoesNotInvokeTransfer

As it is fully covered by:

givenTbxAndnitialPlacementGpu\
WhenVerifyingPagefaultThenMemoryIsUnprotectedOnly

Finally, reduce code duplication where possible.

Related-To: NEO-6658
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2022-03-16 11:18:24 +01:00
Milczarek, Slawomir
c0b7f05897 Add memory prefetch for kmd migrated shared allocations
This feature is disabled by default, controlled with the knob
AppendMemoryPrefetchForKmdMigratedSharedAllocations

Related-To: NEO-6740

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2022-03-09 16:02:18 +01:00