Commit Graph

658 Commits

Author SHA1 Message Date
Szymon Morek
80ef56ef4a fix: Fix residency handling when out-of-memory occurs
Related-To: NEO-12434 , NEO-11755

When OOM was triggered from KMD then reiterate
over allocations again since allocations which
should be resident could be evicted during trim process.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-09-05 13:23:37 +02:00
Compute-Runtime-Validation
d842f65cf1 Revert "fix: submit dummy exec to pin memory during zeContextMakeMemoryReside...
This reverts commit f9b87d53e6.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-05 03:28:03 +02:00
Maciej Plewka
f9b87d53e6 fix: submit dummy exec to pin memory during zeContextMakeMemoryResident call
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>

Related-To: NEO-11879
2024-09-04 14:07:29 +02:00
Compute-Runtime-Validation
99f62ac866 Revert "feature: support SVM heap in reserveVirtualMem"
This reverts commit 93cde3ee12.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-09-03 20:20:25 +02:00
Wenbin Lu
93cde3ee12 feature: support SVM heap in reserveVirtualMem
Related-To: NEO-11981

Signed-off-by: Wenbin Lu <wenbin.lu@intel.com>
2024-09-03 11:38:51 +02:00
Szymon Morek
e6abfafa16 fix: drain paging fence queue before waiting for resources
Related-To: NEO-12197

If ULLS controller waits for CSR lock, and driver must
wait for resources due to OOM, then draing paging fence queue
directly

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-09-03 07:45:25 +02:00
Compute-Runtime-Validation
fbabe203c1 Revert "fix: update completion data after makeResident"
This reverts commit 13ec7ad32a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-31 20:54:02 +02:00
Szymon Morek
13ec7ad32a fix: update completion data after makeResident
Completion data can't be updated when makeResident fails
Resources could have updated state but paging fence remains the same.
Wait on paging fence during freeing these resources might result in hang.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-30 13:29:48 +02:00
Mateusz Jablonski
c934877790 refactor: remove not needed function
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-30 12:18:14 +02:00
Szymon Morek
df859c6d4a fix: skip always resident allocations during trim
Related-To: NEO-12461

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-30 11:22:44 +02:00
Szymon Morek
a0b789bf9c fix: make removal from container thread safe
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-30 09:24:20 +02:00
Compute-Runtime-Validation
78d9af04e7 Revert "fix: change mutex when destroying allocation"
This reverts commit 7628966f80.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-29 18:14:01 +02:00
Compute-Runtime-Validation
2d8397a36f Revert "fix: skip always resident allocations during trim"
This reverts commit c9457bb5eb.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-29 15:57:37 +02:00
Szymon Morek
7628966f80 fix: change mutex when destroying allocation
Current mutex is not preventing destroying resources
when trim callback is currently evicting same allocation.
New mutex does.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-28 15:31:45 +02:00
Szymon Morek
c9457bb5eb fix: skip always resident allocations during trim
Related-To: NEO-12461

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-27 15:04:50 +02:00
Compute-Runtime-Validation
5dbbaa39b9 Revert "fix: ulls controller sleep, windows"
This reverts commit 6455d4648c.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-24 06:37:58 +02:00
Szymon Morek
b8f181d50e performance: remove trim candidate list
Related-To: NEO-11755

Removing trim candidate list reduces overhead
caused by residency handling. Allocations required
for eviction are placed in eviction container managed
by CSR.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-23 12:21:50 +02:00
Dominik Dabek
6455d4648c fix: ulls controller sleep, windows
Request higher resolution for windows periodic timers for ulls
controller sleep.

Allows for controller thread to sleep with granularity of 1ms.

Related-To: NEO-10800

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-08-23 12:05:26 +02:00
Mateusz Jablonski
7ac41615cd fix: create thread with function pointer
don't create async thread in neo shared tests

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-21 18:02:37 +02:00
Mateusz Hoppe
8b1bedd1f6 fix: call processFlushResidency on aub operations handler
- in Os with AubDump mode

Related-To: NEO-11719

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-08-20 21:13:10 +02:00
Mateusz Jablonski
579af57161 refactor: don't call OsLibrary::load directly, use function pointer
this allows mocking this call in ULT

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-20 08:59:26 +02:00
Mateusz Jablonski
efb8240a00 refactor: rename OsLibrary::load function to distinguish functionality
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-16 01:35:41 +02:00
Szymon Morek
7ebf2e1994 performance: unlock shared mutex during evict
Related-To: NEO-11755

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-09 10:35:12 +02:00
Szymon Morek
556a116987 fix: make paging fence address volatile
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-07 15:09:34 +02:00
Szymon Morek
d7d6996464 performance: initialize timeout params once
Currently this is done per each enqueue
which is not really needed

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-07 14:35:12 +02:00
Szymon Morek
0d6c506c0b performance: enable wait on paging fence on semaphore
Related-To: NEO-12197

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-07 10:20:03 +02:00
Szymon Morek
d4c1631ac7 performance: don't wait for paging fence on user thread
Related-To: NEO-12197

Currently for new resources user thread must wait before submitting
actual workload. With this commit, instead of waiting on user thread,
request is sent to background ULLS controller thread and additional
semaphore is programmed. ULLS controller will perform actual wait
and signal semaphore when paging fence reaches required value.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-08-07 08:30:51 +02:00
shubham kumar
0002eb3fcc feature: adding eu stall support on windows
Related-To: NEO-12174


Signed-off-by: shubham kumar <shubham.kumar@intel.com>
2024-08-06 06:47:11 +02:00
Dominik Dabek
e9e6cc05e3 fix: mem alloc size tracking safety
Make sure local mem alloc size atomic array is initialized with 0.
Add debug breaks to catch possible overflow on unregistering
allocations.

Related-To: NEO-11356

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-08-05 11:51:17 +02:00
Dominik Dabek
26428d5af3 feature: track used memory by allocations
Track memory used by memory allocations. System and local per device.
Will be used for heuristics in memory pooling.

Related-To: NEO-11356

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-08-02 17:54:34 +02:00
Mateusz Jablonski
afc1664fce fix: fail wddm initialization when cannot create topology map
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-30 09:35:42 +02:00
Mateusz Hoppe
8a7923c6ee fix: allow fork() after zeInit()
- do not release resources derived from parent process
- zeInit() in child should initilize new driver

Related-To: NEO-11761

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-07-23 15:31:50 +02:00
Mateusz Hoppe
0800ab54f5 refactor: remove redundant getPid()
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-07-22 22:51:47 +02:00
Compute-Runtime-Validation
0cb2a22c55 Revert "fix: correct number of slice count in configureHwInfoDrm"
This reverts commit b597f47a70.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-19 04:35:03 +02:00
Mateusz Jablonski
b597f47a70 fix: correct number of slice count in configureHwInfoDrm
adjust slice count to proper value based on previously calculated
max slices and max subslice counts

Related-To: NEO-12073
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-18 16:54:51 +02:00
Grzegorz Choinski
46f2568902 build: fixes for windows clang with -m32
Related-To: NEO-10748
Signed-off-by: Grzegorz Choinski <grzegorz.choinski@intel.com>
2024-07-18 14:49:56 +02:00
Mateusz Jablonski
1d7ce005d7 refactor: extract common logic from wddm and drm product helpers
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-17 11:03:02 +02:00
Compute-Runtime-Validation
e3053121cb Revert "refactor: extract common logic from wddm and drm product helpers"
This reverts commit 585caab757.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-17 04:53:28 +02:00
Michal Mrozek
61e08ef4ff performance: add debug flag to allow non zero for compressed
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-07-16 12:46:10 +02:00
Mateusz Jablonski
585caab757 refactor: extract common logic from wddm and drm product helpers
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-16 11:17:18 +02:00
Mateusz Jablonski
80afda1ac9 refactor: extract common logic of setting kmd notify properties
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-15 17:58:34 +02:00
Mateusz Jablonski
a2fb4da91d fix: correct fallback path when creating topology map in Wddm
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-15 16:52:47 +02:00
Mateusz Jablonski
789a008470 fix: setup proper preemption surface size when forcing builtin SIP
when getting SIP kernel from IGC, setup related surface size based on IGC data

Related-To: NEO-8188
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-15 15:56:24 +02:00
shubham kumar
e78c8edcf3 refactor: prework for adding eu stall support on windows
Related-To: NEO-9492

Signed-off-by: shubham kumar <shubham.kumar@intel.com>
2024-07-12 16:27:24 +02:00
Mateusz Jablonski
e39994f525 fix: setup slm size based on gt system info when not set in capability table
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-09 15:21:35 +02:00
Compute-Runtime-Validation
991640f558 Revert "fix: update slm size in capability table based on gt system info"
This reverts commit 47e064a686.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-09 03:31:42 +02:00
Mateusz Jablonski
47e064a686 fix: update slm size in capability table based on gt system info
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-08 09:35:25 +02:00
Compute-Runtime-Validation
c679e7df30 Revert "fix: update slm size in capability table based on gt system info"
This reverts commit e624a4b0ab.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-06 03:40:49 +02:00
Mateusz Jablonski
e624a4b0ab fix: update slm size in capability table based on gt system info
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-05 14:25:33 +02:00
Filip Hazubski
6992cb8aeb fix: Add experimental debug toggle to force 2M local memory size alignment
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2024-06-27 15:21:35 +02:00