Commit Graph

1093 Commits

Author SHA1 Message Date
Kamil Kopryk
aef760b6b0 feature: add tbx support for host functions
Tbx requires write memory after changing a mapped
allocation from the driver side.
Host function use bytes mapped from tagAllocation.

Host function data update has 2 steps:
* update the mapped data in the driver
* write memory so Tbx can see the data

Tag allocation can be pulled (downloadAllocation)
e.g. while waiting, and at the same time the host function worker thread
can update the data.
In such scenario the updated mapped data could be reverted
by a concurrent downloadAllocation call.

I've added a lock to prevent concurrent downloadAllocation calls
overlapping the 2step tbx host function data update.

Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-12-16 09:55:51 +01:00
Mateusz Jablonski
0c2c1df1d4 fix: correct setting run alone flag for aub csr
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-12-15 16:16:07 +01:00
Mateusz Jablonski
814afc90fe feature: Add initial support for CRI
Related-To: NEO-16649

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-12-15 12:12:42 +01:00
Compute-Runtime-Validation
12a683bb07 Revert "feature: Add initial support for CRI"
This reverts commit d71f93c89b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-12-15 06:29:31 +01:00
Mateusz Jablonski
d71f93c89b feature: Add initial support for CRI
Related-To: NEO-16649

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-12-12 22:34:38 +01:00
Szymon Morek
c265bc692f refactor: add infrastructure for setting L1 flush mode
Related-To: NEO-15936

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-12-12 14:16:04 +01:00
Kamil Kopryk
6da5f98324 fix: implicit scaling for hostFunctions
Program StoreData once, using base address
with workloadPartitionOffset bool enabled.
Program SemaphoreWait for each partition as work for each
tile must be synchronized.
HostFunction worker will wait for HostFunctionId on all tiles,
using partition offset for each partition.
HostFunction completion will clear hostFunction Id
for each partition using partition offset.

Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-12-12 10:48:01 +01:00
Kamil Kopryk
46f40eb793 refactor: remove experimental out of order host functions
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-12-11 13:22:44 +01:00
Mateusz Hoppe
00b4219adb refactor: defer hwContext creation for aubs and tbx
- create HardwareContext when osContext is setup and initialized

Related-To: NEO-16666

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2025-12-10 19:20:28 +01:00
Jaroslaw Warchulski
60376bd98a refactor: cleanup includes
Signed-off-by: Jaroslaw Warchulski <jaroslaw.warchulski@intel.com>
2025-12-10 09:33:04 +01:00
Szymon Morek
f7a87f1509 fix: properly flush device cache during host sync on event
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-12-09 10:44:28 +01:00
Naklicki, Mateusz
2c3b6a8760 feature: add 64-bit semaphore command
Related-To: NEO-15636

Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2025-12-08 13:59:29 +01:00
Jaroslaw Warchulski
33e25b260e refactor: do not include gmm_lib.h in gmm.h
Signed-off-by: Jaroslaw Warchulski <jaroslaw.warchulski@intel.com>
2025-12-08 12:52:02 +01:00
Mateusz Jablonski
4f5d1f1175 feature: add stream properties for xe3p specific fields
Related-To: NEO-16649

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-12-08 09:23:23 +01:00
Kamil Kopryk
fefc1f6a36 refactor: move logic to dedicated functions
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-12-08 08:05:32 +01:00
Maciej Bielski
147bd894ec refactor: use PRINT_STRING macro for most diagnostics
Related-To: NEO-14742
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2025-11-28 13:28:29 +01:00
Michal Mrozek
68d01f398f refactor: remove not needed code
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2025-11-28 13:08:13 +01:00
Jaroslaw Warchulski
cc79a136c9 refactor: do not use C headers
Signed-off-by: Jaroslaw Warchulski <jaroslaw.warchulski@intel.com>
2025-11-25 12:07:50 +01:00
Szymon Morek
861ea7200d performance: increase heap size to 4MB on OCL
Related-To: NEO-16348

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-11-25 09:58:16 +01:00
Kamil Kopryk
56b30d1803 feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.

Adjust command buffer programming and patching logic to IDs.

Add hostFunction callable class - using invoke method,
which stores required information about callback.

Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.

Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.

Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.

Remove Host Function worker CV and atomics based implementation.

Rename classes

Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-25 08:05:41 +01:00
Mateusz Jablonski
a22817200f refactor: add wrapper for max gfx core
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-11-24 21:35:38 +01:00
Igor Venevtsev
7f593bd295 fix: use condition variables instead of busy waits in worker threads
Resolves: NEO-16085, GSD-11678, HSD-14025819208

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2025-11-21 08:48:45 +01:00
Kamil Kopryk
c3e98e346a refactor: mark host functions classes as final
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-18 14:59:02 +01:00
Mateusz Jablonski
7660b29bbb fix: reduce types for tagSize and tagCount within TagAllocator
Related-To: NEO-16444

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-11-17 10:47:51 +01:00
Jakub Nowacki
be34c1ac86 performance: move instead of copy
Related-To: NEO-15630

Signed-off-by: Jakub Nowacki <jakub.nowacki@intel.com>
2025-11-14 16:30:05 +01:00
Compute-Runtime-Validation
ff27bb12d1 Revert "fix: use condition variables instead of busy waits in worker threads"
This reverts commit 4406889b39.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-11-14 15:55:47 +01:00
Jaroslaw Warchulski
0afcec950e refactor: cleanup includes
Signed-off-by: Jaroslaw Warchulski <jaroslaw.warchulski@intel.com>
2025-11-14 11:22:46 +01:00
Mateusz Hoppe
91fe2ec380 refactor: remove not needed debug flag AppendAubStreamContextFlags
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2025-11-12 16:06:09 +01:00
Kamil Kopryk
129249f022 refactor: correct typo
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-06 15:15:08 +01:00
Kamil Kopryk
8757ecf2f7 refactor: reuse tag allocation for host function data
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-05 14:51:20 +01:00
Lukasz Jobczyk
01885fe362 fix: proper lock order when reinitialize context
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-11-05 09:51:37 +01:00
Lukasz Jobczyk
498f62d7a0 fix: Reset direct submission when reinitialize context
Resolves: HSD-15018564496
Related-To: NEO-16651

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-11-04 13:32:36 +01:00
Kamil Kopryk
f84a5fbee9 feature: add host functions workers
* add common host function worker interface
* add worker as a single thread per csr with 3 modes
* add logic for waiting on internal tag, check gpu hang
* if tag is in pending state, read callback data, run callback
and signal completion
* threads will exit the work loop once stop request
is called in finish
* add multi thread unit tests

Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-03 12:11:17 +01:00
Igor Venevtsev
4406889b39 fix: use condition variables instead of busy waits in worker threads
Resolves: NEO-16085, GSD-11678, HSD-14025819208

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2025-10-31 15:28:54 +01:00
Compute-Runtime-Validation
b7d1c32edd Revert "fix: use condition variables instead of busy waits in worker threads"
This reverts commit 1f6039676f.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-10-24 10:55:27 +02:00
Igor Venevtsev
1f6039676f fix: use condition variables instead of busy waits in worker threads
Resolves: NEO-16085, GSD-11678, HSD-14025819208

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2025-10-21 17:37:00 +02:00
Mateusz Jablonski
6f83f699d7 fix: unify expect memory functions
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-10-17 14:30:38 +02:00
Fabian Zwoliński
6102280f71 fix: add missing writeMemory for pooled global surface
Related-To: HSD-18043489182, HSD-18043476772
Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
2025-10-17 14:26:54 +02:00
Szymon Morek
64b79723cc performance: enable cmd buffers reuse without DC flush
Related-To: NEO-16348

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-10-17 14:26:37 +02:00
Szymon Morek
c78c1515de performance: reuse cmd buffer without dc flush
Related-To: NEO-16348

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-10-16 16:26:54 +02:00
Mateusz Jablonski
35f6dc12b8 refactor: remove not needed code
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-10-15 16:19:04 +02:00
Lukasz Jobczyk
ce1c5d747b fix: fix data race issue
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-10-13 14:11:28 +02:00
Lukasz Jobczyk
6515e422e9 refactor: move eviction container to residency controller
Related-To: NEO-13315

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2025-10-13 08:41:34 +02:00
Compute-Runtime-Validation
244dd9b0b4 Revert "fix: use condition variables instead of busy waits in worker threads"
This reverts commit db0b4a616c.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-10-11 03:52:05 +02:00
Compute-Runtime-Validation
2eb8928ec5 Revert "performance: increase heap size to 4MB"
This reverts commit f41bb3517a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2025-10-10 22:23:23 +02:00
Igor Venevtsev
db0b4a616c fix: use condition variables instead of busy waits in worker threads
Resolves: NEO-16085, GSD-11678, HSD-14025819208

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2025-10-10 21:42:02 +02:00
Mateusz Jablonski
d53ac208fc refactor: remove not needed code
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-10-10 19:31:00 +02:00
Mateusz Jablonski
1918c5e9da refactor: add helper to create uint64 bitmask
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2025-10-10 12:54:50 +02:00
Jack Myers
f06bb256c7 refactor: sba type helper
Signed-off-by: Jack Myers <jack.myers@intel.com>
2025-10-10 11:36:36 +02:00
Szymon Morek
f41bb3517a performance: increase heap size to 4MB
Related-To: NEO-16348

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2025-10-09 13:03:53 +02:00