feature: extend external CB alloc capabilities

Related-To: NEO-13971

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
This commit is contained in:
Bartosz Dunajski 2025-02-05 13:09:40 +00:00 committed by Compute-Runtime-Automation
parent 2451727d6a
commit ff67943d06
2 changed files with 26 additions and 1 deletions

View File

@ -11,6 +11,7 @@ SPDX-License-Identifier: MIT
* [Overview](#Overview)
* [Creation](#Creation)
* [External storage](#External-storage)
* [Aggregated event](#Aggregated-event)
* [Obtaining counter memory and value](#Obtaining-counter-memory-and-value)
* [IPC sharing](#IPC-sharing)
* [Regular command list](#Regular-command-list)
@ -98,6 +99,19 @@ User may optionally specify externally managed counter allocation and value. Thi
- User is responsible for updating both memory locations to >= `completionValue` to signal Event completion
- Signaling such event, replaces the state (as described previously)
# Aggregated event
Aggregated event is a special use case for CB Events. It can be signaled from multiple append calls, but waiting requires only one memory compare operation.
It can be created by passing `zex_counter_based_event_external_storage_properties_t` as extension of `zex_counter_based_event_desc_t`.
**Requirements:**
- This extension cannot be used with "external storage" extension
- User must ensure device allocation (`deviceAddress`) residency. It must be accessible by GPU
- Driver will use `deviceAddress` for host synchronization as USM allocation. It must be accessible by CPU
- Signaling such event, will not replace its state (as described previously). It can be passed to multiple append calls and each append will increment the storage by `incrementValue` (atomically) on GPU
- Using aggregated event as dependency, requires only one memory compare operation against final value: `completionValue` >= `*deviceAddress`
- Device storage is under Users control. It may be reset manually if needed
- Profiling is not possible if producers originate on different GPUs (different timestamp domains)
# Obtaining counter memory and value
User may obtain counter memory location and value. For example, waiting for completion outside the L0 Driver.
If Event state is replaced by new append call or `zeCommandQueueExecuteCommandLists` that signals such Event, below API must be called again to obtain new data.

View File

@ -1,5 +1,5 @@
/*
* Copyright (C) 2022-2024 Intel Corporation
* Copyright (C) 2022-2025 Intel Corporation
*
* SPDX-License-Identifier: MIT
*
@ -209,6 +209,17 @@ typedef struct _zex_counter_based_event_external_sync_alloc_properties_t {
uint64_t completionValue; ///< [in] completion value for external synchronization allocation
} zex_counter_based_event_external_sync_alloc_properties_t;
///////////////////////////////////////////////////////////////////////////////
/// @brief Initial Counter Based Event synchronization parameters. This structure may be
/// passed as pNext member of ::zex_counter_based_event_desc_t.
typedef struct _zex_counter_based_event_external_storage_properties_t {
ze_structure_type_t stype; ///< [in] type of this structure
const void *pNext; ///< [in][optional] must be null or a pointer to an extension-specific
uint64_t *deviceAddress; ///< [in] device address that would be updated with atomic_add upon signaling of this event, must be device USM memory
uint64_t incrementValue; ///< [in] value which would by atomically added upon each completion
uint64_t completionValue; ///< [in] final completion value, when value under deviceAddress is equal or greater then this value then event is considered as completed
} zex_counter_based_event_external_storage_properties_t;
#if defined(__cplusplus)
} // extern "C"
#endif