2025-10-31 10:59:39 +00:00
|
|
|
/*
|
|
|
|
|
* Copyright (C) 2025 Intel Corporation
|
|
|
|
|
*
|
|
|
|
|
* SPDX-License-Identifier: MIT
|
|
|
|
|
*
|
|
|
|
|
*/
|
|
|
|
|
|
|
|
|
|
#pragma once
|
|
|
|
|
|
feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.
Adjust command buffer programming and patching logic to IDs.
Add hostFunction callable class - using invoke method,
which stores required information about callback.
Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.
Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.
Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.
Remove Host Function worker CV and atomics based implementation.
Rename classes
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-24 20:35:14 +00:00
|
|
|
#include "shared/source/command_stream/host_function_interface.h"
|
2025-10-31 10:59:39 +00:00
|
|
|
#include "shared/source/helpers/non_copyable_or_moveable.h"
|
|
|
|
|
|
|
|
|
|
#include <functional>
|
|
|
|
|
#include <memory>
|
|
|
|
|
#include <mutex>
|
|
|
|
|
#include <stop_token>
|
|
|
|
|
#include <thread>
|
|
|
|
|
|
|
|
|
|
namespace NEO {
|
|
|
|
|
|
|
|
|
|
class GraphicsAllocation;
|
feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.
Adjust command buffer programming and patching logic to IDs.
Add hostFunction callable class - using invoke method,
which stores required information about callback.
Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.
Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.
Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.
Remove Host Function worker CV and atomics based implementation.
Rename classes
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-24 20:35:14 +00:00
|
|
|
class HostFunctionStreamer;
|
|
|
|
|
struct HostFunction;
|
2025-10-31 10:59:39 +00:00
|
|
|
|
feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.
Adjust command buffer programming and patching logic to IDs.
Add hostFunction callable class - using invoke method,
which stores required information about callback.
Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.
Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.
Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.
Remove Host Function worker CV and atomics based implementation.
Rename classes
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-24 20:35:14 +00:00
|
|
|
class HostFunctionSingleWorker : public HostFunctionWorker {
|
2025-10-31 10:59:39 +00:00
|
|
|
public:
|
feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.
Adjust command buffer programming and patching logic to IDs.
Add hostFunction callable class - using invoke method,
which stores required information about callback.
Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.
Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.
Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.
Remove Host Function worker CV and atomics based implementation.
Rename classes
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-24 20:35:14 +00:00
|
|
|
explicit HostFunctionSingleWorker(bool skipHostFunctionExecution);
|
|
|
|
|
~HostFunctionSingleWorker() override = 0;
|
2025-10-31 10:59:39 +00:00
|
|
|
|
feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.
Adjust command buffer programming and patching logic to IDs.
Add hostFunction callable class - using invoke method,
which stores required information about callback.
Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.
Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.
Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.
Remove Host Function worker CV and atomics based implementation.
Rename classes
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-24 20:35:14 +00:00
|
|
|
void start(HostFunctionStreamer *streamer) override = 0;
|
|
|
|
|
void finish() override = 0;
|
|
|
|
|
void submit(uint32_t nHostFunctions) noexcept override = 0;
|
2025-10-31 10:59:39 +00:00
|
|
|
|
|
|
|
|
protected:
|
feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.
Adjust command buffer programming and patching logic to IDs.
Add hostFunction callable class - using invoke method,
which stores required information about callback.
Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.
Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.
Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.
Remove Host Function worker CV and atomics based implementation.
Rename classes
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-24 20:35:14 +00:00
|
|
|
MOCKABLE_VIRTUAL void processNextHostFunction(std::stop_token st) noexcept;
|
|
|
|
|
bool waitUntilHostFunctionIsReady(std::stop_token st) noexcept;
|
|
|
|
|
HostFunctionStreamer *streamer = nullptr;
|
2025-10-31 10:59:39 +00:00
|
|
|
};
|
|
|
|
|
|
feature: redesign host function workers
Each host function gets its unique ID within a CSR,
uses 1 mi store to write ID - to signal that host function is ready,
and 1 mi semaphore wait will wait for the ID to be cleared,
Use 0th bit from ID as pending/completed flag,
host function ID is incremented by 2, and starts with 1.
So each ID will always have 0bit set.
This is a must have since semaphore wait can wait for 4 bytes only.
Adjust command buffer programming and patching logic to IDs.
Add hostFunction callable class - using invoke method,
which stores required information about callback.
Add host function streamer - stores all host function data
for a given CSR.
All user provided host functions are stored in unordered map,
where key is host function ID.
Add host function scheduler, and a thread pool - under debug flag
Single threaded scheduler loops over all registered host function streamers,
dispatch ready to execute host functions to thread pool.
Allow for out of order host functions execution for OOQ - under debug flag,
each host function has bool isInOrder flag which indicates if it can be
executed Out Of Order - in this mode, ID tag will be cleared immediately,
so semaphore wait will unblock before the host function execution.
Remove Host Function worker CV and atomics based implementation.
Rename classes
Related-To: NEO-14577
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2025-11-24 20:35:14 +00:00
|
|
|
static_assert(NonCopyableAndNonMovable<HostFunctionSingleWorker>);
|
2025-10-31 10:59:39 +00:00
|
|
|
|
|
|
|
|
} // namespace NEO
|