mirror of
https://github.com/intel/compute-runtime.git
synced 2026-01-03 14:55:24 +08:00
feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within `KernelImmutableData::initialize()`. Apparently the ISA blocks are often much smaller than a 64k page, which leads to poor memory utilization and was even observed to cause the device OOM error if a single module has several keys. Improve the situation by reusing the parent allocation (owned by the module instance) for modules, which kernel ISAs can fit together within a single 64k page. This improves the memory utilization on a single module level. Related-To: NEO-7788 Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
This commit is contained in:
committed by
Compute-Runtime-Automation
parent
1b7e178b25
commit
c348831470
@@ -77,7 +77,7 @@ void EncodeDispatchKernel<Family>::encode(CommandContainer &container, EncodeDis
|
||||
{
|
||||
auto alloc = args.dispatchInterface->getIsaAllocation();
|
||||
UNRECOVERABLE_IF(nullptr == alloc);
|
||||
auto offset = alloc->getGpuAddressToPatch();
|
||||
auto offset = alloc->getGpuAddressToPatch() + args.dispatchInterface->getIsaOffsetInParentAllocation();
|
||||
idd.setKernelStartPointer(offset);
|
||||
idd.setKernelStartPointerHigh(0u);
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user