Align local memory allocations of tag types to 2MB when
2MB alignment is enabled via the product helper
(is2MBLocalMemAlignmentEnabled flag).
Refactored the allocateGraphicsMemoryInDevicePool function to improve
readability and maintainability. Simplified the logic for
determining base size and final alignment by reducing redundant code.
Related-To: NEO-12287
Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
In case of indirect kernel launch some payload arguments are patched
just before walker command, this change disables prefetch, performs
batch buffer start to next bytes and then re-enable prefetch. All these
operations are performed between MI_STORE_REGISTER_MEM and COMPUTE_WALKER
Related-To: NEO-14584
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Created EncodePostSync template struct to organize various post-sync
variables/functions from EncodeDispatchKernel
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
If some operatioins requires ULLS light stop, execute such operations
under mutex in pair with ULLS stop to ensure no other thread will start
ULLS.
Related-To: NEO-14406, NEO-13922
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
When trimming old allocations in usm reuse start from largest
allocations.
This will reduce memory usage more quickly once max hold time is hit.
Related-To: NEO-6893, NEO-14429
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Real allocation size should be used to properly apply limits and allow
more usm reuse hits.
Related-To: NEO-6893
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Related-To: NEO-14538
It's valid for 3D image to copy 2D region.
Current checks for mip map do not consider that.
This change correctly checks for mip mapped 3D image.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
Save svmData on putting into reuse, instead of searching each time.
Change UNRECOVERABLE_IF to DEBUG_BREAK_IF.
Related-To: NEO-6893
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Related-To: NEO-14538
If user passes slice pitch which is larger than region
to copy, do not override memory beyond region but within
that slice pitch.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
Pass hwInfo to isHeaplessModeEnabled and isForceBindlessRequired functions.
Related-To: NEO-14526
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
The patch applies to Level Zero.
Only allocations < 2MB will be fetched from the pool.
Allocations are shared and reused within a given device.
Additionally, I added a new debug flag to control the allocator:
EnableTimestampPoolAllocator
Related-To: NEO-12287
Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
Related-To: NEO-14056
No need to explicitly wait on Windows KMD during make resident as it has
a while loop that does it nevertheless. The KMD wait affects the API
overhead of zeCommandQueueExecuteCommandLists some platforms (MTL, and ARL).
Signed-off-by: Chandio, Bibrak Qamar <bibrak.qamar.chandio@intel.com>
Lock on csr is needed before lock on residency controller to prevent
incorrect lock order. Csr may be locked in waitOnCpu called from trimToBudget,
which may lead to deadlocks
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
instead of printf use makro that make flush after printf
Related-To: HSD-14024170600
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>