value depends on CCS count:
- single CCS mode (default) - no limitations
- two CCS mode - 25% available
- four CCS mode - 12.5% available
Related-To: NEO-8377
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- For calculating number of threads per workgroup, treat simd 1 as it
was simd 32
- Correct logic of calculating space for per thread data for simd 1
- Minor: unit tests refactor
- Corrected naming
Related-To: NEO-8261
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
For all of the devices, preferredPlatformName is initialized with
nullptr by default and platform name will be initialized to driver's default
platform name, at the moment this is "Intel(R) OpenCL Graphics".
When Platform is initialized and preferredPlatformName is not nullptr then
Platform name will be set to the value stored in preferredPlatformName.
Add ENABLE_LEGACY_PLATFORM_NAME AIL enum related to added functionality.
Move PlatformInfo to NEO namespace.
Related-To: HSD-22018809561
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
Products for which zebin has been set as default format in OCLOC:
- ICELAKE_LP
- TIGERLAKE_LP
- ROCKETLAKE
- ALDERLAKE_S
- ALDERLAKE_P
- ALDERLAKE_N
The default format does not override `--format` parameter.
Related-To: NEO-8334
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
- introduce 2 reuse pools to bindlessHeapHelper
- one pool stores slots for reuse, second pool stores released slots
- stateCacheDirty flags keep track of state cache - when pools are
switched - flags are set indicating flushing caches is needed after
old slots have been reused for new allocations
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- program surface states for redescribed images correctly. Image copy
to/from memory are using redescribed surface states,
- refactor state base address programming - program address and size
together, set max size at the beginning due to lack of Enable flag
- set GpuBase in WddmAllocation when external heap is used
- return max ssh required size from kernelInfo or based on stateful args
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- allocate SSH in cmdContainer when scratch allocation used with
private heaps
- scratch SurfaceStates are addressed relative to
SurfaceStateBaseAddress and have to be placed on SSH
- remove not used SCRATCH_SSH heap type from bindelssHeapHelper
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- store surface state info for bindless addressing in graphics
allocation
- remove map in BindlessHeapsHelper - bindlessInfo is constant for
the lifetime of an allocation
- program bindless offsets and surface states for images when used in
bindless kernel
- handle ouf of memory on surface state heap - return error
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- Preserve releases on CMake level.
- Instead of generating builtins per platform, generate them per-release
(+ correct naming accordingly).
- Stop using revisions in builtin compilation logic path, as they are
already embedded in release (device ip).
- Remove platform names & revisions from names for generated files
(related to builtins).
- Remove unnecessary code, refactor ULT logic.
Related-To: NEO-7783
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
The previous name "pageSize2Mb" defined in
shared/source/helpers/constant.h is inconsistent to other variable,
i.e. pageSize64k.
Furthermore, it's a bit misleading because the page size is defined in
Megabytes (MB), not in Megabits (Mb).
Related-to: NEO-7695
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
- also use helper when checking that is simd1 to have same flow
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
- program global bindless ssba when external allocator used (
UseExternalAllocatorForSshAndDsh)
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
With direct submission disabled this resulted in waitTimeout long enough
that kmdWait fallback was rarely used.
This caused more CPU spin time.
Related-To: GSD-3612
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>