- surfaceStateSize is in pages, bindless size needs to be programmed in
surface state units
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- create heapAssigner per root device in memory manager to allow per
device config
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- check releaseHelper support when selecting bindless mode, if not
disabled, prefer bindless mode in L0 API
- bindless mode can be forced with DebugVariable: UseBindlessMode
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
when class defines copy/move ctor then corresponding assign operator(s)
should be defined or deleted
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Proper subdevice count being returned now in GfxCoreHelper
path, as previous method ignored the usage of the
ReturnSubDevicesAsApiDevices flag.
Related-To: LOCI-4859
Signed-off-by: Latif, Raiyan <raiyan.latif@intel.com>
- SPECIAL_SSH is used for debug surface SurfaceState which must be
located at bindless offset zero
- limit size of external front window
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Added getDefaultDeviceHierarchy call that describes default device
hierarchy for a gfx core. Refactored L0 and OCL paths to use this
value by default and override this value when user sets
ZE_FLAT_DEVICE_HIERARCHY environment variable or
ReturnSubDevicesAsApiDevices debug key.
Updated ReturnSubDevicesAsApiDevices to force COMPOSITE device hierarchy
when set to 0.
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
- program debugSurface's SurfaceState at the beginning of Bindless Surface
State Heap - SPECIAL_SSH
- ensure SPECIAL_SSH is resident
Related-To: NEO-7063
Signed-off-by: Hoppe, Mateusz <mateusz.hoppe@intel.com>
- For calculating number of threads per workgroup, for SIMD 1, return
local work size (each software thread should be mapped into a whole hardware
thread).
- Correct logic of calculating space for per thread data for SIMD 1.
- Minor: unit tests refactor.
- Corrected naming.
Related-To: NEO-8261
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Add support for different timestamp packet counts per gfx family.
Change all packet counts to 1 except for xe-hpc.
Related-To: NEO-8154
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.
Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>