- if binding table entries are used in bindless kernel, program Binding
table
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- check releaseHelper support when selecting bindless mode, if not
disabled, prefer bindless mode in L0 API
- bindless mode can be forced with DebugVariable: UseBindlessMode
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Use debug flag PrintKernelDispatchParameters to print params used in
thread group dispatch size heuristic when encoding kernel dispatch.
Related-To: NEO-6989
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.
Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.
Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
denorm support is controlled by IGC, we should just set zero by default
Related-To: NEO-8059
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- allow bindless kernels to execute
- bindless addressing kernels are using private heaps mode
- do not differentiate bindful and bindless surface state base addresses
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
denorm support is controlled by IGC, we should just set zero by default
Related-To: NEO-8059
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
Related-To: NEO-6075
Ngen binaries contain stateful information, however they are
not used in isa on Pvc. Therefore, we can just ignore them.
- getGlobalBindlessHeapConfiguration() should be used to choose global
alloctor for SSH
- remove not needed and incorrect unit tests
- remove not needed branches
- bindless mode controls bindless compilation only
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- when debugging is enabled, assert() in gpu kernel will trigger SW
exception
Related-To: NEO-5753
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Related-To: LOCI-4332
- Signal non-timestamp Walkers with in-order CL value
- Event host synchronization based on CL signal value
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
- share same code between csr and cmd container to get default heap size
- share handling of debug flag to change heap size
- share platform level surface heap size between csr and command list
- refactor heap size files
- put heap size constant and function into namespace
- command list surface heap size increased to 2MB for xehp+ to match csr
- command list increased surface heap size only for sba tracking
- sba tracking heap consumption increased due to different reset policy
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- This change gets level one cache policy from cached values instead
of calling virtual methods
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- add estimation parameter for interface descriptor data count
- add to the heap estimation alignment parameter for dynamic and surface heaps
- extend encode interface and implementations to allow child heaps
Related-To: NEO-5055
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
This change is intended to be used in immediate command lists that are
using flush task functionality.
With this change all immediate command list using the same csr will consume
shared allocations for dsh and ssh heaps. This will decrease number of SBA
commands dispatched when multiple command lists coexists and dispatch kernels.
With this change new SBA command should be dispatched only when current heap
allocation is exhausted.
Functionality is currently disabled and available under debug key.
Functionality will be enabled by default for all immediate command lists
with flush task functionality enabled.
Related-To: NEO-7142
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
Allow pausing execution before and after enqueuing kernel
using the PauseOnEnqueue and PauseOnGpuMode debug flags.
Related-To: NEO-6570
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>