Expose copy engines in parent device with implicit scaling

When using implicit scaling, expose the copy engines from
sub-device 0 in the root device. This to facilitate
programming models of layers above.

Related-To: NEO-6815

Signed-off-by: Jaime Arteaga <jaime.a.arteaga.molina@intel.com>
This commit is contained in:
Jaime Arteaga
2022-03-24 07:44:40 +00:00
committed by Compute-Runtime-Automation
parent 7a324051ef
commit 3c3dab8fe0
5 changed files with 461 additions and 13 deletions

View File

@@ -36,7 +36,12 @@ When doing allocations in implicit scaling mode, driver *colors* an allocation a
When scheduling a kernel for execution, driver distributes the kernel workgroups among the available tiles. Default mechanism is called *Static Partitioning*, where the workgroups are evenly distributed among tiles. For instance, in a 2-tile system, half of the workgroups go to tile 0, and the other half to tile 1.
The number of CCSs, or compute engines, currently available with implicit scaing on the root device is one. This is because with implicit scaling the driver automatically uses all the EUs available in the device, so no other CCSs are exposed. Even though only one CCS is exposed, multiple kernels submitted to the root device using implicit scaling may execute concurrently on PVC, depending on EU availability. On XeHP_SDV, they may be serialized. See [Limitations](#Limitations) section below.
The number of CCSs, or compute engines, currently available with implicit scaling on the root device is one. This is because with implicit scaling the driver automatically uses all the EUs available in the device, so no other CCSs are exposed. Even though only one CCS is exposed, multiple kernels submitted to the root device using implicit scaling may execute concurrently on PVC, depending on EU availability. On XeHP_SDV, they may be serialized. See [Limitations](#Limitations) section below.
No implicit scaling support is available for BCSs. Considering that, two models are followed in terms of discovery of copy engines:
* In Level Zero, the copy engines from sub-device 0 are exposed also in the root device. This to align the engine model on both the implicit and the non-implicit-scaling scenarios.
* In OpenCL, copy engines are not exposed in the root device.
Since implicit scaling is only done for EUs, which are associated only with kernels submitted to CCS, BCSs are currently not being exposed and access to them are done through sub-device handles.