When generating work group sizes first try with enforcing decremental
order X >= Y >= Z if generated work group size X * Y * Z is smaller
than half the kernel's SIMD size then generate again without
enforcing decremental order.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
This change fixes problem with memory locality.
When calculating work group size do not take into account
work group sizes where there's bigger number of elements in
higher dimensions namely: Y>X or Z>Y.
Related-To: NEO-5719
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>