fix: Use proper value about hw local id generations

- remove useless flag ForceNumberOfThreadsInGpgpuThreadGroup
- add new flag "RemoveRestrictionsOnNumberOfThreadsInGpgpuThreadGroup"
to restore old path without restrictions about number of threads in
thread group
- fix forwarding information about hw local ids generations to
calculate numOfThreadsInThreadGroup correctly

Related-To: NEO-7952, NEO-7982
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
This commit is contained in:
Cencelewska, Katarzyna
2023-06-23 14:26:00 +00:00
committed by Compute-Runtime-Automation
parent aea5f435db
commit 68d81c82a7
9 changed files with 31 additions and 19 deletions

View File

@@ -339,7 +339,7 @@ ze_result_t KernelImp::setGroupSize(uint32_t groupSizeX, uint32_t groupSizeY,
auto &rootDeviceEnvironment = module->getDevice()->getNEODevice()->getRootDeviceEnvironment();
auto &gfxCoreHelper = rootDeviceEnvironment.getHelper<NEO::GfxCoreHelper>();
this->numThreadsPerThreadGroup = gfxCoreHelper.calculateNumThreadsPerThreadGroup(
simdSize, static_cast<uint32_t>(itemsInGroup), grfSize, kernelRequiresGenerationOfLocalIdsByRuntime);
simdSize, static_cast<uint32_t>(itemsInGroup), grfSize, !kernelRequiresGenerationOfLocalIdsByRuntime);
if (kernelRequiresGenerationOfLocalIdsByRuntime) {
auto grfSize = this->module->getDevice()->getHwInfo().capabilityTable.grfSize;