Commit Graph

12 Commits

Author SHA1 Message Date
Fabian Zwolinski
cbce863dc2 refactor: Rename member variables to camelCase 3/n
Additionally enable clang-tidy check for member variables

Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-04-28 16:01:14 +02:00
Zbigniew Zdanowicz
4c7bc2ca98 [feature, perf] add alogrithm to chain command buffers in container
This feature is part of performance improvement to dispatch and start
command buffers as primary batch buffers.
When exhausted command buffer is closed, then reserve exact space for chained
batch buffer start and bind it to the next command buffer.
When closing command buffer, then save ending pointer and
reserve aligned space.

Related-To: NEO-7807

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-04-05 15:49:01 +02:00
Rafal Maziejuk
b9828b543e feature: adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-03-28 15:19:52 +02:00
Mateusz Jablonski
5610eae710 refactor: fix typo Barrierl -> Barrier
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-21 15:58:24 +01:00
Filip Hazubski
0bee81c0c0 refactor: Move isLinearStoragePreferred function from gfx to product helper
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2023-03-15 18:51:59 +01:00
Mateusz Jablonski
340f932ca2 refactor: move GfxCoreHelper::getExtensions to CompilerProductHelper
Related-To: NEO-7800
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-14 13:56:19 +01:00
Raiyan Latif
d5c909c9f9 Fix calculation of number of Ray-Tracing stacks
MaxDualSubSlicesSupported is filled inside GT_SYSTEM_INFO
structure when querying the KMD appropriately with the
number of enabled DualSubSlices. However we need to find
the highest index of the last enabled DualSubSlice.

For proper allocation of thread scratch space, allocation
has to be done based on native die config (including unfused
or non-enabled DualSubSlices). Since HW doesn't provide us a
way to know the exact native die config, in SW we need to
allocate RT stacks with enough size based on the last used
DualSubSlice.

The IsDynamicallyPopulated field in GT_SYSTEM_INFO is used to
indicate if system details are populated either via Fuse reg.
or hard-coded. Based on this field's value, we calcuate the
numRtStacks appropriately.

Related-To: LOCI-3954

Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>
2023-03-13 10:48:10 +01:00
Compute-Runtime-Validation
678e47de2d Revert "Adjust maxWorkGroupSize value"
This reverts commit f7685a93e4.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-21 14:45:36 +01:00
Rafal Maziejuk
f7685a93e4 Adjust maxWorkGroupSize value
Related-To: NEO-7357

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
2023-02-17 09:34:15 +01:00
Maciej Bielski
2778043d67 fix(l0): check for largeGRF when computing maxWorkGroupSize
Sizing context (PVC):
When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU
(instead of default 8). Together with SIMD16 that means that there can
be max 64 work-items per EU. With 8 EU per subslice this gives 512
work-items on a single subslice. For correct intra-WG synchronization
all its WIs must be executed on the same subslice (to access the same
SLM, where the synchronization primitives are stored). Thus, with SIMD16
and LargeGRF the work-group size must not exceed 512 (PVC example).

So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure
both in `ModuleTranslationUnit::processUnpackedBinary()` and
`ModuleImp::initialize()`. This method does not take kernel parameters
(LargeGRF) into account. It allows to submit a kernel using LargeGRF
with SIMD16 with the work-group size set to 1024. That leads to a hang.

Fix the `.maxWorkGroupSize` computation so that it takes the kernel
parameters into consideration.

Add new (for discrete platforms >= XeHP) and adapt existing tests, fix
cosmetics by the way.

Similar check for OCL:
https://github.com/intel/compute-runtime/blob/master/opencl/source/comma
nd_queue/enqueue_kernel.h#L130

Related-To: NEO-7684
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-02-08 11:20:52 +01:00
Dominik Dabek
8da362afae fix(l0): do not memcpy on cpu if need unlock ptr
Do not use cpu memory copy on windows if need to unlock locked ptr.

Related-To: NEO-7553

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-02-02 10:41:39 +01:00
Kamil Kopryk
2484c7ceb2 refactor: rename hw_helper files to gfx_core_helper files
Related-To: NEO-6853
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-02-01 19:37:51 +01:00