compute-runtime

mirror of https://github.com/intel/compute-runtime.git synced 2026-01-04 07:14:10 +08:00

Author	SHA1	Message	Date
Jitendra Sharma	9818ef61a5	feature: Report correct GRF register count Based on Large GRF enabled or not, report correct GRF register. Related-To: NEO-6788 Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>	2023-09-04 11:42:48 +02:00
Compute-Runtime-Validation	154530ad23	Revert "feature: Report correct GRF register count" This reverts commit `8eb3fe222e`. Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>	2023-09-01 15:12:57 +02:00
Jitendra Sharma	8eb3fe222e	feature: Report correct GRF register count Based on Large GRF enabled or not, report correct GRF register. Related-To: NEO-6788 Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>	2023-08-31 18:48:29 +02:00
Mateusz Jablonski	27e459dfd0	fix: add missing cache flushes on MTL and later integrated GPUs hdc pipeline / untyped dataport cache flushes were applied only on discrete GPUs Related-To: GSD-5085 Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>	2023-07-28 19:30:13 +02:00
Kacper Nowak	b908203001	fix: Compile built-ins per release - Preserve releases on CMake level. - Instead of generating builtins per platform, generate them per-release (+ correct naming accordingly). - Stop using revisions in builtin compilation logic path, as they are already embedded in release (device ip). - Remove platform names & revisions from names for generated files (related to builtins). - Remove unnecessary code, refactor ULT logic. Related-To: NEO-7783 Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>	2023-07-11 16:02:36 +02:00
Cencelewska, Katarzyna	0d7aefe66b	fix: Unify logic calculating threads per work group part 1 Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>	2023-06-29 10:43:22 +02:00
Cencelewska, Katarzyna	68d81c82a7	fix: Use proper value about hw local id generations - remove useless flag ForceNumberOfThreadsInGpgpuThreadGroup - add new flag "RemoveRestrictionsOnNumberOfThreadsInGpgpuThreadGroup" to restore old path without restrictions about number of threads in thread group - fix forwarding information about hw local ids generations to calculate numOfThreadsInThreadGroup correctly Related-To: NEO-7952, NEO-7982 Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>	2023-06-26 16:35:42 +02:00
Cencelewska, Katarzyna	7cb3278eb3	fix: add function to calculate number of threads per tg Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>	2023-06-13 14:02:24 +02:00
Cencelewska, Katarzyna	d2436a8231	fix: add limitations for setting gmm flag Cacheable - move isCachingOnCpuAvailable to product helper - isCachingOnCpuAvailable should return false on mtl - if wsl, skip checking method from product helper Related-To: NEO-7194 Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>	2023-05-30 17:04:57 +02:00
Mateusz Jablonski	61055478d4	fix: adjust scope of disable L3 for debug WA Related-To: HSD-1609398399 Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>	2023-05-30 14:23:16 +02:00
Filip Hazubski	d234bc970d	refactor: Move getMaxNumSamplers function to ProductHelper Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>	2023-05-18 09:25:07 +02:00
Cencelewska, Katarzyna	5f22e9eaca	fix: don't set Cacheable on xe_hp and later Related-To: NEO-7194 Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>	2023-05-18 09:17:32 +02:00
Milczarek, Slawomir	66eb1c9c0a	refactor: Add helpers to control kmd migration support on PVC platform This commit keeps KMD migration still disabled by default on PVC platform. Related-To: NEO-6465 Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>	2023-05-15 13:51:19 +02:00
Fabian Zwolinski	cbce863dc2	refactor: Rename member variables to camelCase 3/n Additionally enable clang-tidy check for member variables Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>	2023-04-28 16:01:14 +02:00
Zbigniew Zdanowicz	4c7bc2ca98	[feature, perf] add alogrithm to chain command buffers in container This feature is part of performance improvement to dispatch and start command buffers as primary batch buffers. When exhausted command buffer is closed, then reserve exact space for chained batch buffer start and bind it to the next command buffer. When closing command buffer, then save ending pointer and reserve aligned space. Related-To: NEO-7807 Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>	2023-04-05 15:49:01 +02:00
Rafal Maziejuk	b9828b543e	feature: adjust maxWorkGroupSize value Related-To: NEO-7357 Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>	2023-03-28 15:19:52 +02:00
Mateusz Jablonski	5610eae710	refactor: fix typo Barrierl -> Barrier Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>	2023-03-21 15:58:24 +01:00
Filip Hazubski	0bee81c0c0	refactor: Move isLinearStoragePreferred function from gfx to product helper Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>	2023-03-15 18:51:59 +01:00
Mateusz Jablonski	340f932ca2	refactor: move GfxCoreHelper::getExtensions to CompilerProductHelper Related-To: NEO-7800 Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>	2023-03-14 13:56:19 +01:00
Raiyan Latif	d5c909c9f9	Fix calculation of number of Ray-Tracing stacks MaxDualSubSlicesSupported is filled inside GT_SYSTEM_INFO structure when querying the KMD appropriately with the number of enabled DualSubSlices. However we need to find the highest index of the last enabled DualSubSlice. For proper allocation of thread scratch space, allocation has to be done based on native die config (including unfused or non-enabled DualSubSlices). Since HW doesn't provide us a way to know the exact native die config, in SW we need to allocate RT stacks with enough size based on the last used DualSubSlice. The IsDynamicallyPopulated field in GT_SYSTEM_INFO is used to indicate if system details are populated either via Fuse reg. or hard-coded. Based on this field's value, we calcuate the numRtStacks appropriately. Related-To: LOCI-3954 Signed-off-by: Raiyan Latif <raiyan.latif@intel.com>	2023-03-13 10:48:10 +01:00
Compute-Runtime-Validation	678e47de2d	Revert "Adjust maxWorkGroupSize value" This reverts commit `f7685a93e4`. Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>	2023-02-21 14:45:36 +01:00
Rafal Maziejuk	f7685a93e4	Adjust maxWorkGroupSize value Related-To: NEO-7357 Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>	2023-02-17 09:34:15 +01:00
Maciej Bielski	2778043d67	fix(l0): check for largeGRF when computing maxWorkGroupSize Sizing context (PVC): When using LargeGRF (a.k.a GRF256) there are only 4 HW threads per EU (instead of default 8). Together with SIMD16 that means that there can be max 64 work-items per EU. With 8 EU per subslice this gives 512 work-items on a single subslice. For correct intra-WG synchronization all its WIs must be executed on the same subslice (to access the same SLM, where the synchronization primitives are stored). Thus, with SIMD16 and LargeGRF the work-group size must not exceed 512 (PVC example). So far `maxWorkGroupSize` is taken solely from a DeviceInfo structure both in `ModuleTranslationUnit::processUnpackedBinary()` and `ModuleImp::initialize()`. This method does not take kernel parameters (LargeGRF) into account. It allows to submit a kernel using LargeGRF with SIMD16 with the work-group size set to 1024. That leads to a hang. Fix the `.maxWorkGroupSize` computation so that it takes the kernel parameters into consideration. Add new (for discrete platforms >= XeHP) and adapt existing tests, fix cosmetics by the way. Similar check for OCL: https://github.com/intel/compute-runtime/blob/master/opencl/source/comma nd_queue/enqueue_kernel.h#L130 Related-To: NEO-7684 Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>	2023-02-08 11:20:52 +01:00
Dominik Dabek	8da362afae	fix(l0): do not memcpy on cpu if need unlock ptr Do not use cpu memory copy on windows if need to unlock locked ptr. Related-To: NEO-7553 Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>	2023-02-02 10:41:39 +01:00
Kamil Kopryk	2484c7ceb2	refactor: rename hw_helper files to gfx_core_helper files Related-To: NEO-6853 Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>	2023-02-01 19:37:51 +01:00

25 Commits