Commit Graph

1096 Commits

Author SHA1 Message Date
Mateusz Jablonski 4dfa12c8eb fix: add mechanism to detect gpu timestamp overflows
unify naming CpuGpu to GpuCpu

Related-To: NEO-8394
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-19 16:31:06 +02:00
Lukasz Jobczyk 750b5ba89a fix: flush necessary caches when dispatch pipe control
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-10-19 15:09:31 +02:00
Mateusz Hoppe 5a4fa180d6 feature: control bindless compilation mode based on release
- check releaseHelper support when selecting bindless mode, if not
disabled, prefer bindless mode in L0 API
- bindless mode can be forced with DebugVariable: UseBindlessMode

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-10-19 10:11:11 +02:00
John Falkowski f156a74f54 fix: split chunking prefetch flags
Related-To: NEO-9120

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-10-18 19:20:42 +02:00
Mateusz Jablonski a420e34b10 fix: explicitly remove assign operators when not needed
when class defines copy/move ctor then corresponding assign operator(s)
should be defined or deleted

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-12 10:25:15 +02:00
Mateusz Jablonski 3fdcf049bf fix: set default device hierarchy to composite for all platforms except xe hpc
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-12 10:09:50 +02:00
Latif, Raiyan dee5ecfdf3 fix: ReturnSubDevicesAsApiDevices flag being ignored
Proper subdevice count being returned now in GfxCoreHelper
path, as previous method ignored the usage of the
ReturnSubDevicesAsApiDevices flag.

Related-To: LOCI-4859

Signed-off-by: Latif, Raiyan <raiyan.latif@intel.com>
2023-10-10 17:05:00 +02:00
Mateusz Jablonski 420f273a6c fix: don't wait on condition in unit tests
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-10 15:14:30 +02:00
Mateusz Hoppe c2d69e5857 feature: allocate SPECIAL_SSH heap in front window from EXTERNAL heap
- SPECIAL_SSH is used for debug surface SurfaceState which must be
located at bindless offset zero
- limit size of external front window

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-10-09 14:54:39 +02:00
Filip Hazubski 0c8a514349 fix: Switch default device hierarchy to FLAT
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2023-10-06 15:24:50 +02:00
Filip Hazubski 08e92d154f fix: Add getDefaultDeviceHierarchy call to GfxCoreHelper
Added getDefaultDeviceHierarchy call that describes default device
hierarchy for a gfx core. Refactored L0 and OCL paths to use this
value by default and override this value when user sets
ZE_FLAT_DEVICE_HIERARCHY environment variable or
ReturnSubDevicesAsApiDevices debug key.

Updated ReturnSubDevicesAsApiDevices to force COMPOSITE device hierarchy
when set to 0.

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2023-10-06 12:32:41 +02:00
Mateusz Jablonski 382fc952f2 refactor: add NonAssignableClass to define classes without assign operator
Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-05 15:16:58 +02:00
Mateusz Jablonski 712ba60452 fix: add unrecoverable to avoid nullptr access
Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-05 13:20:21 +02:00
Mateusz Jablonski 6d259ac4b7 fix: add unrecoverable to avoid out of bound access
Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-05 12:03:28 +02:00
Mateusz Jablonski ad2701ad26 fix: add unrecoverable to avoid out of bound access
Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-04 15:59:18 +02:00
John Falkowski 56f05303c9 feature: add support for zeMemGet/SetAtomicAccessAttributeExp
Resolves: NEO-8219

Signed-off-by: John Falkowski <john.falkowski@intel.com>
2023-10-02 15:59:17 +02:00
Hoppe, Mateusz 5c565efe28 feature: bindless global heap with debugger
- program debugSurface's SurfaceState at the beginning of Bindless Surface
State Heap - SPECIAL_SSH
- ensure SPECIAL_SSH is resident

Related-To: NEO-7063

Signed-off-by: Hoppe, Mateusz <mateusz.hoppe@intel.com>
2023-09-29 13:13:46 +02:00
Mateusz Jablonski a033df33ff fix: remove preferSmallWorkgroupSizeForKernel method
Related-To: HSD-18033866078
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-29 11:55:09 +02:00
Mateusz Jablonski 3a21b3b228 refactor: remove not needed code
Related-To: NEO-7527

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-28 07:52:39 +02:00
Mateusz Jablonski 5f846d8a13 refactor: remove not needed code
Related-To: NEO-7527
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-27 18:17:04 +02:00
Mateusz Jablonski 03874b8815 refactor: remove not needed code
Related-To: NEO-7527
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-27 17:45:54 +02:00
Mateusz Jablonski 09044dfbaa refactor: remove not needed code
Related-To: NEO-7527

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-27 14:35:49 +02:00
Kacper Nowak 160303924d refactor: Correct logic for SIMD1
- For calculating number of threads per workgroup, for SIMD 1, return
local work size (each software thread should be mapped into a whole hardware
thread).
- Correct logic of calculating space for per thread data for SIMD 1.
- Minor: unit tests refactor.
- Corrected naming.
Related-To: NEO-8261
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2023-09-26 15:28:37 +02:00
Dominik Dabek eebf2bbd26 performance(ocl): timestamp packet count per gfx
Add support for different timestamp packet counts per gfx family.
Change all packet counts to 1 except for xe-hpc.

Related-To: NEO-8154

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-09-25 20:34:58 +02:00
Maciej Plewka 8658fdb04e fix: Use stack vec for api specific prefix
Related-To: NEO-8388, GSD-6296

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-09-21 16:10:54 +02:00
Maciej Bielski 97e7cda912 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-21 13:55:45 +02:00
Katarzyna Cencelewska d7d46a9fc5 refactor: use initialized variable in getHighestEnabledDualSubSlice
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2023-09-20 14:49:56 +02:00
Mateusz Jablonski b1808f7830 fix: correct suggested number of work groups for concurrent kernels on PVC
value depends on CCS count:
- single CCS mode (default) - 50% available
- two CCS mode - 25% available
- four CCS mode - 12.5% available

Related-To: NEO-8377
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-20 13:40:22 +02:00
Mateusz Hoppe 69f5ca6345 feature: bindless addressing - flush state cache after reusing SS slot
- when Surface State is reused for new resource, State Cache needs to be
invalidated

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-09-20 12:53:32 +02:00
Dunajski, Bartosz d3d5da1f72 feature: initial 64b in-order CmdList support
Related-To: NEO-7966

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-09-19 15:23:37 +02:00
Compute-Runtime-Validation 913a926fd4 Revert "feature: Optimize intra-module kernel ISA allocations"
This reverts commit c348831470.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-19 14:16:05 +02:00
Maciej Bielski c348831470 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-19 12:05:09 +02:00
Compute-Runtime-Validation 73731d3be5 Revert "fix: correct suggested number of work groups for concurrent kernels o...
This reverts commit 6fc673b0fe.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-15 04:21:58 +02:00
Mateusz Jablonski 6fc673b0fe fix: correct suggested number of work groups for concurrent kernels on PVC
value depends on CCS count:
- single CCS mode (default) - no limitations
- two CCS mode - 25% available
- four CCS mode - 12.5% available

Related-To: NEO-8377
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-14 16:13:54 +02:00
Mateusz Jablonski 2f7c33c1fd refactor: move xe hpg specific appendBlitCommandsBlockCopy to xe hpg file
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-13 10:49:28 +02:00
Dunajski, Bartosz 7562842a58 refactor: remove LogicalStateHelper
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-09-13 10:29:53 +02:00
Compute-Runtime-Validation 413365a7bf Revert "fix: Correct logic for SIMD1"
This reverts commit fc099ead2e.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-13 08:23:59 +02:00
Kacper Nowak fc099ead2e fix: Correct logic for SIMD1
- For calculating number of threads per workgroup, treat simd 1 as it
  was simd 32
- Correct logic of calculating space for per thread data for simd 1
- Minor: unit tests refactor
- Corrected naming
Related-To: NEO-8261
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2023-09-13 07:03:12 +02:00
Mrozek, Michal d9f938f3db refactor: remove not needed code
Signed-off-by: Mrozek, Michal <michal.mrozek@intel.com>
2023-09-12 14:25:04 +02:00
Dunajski, Bartosz 6648065703 feature: add indirect semaphore mode
Related-To: NEO-8242

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-09-12 13:15:51 +02:00
Dunajski, Bartosz 2a6be2fccd feature: update conditional bb start to use qword data
Related-To: NEO-8242

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-09-12 11:24:28 +02:00
Dunajski, Bartosz def3f2e9ad refactor: improve semaphore programming
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-09-12 11:24:11 +02:00
Filip Hazubski d7db6ac467 feature: Add preferredPlatformName field to RuntimeCapabilityTable
For all of the devices, preferredPlatformName is initialized with
nullptr by default and platform name will be initialized to driver's default
platform name, at the moment this is "Intel(R) OpenCL Graphics".

When Platform is initialized and preferredPlatformName is not nullptr then
Platform name will be set to the value stored in preferredPlatformName.

Add ENABLE_LEGACY_PLATFORM_NAME AIL enum related to added functionality.

Move PlatformInfo to NEO namespace.

Related-To: HSD-22018809561

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2023-09-12 11:07:14 +02:00
Mateusz Jablonski c851896482 refactor: move XeHpg specific setExtraAllocationData definition to Xe Hpg file
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-11 13:32:00 +02:00
Compute-Runtime-Validation 1579c69316 Revert "performance: allocate timestamp packet tag buffer in local mem on DG2"
This reverts commit 819908ec94.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-11 12:07:26 +02:00
Fabian Zwolinski b2ba1fbecf fix: enforce zebin format by default in Ocloc for ICL/TGL/RKL/ADL
Products for which zebin has been set as default format in OCLOC:
- ICELAKE_LP
- TIGERLAKE_LP
- ROCKETLAKE
- ALDERLAKE_S
- ALDERLAKE_P
- ALDERLAKE_N

The default format does not override `--format` parameter.

Related-To: NEO-8334
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-09-08 16:16:18 +02:00
Fabian Zwolinski 10675134e1 feature: Add process safety to Windows compiler cache
Related-To: NEO-8092

Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Co-authored-by: Diedrich, Kamil <kamil.diedrich@intel.com>
2023-09-06 15:34:15 +02:00
Maciej Plewka 3b3e17e738 performance: Use vector for private allocs to reuse
Related-To: HSD-18033105655, HSD-18033153203

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-09-04 13:34:38 +02:00
Mateusz Jablonski 91b26277a4 feature: add method to adjust hw info for igc
Related-To: NEO-8203

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-04 11:58:13 +02:00
Jitendra Sharma 9818ef61a5 feature: Report correct GRF register count
Based on Large GRF enabled or not, report correct GRF
register.

Related-To: NEO-6788
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
2023-09-04 11:42:48 +02:00