Commit Graph

151 Commits

Author SHA1 Message Date
Mateusz Hoppe 4c49a08017 feature: add inline samplers bindless addressing support
- inline samplers in bindless addressing mode requires bindless offset
passed in cross thread data

Related-To: NEO-11748

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-06-24 13:02:08 +02:00
Bartosz Dunajski 692def2c79 feature: region group barrier allocation support
Related-To: NEO-11031

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-06-03 18:34:54 +02:00
Bartosz Dunajski cc8b5ee972 feature: add support for new ze_bin wg barrier token
Related-To: NEO-11031

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-05-31 11:04:50 +02:00
Dominik Dabek ae8c7589dc refactor: move implicit arg has indirect access
Move implicit arg has indirect access boolean to kernelAttributes

Related-To: NEO-11396

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-05-15 13:11:04 +02:00
Dominik Dabek fd47030ad6 fix: use igc indirect detection v3
Update to use igc indirect detection v3. Fix for not detecting indirects
passed as implicit arguments.

Related-To: NEO-11396

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-05-15 06:35:42 +02:00
Mateusz Hoppe f86d4220a5 feature: add bindless samplers support to level zero
- samplers using bindless adressing require patching bindless offsets to
sampler states on kernel's cross thread data

Related-To: NEO-10505

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-03-29 08:07:28 +01:00
Katarzyna Cencelewska da7b03dd15 fix: to always use grfs count in calculateNumThreadsPerThreadGroup
grf size != grf count

Related-To: GSD-8437
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2024-03-22 11:03:18 +01:00
Katarzyna Cencelewska dd1d52259e refactor: add param rootDeviceEnvironment to calculateNumThreadsPerThreadGroup
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2024-03-21 22:25:14 +01:00
Mrozek, Michal f71f6d2b72 refactor: remove not needed code
Signed-off-by: Mrozek, Michal <michal.mrozek@intel.com>
2024-03-08 18:18:55 +01:00
Kacper Nowak 999ec9b2ca refactor: Unify logic for getting atomic FP caps 1/n
- Separate logic for fp16/32/46 caps.
- Add aggregated constexprs for local & global caps of given type
- Pass arguments by reference
- Add hwInfo as argument for future refactors
- Add static_asserts in L0 to ensure there is no mismatch between
internal/external caps
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2024-02-07 11:39:36 +01:00
Mateusz Jablonski a697a3f718 refactor: create new members for storing spill and private memory in scratch
rename private scratch space into scratch space slot 1 as it can be generic

Related-To: NEO-9944
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-01-23 12:42:25 +01:00
Compute-Runtime-Validation f9f9035b95 Revert "refactor: create new members for storing spill and private memory in ...
This reverts commit 87eb5f554a.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-01-23 09:13:00 +01:00
Mateusz Jablonski 87eb5f554a refactor: create new members for storing spill and private memory in scratch
rename private scratch space into scratch space slot 1 as it can be generic

Related-To: NEO-9944
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-01-22 19:48:48 +01:00
Fabian Zwolinski 903e581b5f fix: add support for bindless implicit args
Support for:
global_base and const_base in bindless addressing mode.

Related-To: NEO-9855
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2024-01-12 01:27:17 +01:00
Dunajski, Bartosz f7eb961435 refactor: validate template type in isUndefinedOffset helper
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-12-21 10:29:04 +01:00
Mateusz Jablonski 138fb65401 refactor: correct naming of enum class constants 11/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-19 14:52:57 +01:00
Dunajski, Bartosz c612a86d28 feature: initial support for new zeinfo args
Related-To: NEO-8070

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-12-19 10:04:14 +01:00
Mateusz Jablonski dd1b9d6abc refactor: correct naming of enum class constants 8/n
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-19 08:18:18 +01:00
Dunajski, Bartosz d99104d5bf refactor: improve ImplicitArg struct handling
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-12-18 15:19:00 +01:00
Mateusz Jablonski 895519db38 refactor: correct naming of NEOImageType enum values
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-12 11:15:28 +01:00
Mateusz Jablonski 83006521bc refactor: correct naming of internal fp atomic ext flags
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-04 19:02:53 +01:00
Mateusz Jablonski c3d3a4db1f refactor: correct variable naming
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-04 13:45:53 +01:00
Mateusz Jablonski c3ac7b78bd refactor: correct variable naming
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-12-01 02:18:46 +01:00
Mateusz Jablonski c9664e6bad refactor: rename global debug manager to debugManager
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-11-30 13:00:59 +01:00
Kamil Kopryk ae607502a0 feature: Add indirect data and scratch pointer to zeinfo
Related-To: NEO-7621
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-11-03 12:01:58 +01:00
Mateusz Jablonski fd7c750cf7 fix: ensure local variable address is not exposed outside of function
Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-06 15:59:16 +02:00
Mateusz Jablonski 9337911742 fix: add self-assign check in operator=
Related-To: NEO-9038
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-10-03 08:09:16 +02:00
Mateusz Jablonski 2a78a00855 fix: correct passing string in populateArgMetadata
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-09-27 18:00:33 +02:00
Maciej Bielski 97e7cda912 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-21 13:55:45 +02:00
Compute-Runtime-Validation 913a926fd4 Revert "feature: Optimize intra-module kernel ISA allocations"
This reverts commit c348831470.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-09-19 14:16:05 +02:00
Maciej Bielski c348831470 feature: Optimize intra-module kernel ISA allocations
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.

Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-09-19 12:05:09 +02:00
Mateusz Jablonski 00e24c0069 performance: leave StackVec::onStackMemRawBytes uninitialized
this memory shouldn't be accessed before resize

Resolves: HSD-18032826534

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-08-25 11:40:38 +02:00
Cencelewska, Katarzyna aa0beb8191 fix: Unify logic calculating threads per work group part 4
- also use helper when checking that is simd1 to have same flow

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-07 15:34:59 +02:00
Cencelewska, Katarzyna 61f701aba5 fix: Unify logic calculating threads per work group part 3
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 15:27:44 +02:00
Cencelewska, Katarzyna 2e17c21728 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 10:34:02 +02:00
Compute-Runtime-Validation 39740da9d1 Revert "fix: Unify logic calculating threads per work group part 2"
This reverts commit 1e8a53bd53.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-02 07:09:14 +02:00
Cencelewska, Katarzyna 1e8a53bd53 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-30 14:16:08 +02:00
Mateusz Hoppe 111b112729 feature: add assertBufferPtr to ImplicitArgs
Related-To: NEO-5753, NEO-8078

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-20 20:43:57 +02:00
Mateusz Hoppe 313fb84fda feature: bindless addressing mode support
- allow bindless kernels to execute
- bindless addressing kernels are using private heaps mode
- do not differentiate bindful and bindless surface state base addresses

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-19 12:41:03 +02:00
Cencelewska, Katarzyna 7cb3278eb3 fix: add function to calculate number of threads per tg
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-13 14:02:24 +02:00
Mateusz Hoppe 8bc1fb1251 refactor: add function checking bindless addressing
- simplify logic to check addressing mode of a kernel

Related-To: NEO-7063

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-12 14:42:18 +02:00
Mateusz Hoppe 646c8985e8 refactor: store number of stateful args in KernelDescriptor
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-06-12 14:15:43 +02:00
Kamil Kopryk 6a0f7afd64 feature: verify stateful information only when binary is generated by IGC
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>

Related-To: NEO-6075

Ngen binaries contain stateful information, however they are
not used in isa on Pvc. Therefore, we can just ignore them.
2023-06-12 11:45:41 +02:00
Mateusz Jablonski 31f32cc16e fix implicit args: generate local ids as for grf size 32
Related-To: IGC-6936

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-04-07 11:37:07 +02:00
Mateusz Jablonski dd39b822d3 feature implicit args: patch rt dispatch global array in implicit args buffer
handle has_rtcalls in kernels and functions in zebin

Related-To: NEO-7818
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-03-28 12:31:38 +02:00
Mateusz Hoppe 0204761add feature: gpu assert implementation
- allocate assert buffer when kernel has assert
- track assert kernels in cmdlists and cmdqueues
- check and print assert at sync calls: cmdqueue synchronize(), fence
synchronize(), event hostSynchronize(), synchronous imm cmdlists
append()

Related-To: NEO-5753

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-03-15 19:22:09 +01:00
Dominik Dabek 69a16fd3ed feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
For both level zero and opencl.
Currently disabled by default, enable with debug flag
DetectIndirectAccessInKernel

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-03-08 16:58:26 +01:00
Michal Mrozek c77d954900 [perf] simplify setting constant buffer and improve performance
- no need to count parameters
- remove unrecoverable which requires fetching additional fields.

Related-To: NEO-5170

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2023-03-02 13:09:52 +01:00
Compute-Runtime-Validation 4a369ad88d Revert "feature: check indirect access for kernel"
This reverts commit 075c96267d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-02-24 03:48:22 +01:00
Dominik Dabek 075c96267d feature: check indirect access for kernel
Do not make indirect allocations resident if kernel does not use
indirect access.
Enable for both level zero and opencl.

Related-To: NEO-7712

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-02-23 12:38:53 +01:00