Commit Graph

4499 Commits

Author SHA1 Message Date
Dominik Dabek
622a3ed89c performance(ocl): flag to not dcFlush on no event
If waitForBarrier is not passed outEvent then do
dcFlush on the next synchronize call.

Related-To: NEO-8147

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-07-18 15:38:54 +02:00
Artur Harasimiuk
faa8907344 refactor: remove unused code
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2023-07-18 14:52:43 +02:00
Mateusz Hoppe
776917159d feature: cl_khr_external_memory - return platform info
- return supported memory handle types when querying
CL_PLATFORM_EXTERNAL_MEMORY_IMPORT_HANDLE_TYPES_KHR

Related-To: NEO-6757

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-18 14:21:20 +02:00
Mateusz Jablonski
33e0eabe91 performance(ocl): verify tracingInProgress after checking for tracing support
Related-To: NEO-7958
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-07-18 11:42:30 +02:00
Mateusz Jablonski
f363463e2d test: move memory manager tests from OCL to shared
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-07-17 14:53:16 +02:00
Dunajski, Bartosz
815b37bf3a performance: allow waiting for OOQ timestamps in clEnqueueWaitForEvents
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-07-14 11:33:10 +02:00
Katarzyna Cencelewska
d74bba95c4 fix: use proper gpu ptr when 32 bit
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2023-07-14 11:00:40 +02:00
Dunajski, Bartosz
712e059ace performance: check completion alloc only once when waiting for Event
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-07-14 08:32:50 +02:00
Maciej Bielski
c7a971a28f feature: add optional onChunkFree callback to AbstractBuffersPool
Instances returned by `getAllocationsVector()` in some cases cannot be
freed (in the `malloc/new` sense) until the `drain()` function invokes
`allocInUse()` on them. Plus, the `chunksToFree` container operates on
pairs `{offset, size}`, not pointers, so such pair cannot be used to
release allocations either.

Provide an optional callback, which can be implemented by the custom
pool derived from `AbstractBuffersPool`. This callback can be used, for
example, to perform actual release of an allocation related to the
currently processed chunk.

Additionally, provide the `drain()` and `tryFreeFromPoolBuffer()`
functions with pool-independent versions and keep the previous versions
as defaults (for allocators with a single pool). The new versions allow
reusing the code for cases when allocator has multiple pools.

In both cases, there was no such needs so far but it arose when working
on `IsaBuffersAllocator`. The latter is coming with future commits, but
the shared code modifications are extracted as an independent step.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-07-13 17:26:51 +02:00
Mateusz Hoppe
9fd7f9cf05 fix: set ImplicitArgs size to size of defined fields
Resolves: NEO-8169

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-12 21:30:32 +02:00
Zbigniew Zdanowicz
1c0285a156 fix: correct alignment of per thread scratch size
Related-To: NEO-5288

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-07-12 12:31:47 +02:00
Maciej Plewka
3a9a835692 fix: encode options in elf file
Resolves: NEO-8035

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2023-07-11 16:31:52 +02:00
Dunajski, Bartosz
40c7b2842f fix: check engines completion before releasing deferred TSP nodes
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>

Related-To: NEO-8146
2023-07-11 16:11:51 +02:00
Kacper Nowak
b908203001 fix: Compile built-ins per release
- Preserve releases on CMake level.
- Instead of generating builtins per platform, generate them per-release
(+ correct naming accordingly).
- Stop using revisions in builtin compilation logic path, as they are
already embedded in release (device ip).
- Remove platform names & revisions from names for generated files
(related to builtins).
- Remove unnecessary code, refactor ULT logic.

Related-To: NEO-7783
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2023-07-11 16:02:36 +02:00
Andrzej Ratajewski
dc0796c2a1 feature: Add cl_khr_spirv_linkonce_odr to supported extensions
Related-To: NEO-8165
Signed-off-by: Andrzej Ratajewski <andrzej.ratajewski@intel.com>
2023-07-11 13:19:55 +02:00
Zbigniew Zdanowicz
3f7269d401 fix: make sip state programing once for all level zero command queues
Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-07-11 11:34:21 +02:00
Young Jin Yoon
81822e3716 refactor: rename pageSize2Mb to pageSize2M
The previous name "pageSize2Mb" defined in
shared/source/helpers/constant.h is inconsistent to other variable,
i.e. pageSize64k.

Furthermore, it's a bit misleading because the page size is defined in
Megabytes (MB), not in Megabits (Mb).

Related-to: NEO-7695
Signed-off-by: Young Jin Yoon <young.jin.yoon@intel.com>
2023-07-10 20:12:09 +02:00
Cencelewska, Katarzyna
aa0beb8191 fix: Unify logic calculating threads per work group part 4
- also use helper when checking that is simd1 to have same flow

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-07 15:34:59 +02:00
Kamil Kopryk
dcd0bd4c30 fix: correct black box test
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2023-07-07 13:07:09 +02:00
Dunajski, Bartosz
00bae2c827 fix: add missing nullptr check
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-07-07 11:34:49 +02:00
Compute-Runtime-Validation
9c7950cd22 Revert "feature: add optional onChunkFree callback to AbstractBuffersPool"
This reverts commit b7ecf99abb.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-07 04:31:30 +02:00
Dunajski, Bartosz
d96cf5846a fix: dont allocate TSP for OOQ without Event
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-07-06 15:12:22 +02:00
Zbigniew Zdanowicz
c892b8c6f3 fix: remove redundant check
Related-To: NEO-7828

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2023-07-06 14:58:18 +02:00
Maciej Bielski
b7ecf99abb feature: add optional onChunkFree callback to AbstractBuffersPool
Instances returned by `getAllocationsVector()` in some cases cannot be
freed (in the `malloc/new` sense) until the `drain()` function invokes
`allocInUse()` on them. Plus, the `chunksToFree` container operates on
pairs `{offset, size}`, not pointers, so such pair cannot be used to
release allocations either.

Provide an optional callback, which can be implemented by the custom
pool derived from `AbstractBuffersPool`. This callback can be used, for
example, to perform actual release of an allocation related to the
currently processed chunk.

Additionally, provide the `drain()` and `tryFreeFromPoolBuffer()`
functions with pool-independent versions and keep the previous versions
as defaults (for allocators with a single pool). The new versions allow
reusing the code for cases when allocator has multiple pools.

In both cases, there was no such needs so far but it arose when working
on `IsaBuffersAllocator`. The latter is coming with future commits, but
the shared code modifications are extracted as an independent step.

Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2023-07-06 10:38:55 +02:00
Lukasz Jobczyk
e70f441f52 fix: Idle gpu before invalidate aux table
Related-To: NEO-8067

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-07-05 13:51:27 +02:00
Mateusz Hoppe
eb2225e623 Revert "fix: no longer append .bin to binary name when "-output_no_suffix" ...
This reverts commit df62888efc.

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2023-07-05 12:21:55 +02:00
Cencelewska, Katarzyna
61f701aba5 fix: Unify logic calculating threads per work group part 3
Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 15:27:44 +02:00
Cencelewska, Katarzyna
2e17c21728 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-07-04 10:34:02 +02:00
Mateusz Jablonski
30c5d8a681 fix: pass gmm helper to getDumpSurfaceInfo function
gmm may not exist for buffer allocation

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-07-03 11:59:52 +02:00
Compute-Runtime-Validation
39740da9d1 Revert "fix: Unify logic calculating threads per work group part 2"
This reverts commit 1e8a53bd53.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2023-07-02 07:09:14 +02:00
Cencelewska, Katarzyna
1e8a53bd53 fix: Unify logic calculating threads per work group part 2
- use calculateNumThreadsPerThreadGroup instead of getThreadsPerWG to
have same flow and proper values of threads per work groups

Related-To: NEO-8087
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-30 14:16:08 +02:00
Artur Harasimiuk
3c4d921a80 refactor: remove not used code
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2023-06-30 13:39:25 +02:00
Daria Hinz
a6fee8994d feature: Adding kernel sizes validation in ocloc
This commit adds a validation layer in ocloc,
which is designed to check if the data read from
the binary file does not exceed the size of the section.

Related-To: NEO-8062
Signed-off-by: Daria Hinz <daria.hinz@intel.com>
2023-06-30 10:39:31 +02:00
Dominik Dabek
10ac167bdc fix(ocl): do not multiply kmdNotify waitTimeout
With direct submission disabled this resulted in waitTimeout long enough
that kmdWait fallback was rarely used.
This caused more CPU spin time.

Related-To: GSD-3612

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2023-06-29 16:06:28 +02:00
Filip Hazubski
df62888efc fix: no longer append .bin to binary name when "-output_no_suffix" is passed
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2023-06-29 14:10:25 +02:00
Artur Harasimiuk
cf73ab0df3 refactor: remove not used code
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2023-06-29 13:04:23 +02:00
Cencelewska, Katarzyna
0d7aefe66b fix: Unify logic calculating threads per work group part 1
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-29 10:43:22 +02:00
Jaroslaw Chodor
023fe38448 fix: Use correct dimensions for UV plane
For image view mapped directly to UV plane,
the dimensions should 2 times smaller than
dimensions of the source image.
(1 raw UV pair maps to 2x2 block of original image)

Related-To: NEO-7936

Signed-off-by: Jaroslaw Chodor <jaroslaw.chodor@intel.com>
2023-06-28 23:34:50 +02:00
Dunajski, Bartosz
ecb415bf62 feature: reenable RelaxedOrdering
Related-To: NEO-7458

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-06-28 12:20:17 +02:00
Igor Venevtsev
c2c622d695 fix: stop direct submission on platform destruction
Related-To: NEO-8072

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2023-06-28 08:41:31 +02:00
Cencelewska, Katarzyna
68d81c82a7 fix: Use proper value about hw local id generations
- remove useless flag ForceNumberOfThreadsInGpgpuThreadGroup
- add new flag "RemoveRestrictionsOnNumberOfThreadsInGpgpuThreadGroup"
to restore old path without restrictions about number of threads in
thread group
- fix forwarding information about hw local ids generations to
calculate numOfThreadsInThreadGroup correctly

Related-To: NEO-7952, NEO-7982
Signed-off-by: Cencelewska, Katarzyna <katarzyna.cencelewska@intel.com>
2023-06-26 16:35:42 +02:00
Dunajski, Bartosz
2b5e475db9 refactor: use hex values to print TSP usage
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-06-23 11:22:10 +02:00
Mateusz Jablonski
2d01bdec81 fix: change denorm mode in IDD to FlushToZero
denorm support is controlled by IGC, we should just set zero by default

Related-To: NEO-8059
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2023-06-23 09:28:32 +02:00
Kacper Nowak
12f597bc72 fix(zebin): corrections related to IntelGT notes + bump ZEInfo version
- Value correction: IntelGTSectionType::ProductConfig to 6, add new type
IntelGTSectionType::vISAAbiVersion = 5 - currently ignored by the
runtime
- For zebin manipulator: allow to extract PRODUCT_FAMILY from AOT
productConfig - required by IGA wrapper for binary encoding/decoding +
add tests
- Bump ZEInfo version to the latest: 1.32
Related-To: IGC-6300
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2023-06-22 15:41:01 +02:00
Dunajski, Bartosz
b004a27e4e refactor: Debug flag to print TSP usage
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-06-22 14:47:39 +02:00
Lukasz Jobczyk
0bc5eead84 fix: Remove not needed BCS split helper
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2023-06-22 14:36:22 +02:00
Igor Venevtsev
74082703ed feature: enable small buffers allocator optimization on MTL
Resolves: NEO-8082

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2023-06-22 11:06:32 +02:00
Fabian Zwolinski
99d0823e8f fix: Append extra extensions when FP64 emulation is enabled
Related-To: NEO-7611
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2023-06-22 08:38:53 +02:00
Wawiorko, Grzegorz
45187fe714 fix: Update CL_DEVICE_LATEST_CONFORMANCE_VERSION_PASSED value
Signed-off-by: Wawiorko, Grzegorz <grzegorz.wawiorko@intel.com>
2023-06-21 10:31:47 +02:00
Dunajski, Bartosz
46e8c3f5dd fix: reenable RelaxedOrdering for OCL
Related-To: NEO-7458

Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2023-06-20 13:05:25 +02:00