Commit Graph

1411 Commits

Author SHA1 Message Date
Compute-Runtime-Validation 4f31b569e4 Revert "Correct IMAGE1D_BUFFER width size calculation in BCS"
This reverts commit 3490b489ad.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-03-12 20:02:55 +01:00
Rafal Maziejuk 3490b489ad Correct IMAGE1D_BUFFER width size calculation in BCS
Buffer's default bytesPerPixel value always equals 1 and as
IMAGE1D_BUFFER is originally an image, X coordinate needs to be
multiplied by bytesPerPixel in both copySize and (src/dst)Size.

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
Related-To: NEO-6134
2022-03-11 09:34:40 +01:00
Lukasz Jobczyk c8ba97e492 Restore gpgpu csr's mutex lock in the enqueue blit
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-03-10 13:36:46 +01:00
Bartosz Dunajski b8d5fac10f Add missing lock in MapOperationsHandler
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2022-03-10 13:17:46 +01:00
Kamil Kopryk 038d1d54fa Correct xe_hpc tests
Related-To: NEO-6631


Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-03-09 09:21:30 +01:00
Michal Mrozek cd15c82eab Do not prefer copy engine for local to local transfers.
Execution Units are faster.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2022-03-08 15:42:58 +01:00
Dominik Dabek d5fedf90c5 Fix for svm pointer arg caching
Previous version could cause segfaults.

Related-To: NEO-6737

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-03-08 12:13:15 +01:00
Filip Hazubski 80b520bc9b Change ThreadArbitrationPolicy enum type to int32_t
Change ThreadArbitrationPolicy::NotPresent value to -1
Update initial values to ThreadArbitrationPolicy::NotPresent

Related-To: NEO-6728

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2022-03-07 20:04:24 +01:00
Lukasz Jobczyk f91ae9d59c Add multithread enqueue blit OOQ test
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-03-07 11:17:27 +01:00
Krystian Chmielewski 0ccce5a6d7 Zebin: set kernel barriers based on ext funcs
This change allows for modifying kernel's barrier count
based on called external functions metadata passed
via zeInfo section in zebin.

Added parsing external functions metadata.
Added resolving external functions call graph.
Added updating kernel barriers based on called external functions.
Added support for L0 dynamic link.

Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2022-03-04 14:21:50 +01:00
Compute-Runtime-Validation e526cc470b Revert "Add multithread enqueue blit OOQ test"
This reverts commit 0919cad885.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-03-03 16:06:15 +01:00
Lukasz Jobczyk 999c6424a4 While enqueue blit do not flush gpgpu if already flushed
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-03-03 13:01:57 +01:00
Patryk Wrobel f193efec2f Remove additional memory allocations for surfaces container
In constructor of CommandComputeKernel we had been doing multiple allocations
of memory on heap due to lack of call to std::vector copy-constructor or reserve
member function.

Furthermore, in production code there is only one place, where we create objects
of this type and we redundantly copy the local variable, which could be moved.

This change:
- ensures that constructor of CommandComputeKernel performs single allocation
in the worst case; in the best case, it does not allocate memory due to usage
of std::move on input parameter
- steals the memory of the local variable in place of usage of the constructor
to remove redundant copying and memory allocations
- uses reserve() method to reduce the number of allocations during creation
of this local variable

Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
2022-03-03 12:07:36 +01:00
Dominik Dabek 6556d9a510 Improve caching in clSetKernelArgSVMPointer 2/n
Update allocIdMemoryManagerCounter on cache hit

Related-To: NEO-6737

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-03-02 18:10:52 +01:00
Dominik Dabek 7ab86d44d6 Improve caching in clSetKernelArgSVMPointer
Check allocId earlier and also reuse if allocationsCounter did not
change from last call.

Related-To: NEO-6737

Co-authored-by: Michal Mrozek <michal.mrozek@intel.com>

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-03-02 15:56:21 +01:00
Rafal Maziejuk 385c60948e Treat IMAGE1D_BUFFER type as buffer in BCS
This type of image needs to be treated as buffer in order to
allow width to be greater than 16383.

Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
Related-To: NEO-6134
2022-03-02 15:41:15 +01:00
Lukasz Jobczyk 0919cad885 Add multithread enqueue blit OOQ test
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-03-02 14:18:58 +01:00
Grzegorz Choinski b41f088fe9 rename neo_test_kernels to kernels_bin
Related-To: NEO-6172
Signed-off-by: Grzegorz Choinski <grzegorz.choinski@intel.com>
2022-03-02 13:46:26 +01:00
Michal Mrozek bfacd14b61 Remove not needed code.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2022-03-02 13:10:15 +01:00
Szymon Morek 107db3a372 Add surfaceId variable to VASurface
Related-To: NEO-6693

Currently if clCreateFromVA and clEnqueueAcquireVA
are called from different scopes (i.e. surfaceID
passed to clCreate is destroyed when called
clEnqueueAcquired) enqueue results in undefined
behaviour. This PR fixes that.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2022-03-02 12:17:49 +01:00
Lukasz Jobczyk ea574d9b39 Optimize enqueue blit mutex
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-03-01 14:43:29 +01:00
Dominik Dabek 9bc364e7a7 Fix for clSetKernelArgSVMPointer optimization
Related-To: NEO-6737

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-03-01 12:56:04 +01:00
Lukasz Jobczyk 3c30e1b02b Add AssignBCSAtEnqueue debug flag
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-03-01 12:43:36 +01:00
Patryk Wrobel 05bf7a4315 Detect GPU hang in AsyncEventsHandler
This change introduces detection of GPU hangs
in asynchronous events handler. ULTs have also
been added to cover the new code.

Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
2022-03-01 11:41:12 +01:00
Dominik Dabek b9d8d8c0fd Optimize setKernelArgSVMPointer
If same pointer is already set, we don't need to set it again.

Related-To: NEO-6737

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-03-01 09:12:13 +01:00
Lukasz Jobczyk 090bfb9642 Reuse kernel allocation
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-02-28 19:26:19 +01:00
Patryk Wrobel 0ecc7c5e3b Detect GPU hangs in clFinish
This change introduces detection of GPU hangs in
clFinish function as well as unit tests to cover
the new code.

Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
2022-02-28 19:07:36 +01:00
Konstanty Misiak cf1bc3a2ba Disable EU fusion based on kernel properties from compiler
Related-To: NEO-6633

Signed-off-by: Konstanty Misiak <konstanty.misiak@intel.com>
2022-02-28 18:50:38 +01:00
Szymon Morek 205571999e Propagate VA syncSurface failure to API call
Currently, if syncSurface method fails, driver
will result in CL_SUCCESS. This PR fixes that.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2022-02-28 18:34:13 +01:00
Mateusz Jablonski 82e3b10c5a Fix typo
Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-02-25 18:10:41 +01:00
Mateusz Jablonski a2386ad216 Correct programming of implicit args on pre-XeHp platforms
On pre-XeHp platforms implicit args aren't at the beginning of indirect data,
GPU address of implicit args buffer is programmed within cross thread data

Related-To: NEO-5081, IGC-4710
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-02-24 20:52:04 +01:00
Lukasz Jobczyk 0634aa3f1b Create resource with given address
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-02-24 17:06:19 +01:00
Patryk Wrobel 7f729b7f89 Detect GPU hang in clWaitForEvents
This change:
- moves NEO::WaitStatus to a separate file
- enables detection of GPU hang in clWaitForEvents
- adjusts most of blocking calls in CommandStreamReceiver to return WaitStatus
- adds ULTs to cover the new code

Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
2022-02-23 13:33:09 +01:00
Mateusz Jablonski ea6f089e17 Unify implicit args programming across APIs
Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-02-23 11:52:47 +01:00
Mateusz Jablonski aae7858ed9 CMake: define enable core files only once
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-02-18 11:50:09 +01:00
Kacper Nowak b62675290f Refactor source level debugger notification in OCL. [2/2]
Refactor source level debugger notification about debug data in OCL
(build/link path).
- Share common code
- Remove unnecessary function(s)
- Zebin-related ULTs refactor

Related-To: NEO-6644
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2022-02-17 21:40:05 +01:00
Mateusz Jablonski 4f71aaf595 Handle SIMD-1 scenario when programming local ids for implicit args
according to implicit args design for SIMD-1 local ids are one-by-one

Resolves: NEO-6692
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-02-17 17:48:54 +01:00
Kacper Nowak cd9cc53159 Correct setting usesStringMap flag in printf
This commit fixes setting usesStringMap flag for printf, taking into
account using indirect functions in legacy (non-zebinary) path. It also
adds new field to kernelDescriptor, specifying the binary type
(legacy/zebin).

Related-To: NEO-6604
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2022-02-15 15:39:10 +01:00
Compute-Runtime-Validation c5c3e865f0 Revert "Fail build program on PVC with stateful accesses"
This reverts commit 9466113cef.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-02-14 18:55:14 +01:00
Kamil Kopryk 9466113cef Fail build program on PVC with stateful accesses
Related-To: NEO-6075

After this change driver will fail clBuildProgram/zeModuleCreate api calls
whenever stateful access is discovered on PVC.
This is required since in this case allocation greater than 4GB
will not work.
If user still wants to use stateful addressing mode,
-cl-opt-smaller-than-4GB-buffers-only / -ze-opt-smaller-than-4GB-buffers-only
build option should be passed as build option, but then user can not use
bufers greater than 4GB.


Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-02-14 13:44:22 +01:00
Katarzyna Cencelewska bd3e296278 Remove unused functions and struct
getProfilingTimerResolution, getParentObjectCounts,
ObjectCounts

Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2022-02-14 09:53:35 +01:00
Kacper Nowak 741ee49c9a Refactor source level debugger notification in OCL. [1/n]
Refactor source level debugger notification in OCL path - in build()
cl_program method. It fixes confusing debug data creation (mixed
legacy/zebin path) and thus incorrect kernel notification about debug
data.

Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
Related-To: NEO-6644
2022-02-10 17:47:05 +01:00
Kacper Nowak 1390af6efe Make usesStringMap flag independent of implicit args requirements
This commits removes part of condition requiring requiresImplicitArgs
flag set in kernel descriptor in order to set usesStringMap flag.

Related-To: NEO-6604
Signed-off-by: Kacper Nowak <kacper.nowak@intel.com>
2022-02-09 09:51:01 +01:00
Krystian Chmielewski d49c5d6185 OCL: Set target device product family
In OCL product family of target device is not set
which leads to a fail on validating target device in
ZEBin path.
This change adds function that sets all
necessary fields based on provided hardware info.

Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2022-02-08 16:49:28 +01:00
Krzysztof Gibala e518a8f3f9 Add debug flag ForceExtendedUSMBufferSize
Forces extended buffer size by adding pageSize specify by number when
debug flag is >=1 in:
- clHostMemAllocINTEL
- clDeviceMemAllocINTEL
- clSharedMemAllocINTEL

Usage:
ForceExtendedUSMBufferSize=2
size += (2 * pageSize)

Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-02-07 11:44:31 +01:00
Bartosz Dunajski 4b0d986876 Move AllocationType enum out of GraphicsAllocation class
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2022-02-04 17:49:09 +01:00
Mateusz Jablonski b697d75695 Correct dimension order in local ids generated for implicit args
when local ids are generated by HW, use same dim order for runtime generation
move common logic to separated file

Related-To: NEO-5081
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-02-04 12:46:59 +01:00
Compute-Runtime-Validation 6f62a784e1 Revert "Check IndirectStatelessCount from igc"
This reverts commit 5e62df4f8e.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-02-04 12:15:37 +01:00
Katarzyna Cencelewska 3d0c065183 Remove device enqueue part 16
-delete old unused flags

Related-To: NEO-6559
Signed-off-by: Katarzyna Cencelewska <katarzyna.cencelewska@intel.com>
2022-02-03 19:38:41 +01:00
Lukasz Jobczyk 9ff1307b4b Fix optimize timestamp packet dependiencies
-program barrier after global fence allocation is programmed
-do not double barrier timestamp in blit enqueue
-flush GPGPU while submitting to BCS when barrier requested

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-02-03 16:27:09 +01:00