Buffer's default bytesPerPixel value always equals 1 and as
IMAGE1D_BUFFER is originally an image, X coordinate needs to be
multiplied by bytesPerPixel (implied by image format)
in both copySize and (src/dst)Size.
Signed-off-by: Rafal Maziejuk rafal.maziejuk@intel.com
Related-To: NEO-6134
This change introduces detection of GPU hangs in blocking
calls to enqueueHandler() function. Moreover, usages of
this function template have been revised and adjusted to
check the exit code. Furthermore, enqueueBlit() and
dispatchBcsOrGpgpuEnqueue() functions returns value now.
ULTs have been added to cover new cases.
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Related-To: NEO-6681
For different platforms based on number of available threads
and debug surface layout, calculate max debug surface size.
Related-To: NEO-6676
Signed-off-by: Jitendra Sharma <jitendra.sharma@intel.com>
This change introduces detection of GPU hangs
in CommandMapUnmap::submit() as well as in Event::submitCommand().
ULTs have been added to cover the new code.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Buffer's default bytesPerPixel value always equals 1 and as
IMAGE1D_BUFFER is originally an image, X coordinate needs to be
multiplied by bytesPerPixel in both copySize and (src/dst)Size.
Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
Related-To: NEO-6134
Change ThreadArbitrationPolicy::NotPresent value to -1
Update initial values to ThreadArbitrationPolicy::NotPresent
Related-To: NEO-6728
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
This change allows for modifying kernel's barrier count
based on called external functions metadata passed
via zeInfo section in zebin.
Added parsing external functions metadata.
Added resolving external functions call graph.
Added updating kernel barriers based on called external functions.
Added support for L0 dynamic link.
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
In constructor of CommandComputeKernel we had been doing multiple allocations
of memory on heap due to lack of call to std::vector copy-constructor or reserve
member function.
Furthermore, in production code there is only one place, where we create objects
of this type and we redundantly copy the local variable, which could be moved.
This change:
- ensures that constructor of CommandComputeKernel performs single allocation
in the worst case; in the best case, it does not allocate memory due to usage
of std::move on input parameter
- steals the memory of the local variable in place of usage of the constructor
to remove redundant copying and memory allocations
- uses reserve() method to reduce the number of allocations during creation
of this local variable
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
Check allocId earlier and also reuse if allocationsCounter did not
change from last call.
Related-To: NEO-6737
Co-authored-by: Michal Mrozek <michal.mrozek@intel.com>
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
This type of image needs to be treated as buffer in order to
allow width to be greater than 16383.
Signed-off-by: Rafal Maziejuk <rafal.maziejuk@intel.com>
Related-To: NEO-6134
Related-To: NEO-6693
Currently if clCreateFromVA and clEnqueueAcquireVA
are called from different scopes (i.e. surfaceID
passed to clCreate is destroyed when called
clEnqueueAcquired) enqueue results in undefined
behaviour. This PR fixes that.
Signed-off-by: Szymon Morek <szymon.morek@intel.com>
This change introduces detection of GPU hangs
in asynchronous events handler. ULTs have also
been added to cover the new code.
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
This change introduces detection of GPU hangs in
clFinish function as well as unit tests to cover
the new code.
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
On pre-XeHp platforms implicit args aren't at the beginning of indirect data,
GPU address of implicit args buffer is programmed within cross thread data
Related-To: NEO-5081, IGC-4710
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
This change:
- moves NEO::WaitStatus to a separate file
- enables detection of GPU hang in clWaitForEvents
- adjusts most of blocking calls in CommandStreamReceiver to return WaitStatus
- adds ULTs to cover the new code
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>