This is fixed reupload of this commit after auto revert
With this commit OpenCL will track if external host memory is used from
few threads and will secure to update task count in all threads before
destroing allocation.
Resolves: NEO-6807
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
With this commit OpenCL will track if external host memory is used from
few threads and will secure to update task count in all threads before
destroing allocation.
Resolves: NEO-6807
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
- add debugger support to imm cmd lists
- add debugger support to flushTask
Related-To: NEO-6845
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
- helper sets all SbaAddresses for debugger in
EncodeStateBaseAddress<GfxFamily>::setSbaAddressesForDebugger()
- change DebuggerL0::captureStateBaseAddress() to take
LinearStream
- move getSbaTrackingCommandsSize() to Debugger class
Related-To: NEO-6845
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
Stack vector will not cause dynamic allocations in most circumstances
ie. number of root device indices not more than 16
Related-To: NEO-6837
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
Remove parameter requiredThreadArbitrationPolicy
from PreambleHelper::programPreamble function.
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
Logic related to programming non coherent and thread arbitration policy for
gens 9 and 11 has been moved to EncodeComputeMode object, where similar
logic for gens gen12lp and newer is located.
Functions PreambleHelper::programThreadArbitration and
PreambleHelper::getThreadArbitrationCommandsSize have been removed.
Redundant setForceNonCoherent call has been removed from XE HPG
Related-To: NEO-6728
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
Change ThreadArbitrationPolicy::NotPresent value to -1
Update initial values to ThreadArbitrationPolicy::NotPresent
Related-To: NEO-6728
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
Setting one of bitfields requires read from local memory which is very slow.
This is not needed for devices that have coherent L3.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
Some devices do not need dcFlush.
Setting it prevents further optimization of pipe controls which
are not needed.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
This change:
- moves NEO::WaitStatus to a separate file
- enables detection of GPU hang in clWaitForEvents
- adjusts most of blocking calls in CommandStreamReceiver to return WaitStatus
- adds ULTs to cover the new code
Related-To: NEO-6681
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
-program barrier after global fence allocation is programmed
-do not double barrier timestamp in blit enqueue
-flush GPGPU while submitting to BCS when barrier requested
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
This change uses DRM_IOCTL_I915_GET_RESET_STATS to detect
GPU hangs. When such situation is encountered, then
zeCommandQueueSynchronize returns ZE_RESULT_ERROR_DEVICE_LOST.
Related-To: NEO-5313
Signed-off-by: Patryk Wrobel <patryk.wrobel@intel.com>
For non-kernel submission, TAM was incorrectly reprogrammed to default
mode. Correct programming should reuse value from previous submission.
Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
Rename MemorySynchronizationCommands::isDcFlushAllowed
to MemorySynchronizationCommands::getDcFlushEnable
Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
Related-To: NEO-6432
This change applies WA that always programs all fields in SCM for
gen12lp. Also for those platforms Force Non-Coherent is set to 0x2.
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>