Remove not needed includes from command_queue.h
Change-Id: I45963bf005471bd7716d55471474299a15e27b62
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
EnqueueOperation<GfxFamily>::getTotalSizeRequiredCS was ULTs-only.
Replace with the real CS size estimation from getCommandStream.
Change-Id: I4d15d342eb5edff6511acc9c80e13e9cc92d81ac
Updating files modified in 2018 only. Older files remain with old style
copyright header
Change-Id: Ic99f2e190ad74b4b7f2bd79dd7b9fa5fbe36ec92
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
- if Tag allocation was not yet made resident then makeCoherent will fetch
uninitialized data, causing finish to return earlier then it should
causing later synchronization issues.
- simplify rest of method, remove redundant code.
Change-Id: I7bfcbd9f2d7170f41473a97f51856d82671b6638
- Tag allocator: reference count tracking
- Obtain tag by command queue and pass to Event if exist during enqueue
- Handle Timestamp Packet lifetime on Event and CmdQueue destruction
Change-Id: I9a5969830ea7a9d729e6f70519d8c28ff70fcf06
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
This commit adds notifications to enqueue read buffer and image calls
and setters/getters to mark/check if an allocation is dumpable.
Change-Id: I123f24752d2a86abcf934e0d404f4e0ecf1729cc
- Apply layout for images only when Z size is equal to 1
- Fix generating local ids for local workgroup size
when any size is not power of 2
- Revert commit c53c09da45
Change-Id: Ie745782fafce2facbd877e3e33e4ba347cb2b09e
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- Command Stream Receiver should be used instead for locking.
- Remove not needed synchronization in clSetUserEventStatus
Change-Id: I17050dc70cb0be03b2003043a9666ba8df1a83c9
- Mark Kernel for aux translation
- Initial implementation of dispatchAuxTranslation for future use
Change-Id: Ifca1c9a893876eecc5678cdc4f564b2bfcae959a
- replace createGraphicsAllocationWithRequiredBitness with more general
methodallocateGraphicsMemoryInPreferredPool based on passed
AllocationData
- proper flags for allocation selected based on AllocationType
- remove allocateGraphicsMemory(size_t size, size_t alignment)
and use allocateGraphicsMemory(size_t size) instead where default
alignment is sufficient, otherwise use full options version:
allocateGraphicsMemory(size_t size, size_t alignment,
bool forcePin, bool uncacheable)
Change-Id: I2da891f372ee181253cb840568a61b33c0d71fc9
- change GraphicsAllocatoin::AllocationType to scoped enumeration
so that ALLOCATION_TYPE_ prefix in every enum value can be removed
- all accesses are typed (example AllocationType::IMAGE)
- Rename allocationType to AllocationUsage to eliminate confusion
with multiple AllocationType enums / types
Change-Id: I16003297ecfcb0aaa5779ad00706c5d983914bbe
This commit adds a capability to selectively enable/disable AUB capture,
i.e. by toggling the registry key from the outside or specifying the filter
with a kernel name and/or kernel start index and kernel end index.
Change-Id: Ib5d39c21863fbc4a95aa73c949b9779ff993de0f
- This allows applications to force the N:1 aggregation by creating out
of order queue.
- That switches csr to N:1 submission model where commands from multiple
command streams may be aggregated.
- That forces scenarios returning an event to be aggregated as well.
Change-Id: I8fd8d7f88bb2665234ee90870133120b206710a8
-This is required to enable N:1 submission model.
-If heaps are coming from different command queues that always
mean that STATE_BASE_ADDRESS needs to be reloaded
-In order to not emit any non pipelined state in CSR, this change
moves the ownership of IndirectHeap to one centralized place which is
CommandStreamReceiver
-This way when there are submissions from multiple command queues then
they reuse the same heaps, therefore preventing SBA reload
Change-Id: I5caf5dc5cb05d7a2d8766883d9bc51c29062e980
- ThkWrapper had uninitialized mFunc member, setting it
to nullptr
- D3DSurface could dereference null image pointer,
adding validateUpdateData method in SharingHandler
that may return CL_INVALID_MEM_OBJECT if memObject is invalid
Change-Id: Iaa4499bcea47baca156c9d28be4c93ba4f0e1ebb
We don't need BuiltinDispatchInfoBuilder in every place where built ins
are used. specifically in .cpp files generated from kernel binary.
Change-Id: Ie739951cdc93873993f78ad14cee656122af51fd
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
- Move indirect heap to internal allocator domain.
- Add logic in getIndirectHeap to allocate with proper API depending on
heap type
- Add State base Address programming, reflecting that now Indirect Object
Heap is placed in 4GB domain.
- For AddPatchInfoCommentsForAUBDump mode , keep all heaps in non 4GB mode.
Change-Id: I6862f6a249e444d0d6cfe7e499a10d43f284553e
- Internal allocations may now coexists with non internal on reusable list.
- Caller now specifies if internal allocation is needed.
- If criteria are not met , then allocation is not returned.
Change-Id: I7da3a4f944768b7c8a873e44fd47248f1d76bf9e
There are differences in qPitch programming between Gen8 vs Gen9+
devices and this requires special operation when image is zero-copy.
For Gen8 qPitch is distance in rows while Gen9+ it is in pixels.
Minimum value of qPitch is 4 and this causes slicePitch = 4*rowPitch on
Gen8.
To allow zero-copy we have to tell what is correct value rowPitch which
should equal to slicePitch.
Change-Id: I58dea004e3c7f9f4dfabd154d02749c15b6b0246
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
When inheriting task count from parent events,
don't take into account externally synchronized events
Change-Id: I52d861e482669a18e2aca499c813716bb4951b74
- change the way we handle blocked commands.
- instead of allocating CPU pointer and populating it with commands, create
real IndirectHeap that may be later submitted to the GPU
- that removes a lot of copy operations that were happening on submit time
- for device enqueue, this requires dsh & shh to be passed directly to the
underlying commands, in that scenario device queue buffers are not used
Change-Id: I1124a8edbb46777ea7f7d3a5946f302e7fdf9665
* adding support for map/unmap
* adding support for origin/region validation with mipmaps
* fixing slices returned in map/unmap
* removing ambiguity around mipLevel naming
* enabling cl_khr_mipmap_image in current shape
* enabling cl_khr_mipmap_image_writes in current shape
* fixing CompileProgramWithReraFlag test
Change-Id: I0c9d83028c5c376f638e45151755fd2c7d0fb0ab
- While estimating the required size of Indirect Object Heap we were not
handling properly the lack of local ids case
- In such case we should allocate one GRF per HW thread that will be unused
Change-Id: Ibcd359e431e3ffd9d55628ac7cf7eeefad72e7ba
- cpu virtual address was used instead of gpu va
- this caused incorrect behaviour of TBX server when
special heap allocator assigning GPU addresses was used
Change-Id: I2328cf2441be797311fd6a3c7b331b0fff79d4fc
Any CPU related updates such as clEnqueueMapBuffer or similar
need to trigger a re-dump of memory prior to the next clEnqueue call.
Change-Id: I7b31e559278e92ff55b6ebab8ef4190caef1ebc0