Commit Graph

327 Commits

Author SHA1 Message Date
Maciej Dziuban
b91c14f70e Delete Device::getBuiltIns()
Change-Id: I9d1968dfb2ba4a56020fd17152119add726106e1
Signed-off-by: Maciej Dziuban <maciej.dziuban@intel.com>
2018-08-22 16:54:53 +02:00
Mateusz Jablonski
6286f245a1 Fix generation local ids for image layout with local workgroup size 12x12x1
Change-Id: Ib723b132b570d8cfb3f72f32ddadde869607c354
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2018-08-22 14:32:07 +02:00
Maciej Dziuban
e0e48203d2 Move BuiltIns to ExecutionEnvironment
Change-Id: Ib2a1b82cc7858c898bb32820aad106a01d1325ad
Signed-off-by: Maciej Dziuban <maciej.dziuban@intel.com>
2018-08-21 23:15:47 +02:00
Dunajski, Bartosz
931b462ee1 Disable NonAux to Aux translation for Parent Kernel
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
Change-Id: I863608fe3652e7e777a1e841d79b5b56e7362a3f
2018-08-21 15:12:25 +02:00
Mateusz Jablonski
7afba8d50b Cleanup after adding new local ids layout for images
- Apply layout for images only when Z size is equal to 1
- Fix generating local ids for local workgroup size
  when any size is not power of 2
- Revert commit c53c09da45

Change-Id: Ie745782fafce2facbd877e3e33e4ba347cb2b09e
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2018-08-21 11:27:30 +02:00
Dunajski, Bartosz
044255e9bd Pick Main Kernel for LWS and numWG in dispatchWalker()
Change-Id: I4fd0746ec77890ceacbf333966bb00a4ea99b186
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2018-08-20 13:51:35 +02:00
Dunajski, Bartosz
c7a49666d5 Refactor querying Main and Parent Kernel from MultiDispatchInfo
Change-Id: I723d91f2f445bc7af1bcb0de46f8ac07837f3449
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2018-08-17 13:51:32 +02:00
Mrozek, Michal
c53c09da45 Limit local work sizes where local ids limit is applied.
Change-Id: Id9a84d6a7d4530344771f48fd278cff9ab2dd927
2018-08-16 12:34:09 +02:00
Mateusz Jablonski
47f3dad619 Apply (2/4)x4x1 layout when generating local ids for kernel with images only
- For SIMD8 apply 2x4x1 layout
- For SIMD16/SIMD32 apply 4x4x1 layout

Change-Id: I31bceb49387011c66da5f96ad2a71125b96d4cda
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2018-08-14 12:22:20 +02:00
Dunajski, Bartosz
a5950500a3 Aux translation [4/n]: Lock BuiltIn Kernel + refactor BuiltIns locking
Change-Id: Ic7dc9b86a4aa5f93f1c4bcdf80b9598ecdff9713
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2018-08-14 10:56:16 +02:00
Lukasz Towarek
95e28faca0 Fix division by zero in enqueueKernel
Change-Id: I8e7d3db39805133a5af545e65a94fb19433a2a41
2018-08-14 09:02:17 +02:00
Dunajski, Bartosz
6ca84c278a Aux translation [3/n]: Dispatch AuxTranslation builtin when required
Change-Id: I9bd0294de7980ac01ebb3c2d696eba6fd6a456ec
2018-08-13 12:15:30 +02:00
Dunajski, Bartosz
b4f53fdfa7 Pick applicable buffers for aux translation
Change-Id: I60a28cd9e0dec61120b1ae5c42dfe0cb852eb387
2018-08-08 09:23:51 +02:00
Chodor, Jaroslaw
c10d0d79f5 Workgroup walk order
Change-Id: Id02db6a383e21dc17be64655e7f51a84103b2e0b
2018-08-07 13:54:10 +02:00
Mrozek, Michal
d80dbb1ae0 Do not take ownership on device.
- Command Stream Receiver should be used instead for locking.
- Remove not needed synchronization in clSetUserEventStatus

Change-Id: I17050dc70cb0be03b2003043a9666ba8df1a83c9
2018-08-07 09:29:50 +02:00
Dunajski, Bartosz
ec6f0f9f86 Aux translation [1/n]
- Mark Kernel for aux translation
- Initial implementation of dispatchAuxTranslation for future use

Change-Id: Ifca1c9a893876eecc5678cdc4f564b2bfcae959a
2018-08-07 09:07:25 +02:00
Mateusz Jablonski
9ae4f390d1 Remove command queue, completion stamp and device from mem obj
Remove setCompletionStamp function from Surface

Change-Id: I25f3040a91892495e55cb4924f1538276de6264e
2018-08-01 16:17:13 +02:00
Mrozek, Michal
f60847b64e Pass device to flushTask.
- do not obtain it from memory manager

Change-Id: Icc7c03dc925c69ec5932c5812151ac28dc34d20d
2018-08-01 14:11:06 +02:00
Dunajski, Bartosz
239ebf9eab Improve AllocationType operations: dont do bit operations on enums
Change-Id: Ie70ca9e2a93ec80b1cd655bad622db9e12abb7f7
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2018-07-20 09:12:58 +02:00
Mrozek, Michal
2a1feaffe9 Initialize stack variables.
Change-Id: I3fda7163cb649fe4754c46d83c641921593e1823
2018-07-19 16:19:17 +02:00
Mrozek, Michal
1dc50172d3 Remove redundant casts.
Change-Id: I9bfe4615ad0eea739ad0a780426abc007163c961
2018-07-17 19:02:36 +02:00
Hoppe, Mateusz
55a045ebe1 Refactor graphics memory allocation scheme
- replace createGraphicsAllocationWithRequiredBitness with more general
methodallocateGraphicsMemoryInPreferredPool based on passed
 AllocationData
- proper flags for allocation selected based on AllocationType

- remove allocateGraphicsMemory(size_t size, size_t alignment)
and use allocateGraphicsMemory(size_t size) instead where default
alignment is sufficient, otherwise use full options version:
allocateGraphicsMemory(size_t size, size_t alignment,
 bool forcePin, bool uncacheable)

Change-Id: I2da891f372ee181253cb840568a61b33c0d71fc9
2018-07-11 15:48:05 +02:00
Hoppe, Mateusz
684b1d75ba Refactor GraphicsAllocation::AllocationType and allocationType enums
- change GraphicsAllocatoin::AllocationType to scoped enumeration
so that ALLOCATION_TYPE_ prefix in every enum value can be removed
- all accesses are typed (example AllocationType::IMAGE)
- Rename allocationType to AllocationUsage to eliminate confusion
with multiple AllocationType enums / types

Change-Id: I16003297ecfcb0aaa5779ad00706c5d983914bbe
2018-07-06 13:00:08 +02:00
Milczarek, Slawomir
eb1b5ded9c Add support for AUB subcapture (filter and toggle modes)
This commit adds a capability to selectively enable/disable AUB capture,
i.e. by toggling the registry key from the outside or specifying the filter
with a kernel name and/or kernel start index and kernel end index.

Change-Id: Ib5d39c21863fbc4a95aa73c949b9779ff993de0f
2018-06-15 13:02:27 +02:00
Artur Harasimiuk
75ab0c6fe1 Switch clang-format to 6.0
Change-Id: Id96d1f47fb3d479d10d1022f1259dc030a148192
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2018-06-14 09:45:00 +02:00
Zdunowski, Piotr
0cc10e47cc Use device instead of context when programing surface state.
Change-Id: I67615036d373cf905762a43a92562bf3d84854a5
2018-06-11 17:20:11 +02:00
Woloszyn, Wojciech
8a488ad52f Fix reported row/slicePitch for mip-maps
- use information from gmm correctly
- modify computation on gen8

Change-Id: Iaefcc20ce9436ef70cd2f4bc36654932c4b5af49
2018-05-22 10:36:54 +02:00
Zdanowicz, Zbigniew
33fee15711 Change interface to pass Interface Descriptor Data as pointer
Change-Id: I0f33109b800a7607206954bb1e5cb0826290e6f3
2018-05-11 14:35:53 +02:00
Mrozek, Michal
34ff5852eb Add capability to csr to allow N:1 aggregation when ooq is created.
- This allows applications to force the N:1 aggregation by creating out
of order queue.
- That switches csr to N:1 submission model where commands from multiple
command streams may be aggregated.
- That forces scenarios returning an event to be aggregated as well.

Change-Id: I8fd8d7f88bb2665234ee90870133120b206710a8
2018-04-26 15:41:20 +02:00
Mrozek, Michal
8d2df3c332 Move indirect heaps from command queues to csr.
-This is required to enable N:1 submission model.
-If heaps are coming from different command queues that always
mean that STATE_BASE_ADDRESS needs to be reloaded
-In order to not emit any non pipelined state in CSR, this change
moves the ownership of IndirectHeap to one centralized place which is
CommandStreamReceiver
-This way when there are submissions from multiple command queues then
they reuse the same heaps, therefore preventing SBA reload

Change-Id: I5caf5dc5cb05d7a2d8766883d9bc51c29062e980
2018-04-26 14:05:40 +02:00
Pawel Wilma
a0c044e6d2 Extend batch buffer flattening in AubCSR to BatchedDispatch mode
- batch buffer flatening in batched mode
    - added MI_USER_INTERRUPT command
    - added GUC Work Queue Item

Change-Id: I35142da34b30d3006bb4ffc1521db7f6ebe68ebc
2018-04-26 12:45:02 +02:00
Zdanowicz, Zbigniew
d94b853c32 Separate HW commands from class declaration and interface
Change-Id: Ia49098171f4b7814c42a35686354713a322c9df7
2018-04-25 14:03:00 +02:00
Mrozek, Michal
ce8c44cae3 Add check for local work group size in clEnqueueNDRangeKernel call.
- Incoming local work group size cannot exceed device capabilities.

Change-Id: I89a7503155c71443e3ebc630debb5d5b466c6cb5
2018-04-20 08:16:16 +02:00
Hoppe, Mateusz
83160213f0 Fix problems in thkWrapper and SharingHandler
- ThkWrapper had uninitialized mFunc member, setting it
to nullptr

- D3DSurface could dereference null image pointer,
adding validateUpdateData method in SharingHandler
that may return CL_INVALID_MEM_OBJECT if memObject is invalid

Change-Id: Iaa4499bcea47baca156c9d28be4c93ba4f0e1ebb
2018-04-19 15:04:38 +02:00
Artur Harasimiuk
75d497a9a9 separate BuiltinDispatchInfoBuilder from built_ins.h
We don't need BuiltinDispatchInfoBuilder in every place where built ins
are used. specifically in .cpp files generated from kernel binary.

Change-Id: Ie739951cdc93873993f78ad14cee656122af51fd
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2018-04-19 12:32:13 +02:00
Mrozek, Michal
d900bdffc6 [33/n] Internal 4GB allocator.
- Move indirect heap to internal allocator domain.
- Add logic in getIndirectHeap to allocate with proper API depending on
heap type
- Add State base Address programming, reflecting that now Indirect Object
Heap is placed in 4GB domain.
- For AddPatchInfoCommentsForAUBDump mode , keep all heaps in non 4GB mode.

Change-Id: I6862f6a249e444d0d6cfe7e499a10d43f284553e
2018-04-19 08:13:48 +02:00
Mrozek, Michal
8583c68c8c [29/n] Internal 4GB allocator.
- Internal allocations may now coexists with non internal on reusable list.
- Caller now specifies if internal allocation is needed.
- If criteria are not met , then allocation is not returned.

Change-Id: I7da3a4f944768b7c8a873e44fd47248f1d76bf9e
2018-04-17 06:42:56 +02:00
Mrozek, Michal
87b8b6e261 [28/n] Internal 4GB allocator.
Avoid default parameter to getIndirectHeap.

Change-Id: I105ceaa4b5e9b23ce8dc96631410b9535e5a44e0
2018-04-16 17:56:49 +02:00
Artur Harasimiuk
cb064abb04 fix mapImage for 1D_ARRAY
There are differences in qPitch programming between Gen8 vs Gen9+
devices and this requires special operation when image is zero-copy.

For Gen8 qPitch is distance in rows while Gen9+ it is in pixels.
Minimum value of qPitch is 4 and this causes slicePitch = 4*rowPitch on
Gen8.

To allow zero-copy we have to tell what is correct value rowPitch which
should equal to slicePitch.

Change-Id: I58dea004e3c7f9f4dfabd154d02749c15b6b0246
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
2018-04-16 16:13:51 +02:00
Zdanowicz, Zbigniew
e51cb6bd0b Separate struct EnqueueOperation declaration and implementation
Change-Id: I537660867a1c98f957280237c14b7a1554fce3db
2018-04-10 16:36:48 +02:00
Chodor, Jaroslaw
6bf4135def Fix for externally synchronized events
When inheriting task count from parent events,
don't take into account externally synchronized events

Change-Id: I52d861e482669a18e2aca499c813716bb4951b74
2018-04-09 12:12:58 +02:00
Mrozek, Michal
ffa9b097f5 [26/n] Internal 4GB allocator.
- change the way we handle blocked commands.
- instead of allocating CPU pointer and populating it with commands, create
real IndirectHeap that may be later submitted to the GPU
- that removes a lot of copy operations that were happening on submit time
- for device enqueue, this requires dsh & shh to be passed directly to the
underlying commands, in that scenario device queue buffers are not used

Change-Id: I1124a8edbb46777ea7f7d3a5946f302e7fdf9665
2018-04-09 10:47:37 +02:00
Chodor, Jaroslaw
0a97dfbb2f [1/n] Mipmap support
* adding support for map/unmap
* adding support for origin/region validation with mipmaps
* fixing slices returned in map/unmap
* removing ambiguity around mipLevel naming
* enabling cl_khr_mipmap_image in current shape
* enabling cl_khr_mipmap_image_writes in current shape

* fixing CompileProgramWithReraFlag test

Change-Id: I0c9d83028c5c376f638e45151755fd2c7d0fb0ab
2018-04-05 01:09:27 +02:00
Zdanowicz, Zbigniew
b6b92ae808 Create GpgpuWalkerHelper class
Change-Id: Ia9aa7b816356aff57234b46ea3509b6bd9b7f14b
2018-04-04 16:42:16 +02:00
Mrozek, Michal
cbcf77ae49 Fix out of bound problem while estimating Indirect Object Heap.
- While estimating the required size of Indirect Object Heap we were not
handling properly the lack of local ids case
- In such case we should allocate one GRF per HW thread that will be unused

Change-Id: Ibcd359e431e3ffd9d55628ac7cf7eeefad72e7ba
2018-04-03 17:45:28 +02:00
Hoppe, Mateusz
4703417813 Use correct virtual addresses in TBX CSR makeCoherent method
- cpu virtual address was used instead of gpu va
- this caused incorrect behaviour of TBX server when
special heap allocator assigning GPU addresses was used

Change-Id: I2328cf2441be797311fd6a3c7b331b0fff79d4fc
2018-04-03 15:54:07 +02:00
Milczarek, Slawomir
b56289a507 User space AUBs capable of memory re-dumps on CPU-side memory modifications.
Any CPU related updates such as clEnqueueMapBuffer or similar
need to trigger a re-dump of memory prior to the next clEnqueue call.

Change-Id: I7b31e559278e92ff55b6ebab8ef4190caef1ebc0
2018-04-03 15:40:29 +02:00
Mrozek, Michal
e4c25f11de [25/n] Internal 4GB allocator.
- Do not obtain pattern allocation from reusable pool.
- This is due to the fact that it may contain allocations from internal
heap, which cannot be used for arguments declared as kernel argument.

Change-Id: I6c73445c409edc4ce25f8d8eba966f512dfd6cc9
2018-03-30 14:59:11 +02:00
Mrozek, Michal
5dc0a7c731 Remove default value for dispatchWalker parameter.
Change-Id: I0676a353a4364339664edc416e36da37a345a4f6
2018-03-30 12:57:42 +02:00
Mrozek, Michal
de315db953 [24/n] Internal 4GB allocator.
- Refactor tests for better maintenance
- Remove duplicated code.

Change-Id: I154cad43610497d2e1cabf99217820735d3868cd
2018-03-30 09:12:08 +02:00