compute-runtime

Commit Graph

Author	SHA1	Message	Date
Dunajski, Bartosz	c7a49666d5	Refactor querying Main and Parent Kernel from MultiDispatchInfo Change-Id: I723d91f2f445bc7af1bcb0de46f8ac07837f3449 Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>	2018-08-17 13:51:32 +02:00
Dunajski, Bartosz	a5950500a3	Aux translation [4/n]: Lock BuiltIn Kernel + refactor BuiltIns locking Change-Id: Ic7dc9b86a4aa5f93f1c4bcdf80b9598ecdff9713 Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>	2018-08-14 10:56:16 +02:00
Dunajski, Bartosz	6ca84c278a	Aux translation [3/n]: Dispatch AuxTranslation builtin when required Change-Id: I9bd0294de7980ac01ebb3c2d696eba6fd6a456ec	2018-08-13 12:15:30 +02:00
Dunajski, Bartosz	b4f53fdfa7	Pick applicable buffers for aux translation Change-Id: I60a28cd9e0dec61120b1ae5c42dfe0cb852eb387	2018-08-08 09:23:51 +02:00
Mrozek, Michal	d80dbb1ae0	Do not take ownership on device. - Command Stream Receiver should be used instead for locking. - Remove not needed synchronization in clSetUserEventStatus Change-Id: I17050dc70cb0be03b2003043a9666ba8df1a83c9	2018-08-07 09:29:50 +02:00
Dunajski, Bartosz	ec6f0f9f86	Aux translation [1/n] - Mark Kernel for aux translation - Initial implementation of dispatchAuxTranslation for future use Change-Id: Ifca1c9a893876eecc5678cdc4f564b2bfcae959a	2018-08-07 09:07:25 +02:00
Mateusz Jablonski	9ae4f390d1	Remove command queue, completion stamp and device from mem obj Remove setCompletionStamp function from Surface Change-Id: I25f3040a91892495e55cb4924f1538276de6264e	2018-08-01 16:17:13 +02:00
Mrozek, Michal	f60847b64e	Pass device to flushTask. - do not obtain it from memory manager Change-Id: Icc7c03dc925c69ec5932c5812151ac28dc34d20d	2018-08-01 14:11:06 +02:00
Milczarek, Slawomir	eb1b5ded9c	Add support for AUB subcapture (filter and toggle modes) This commit adds a capability to selectively enable/disable AUB capture, i.e. by toggling the registry key from the outside or specifying the filter with a kernel name and/or kernel start index and kernel end index. Change-Id: Ib5d39c21863fbc4a95aa73c949b9779ff993de0f	2018-06-15 13:02:27 +02:00
Mrozek, Michal	34ff5852eb	Add capability to csr to allow N:1 aggregation when ooq is created. - This allows applications to force the N:1 aggregation by creating out of order queue. - That switches csr to N:1 submission model where commands from multiple command streams may be aggregated. - That forces scenarios returning an event to be aggregated as well. Change-Id: I8fd8d7f88bb2665234ee90870133120b206710a8	2018-04-26 15:41:20 +02:00
Pawel Wilma	a0c044e6d2	Extend batch buffer flattening in AubCSR to BatchedDispatch mode - batch buffer flatening in batched mode - added MI_USER_INTERRUPT command - added GUC Work Queue Item Change-Id: I35142da34b30d3006bb4ffc1521db7f6ebe68ebc	2018-04-26 12:45:02 +02:00
Artur Harasimiuk	75d497a9a9	separate BuiltinDispatchInfoBuilder from built_ins.h We don't need BuiltinDispatchInfoBuilder in every place where built ins are used. specifically in .cpp files generated from kernel binary. Change-Id: Ie739951cdc93873993f78ad14cee656122af51fd Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>	2018-04-19 12:32:13 +02:00
Mrozek, Michal	87b8b6e261	[28/n] Internal 4GB allocator. Avoid default parameter to getIndirectHeap. Change-Id: I105ceaa4b5e9b23ce8dc96631410b9535e5a44e0	2018-04-16 17:56:49 +02:00
Zdanowicz, Zbigniew	e51cb6bd0b	Separate struct EnqueueOperation declaration and implementation Change-Id: I537660867a1c98f957280237c14b7a1554fce3db	2018-04-10 16:36:48 +02:00
Chodor, Jaroslaw	6bf4135def	Fix for externally synchronized events When inheriting task count from parent events, don't take into account externally synchronized events Change-Id: I52d861e482669a18e2aca499c813716bb4951b74	2018-04-09 12:12:58 +02:00
Mrozek, Michal	ffa9b097f5	[26/n] Internal 4GB allocator. - change the way we handle blocked commands. - instead of allocating CPU pointer and populating it with commands, create real IndirectHeap that may be later submitted to the GPU - that removes a lot of copy operations that were happening on submit time - for device enqueue, this requires dsh & shh to be passed directly to the underlying commands, in that scenario device queue buffers are not used Change-Id: I1124a8edbb46777ea7f7d3a5946f302e7fdf9665	2018-04-09 10:47:37 +02:00
Zdanowicz, Zbigniew	b6b92ae808	Create GpgpuWalkerHelper class Change-Id: Ia9aa7b816356aff57234b46ea3509b6bd9b7f14b	2018-04-04 16:42:16 +02:00
Mrozek, Michal	2be5934096	[21/n] Remove Instruction Heap from enqueue path. - This removes Instruction Heap allocation from enqueue path - Blocked path is handled as well - Heap is no longer allocated on demand it is bind to kernelInfo. Change-Id: I54545beceed3404ee0330a8bac2b0934944cac30	2018-03-28 20:15:55 +02:00
Mrozek, Michal	9bdf01468e	[20/n] Internal 4GB allocator. - Switch to internal heap for kernel ISA allocations. - remove IH from various functions - remove IHState from CSR , IH is never dirty - ISA is no longer copied on enqueue calls. Change-Id: I0099cf2a9ebab6192ea03a74dd35f7da963fd5a5	2018-03-28 16:07:26 +02:00
Mrozek, Michal	09923fcb39	[17/n] Internal 4GB allocator. - Make sure that blocks ISA is made resident - both blocked & non blocked path - fix a bug where private surface was not made resident in blocked path. Change-Id: Ie564595b176b94ecc7c79d7efeae20598c5874fb	2018-03-27 10:33:22 +02:00
Hoppe, Mateusz	7f32eb06d1	Kernel Source Level debugger support 4/n - adding DebugSurface allocation and setup - unit tests refactors: - mock kernel with kernel debug option - separating fixtures to headers - added helper for getting internal-options kernels filenames Change-Id: I7b6f4d46e2ab7cff0da8d5212483f44ae0d4be31	2018-03-26 15:02:42 +02:00
Pawel Wilma	ff1d2361f3	Add patch info comments to AUB dump Collect patching information and add as comments to AUB dump. Change-Id: Ib7c903a2589d68b6e3e614c1774c7cd5a000c29f	2018-03-23 13:08:54 +01:00
Mrozek, Michal	d7fe01454b	Make sure that gtpin callbacks are not executed in enqueue path. -This is to make sure those functions are not called when gtpin is not used -This preserves CPU instruction cache pollution. -Our enqueue path needs to be as thin as possible, even with this small change there is visible gain in ULT execution time. Change-Id: I44cc2144754cda95ca1fe058184cd8a151b8d35c	2018-03-23 12:54:17 +01:00
Dunajski, Bartosz	516082e7c5	Kmd notify improvements [1/n]: Quick KMD sleep optimization - KmdNotifyProperties struct for CapabilityTable that can be extended by incoming KmdNotify related optimizations - Quick KMD sleep optimization that is called from async events handler - Optimization makes a taskCount check in busy loop with much smaller delay than basic version of KMD Notify optimization Change-Id: I60c851c59895f0cf9de1e1f21e755a8b4c2fe900	2018-03-21 20:41:33 +01:00
Hoppe, Mateusz	a1a20a3b34	Service read only memory passed as host_ptr - read only memory cannot be used for allocation, Oses cannot create graphics alocation for such memory - if memory allocation fails for host_ptr passed to enqueueWrite calls, then try doing new allocation and copy host_ptr on cpu Change-Id: I415a4673ae1319ea8f77e53bd8fba7489fe85218	2018-03-14 13:16:36 +01:00
Dunajski, Bartosz	1fce275542	Remove forced DC flush and disabled out of order execution for shared objects Change-Id: I0de86c3d5af488a347e83858f5dddbac2ef53c17	2018-03-05 09:45:18 +01:00
Zdanowicz, Zbigniew	533afe472a	Program preemption mode in Interface Descriptor Data Change-Id: I7fce731d71dd0b6dc8505ebfe45d24c65898a08b	2018-03-05 09:36:53 +01:00
Mrozek, Michal	3da9df23a9	Flush DC in case shared objects are used. - Due to use cases where one shared buffer may be mapped to multiple CL buffers we need to flush DC between enqueues. Change-Id: I05d7f844afe31d52a0004f5e2e5efa776f9dadbe	2018-02-26 15:51:06 +01:00
Dunajski, Bartosz	1292c3d533	Improve thread arbitration policy programming Change-Id: Ibd764352e14d1a5112034b1c5a1fc6d6d67ebac0	2018-02-20 11:05:54 +01:00
hjnapiat	5909a6b3d3	Add support for GT-Pin Callbacks [3/n] Change-Id: Iea4b49efc9a666fde310ece15a9c69686d22f627	2018-02-19 10:43:19 +01:00
Zdanowicz, Zbigniew	86bb715b95	HostPtr surface makeResident must be called once Change-Id: I9cb04e3affdd8b8634466621b50326a088ecdcf9	2018-02-16 11:11:37 +01:00
Zdanowicz, Zbigniew	45dedb37f3	For HostPtr surfaces of enqueue calls use GPU address Change-Id: I67bf5076d23d43438f5e82c5cb6cbd3b9ed2f152	2018-02-14 15:44:27 +01:00
Mrozek, Michal	b5dab07aa2	Do not allow out of order execution for shared objects. Change-Id: I2dbbd8f09485bd894774eb2c4548326475a41221	2018-02-12 10:36:23 +01:00
Dunajski, Bartosz	72b78d15ee	Map/unmap enqueue fixes [1/n]: Unify Buffer and Image paths Change-Id: I59bf18072c15367ff6caec5dbdc1350ea2d93281	2018-02-09 17:35:03 +01:00
Mrozek, Michal	6bb83fb95a	Do not noop pipe controls if call is returning event on IOQ. -For in order queue application can have fine grain granularity of completion -For out of order queue application wants to execute workloads concurrently -This change disables pipe control nooping for ioq calls when event returned. Change-Id: Iaeaf677f768f7434b2efa1842b50653ab80777ad	2018-02-09 11:57:44 +01:00
Mrozek, Michal	d8f2142faa	Enable out of order execution for all submissions. - This change enabled multiple independent command queues to execute concurrently without stalling pipe controls in between - This change removes L3 flushes between kernels - Dependencies between commands are resolved via task level mechanism - Out of order queues are not changing task level between submissions - In order queues are increasing task level between submissions - Whenever task level changes there is pipe control with cs stall emitted between GPGPU_WALKERs Change-Id: I558653b296424e4775d060df3072e2a50684b715	2018-02-08 08:22:04 +01:00
mplewka	251de14ee6	Enable zero copy for enqueueWriteBufferRect with hint Change-Id: I411f00b98056307906c02d34e793cefe460735ba	2018-01-25 11:48:10 +01:00
Zdunowski, Piotr	0b6b12ea57	Globally enable priority hints extension. Change-Id: I9f3b8d3cf1bedb41d9e0622ff514bf76b4518d8c	2018-01-24 20:07:31 +01:00
Zdanowicz, Zbigniew	602474f868	Command streamers should use device default engine type Change-Id: I7286f15ba78001729ea489a43576d96f109d44f0	2018-01-16 22:37:44 +01:00
Mrozek, Michal	af77720f9c	Fix resource destruction scheme on device closure. - Call waitForTaskCountAndCleanAllocationList with latest flushed task count to reflect what was actually sent to HW. - refactor cleanAllocationList to waitForTaskCountAndCleanAllocationList Change-Id: I5301185c5fce212e39eb017b952b43c279559cf4	2018-01-15 18:45:48 +01:00
Mrozek, Michal	a8b91c8c99	Refactor deducing blocked state and task level. - Do this with one helper function. Change-Id: I81dd3107a98db7e45a691ba6d5e708d98eabe3d2	2018-01-03 17:44:51 +01:00
Hoppe, Mateusz	a9f30a5059	Fix for Execution model PageFaults - adding PC with MediaStateClear and MEDIA_VFE_STATE in EMCleanupSection Change-Id: I0ee0e121bc2fcc09ac79cb3b601591247326482a	2017-12-22 11:49:56 +01:00
Mrozek, Michal	2a00a15141	[n/n] Remove event registry. Change-Id: Ie7da0f2dc944583771aaa80648217602ccff99ce	2017-12-21 14:46:27 +01:00
Brandon Fliflet	7e9ad41290	Initial commit Change-Id: I4bf1707bd3dfeadf2c17b0a7daff372b1925ebbd	2017-12-21 00:45:38 +01:00

1 2 3 4

194 Commits