Commit Graph

63 Commits

Author SHA1 Message Date
Mrozek, Michal
7644209288 Add debug flag to dump dispatch parameters.
- Also refactor debug manager tests , they now check for default value
in igdrcl.config file
- There is no need to write dedicated tests now , so I remove them.

Change-Id: Ib338ca05b6059302c29469c673239e7886dc4b9b
2018-03-16 11:13:35 +01:00
Dunajski, Bartosz
c0d3eade30 Disable wait timeout when flushStamp is 0
Change-Id: I416ace1f8c1a3e5aa91d9bc2425a4faa77e2fbe7
2018-03-15 15:44:17 +01:00
Mrozek, Michal
93fc48339b [15/n] Internal 4GB allocator.
- Make resident on kernel ISA for blocked and non blocked path.

Change-Id: I1fc4948f1abb73c6f7028ae15dccad820101b8dc
2018-03-14 15:04:30 +01:00
Hoppe, Mateusz
a1a20a3b34 Service read only memory passed as host_ptr
- read only memory cannot be used for allocation,
Oses cannot create graphics alocation for such memory
- if memory allocation fails for host_ptr passed
to enqueueWrite calls, then try doing new allocation
and copy host_ptr on cpu

Change-Id: I415a4673ae1319ea8f77e53bd8fba7489fe85218
2018-03-14 13:16:36 +01:00
Mrozek, Michal
8254d6a081 Ensure that submissions are flushed prior to csr destruction.
Change-Id: Ie04de561d3d295f40f55a19f01274d873d259abd
2018-03-12 12:54:47 +01:00
Dunajski, Bartosz
23c1c4fea6 clEnqueueMapImage origin and region usage fixes
- Return error on origin > 0 or region > 1 when its not allowed
- For 1Darray, array region and origin are stored on 2nd position.
  For 2Darray, its on 3rd postion
- Fix map offset for 1Darray image
- Fix CPU data transfer for 1Darray image

Change-Id: Id35ba5f54f117e7af318ca7e6e03c1fc942ce729
2018-03-08 08:54:48 +01:00
Dunajski, Bartosz
0659ee0896 Set valid origin on clEnqueueMapImage ULTs
Change-Id: I67853b7f7d7f7d4bc5475330715490e188a42b6d
2018-03-07 15:24:37 +01:00
Mateusz Jablonski
0afd7a9ec1 Cmake refactor part 8
igdrcl_tests: define sources in subdirectories A-C

Change-Id: Iad8e4e866c4b0b8ccf679313e46ef6f0e5deac50
2018-03-06 20:53:43 +01:00
Mrozek, Michal
1602fa5a88 [7/n] Internal 4GB allocator
- rename getBase to getCpuBase
- change some test names accordingly.

Change-Id: I6fb2e4714298250147ea7766a916d7f5d62edc54
2018-03-05 22:16:14 +01:00
Dunajski, Bartosz
1fce275542 Remove forced DC flush and disabled out of order execution for shared objects
Change-Id: I0de86c3d5af488a347e83858f5dddbac2ef53c17
2018-03-05 09:45:18 +01:00
Zdanowicz, Zbigniew
533afe472a Program preemption mode in Interface Descriptor Data
Change-Id: I7fce731d71dd0b6dc8505ebfe45d24c65898a08b
2018-03-05 09:36:53 +01:00
Mrozek, Michal
cd747b7b8c Change notify delay to use microseconds.
- Microseconds offer better precision.
- Some workloads require threshold less then 1 millisecond to work
efficiently.

Change-Id: I1a565049340fb6eeebe5c0a61ededae9959daca8
2018-02-27 09:10:49 +01:00
Mrozek, Michal
3da9df23a9 Flush DC in case shared objects are used.
- Due to use cases where one shared buffer may be mapped to multiple CL
buffers we need to flush DC between enqueues.

Change-Id: I05d7f844afe31d52a0004f5e2e5efa776f9dadbe
2018-02-26 15:51:06 +01:00
Dunajski, Bartosz
dd44a87d5f Map/unmap enqueue fixes [6/n]: Support multiple map operations
- Dont make cpu/gpu writes on read-only unmap
- Read/Write on limited map range only
- Overlaps checks for non read-only maps
- Fixed cmd type on returned event

Change-Id: I98ca542e8d369d2426a87279f86cadb0bf3db299
2018-02-23 10:45:06 +01:00
Dunajski, Bartosz
b4f79e036f Map/unmap enqueue fixes [5/n]: Unify offset calculation
Change-Id: I53eafe89532d43c5cf5139ed3fac0a87619dc7a3
2018-02-21 20:12:52 +01:00
Dunajski, Bartosz
f6825252fc Map/unmap enqueue fixes [4/n]: Return slice/row pitch
When queue is blocked on non-blocking call, map operation is added to
waitlist dependencies. Returning slice/row pitch for map image was skipped

Change-Id: I46f97590315e7aee7fbbfbdb615f383cdb666307
2018-02-20 14:30:35 +01:00
Zdanowicz, Zbigniew
86bb715b95 HostPtr surface makeResident must be called once
Change-Id: I9cb04e3affdd8b8634466621b50326a088ecdcf9
2018-02-16 11:11:37 +01:00
Dunajski, Bartosz
e0ca78ccea Map/unmap enqueue fixes [3/n]: Map params inconsistency
- Introducing MapInfo struct which will be used as container for multiple
  map operations
- Unified mapped offset and size for Buffers and Images
- Fixed incorrect map params for CPU and GPU path
- Missing API level checks


Change-Id: Ib4077c9e2c0c333b131ffd5ccbc4a1404920eb5b
2018-02-16 08:28:29 +01:00
Mrozek, Michal
acb044dce3 Fix DC flush programming in non concurrent scenarios.
-If out of order flag was disabled then pipe control was not having dc flush.
-This could led to a batch buffer that doesn't end with dc flush.
-This change adds differentiation between pipe controls that may be erased and
pipe controls that are used as a part of epilogue command

Change-Id: Ic9c970c75c89ff524a0e40506eff6dd097760145
2018-02-15 09:42:11 +01:00
Mrozek, Michal
2d0af9d4a4 Make sure that local workgroup size is properly passed for IOH estimation.
Change-Id: I0ad5da4fffd1575f64d44803ce8eb4a6a0ab1532
2018-02-15 07:57:39 +01:00
Zdanowicz, Zbigniew
45dedb37f3 For HostPtr surfaces of enqueue calls use GPU address
Change-Id: I67bf5076d23d43438f5e82c5cb6cbd3b9ed2f152
2018-02-14 15:44:27 +01:00
Mrozek, Michal
d563059c14 Remove redundant code from flushWaitList.
Change-Id: Iab4cb856ce324a785b052b8638ef23aef43c9bc9
2018-02-13 10:40:33 +01:00
Mrozek, Michal
b5dab07aa2 Do not allow out of order execution for shared objects.
Change-Id: I2dbbd8f09485bd894774eb2c4548326475a41221
2018-02-12 10:36:23 +01:00
Dunajski, Bartosz
72b78d15ee Map/unmap enqueue fixes [1/n]: Unify Buffer and Image paths
Change-Id: I59bf18072c15367ff6caec5dbdc1350ea2d93281
2018-02-09 17:35:03 +01:00
Mrozek, Michal
6bb83fb95a Do not noop pipe controls if call is returning event on IOQ.
-For in order queue application can have fine grain granularity of completion
-For out of order queue application wants to execute workloads concurrently
-This change disables pipe control nooping for ioq calls when event returned.

Change-Id: Iaeaf677f768f7434b2efa1842b50653ab80777ad
2018-02-09 11:57:44 +01:00
Hoppe, Mateusz
012b8bd73c Adding initial PreemptionMode::Initial
- account for initial setting (when set mode was equal to initial(Disabled))
estimate size in cmdStreamCS, program MMIO

Change-Id: Ice218ae986583c8f3bab4f4f6979e38f03e30d7e
2018-02-08 16:21:52 +01:00
mplewka
4db1e3af6a Check zeroCopy flag for r/w images/buffers
Change-Id: I7047ae8458bdf3528d6014137522a37561d15ab6
2018-02-08 13:55:44 +01:00
Mateusz Jablonski
ea021f8d69 Cmake refactor part 1: fix dependencies with including os_inc.h
Remove some not needed includes

Change-Id: I158ad663ccfcec4822e3768df9d05090c5e096f9
2018-02-08 09:40:40 +01:00
Mrozek, Michal
d8f2142faa Enable out of order execution for all submissions.
- This change enabled multiple independent command queues to execute
concurrently without stalling pipe controls in between
- This change removes L3 flushes between kernels
- Dependencies between commands are resolved via task level mechanism
- Out of order queues are not changing task level between submissions
- In order queues are increasing task level between submissions
- Whenever task level changes there is pipe control with cs stall emitted
between GPGPU_WALKERs

Change-Id: I558653b296424e4775d060df3072e2a50684b715
2018-02-08 08:22:04 +01:00
mplewka
21c1dce943 Enable zero copy for enqueueImage r/w with hints
Change-Id: I6d4379b4bebaca162f859ea790f6a77475f7e94e
2018-02-06 19:00:15 +01:00
Mrozek, Michal
28758fc336 [2/n] Optimize CPU code
-Do not inc/dec reference count for flush stamps while used only for
update
-FlushStamp doesn't need to be atomic,replace with atomic bool flag
to prevent usage while uniinitialized
-Clean not needed private new

Change-Id: Idad2b318f988de1e7af7642047c67f931e9772aa
2018-02-05 11:02:17 +01:00
Dunajski, Bartosz
80eefc79f3 Fix normalizing factor for SNORM formats
Change-Id: I4febe3a557762b94c0c4445015c948d45a4390d2
2018-02-02 16:10:48 +01:00
Mrozek, Michal
e35a066f79 Change the instruction heap size to be at least 512KB.
- Instruction heap is currently heavily used as every kernel copies ISA into
 it.
- It dries out very fast and each change to new heap requires whole pipeline
drain that prevents concurrency
- Problem is even larger when sip kernel is used as it limits the total heap
size
- In order to maximize heap re-use and to limit the count of pipeline drains
this change introduces new minimal size for instruction heap 512 KB.

Change-Id: Ic54e9ef4448b1d35dab01b084ee1d59b509642cb
2018-02-01 13:10:39 +01:00
Mrozek, Michal
37c7e27276 Fix heap size programming.
- In various scenarios code was not programming the max heap size correctly
- It was possible for SSH to overcome the limit
- Size was programmed smaller then it really was, which resulted in smaller
reuse, which led to SBA reprogramming which led to lower performance in ooq
scenarios
- This change fixes the heap size programming by always utilizing full
allocation size and always limiting SSH at proper value

Change-Id: Ib703d2b0709ed8227a293def3a454bf1bb516dfd
2018-01-31 17:35:32 +01:00
mplewka
377fc8d20b Enable zero copy for enqueueReadBufferRect with hint
Change-Id: I4e7d89edfcff2674e7c163d70ad974d3464bf64f
2018-01-25 13:17:59 +01:00
mplewka
251de14ee6 Enable zero copy for enqueueWriteBufferRect with hint
Change-Id: I411f00b98056307906c02d34e793cefe460735ba
2018-01-25 11:48:10 +01:00
Mrozek, Michal
274c8084a3 For devices with small HW thread count, limit the available pool of LWS.
Change-Id: Ib3c0fea3e0422dae3bc93b891aab087ad597776e
2018-01-24 14:30:39 +01:00
mplewka
2c2bbbcdbb Add support for zero-copy r/w buffer
Change-Id: Ie9f3f2211d107eb338bd97692d36e9c7d7a0feab
2018-01-22 09:40:51 +01:00
Mateusz Jablonski
13ac81f465 Change pipeline select programing
- Program one PS with gpgpu selection and media sampler
- Program PS only when media sampler requirement changed
  or when preamble was not sent

Change-Id: I85ba3f74087733e79d048e120aeb8b4b04796e00
2018-01-18 14:39:47 +01:00
Zdunowski, Piotr
5e7eccefe5 Improve error handling for shared objects.
Change-Id: I86fccb26cbf327b49c1b4992eeb3d25e52d3bced
2018-01-17 21:32:36 +01:00
Chodor, Jaroslaw
044fd1ab81 Fixing IntDescr programing for blocked cmd and MT
Fixing InterfaceDescriptor programming for
blocked commands when MidThread preemption is
enabled
Additionally, fixing couple of tests that block
global preemption enabling in ULTs

Change-Id: I454c9608f8606f23d7446785ac24c7c7d8701ae0
2018-01-17 12:19:07 +01:00
Mrozek, Michal
41f0ac3019 Check if we do not access outside of array.
Change-Id: I3357b745d36398ad52777054f64a7915278c0463
2018-01-17 09:33:57 +01:00
Zdanowicz, Zbigniew
602474f868 Command streamers should use device default engine type
Change-Id: I7286f15ba78001729ea489a43576d96f109d44f0
2018-01-16 22:37:44 +01:00
Mateusz Jablonski
be6f211910 Add pipeline select mask bits getter in preamble helper
Change-Id: I783c911ad69916a979e58256a8705d22a86f6a41
2018-01-16 16:51:17 +01:00
Mrozek, Michal
dd601ff73a Utilize shortened version of optimal HW thread count in nx4 scenarios.
- also clean early return if simd size = 0

Change-Id: I9b01df091ab6dd6a3066d1a8762c7fb1530c2804
2018-01-16 14:47:07 +01:00
Mrozek, Michal
ee250be942 Fix num thread per slice computation.
- It should use thread count not EU count.
- change variable name to reflect that we work on sublices.
- fix test description, add missing test
- change hasBarrier variable to be boolean

Change-Id: I627bdf17b661d2f9b5eb3d8cd6ca53eba5d46b81
2018-01-16 13:06:31 +01:00
Mrozek, Michal
8ee2c54a50 Disable squared algorithm.
Change-Id: Ibecbd75b97596e56efc92445f46a4f2a4768a351
2018-01-16 11:20:26 +01:00
Mrozek, Michal
af77720f9c Fix resource destruction scheme on device closure.
- Call waitForTaskCountAndCleanAllocationList with latest flushed task count
to reflect what was actually sent to HW.

- refactor cleanAllocationList to waitForTaskCountAndCleanAllocationList

Change-Id: I5301185c5fce212e39eb017b952b43c279559cf4
2018-01-15 18:45:48 +01:00
Mrozek, Michal
7640201585 Allow squared algorithm to work together with base one.
Change-Id: I9087957bb427a422b1be632f6375c96b8f91a492
2018-01-12 12:05:04 +01:00
Chodor, Jaroslaw
d290955a57 Preemption - SIP command programming
Change-Id: I4c7c805a77a9decb8f13d39055bfb2590209ca3e
2018-01-10 16:43:29 +01:00