- Switch to internal heap for kernel ISA allocations.
- remove IH from various functions
- remove IHState from CSR , IH is never dirty
- ISA is no longer copied on enqueue calls.
Change-Id: I0099cf2a9ebab6192ea03a74dd35f7da963fd5a5
- Measure time between wait calls. If delay is exeeded use QuickKmdSleep
- Kmd Notify helper functions
- Refactor overriding from debug variables
- Refactor Kmd Notify tests
Change-Id: I123c31f492d98fd304184f99ee0bf7d733d06f04
- KmdNotifyProperties struct for CapabilityTable that can be extended by
incoming KmdNotify related optimizations
- Quick KMD sleep optimization that is called from async events handler
- Optimization makes a taskCount check in busy loop with much smaller
delay than basic version of KMD Notify optimization
Change-Id: I60c851c59895f0cf9de1e1f21e755a8b4c2fe900
- Also refactor debug manager tests , they now check for default value
in igdrcl.config file
- There is no need to write dedicated tests now , so I remove them.
Change-Id: Ib338ca05b6059302c29469c673239e7886dc4b9b
- read only memory cannot be used for allocation,
Oses cannot create graphics alocation for such memory
- if memory allocation fails for host_ptr passed
to enqueueWrite calls, then try doing new allocation
and copy host_ptr on cpu
Change-Id: I415a4673ae1319ea8f77e53bd8fba7489fe85218
- Return error on origin > 0 or region > 1 when its not allowed
- For 1Darray, array region and origin are stored on 2nd position.
For 2Darray, its on 3rd postion
- Fix map offset for 1Darray image
- Fix CPU data transfer for 1Darray image
Change-Id: Id35ba5f54f117e7af318ca7e6e03c1fc942ce729
- Microseconds offer better precision.
- Some workloads require threshold less then 1 millisecond to work
efficiently.
Change-Id: I1a565049340fb6eeebe5c0a61ededae9959daca8
- Due to use cases where one shared buffer may be mapped to multiple CL
buffers we need to flush DC between enqueues.
Change-Id: I05d7f844afe31d52a0004f5e2e5efa776f9dadbe
- Dont make cpu/gpu writes on read-only unmap
- Read/Write on limited map range only
- Overlaps checks for non read-only maps
- Fixed cmd type on returned event
Change-Id: I98ca542e8d369d2426a87279f86cadb0bf3db299
When queue is blocked on non-blocking call, map operation is added to
waitlist dependencies. Returning slice/row pitch for map image was skipped
Change-Id: I46f97590315e7aee7fbbfbdb615f383cdb666307
- Introducing MapInfo struct which will be used as container for multiple
map operations
- Unified mapped offset and size for Buffers and Images
- Fixed incorrect map params for CPU and GPU path
- Missing API level checks
Change-Id: Ib4077c9e2c0c333b131ffd5ccbc4a1404920eb5b
-If out of order flag was disabled then pipe control was not having dc flush.
-This could led to a batch buffer that doesn't end with dc flush.
-This change adds differentiation between pipe controls that may be erased and
pipe controls that are used as a part of epilogue command
Change-Id: Ic9c970c75c89ff524a0e40506eff6dd097760145
-For in order queue application can have fine grain granularity of completion
-For out of order queue application wants to execute workloads concurrently
-This change disables pipe control nooping for ioq calls when event returned.
Change-Id: Iaeaf677f768f7434b2efa1842b50653ab80777ad
- account for initial setting (when set mode was equal to initial(Disabled))
estimate size in cmdStreamCS, program MMIO
Change-Id: Ice218ae986583c8f3bab4f4f6979e38f03e30d7e
- This change enabled multiple independent command queues to execute
concurrently without stalling pipe controls in between
- This change removes L3 flushes between kernels
- Dependencies between commands are resolved via task level mechanism
- Out of order queues are not changing task level between submissions
- In order queues are increasing task level between submissions
- Whenever task level changes there is pipe control with cs stall emitted
between GPGPU_WALKERs
Change-Id: I558653b296424e4775d060df3072e2a50684b715
-Do not inc/dec reference count for flush stamps while used only for
update
-FlushStamp doesn't need to be atomic,replace with atomic bool flag
to prevent usage while uniinitialized
-Clean not needed private new
Change-Id: Idad2b318f988de1e7af7642047c67f931e9772aa
- Instruction heap is currently heavily used as every kernel copies ISA into
it.
- It dries out very fast and each change to new heap requires whole pipeline
drain that prevents concurrency
- Problem is even larger when sip kernel is used as it limits the total heap
size
- In order to maximize heap re-use and to limit the count of pipeline drains
this change introduces new minimal size for instruction heap 512 KB.
Change-Id: Ic54e9ef4448b1d35dab01b084ee1d59b509642cb
- In various scenarios code was not programming the max heap size correctly
- It was possible for SSH to overcome the limit
- Size was programmed smaller then it really was, which resulted in smaller
reuse, which led to SBA reprogramming which led to lower performance in ooq
scenarios
- This change fixes the heap size programming by always utilizing full
allocation size and always limiting SSH at proper value
Change-Id: Ib703d2b0709ed8227a293def3a454bf1bb516dfd
- Program one PS with gpgpu selection and media sampler
- Program PS only when media sampler requirement changed
or when preamble was not sent
Change-Id: I85ba3f74087733e79d048e120aeb8b4b04796e00
Fixing InterfaceDescriptor programming for
blocked commands when MidThread preemption is
enabled
Additionally, fixing couple of tests that block
global preemption enabling in ULTs
Change-Id: I454c9608f8606f23d7446785ac24c7c7d8701ae0
- It should use thread count not EU count.
- change variable name to reflect that we work on sublices.
- fix test description, add missing test
- change hasBarrier variable to be boolean
Change-Id: I627bdf17b661d2f9b5eb3d8cd6ca53eba5d46b81
- Call waitForTaskCountAndCleanAllocationList with latest flushed task count
to reflect what was actually sent to HW.
- refactor cleanAllocationList to waitForTaskCountAndCleanAllocationList
Change-Id: I5301185c5fce212e39eb017b952b43c279559cf4
- Prevents destruction of MemObj while it may still be in use.
- Add UNRECOVERABLE to check whether object is deleted while having
dependencies, fix all problems is tests due to that fact.
- Fix special queue setting, clean interfaces.
Change-Id: I2a467e80df00ea1650decdcfa6866acf10b441f8
- When command queue is blocked, all heaps are being stored in temporary
allocations, command buffers are being pre-programmed, heaps are being set
on those temporary allocations with the assumption that all heaps start with
offset 0.
- Problem was when the actual submissions happened, all those temporary heaps
were just copied to appended command queue heaps, so when something was there
then new stuff was copied right after it. It means that all state was
incorrect as the offsets are not valid anymore and will point to wrong
location.
- This change releases command queue heaps when blocked command is being
submitted to make sure they will be programmed with the proper offset in newly
allocate command queue heap.
Change-Id: I3e30be13caf4df8621ddb18f8448ffaf0f1278d1
- due to the fact that device mutex was obtained to prevent threaded access to
image there was a problem when other thread was also doing readImage call
That thread got read Image kernel mutex first and then it was acquiring device
mutex, which was taken by other thread doing mapImage call.
- In current code device mutex is not taken to service mapImage call, instead
image is being guarded by its own mutex.
Change-Id: Ic4c5a019708d7ec5b240bc5b08c5a65173827392
- Fix tests that were triggering the UNRECOVERABLE scenario
- Change UNRECOVERABLE to DEBUG_BREAK in some places
Change-Id: I479baac4941b485af9ea81a61a1a03d2f3f42e6a
- This causes event tree update if virtual event is holding commands or
callbacks
- That causes race between other threads that may be updating the tree
Change-Id: Ic80a8b71ed1e1c1deab8af1bc64f8ce81c21de1b