When inheriting task count from parent events,
don't take into account externally synchronized events
Change-Id: I52d861e482669a18e2aca499c813716bb4951b74
- change the way we handle blocked commands.
- instead of allocating CPU pointer and populating it with commands, create
real IndirectHeap that may be later submitted to the GPU
- that removes a lot of copy operations that were happening on submit time
- for device enqueue, this requires dsh & shh to be passed directly to the
underlying commands, in that scenario device queue buffers are not used
Change-Id: I1124a8edbb46777ea7f7d3a5946f302e7fdf9665
* adding support for map/unmap
* adding support for origin/region validation with mipmaps
* fixing slices returned in map/unmap
* removing ambiguity around mipLevel naming
* enabling cl_khr_mipmap_image in current shape
* enabling cl_khr_mipmap_image_writes in current shape
* fixing CompileProgramWithReraFlag test
Change-Id: I0c9d83028c5c376f638e45151755fd2c7d0fb0ab
- While estimating the required size of Indirect Object Heap we were not
handling properly the lack of local ids case
- In such case we should allocate one GRF per HW thread that will be unused
Change-Id: Ibcd359e431e3ffd9d55628ac7cf7eeefad72e7ba
- cpu virtual address was used instead of gpu va
- this caused incorrect behaviour of TBX server when
special heap allocator assigning GPU addresses was used
Change-Id: I2328cf2441be797311fd6a3c7b331b0fff79d4fc
Any CPU related updates such as clEnqueueMapBuffer or similar
need to trigger a re-dump of memory prior to the next clEnqueue call.
Change-Id: I7b31e559278e92ff55b6ebab8ef4190caef1ebc0
- Do not obtain pattern allocation from reusable pool.
- This is due to the fact that it may contain allocations from internal
heap, which cannot be used for arguments declared as kernel argument.
Change-Id: I6c73445c409edc4ce25f8d8eba966f512dfd6cc9
- This removes Instruction Heap allocation from enqueue path
- Blocked path is handled as well
- Heap is no longer allocated on demand it is bind to kernelInfo.
Change-Id: I54545beceed3404ee0330a8bac2b0934944cac30
- Switch to internal heap for kernel ISA allocations.
- remove IH from various functions
- remove IHState from CSR , IH is never dirty
- ISA is no longer copied on enqueue calls.
Change-Id: I0099cf2a9ebab6192ea03a74dd35f7da963fd5a5
- Make sure that blocks ISA is made resident
- both blocked & non blocked path
- fix a bug where private surface was not made resident in blocked path.
Change-Id: Ie564595b176b94ecc7c79d7efeae20598c5874fb
-This is to make sure those functions are not called when gtpin is not used
-This preserves CPU instruction cache pollution.
-Our enqueue path needs to be as thin as possible, even with this small change
there is visible gain in ULT execution time.
Change-Id: I44cc2144754cda95ca1fe058184cd8a151b8d35c
- KmdNotifyProperties struct for CapabilityTable that can be extended by
incoming KmdNotify related optimizations
- Quick KMD sleep optimization that is called from async events handler
- Optimization makes a taskCount check in busy loop with much smaller
delay than basic version of KMD Notify optimization
Change-Id: I60c851c59895f0cf9de1e1f21e755a8b4c2fe900
- Previously only taskCount was updated
- This improves KMD notify usage for Events handled asynchronously
Change-Id: I283982890579254033557de0e1cef2239c0035e2
- Also refactor debug manager tests , they now check for default value
in igdrcl.config file
- There is no need to write dedicated tests now , so I remove them.
Change-Id: Ib338ca05b6059302c29469c673239e7886dc4b9b
- read only memory cannot be used for allocation,
Oses cannot create graphics alocation for such memory
- if memory allocation fails for host_ptr passed
to enqueueWrite calls, then try doing new allocation
and copy host_ptr on cpu
Change-Id: I415a4673ae1319ea8f77e53bd8fba7489fe85218
Add macro to add all subdirectories
Add macro to create project source tree based on target sources
Small cleanup runtime/CMakeLists.txt
Change-Id: I9b99145c544f648c4c3fe7421752d0c5d9504edf
- Due to use cases where one shared buffer may be mapped to multiple CL
buffers we need to flush DC between enqueues.
Change-Id: I05d7f844afe31d52a0004f5e2e5efa776f9dadbe
- Dont make cpu/gpu writes on read-only unmap
- Read/Write on limited map range only
- Overlaps checks for non read-only maps
- Fixed cmd type on returned event
Change-Id: I98ca542e8d369d2426a87279f86cadb0bf3db299
When queue is blocked on non-blocking call, map operation is added to
waitlist dependencies. Returning slice/row pitch for map image was skipped
Change-Id: I46f97590315e7aee7fbbfbdb615f383cdb666307
- Introducing MapInfo struct which will be used as container for multiple
map operations
- Unified mapped offset and size for Buffers and Images
- Fixed incorrect map params for CPU and GPU path
- Missing API level checks
Change-Id: Ib4077c9e2c0c333b131ffd5ccbc4a1404920eb5b
- Curently each non-zerocopy CPU operation on map/unmap make a full copy
using hostPtr
- This commit adds functionality to select specific range of copy
- Multiple mapping with different size is not supported yet,
so copy will be made on full range for now. This is for future usage.
Change-Id: I7652e85482ba6fffb2474169447baf9b080dcd1e