- allow creating image 2d from non NV12 image 2d
- validate image descriptor and format when create image from image
Change-Id: Ie7887e75f1450fc723dc1d1ae9ff5639d88835fc
- change the way we handle blocked commands.
- instead of allocating CPU pointer and populating it with commands, create
real IndirectHeap that may be later submitted to the GPU
- that removes a lot of copy operations that were happening on submit time
- for device enqueue, this requires dsh & shh to be passed directly to the
underlying commands, in that scenario device queue buffers are not used
Change-Id: I1124a8edbb46777ea7f7d3a5946f302e7fdf9665
* adding support for map/unmap
* adding support for origin/region validation with mipmaps
* fixing slices returned in map/unmap
* removing ambiguity around mipLevel naming
* enabling cl_khr_mipmap_image in current shape
* enabling cl_khr_mipmap_image_writes in current shape
* fixing CompileProgramWithReraFlag test
Change-Id: I0c9d83028c5c376f638e45151755fd2c7d0fb0ab
- createContextOsProperties is not needed anymore
- replace invalid context property value
0x200D as invalid context property value should not be used,
as it can be use in future as valid property
Change-Id: I569433b0f37bbce083f0d64ecf1dc80ff83bfb46
- While estimating the required size of Indirect Object Heap we were not
handling properly the lack of local ids case
- In such case we should allocate one GRF per HW thread that will be unused
Change-Id: Ibcd359e431e3ffd9d55628ac7cf7eeefad72e7ba
- cpu virtual address was used instead of gpu va
- this caused incorrect behaviour of TBX server when
special heap allocator assigning GPU addresses was used
Change-Id: I2328cf2441be797311fd6a3c7b331b0fff79d4fc
Any CPU related updates such as clEnqueueMapBuffer or similar
need to trigger a re-dump of memory prior to the next clEnqueue call.
Change-Id: I7b31e559278e92ff55b6ebab8ef4190caef1ebc0
For image with defined sharingHandler test:
- enqueueAcquireSharedObjects
- enqueueReleaseSharedObjects
Change-Id: I8835e4a4aa06a08e57dc207b168810162e44445c
call to FCL can be costly. we don't need this when kernel source is
simple and does not contain '#include'. In this case we can compute hash
directly based on kernel source.
Change-Id: I0455be57d9ee13919a53c145e3feeb00a113d71e
- Do not obtain pattern allocation from reusable pool.
- This is due to the fact that it may contain allocations from internal
heap, which cannot be used for arguments declared as kernel argument.
Change-Id: I6c73445c409edc4ce25f8d8eba966f512dfd6cc9
- Move Windows HardwareInfo configuration from DeviceFactory to HwInfoConfig
- Add ULTs for HwInfoConfig on Windows
Change-Id: I9b84bbe60ca9f2ad4ddc3119bc8cb88331a7d154
- remove unused MemoryManagementFixture.
MemoryLeaks are tracked using MemoryLeakListener no need to duplicate
with Fixure.
MMF should be used when you need to inject memory allocation failure
Change-Id: I95bcaa7051acf540c5b015c5489ed6a6fc38ee8e
- Align SIP kernel & STATE_SIP programming.
- on Linux address may be non 0
- on Windows address is expected to be always 0
Change-Id: I385ed59ef652382f3f17d1afe55f6050d07ed1f4
use preprocessor sequence to convert define value to stirng:
#define q(a) #a
#define tostr(b) q(b)
Change-Id: If0a9ccfcc543523309be4995c05125bf8fbf2081
- This removes Instruction Heap allocation from enqueue path
- Blocked path is handled as well
- Heap is no longer allocated on demand it is bind to kernelInfo.
Change-Id: I54545beceed3404ee0330a8bac2b0934944cac30
- Switch to internal heap for kernel ISA allocations.
- remove IH from various functions
- remove IHState from CSR , IH is never dirty
- ISA is no longer copied on enqueue calls.
Change-Id: I0099cf2a9ebab6192ea03a74dd35f7da963fd5a5
- Allocator now uses uint64_t instead of void*.
- This is due to the fact that it is required to work on 64 bit addresses
in 32 bit dll.
Change-Id: Ia715ea7913efc95a2974aff8dff390203d8125a8
- do not store fragment in map until hostPointerValidation
is done
- set pointers to nullptr after delete in cleanOsHandles
Change-Id: I0bf99c3215c4b91ce059bb4e94716671c49f1946
- if no options are passed try to read options
and internal options from text files.
- refactor code for reading from file.
Change-Id: I608c70f3afe77a4e4845fe1c96cc9d31464c6122
- Make sure that blocks ISA is made resident
- both blocked & non blocked path
- fix a bug where private surface was not made resident in blocked path.
Change-Id: Ie564595b176b94ecc7c79d7efeae20598c5874fb
Add macro to simplify iteration for gens, platforms and test configs
Common usage:
1. Write macro "macro_for_each_platform", you can use variables:
GEN_TYPE, GEN_TYPE_LOWER, PLATFORM_IT, PLATFORM_IT_LOWER
2. Write macro "macro_for_each_gen", you can use variables:
GEN_TYPE, GEN_TYPE_LOWER
3. In macro "macro_for_each_gen" call "apply_macro_for_each_platform"
4. Call "apply_macro_for_each_gen" with gen type (SUPPORTED/TESTED)
When needed iterate over test configurations:
1. Write macro "macro_for_each_test_config", you can use variables from
parent macro and SLICES, SUBSLICES and EU_PER_SS
2. In macro "macro_for_each_platform" call "apply_macro_for_each_test_config"
with specified type (AUB_TESTS/MT_TESTS/UNIT_TESTS)
Change-Id: Icd537f409a224a1ffade1874065f8fee66189350
- limit the amount of atomic reads
- change debug callback code to not test atomic
- change logic of seeing if we are within limits.
Change-Id: I67e6cf0558f2db60a50bf3ecdf7287d345bf50ae
-This is to make sure those functions are not called when gtpin is not used
-This preserves CPU instruction cache pollution.
-Our enqueue path needs to be as thin as possible, even with this small change
there is visible gain in ULT execution time.
Change-Id: I44cc2144754cda95ca1fe058184cd8a151b8d35c
- Dont check if resource is shared
- Check only gmm allocation capability
- Add missing support for 3d textures
Change-Id: I989533549087db74d5c238d639055462d5fea604
- Extract hw_info_config.h from Linux directory
- Extract enabling HwInfoConfig from Linux directory
- Create dummy implementations for HwInfoConfig on Windows
Change-Id: Ic9c7525ba9d9b654f238fb661cdbb3eecc421e29
- Measure time between wait calls. If delay is exeeded use QuickKmdSleep
- Kmd Notify helper functions
- Refactor overriding from debug variables
- Refactor Kmd Notify tests
Change-Id: I123c31f492d98fd304184f99ee0bf7d733d06f04
- simplify os agnostic memory manager
- remove pointer map
- move cpuPtr allocate logic to graphics allocation
- do not release tag allocation while injecting memory manager
- remove not needed ref count from Memory Allocation
Change-Id: I6ad81ee919c9cde939bc754a9dfc2db7568397d2
- KmdNotifyProperties struct for CapabilityTable that can be extended by
incoming KmdNotify related optimizations
- Quick KMD sleep optimization that is called from async events handler
- Optimization makes a taskCount check in busy loop with much smaller
delay than basic version of KMD Notify optimization
Change-Id: I60c851c59895f0cf9de1e1f21e755a8b4c2fe900
- when pinning fails with EFAULT due to read-only memory
used for allocation (BO), mark the allocated fragments
to be freed, as cpu copy will be used.
- prevent possible leaks
Change-Id: I200ba276da5e3a8557df28fe2e411ef30d69a86a
- generation temporarily disabled for gen8 platforms only
- unit tests using the pregenerated kernel modified accordingly
Change-Id: I304a796836c823d222e60c44a78fc7f4b03b8a73
- Previously only taskCount was updated
- This improves KMD notify usage for Events handled asynchronously
Change-Id: I283982890579254033557de0e1cef2239c0035e2
- adding kernel debug option to build program
- program tests refactor
- pregenerated debug kernel for ULTs
Change-Id: I00152639148fd48c4f709dc7cd9c46392df567c8
igdrcl_tests: define gen specific sources in subdirectories
libult: append gen specific sources needed to link hw tests
Change-Id: I72505729f1ff27439cd43904688de9c2cfbe080f
- Also refactor debug manager tests , they now check for default value
in igdrcl.config file
- There is no need to write dedicated tests now , so I remove them.
Change-Id: Ib338ca05b6059302c29469c673239e7886dc4b9b
This commit introduces a software controlled HW Tag
in the configuration of AUB CSR in standalone mode
(i.e. with no execution on real HW).
Change-Id: Ic470957d58e6568b13dda3d61cb230498d8f2691
- when fragment is already allocated, allcoating is
skipped and index is incremented,
separate index must be used for allocated bos array
Change-Id: I856a99ba4ebdad5375829a43d721c7e1490b18d3
Cleanup unit_tests/CMakeLists.txt
Move shared sources to libult
define linux test projects in linux subdirectory
Change-Id: I0da18c79e6581412a04ddfb3795750db862ad95c
- read only memory cannot be used for allocation,
Oses cannot create graphics alocation for such memory
- if memory allocation fails for host_ptr passed
to enqueueWrite calls, then try doing new allocation
and copy host_ptr on cpu
Change-Id: I415a4673ae1319ea8f77e53bd8fba7489fe85218
cleanup unit_tests/aub_tests/CMakeLists.txt
cleanup unit_tests/elflib/CMakeLists.txt
cleanup unit_tests/libult/CMakeLists.txt
cleanup unit_tests/tbx/CMakeLists.txt
partially cleanup unit_tests/CMakeLists.txt
solution source tree changes:
- make test projects folder as variable
- make platform specific targets folder as variable
- move platform specific targets to \"test platforms\" folder
Change-Id: Iff7da009e13c3ac9e5af76325be32e5056e8cd7b
- new patch token
- program debug compilation flag
- sip kernel new methods for querying bti and debug
surface size
Change-Id: Icaddd15f269c4b76efdf926f2e346aa61cbaae02
size of structure can vary. Create single point of conversion to extract
required data and store them in Neo specific structures.
Change-Id: I822ec633014aa7394cbd626ecbc275e32e61cf60
- This ensures each kernel has ISH set up after it is created.
- refactor freeBlockPrivateSurfaces to freeBlockResources, this is to properly
clean allocations for blocks
- Add method cleanCurrentKernelInfo to avoid code duplication in KernelInfo
cleanup
Change-Id: I01f155d434579fe5ce2675bc4e89b04628ef8158
- Return error on origin > 0 or region > 1 when its not allowed
- For 1Darray, array region and origin are stored on 2nd position.
For 2Darray, its on 3rd postion
- Fix map offset for 1Darray image
- Fix CPU data transfer for 1Darray image
Change-Id: Id35ba5f54f117e7af318ca7e6e03c1fc942ce729
New debug option FlattenBatchBufferForAUBDump has been added. When set it
modifies AUB dump in such way that commands from main and chained batch
buffer are dumped as single allocation. Commands from chained batch buffer are
dumped directly after commands from main batch buffer without
MI_BATCH_BUFFER_START. This feature also requires ImmediateDispatch mode which
can be forced using debug option CsrDispatchMode = 1.
Change-Id: I730760791693a748e7f4e1463ce8e7af94287b93
- remove unused cmake code
- small cleanup around scheduler compilaiton
- remove misleading message related to compiler copying
message is generated before copy_if_different operation and may be
incorrect when such copy doesn't happen
Change-Id: Ia419d1ea26e9149b4282dc4883ddda0232ffd3f4
- New registry flags can be used for applications that wants to dump driver
diagnostics without using any additional tools
- When flag is on , context is being created with driver diagnostics and
hint level is being set to debug variable
- If application is already using driver diagnostics the hint level is
overwritten
Change-Id: I9912c0a7e8f23adb8372997144e5b75f9cc05b1d
do all transform and conversion in enumAdapters and return HardwareInfo.
the ADAPTER_INFO structure may vary and SkuInfoTransfer is responsible
to copy/deduce required flags, it can be done as a part of enumAdapter.
Change-Id: Iad6fd5f7094f591a0175025c9ec33a96e55ebdc9
This commit eliminates redundancy in calling processResidency() for AUB CSR
twice in the HW CSR with AUB dump configuration.
Change-Id: Ib49c80fa9d81a495dfb7261ff76e0b9b1422e42d
This commit fixes the issue with image contents writes
in the configuration of CSR HW with AUB dump.
Change-Id: Id0c4f36d4f9eee5175267384d42cb75bf41062f3
-Do not create allocator 32 bit with every DRM memory manager
-This is not needed for apps that do not use this.
-Add allocation of allocator to setForce32BitAddressing
Change-Id: I836b60f6b74eecf678cc9d56851797d0db176107
- depending on argument different parameter size may be returned. we
shouldn't check this at the beginning of file but after checking
parameter name.
- check retVal in profiling ULTs
Change-Id: I18a80545111d6efffd0a176340b3c3234f53af08
- Add new entry point in memory manager for internal allocations.
- Route to allocate32BitGraphicsMemory
- Add new enum to control memory region
- Change mm to memoryManager
Change-Id: I2ee069aa9baf7f69f652022e026569ec4fdb9d77
- Microseconds offer better precision.
- Some workloads require threshold less then 1 millisecond to work
efficiently.
Change-Id: I1a565049340fb6eeebe5c0a61ededae9959daca8
- Due to use cases where one shared buffer may be mapped to multiple CL
buffers we need to flush DC between enqueues.
Change-Id: I05d7f844afe31d52a0004f5e2e5efa776f9dadbe
- Do not open file twice, loadDataFromFile checks if file is successfully
opened and returns 0 if not.
Change-Id: I8ca73b281ea13033746f8203f482d9af7a2739b7
- Dont make cpu/gpu writes on read-only unmap
- Read/Write on limited map range only
- Overlaps checks for non read-only maps
- Fixed cmd type on returned event
Change-Id: I98ca542e8d369d2426a87279f86cadb0bf3db299
igdrcl_mocks should be single source for any Neo mock. Because this is
static library, only required compilation uint will be included in
resulting binary
Change-Id: I53019bf8cd86072ccb2be40e82c5136bd50ee15f
When queue is blocked on non-blocking call, map operation is added to
waitlist dependencies. Returning slice/row pitch for map image was skipped
Change-Id: I46f97590315e7aee7fbbfbdb615f383cdb666307
Linking is required for igdrcl_dll target only. Not needed for static
library. This reduces scope of targets where library is required.
Change-Id: Ie48ce1f299ef9d4e484081fe87254869c72ca042
This commit enables AUB dumps in scenarios with images with no host ptr
when resource lock is required to get CPU address and dump image contents.
Change-Id: I996efc5f520d0ac7b470870f7b4eeb9d2ef7b25b
-There was a precision problem with timestamp calculation, all math was using
integers which are not very precise in overflow scenarios
-Change the logic to use doubles and cast back to uint64_t at the end.
Change-Id: Ia08d504a90a43df7330f398af966535ed944650d
remove not needed global variable from mock device and refactor gmm
context initialization for ULTs
Change-Id: I594938a7df7dfaaf7a3cf73f8a13ad85a7b58401
- Introducing MapInfo struct which will be used as container for multiple
map operations
- Unified mapped offset and size for Buffers and Images
- Fixed incorrect map params for CPU and GPU path
- Missing API level checks
Change-Id: Ib4077c9e2c0c333b131ffd5ccbc4a1404920eb5b
-If out of order flag was disabled then pipe control was not having dc flush.
-This could led to a batch buffer that doesn't end with dc flush.
-This change adds differentiation between pipe controls that may be erased and
pipe controls that are used as a part of epilogue command
Change-Id: Ic9c970c75c89ff524a0e40506eff6dd097760145
RT must override engineType at DeviceFactory, since Wddm CSR uses HardwareInfo
at its ctor.
AUB tests must override engineType at Device ctor since they bypass
DeviceFactory.
Change-Id: I73e4066e9b16aed0410fe39a82726d3baea2e67f
because of deffered deletion some variables are dereferenced after
leaving test scope. this causes invalid stack memory accesses reported
by GCC 7.
Change-Id: I183be8ec3c815a41a75a1f71635d9afb560c7457
- Curently each non-zerocopy CPU operation on map/unmap make a full copy
using hostPtr
- This commit adds functionality to select specific range of copy
- Multiple mapping with different size is not supported yet,
so copy will be made on full range for now. This is for future usage.
Change-Id: I7652e85482ba6fffb2474169447baf9b080dcd1e
-Do not flush dc for every command in batched mode
-Do that only in immediate mode
-For commands that needs DC do not noop pipe controls
-Ensure that each command buffer in batching mode ends with dc flush.
Change-Id: I3cd9d1831c19b69c66092687922f20df7e330245
AUB tests do not use DeviceFactory class to create Device objects but still
need to have a functionality to override default engine type
Change-Id: I6841cb0a9c5726ac4308c742c78cf7a61829f168
-For in order queue application can have fine grain granularity of completion
-For out of order queue application wants to execute workloads concurrently
-This change disables pipe control nooping for ioq calls when event returned.
Change-Id: Iaeaf677f768f7434b2efa1842b50653ab80777ad
- account for initial setting (when set mode was equal to initial(Disabled))
estimate size in cmdStreamCS, program MMIO
Change-Id: Ice218ae986583c8f3bab4f4f6979e38f03e30d7e
mapGPUVA will fail when allocation is still in deferred deleter and using
the same base pointer to map, while there is no reserveGPUVA for SVM range.
In that case driver should drain deleter and retry mapGPUVA call
Change-Id: I4ded7d79e0cd935ec62d7fae785d66570c847535
- This change enabled multiple independent command queues to execute
concurrently without stalling pipe controls in between
- This change removes L3 flushes between kernels
- Dependencies between commands are resolved via task level mechanism
- Out of order queues are not changing task level between submissions
- In order queues are increasing task level between submissions
- Whenever task level changes there is pipe control with cs stall emitted
between GPGPU_WALKERs
Change-Id: I558653b296424e4775d060df3072e2a50684b715
- Tag should be updated only as a part of epilogue.
- Level change should only emit pipe control with cs stall
Change-Id: I6e04f794641818b0d046523776d3ce87aec9f606