- Change dirty state helpers to work on IndirectHeaps.
- Instead of comparing size in bytes and cpu pointers, compare gpu base
address and size of the heap in pages
- That allows to not have dirty flag for heaps that are coming from 4GB
allocator.
Change-Id: I0ff81e3c0945b32e4f872a100cd10b332b27ed24
- notifySourceCode, notifyKernelDebugData, notifyDeviceDestruction
- added processDebugData method in Program
- change options when SLD is active
- add space at the beginning of extension list options
Change-Id: Iac1e52f849544dbfda62407e112cde83fa94e3ad
- For every command buffer that we submit, pass it to gem close worker.
- Gem close worker will do asynchronous cleanup if this resource is meant to
be destroyed.
- if the resource is not meant to be destroyed we will call IOCTL wait for
this batch buffer.
- This will result in bumping up GPU clocks and better performance.
Change-Id: If9f181e411d7748573f31682e875a97c5355abe5
- Only Wddm object owns Gdi
- Dont pass Gdi object to constructor
- Move Wddm related files to new directory
Change-Id: Iadd26634c7692db760d7d3367211c32d2c2c8121
- Add L3UltHelper to be able to tell if L3 config is programmable
- Run L3 config kernel tests according to its output
Change-Id: I55b76e2da325d28f62b0bde20250b68f02154ae2
- Use local gmmClientContext instead of pGMMGlobalContext
- ResourceInfo and PTmanager creation from gmmClientContext
- Mock Gmm context creation in Wddm to have only one instance per run
Change-Id: I67e015c57f0ab5524564760fd9a849615615697f
- add source level debugger to device
- load isDebuggerActive function from library
- rename interface to sourceLevelDebuggerInterface in SLD
- add DebugData to KernelInfo with kernel debug data
Change-Id: I2643ee633f8dc5c97e8bbdc9d4e7977ddcbf440d
- Rename misnamed test function
- Adjust 2 tests, so they use CSR size getters instead of hardcoded values
- Move getSizeRequiredPreambleCS() into CommandStreamReceiverHw class
- Improve PreambleHelper size estimating
Change-Id: I3f292d50e08f3d10d190c9f8722e1f0498481154
- This is the longest group of tests currently having 2k tests which execute
for a second in Debug64 build on Windows
- Every test_p in this fixture corresponds to ~200 tests.
- Aggregate multiple tests into one to do verification in one shot.
- apply unique_ptr
- remove string creation and propagation
- This effectively removes ~1k tests from the suite while keeping the same
testing functionality.
Change-Id: I19003b38c193073db90dd58724e96b821fd16aea
- This allows applications to force the N:1 aggregation by creating out
of order queue.
- That switches csr to N:1 submission model where commands from multiple
command streams may be aggregated.
- That forces scenarios returning an event to be aggregated as well.
Change-Id: I8fd8d7f88bb2665234ee90870133120b206710a8
-This is required to enable N:1 submission model.
-If heaps are coming from different command queues that always
mean that STATE_BASE_ADDRESS needs to be reloaded
-In order to not emit any non pipelined state in CSR, this change
moves the ownership of IndirectHeap to one centralized place which is
CommandStreamReceiver
-This way when there are submissions from multiple command queues then
they reuse the same heaps, therefore preventing SBA reload
Change-Id: I5caf5dc5cb05d7a2d8766883d9bc51c29062e980
- add defines to command line
- remove most occurences of include "config.h"
Change-Id: I19d65d83c895fc6143d319d057a50e5ae3e78830
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
use this variable in tests as it is set once in main.cpp
create function to get binary kernel filename
Change-Id: Ibf7b4c2d390caefda4a5d7fc4667006e7f2edde8
- use full type specification and remove casts in MemoryManager
- remove TagAllocatorBase not used any more
- make TagAllocator to be profiling/instrumentation agnostic
- unify UnlimitedTagCount and make part of TagAllocator
Change-Id: I7b5b1ed83aa5e1f0839f611db0530d7e062a3c25
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
- interface to debugger dynamic library
- code is included when source level debugger header exists,
otherwise implementation is exluded from build
- unit tests do not load real dynamic library,
instead test version (DebbugerLibrary) of OsLibrary is used.
Change-Id: Id3229c77963352e8001043ee41b7d48c6b180a59
- ThkWrapper had uninitialized mFunc member, setting it
to nullptr
- D3DSurface could dereference null image pointer,
adding validateUpdateData method in SharingHandler
that may return CL_INVALID_MEM_OBJECT if memObject is invalid
Change-Id: Iaa4499bcea47baca156c9d28be4c93ba4f0e1ebb
We don't need BuiltinDispatchInfoBuilder in every place where built ins
are used. specifically in .cpp files generated from kernel binary.
Change-Id: Ie739951cdc93873993f78ad14cee656122af51fd
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
- There was a wrong cast in Graphics Allocation constructor resulting
in wrong GPU address generation in some sporadic scenarios.
- Problem appears in 32 bit applications where void* address is cast to
uint64_t value, if c style cast is used it makes trailing bit to be
populated to higher bits constructing wrong value
0xf000000 is being casted to 0xfffffffff0000000 while it should be casted to
0x00000000f0000000
- added special cast function for further use.
Change-Id: I56d53a8e13e17cbacd127566442eea3f6a089977
- Move indirect heap to internal allocator domain.
- Add logic in getIndirectHeap to allocate with proper API depending on
heap type
- Add State base Address programming, reflecting that now Indirect Object
Heap is placed in 4GB domain.
- For AddPatchInfoCommentsForAUBDump mode , keep all heaps in non 4GB mode.
Change-Id: I6862f6a249e444d0d6cfe7e499a10d43f284553e
- generate debug data to .dbg file in cloc
- generate debug kernel for ults with "-g" option
in addition to "-cl-kernel-debug-enable"
- append "-g" option for compilation and build of
programs with kernel debugging enabled to make
compiler generate debug data
Change-Id: I09401f84be6e09da167194a44d1b9a7f2bfb622d
- Add Indirect Heap function that will be used to program State Base Address.
- This is to allow indirect heaps to work in 2 modes, either heap will service
as whole indirect allocation OR offsets in 4GB space will be used.
Change-Id: Ic4ca1e907c1b30d2f98dc39e8ab945ce35ed6ad0
- Internal allocations may now coexists with non internal on reusable list.
- Caller now specifies if internal allocation is needed.
- If criteria are not met , then allocation is not returned.
Change-Id: I7da3a4f944768b7c8a873e44fd47248f1d76bf9e
- Allow indirect heap to work in 2 modes:
first mode is when it will be used as an allocation from 4GB allocator.
In such scenario driver will return offset from base of the allocator region.
Second mode is the legacy mode which will be used by device enqueue, this
will results in heap CPU base address being programmed in State Base Address
commands and during programming heap offset base of 0 will be returned.
Change-Id: Ica098f3278b6b6ed5036b4c5ab7461dc61d8ee86
There are differences in qPitch programming between Gen8 vs Gen9+
devices and this requires special operation when image is zero-copy.
For Gen8 qPitch is distance in rows while Gen9+ it is in pixels.
Minimum value of qPitch is 4 and this causes slicePitch = 4*rowPitch on
Gen8.
To allow zero-copy we have to tell what is correct value rowPitch which
should equal to slicePitch.
Change-Id: I58dea004e3c7f9f4dfabd154d02749c15b6b0246
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
Refactoring in ULTs around preemption:
-refactoring ULTS to not fail with default preemption mode
-fixing ULT memory leaks observed after enabling preemption
-mocking getSipKernel in ULTs (to minimize ULT execution time)
Change-Id: I194b56173d7cb23aae94eeeca60051759c817e10
- program SIP_STATE when either MidThread preemption is enabled
or kernel debugging is active
- device creates correct sip based on preemption mode and
active kernel debugging
Change-Id: I3e43b66ad00d24c2389fa4fc766dd47044b6af80
these tests should be executed after unit_tests target is complete to
ensure everything is ready in environment and to avoid sporadic failures
Change-Id: Ib9f9fdb9f4135441d17761c8dbee0868f1be404b
Signed-off-by: Artur Harasimiuk <artur.harasimiuk@intel.com>
- This is to improve battery usage while waiting in busy loop on CPU
- New Kmd Notify helper to maintain dynamic parameters
- Ask OS about battery status on longer waits
- Pick different timeout when using battery and optimization is disabled
Change-Id: I5f9c8c5a9c635652aac27c707f2b55933947a7fb
there is problem with Clang 4.0 and Debug builds when bit field
initialization is used. depending on structure size we may get some bits
still set.
this bitfield comes from external component, so we don't have full
control over them. use of memset to clear structure is workaround
Change-Id: I35062517107fde37e503f1bf8909db856d566254
- On gen8 devices we are not using index to control caching, but we program
caches directly
- In such case we need to rely on values reported from GMM instead of using
Kernel Mocs indexes.
Change-Id: I6c030847509d8f39f63ac98ebd3ebd0b0907e625
- Decission about timeout enabling and value moved out of CSR
- Timeout multiplier is no longer Linux specific
Change-Id: I6858fe2f811ef13802b95e0470e310210a9dea8b
When inheriting task count from parent events,
don't take into account externally synchronized events
Change-Id: I52d861e482669a18e2aca499c813716bb4951b74
- allow creating image 2d from non NV12 image 2d
- validate image descriptor and format when create image from image
Change-Id: Ie7887e75f1450fc723dc1d1ae9ff5639d88835fc
- change the way we handle blocked commands.
- instead of allocating CPU pointer and populating it with commands, create
real IndirectHeap that may be later submitted to the GPU
- that removes a lot of copy operations that were happening on submit time
- for device enqueue, this requires dsh & shh to be passed directly to the
underlying commands, in that scenario device queue buffers are not used
Change-Id: I1124a8edbb46777ea7f7d3a5946f302e7fdf9665
* adding support for map/unmap
* adding support for origin/region validation with mipmaps
* fixing slices returned in map/unmap
* removing ambiguity around mipLevel naming
* enabling cl_khr_mipmap_image in current shape
* enabling cl_khr_mipmap_image_writes in current shape
* fixing CompileProgramWithReraFlag test
Change-Id: I0c9d83028c5c376f638e45151755fd2c7d0fb0ab
- createContextOsProperties is not needed anymore
- replace invalid context property value
0x200D as invalid context property value should not be used,
as it can be use in future as valid property
Change-Id: I569433b0f37bbce083f0d64ecf1dc80ff83bfb46
- While estimating the required size of Indirect Object Heap we were not
handling properly the lack of local ids case
- In such case we should allocate one GRF per HW thread that will be unused
Change-Id: Ibcd359e431e3ffd9d55628ac7cf7eeefad72e7ba
- cpu virtual address was used instead of gpu va
- this caused incorrect behaviour of TBX server when
special heap allocator assigning GPU addresses was used
Change-Id: I2328cf2441be797311fd6a3c7b331b0fff79d4fc
Any CPU related updates such as clEnqueueMapBuffer or similar
need to trigger a re-dump of memory prior to the next clEnqueue call.
Change-Id: I7b31e559278e92ff55b6ebab8ef4190caef1ebc0
For image with defined sharingHandler test:
- enqueueAcquireSharedObjects
- enqueueReleaseSharedObjects
Change-Id: I8835e4a4aa06a08e57dc207b168810162e44445c
call to FCL can be costly. we don't need this when kernel source is
simple and does not contain '#include'. In this case we can compute hash
directly based on kernel source.
Change-Id: I0455be57d9ee13919a53c145e3feeb00a113d71e
- Do not obtain pattern allocation from reusable pool.
- This is due to the fact that it may contain allocations from internal
heap, which cannot be used for arguments declared as kernel argument.
Change-Id: I6c73445c409edc4ce25f8d8eba966f512dfd6cc9
- Move Windows HardwareInfo configuration from DeviceFactory to HwInfoConfig
- Add ULTs for HwInfoConfig on Windows
Change-Id: I9b84bbe60ca9f2ad4ddc3119bc8cb88331a7d154
- remove unused MemoryManagementFixture.
MemoryLeaks are tracked using MemoryLeakListener no need to duplicate
with Fixure.
MMF should be used when you need to inject memory allocation failure
Change-Id: I95bcaa7051acf540c5b015c5489ed6a6fc38ee8e
- Align SIP kernel & STATE_SIP programming.
- on Linux address may be non 0
- on Windows address is expected to be always 0
Change-Id: I385ed59ef652382f3f17d1afe55f6050d07ed1f4
use preprocessor sequence to convert define value to stirng:
#define q(a) #a
#define tostr(b) q(b)
Change-Id: If0a9ccfcc543523309be4995c05125bf8fbf2081
- This removes Instruction Heap allocation from enqueue path
- Blocked path is handled as well
- Heap is no longer allocated on demand it is bind to kernelInfo.
Change-Id: I54545beceed3404ee0330a8bac2b0934944cac30
- Switch to internal heap for kernel ISA allocations.
- remove IH from various functions
- remove IHState from CSR , IH is never dirty
- ISA is no longer copied on enqueue calls.
Change-Id: I0099cf2a9ebab6192ea03a74dd35f7da963fd5a5
- Allocator now uses uint64_t instead of void*.
- This is due to the fact that it is required to work on 64 bit addresses
in 32 bit dll.
Change-Id: Ia715ea7913efc95a2974aff8dff390203d8125a8