Add support for different timestamp packet counts per gfx family.
Change all packet counts to 1 except for xe-hpc.
Related-To: NEO-8154
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
CmdList can be released before Event. In this case, GfxAllocation
destruction must be deferred.
Related-To: NEO-7966
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
config.file should not be created manually by the user.
In a scenaro when the user manually creates an empty config.file,
then reading data from this file ends with failure,
because the file is empty.
Such scenario completely freezes the cache creation
until the user manually deletes the empty config file.
This patch fixes such freeze by automatically deleting config
if read fails with ERROR_HANDLE_EOF error.
Patch applies to windows only.
Related-To: NEO-8092
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
- this change handles level zero immediate command lists on copy engine
- monitor fence will be dispatched for blocking calls
- asynchronous mode will dispatch monitor fence only on host synchronization
Related-To: NEO-8395
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
In L0 its not possible to track objects relations. For example CmdList
may be removed before Event.
In such case, Event needs to safely skip unregister call, without
accessing CmdList/CmdQueue object.
Related-To: NEO-8884
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
So far, there is a separate page allocated for each kernel's ISA within
`KernelImmutableData::initialize()`. Apparently the ISA blocks are often
much smaller than a 64k page, which leads to poor memory utilization and
was even observed to cause the device OOM error if a single module has
several keys.
Improve the situation by reusing the parent allocation (owned by the
module instance) for modules, which kernel ISAs can fit together within
a single 64k page. This improves the memory utilization on a single
module level.
Related-To: NEO-7788
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
Changes:
- replaced registry keys with environment variables
for cl_cache in OCL
- added compiler cache helpers
- implemented support for new env vars on Windows
- added tests
New env vars mechanism works as follows:
If `PERSISTENT_CACHE` is set,
driver checks if `NEO_CACHE_DIR` is set.
If `NEO_CACHE_DIR` is not set,
driver uses `%LocalAppData%\NEO\neo_compiler_cache`
as `cl_cache` destination folder.
If `NEO_CACHE_DIR` is not set and `%LocalAppData%`
path could not be obtained,
compiler cache is disabled.
In the current Windows implementation,
special characters in the folder path are not supported.
Related-To: NEO-8092
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
Fix bug introduced in neo 27314 - splitBarrierRequired was set for all
commands, should be only for cl_command_barrier.
Related-To: NEO-8147
Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
value depends on CCS count:
- single CCS mode (default) - 50% available
- two CCS mode - 25% available
- four CCS mode - 12.5% available
Related-To: NEO-8377
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
- unify Linux and Windows default settings
- unify override default code
- correct size estimation when fence is required
- call virtual function once for both estimation and dispatch
Related-To: NEO-8395
Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
- when Surface State is reused for new resource, State Cache needs to be
invalidated
Related-To: NEO-7063
Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>