Commit Graph

3060 Commits

Author SHA1 Message Date
Krzysztof Gibala
2e7c90e58f Add debug flag to enable specific PIPE_CONTROL fields
FlushSpecificCache equivalent in value:

dcFlushEnable 	 				0b000000000001
renderTargetCacheFlushEnable  			0b000000000010
instructionCacheInvalidateEnable  		0b000000000100
textureCacheInvalidationEnable  		0b000000001000
pipeControlFlushEnable  			0b000000010000
vfCacheInvalidationEnable  			0b000000100000
constantCacheInvalidationEnable  		0b000001000000
stateCacheInvalidationEnable  			0b000010000000
tlbInvalidation  				0b000100000000
hdcPipelineFlush 				0b001000000000
unTypedDataPortCacheFlush 			0b010000000000
compressionControlSurfaceCcsFlush 		0b100000000000

Setting multiple cache at once for example:

constantCacheInvalidationEnable
textureCacheInvalidationEnable
vfCacheInvalidationEnable 			0b000001101000

Related-To: NEO-6049
Signed-off-by: Krzysztof Gibala <krzysztof.gibala@intel.com>
2022-09-28 11:17:03 +02:00
Jim Snow
eaa4965ae8 Allocate RTDispatchGlobals as unboxed array
Previously we used an array-of-pointers approach, but using an
array-of-structures is in some ways simpler.

We also split out the RTStack as a separate allocation.

Related-To: LOCI-2966

Signed-off-by: Jim Snow <jim.m.snow@intel.com>
2022-09-28 03:42:14 +02:00
Dominik Dabek
d8b7d56160 Copy host ptr on cpu if possible in clCreateBuffer
use cpu copy with locked pointer if possible
because this is faster than copy on gpu
limit to buffers of size at most 64kb

Related-To: NEO-7332

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2022-09-27 17:54:06 +02:00
Szymon Morek
7ded401615 [L0][XE_HPC]Perform memcpy on CPU by default
Related-To: NEO-7237

Enable copy on cpu by default.
This commit also changes barrierCounter to bool
barrierCalled

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2022-09-27 17:32:56 +02:00
Krystian Chmielewski
596e9f815c 32bit zebin support
This commit adds support for 32 bit zebinary in NEO runtime and in
ocloc validate.

Resolves: NEO-7288

Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2022-09-27 11:12:05 +02:00
Spruit, Neil R
b5b9c3500f Support for L0 to read Device LUID from the WDDM driver using EXT Properties
- Added Support for reading the Device LUID of the given device used in
Windows WDDM.
- Added inital support for passing back the NodeMask of 1.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
2022-09-27 08:18:50 +02:00
Compute-Runtime-Validation
d7eacc0280 Revert "Support for L0 to read Device LUID from the WDDM driver using EXT Pro...
This reverts commit af3dd2859b.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-09-27 06:20:55 +02:00
Spruit, Neil R
af3dd2859b Support for L0 to read Device LUID from the WDDM driver using EXT Properties
- Added Support for reading the Device LUID of the given device used in
Windows WDDM.
- Added inital support for passing back the NodeMask of 1.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
2022-09-26 19:05:05 +02:00
Zbigniew Zdanowicz
f0888fece2 Rename command list tracking debug flag and variables
This change reflects exact nature of debug variable and what is code
actually doing

Related-To: NEO-7187

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2022-09-26 18:59:39 +02:00
Michal Mrozek
2fbc1f652b Choose alignment as next power of 2 for HEAP_EXTENDED allocations.
This way we will get as big pages as possible without leftovers.

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2022-09-26 16:42:59 +02:00
Mateusz Hoppe
7ff258fc92 L0Debug - Enable attaching to Root or Subdevices
- enable tile attach mode by default
- both root device and subdevice may be attached to

Related-To: NEO-7347

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-09-26 16:03:54 +02:00
Zbigniew Zdanowicz
57d35c8932 Add state compute mode tracking
Related-To: NEO-5019

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2022-09-26 14:36:37 +02:00
Szymon Morek
ec04de61a7 [L0][XE_HPC]Perform memcpy on CPU for non-usm ptrs
Related-To: NEO-7237

If size is small enough, it is more efficient to
perform copy through locked ptr on CPU.
This change also introduces experimental flag to
enable this.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2022-09-26 13:20:40 +02:00
Bellekallu Rajkiran
7f8e9378b6 Adjust ccs on reinit
Parse and adjust ccs count on reset so that initial
environment is restored.

Related-To: LOCI-3435

Signed-off-by: Bellekallu Rajkiran <bellekallu.rajkiran@intel.com>
2022-09-26 11:24:21 +02:00
Zbigniew Zdanowicz
a95ab1d16b Share pipeline select state updates between regular and immediate command lists
Related-To: NEO-5019

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2022-09-26 11:14:53 +02:00
Luzynski, Sebastian Jozef
bac85ddb25 Move kernel globals from SVM to USM device
With this change, module's data sections will be allocated in USM device
pool instead of SVM or USM shared.

Signed-off-by: Luzynski, Sebastian Jozef <sebastian.jozef.luzynski@intel.com>
2022-09-23 23:06:15 +02:00
Compute-Runtime-Validation
f5575a1370 Revert "Remove fallback path for PAT index programming"
This reverts commit faf8d51f6d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-09-23 20:46:31 +02:00
Dunajski, Bartosz
6175a3e785 Debug flag to force stateless mocs encryption bit
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2022-09-23 15:19:26 +02:00
Zbigniew Zdanowicz
5986a7199a Share front end state updates between regular and immediate command lists
Related-To: NEO-5019

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2022-09-23 09:46:35 +02:00
Zbigniew Zdanowicz
e960802e33 Add pipeline select state tracking
This optimization removes pipeline select from command list preamble
and presented to command queue for necessary state update.
Code is disabled by default and available under debug key.

Related-To: NEO-5019

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2022-09-23 08:21:00 +02:00
Compute-Runtime-Validation
7aecea534f Revert "Default L0 Function & Global Symbols with fallback build for SPIRv"
This reverts commit 88b7a4f82d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-09-23 07:07:04 +02:00
Dunajski, Bartosz
98db084b59 Debug flag to append api module build options
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2022-09-22 16:03:58 +02:00
Dunajski, Bartosz
b2001bf265 L0: GRF mode debug flags support
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2022-09-22 15:27:31 +02:00
Yates, Brandon
7dc36ca422 L0 Win Debugger - fix slice mapping bug
Related-to: LOCI-3429
Signed-off-by: Yates, Brandon <brandon.yates@intel.com>
2022-09-22 14:40:13 +02:00
Fabian Zwolinski
645600d141 Return error when there is no memory to evict
We want to return error code to the application instead of aborting when
we are not able to make more memory resident.

Related-To: NEO-7289
Signed-off-by: Fabian Zwolinski <fabian.zwolinski@intel.com>
2022-09-22 14:26:55 +02:00
Mateusz Jablonski
501873d0e0 Add virtual keyword to IoctlHelper methods for frequency files
Related-To: NEO-7300

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-09-22 14:13:03 +02:00
Maciej Plewka
1458602efc Store indirect residency at command queue level
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2022-09-22 14:07:19 +02:00
Zbigniew Zdanowicz
81f2d04f5a correct and unify programming of front end disable overdispatch property support
Related-To: NEO-5019

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2022-09-22 13:13:38 +02:00
Naklicki, Mateusz
ec3668fc18 Add initialization method to ioctl helpers
Signed-off-by: Naklicki, Mateusz <mateusz.naklicki@intel.com>
2022-09-22 11:55:59 +02:00
Krystian Chmielewski
311b0b0020 Create input for linker during zebin decoding
Remove code duplication. Parsing zebin elf for relocations and symbols
is moved to decodeSingleDeviceBinary.

Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2022-09-22 11:12:39 +02:00
Neil R Spruit
88b7a4f82d Default L0 Function & Global Symbols with fallback build for SPIRv
- Enabled default setting of Program & Global Symbols to be generated by
IGC when building L0 Modules with the ability to fallback to previous
behavior thru build failure checks.

- Enabled selective disable of default program or global symbol
generation thru debug variables.

Signed-off-by: Neil R Spruit <neil.r.spruit@intel.com>
2022-09-22 02:40:51 +02:00
Mateusz Jablonski
9bde277184 Read frequency from file system based on drm version
Related-To: NEO-7300
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-09-21 13:28:18 +02:00
Igor Venevtsev
43676ed02a L0Debug Win: initialize device with empty submission
In case of debuggable context device should be additionally
initialized by early empty submission issue.

Signed-off-by: Igor Venevtsev <igor.venevtsev@intel.com>
2022-09-21 12:02:34 +02:00
Maciej Bielski
56cb1f757b programStateBaseAddress: improve code reuse
Another step towards cleaner callers of
StateBaseAddressHelper<>::programStateBaseAddress.

Export programming state base address into a separate function to
improve code reuse and reduce copy-pasted fragments, which make code
modifications or maintenance more and more difficult over time. Use
specialization for gen-specific variations.

Related-To: NEO-6774
Signed-off-by: Maciej Bielski <maciej.bielski@intel.com>
2022-09-21 11:54:57 +02:00
Michal Mrozek
bddf8c7dbc Do not make resident something that is already resident.
Move checks to upper layers.
100ns gain in ZE_AFFINITY_MASK=0.0 PrintDebugSettings=1
./api_overhead_benchmark_l0 --test=ExecuteCommandListImmediate --api=l0
--UseProfiling=0 --CallsCount=1 --MeasureCompletionTime=0
--useBarrierSynchronization=0 --KernelExecutionTime=1 --iterations=1000

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2022-09-21 10:59:35 +02:00
Dunajski, Bartosz
faf8d51f6d Remove fallback path for PAT index programming
Signed-off-by: Dunajski, Bartosz <bartosz.dunajski@intel.com>
2022-09-21 10:46:43 +02:00
Lukasz Jobczyk
efac290ba3 Do not use selector copy engine
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2022-09-20 21:49:00 +02:00
Mateusz Hoppe
92893a5101 L0Debug - add support for mirrored isa heaps
- allow tileInstanced ISA while debugging

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2022-09-20 19:32:00 +02:00
Compute-Runtime-Validation
643e21631c Revert "Store indirect residency at command queue level"
This reverts commit ffad5c6c09.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-09-20 18:12:12 +02:00
Mateusz Jablonski
99d63facb5 Clarify meaning of ForceDeviceId debug flag
this flag can be used only to override device id in AUB/TBX mode

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-09-20 13:15:15 +02:00
Mateusz Jablonski
cfe51ff2ba Remove not used isSimulation functions
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2022-09-20 11:01:55 +02:00
Kamil Kopryk
17d87a4c69 Add RemoveUserFenceInCmdlistResetAndDestroy debug flag
Related-To: NEO-7156
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2022-09-19 22:35:53 +02:00
Michal Mrozek
3d5e34f727 Reduce the size of masks to 4.
32 is not required.

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2022-09-19 21:53:40 +02:00
Maciej Plewka
ffad5c6c09 Store indirect residency at command queue level
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>

Related-To: NEO-7211
2022-09-19 17:01:20 +02:00
Michal Mrozek
fc9352cfcb Optimize binding process.
- Do not iterate when all devices are parsed
- Early continue if given device not present in context

200ns (+10%) in below scenario from compute-benchmarks
ZE_AFFINITY_MASK=0.0 PrintDebugSettings=1 ./api_overhead_benchmark_l0
--test=ExecuteCommandListImmediate --api=l0 --UseProfiling=0
--CallsCount=1 --MeasureCompletionTime=0 --useBarrierSynchronization=0
--KernelExecutionTime=1 --iterations=1000

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2022-09-19 16:47:34 +02:00
Milczarek, Slawomir
0192e8038f Check for GPU hang in path with wait for timestamps
Related-To: NEO-6868

Signed-off-by: Milczarek, Slawomir <slawomir.milczarek@intel.com>
2022-09-19 15:01:46 +02:00
Zbigniew Zdanowicz
8eaa9d690e add tracking of the state of pipeline select for command lists and queues
This change prepares infrastructure for pipeline select handling in
command lists and queues by optimization of number of commands dispatched.
State is synchronized between flush-task immediate and regular command lists.
Next step is to add optimization itself which disables legacy hw command
dispatch algorithm.
This change corrects ADL-P support for systolic mode changes.

Related-To: NEO-5019

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2022-09-19 11:57:34 +02:00
Compute-Runtime-Validation
45c8124d8f Revert "Move kernel globals from SVM to USM device"
This reverts commit 706a5a7a8c.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2022-09-18 12:49:44 +02:00
Krystian Chmielewski
b7a780868a Prepare OCL tests for switch to zebin
Signed-off-by: Krystian Chmielewski <krystian.chmielewski@intel.com>
2022-09-16 15:33:26 +02:00
Sebastian Luzynski
706a5a7a8c Move kernel globals from SVM to USM device
With this change, module's data sections will be allocated in USM device pool
instead of SVM or USM shared.

Signed-off-by: Sebastian Luzynski <sebastian.jozef.luzynski@intel.com>
2022-09-15 16:50:12 +02:00