Commit Graph

2418 Commits

Author SHA1 Message Date
Bartosz Dunajski
696b02bfd3 fix: improve TBX downloading after L0 Event sync
Related-To: HSD-18038498579

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-23 10:42:17 +02:00
Compute-Runtime-Validation
4b01058706 Revert "performance: Defer special queue init to first use"
This reverts commit 25bb3c87ad.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-23 09:09:17 +02:00
Lukasz Jobczyk
c1a5fb089b performance: Add copy buffer rect middle builtin
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-08-22 10:30:17 +02:00
Lukasz Jobczyk
25bb3c87ad performance: Defer special queue init to first use
Resolves: NEO-12332

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-08-22 10:07:44 +02:00
Compute-Runtime-Validation
d8ea5516b2 Revert "performance: Add copy buffer rect middle builtin"
This reverts commit bbb44c7a4d.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-22 04:41:42 +02:00
Mateusz Jablonski
7ac41615cd fix: create thread with function pointer
don't create async thread in neo shared tests

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-21 18:02:37 +02:00
Lukasz Jobczyk
bbb44c7a4d performance: Add copy buffer rect middle builtin
Resolves: NEO-12132

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-08-21 14:41:44 +02:00
Mateusz Jablonski
f617093a6a fix: add missing nullptr check before accessing ail helper
Fixes: #755
Fixes: #754
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-20 16:58:53 +02:00
Mateusz Jablonski
579af57161 refactor: don't call OsLibrary::load directly, use function pointer
this allows mocking this call in ULT

Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-08-20 08:59:26 +02:00
Compute-Runtime-Validation
9b652f4a34 Revert "feature: Improving information transfer about the copy engine"
This reverts commit 17ffdff4f1.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-08-15 22:06:31 +02:00
Andrzej Koska
17ffdff4f1 feature: Improving information transfer about the copy engine
Related-To: NEO-11934

Signed-off-by: Andrzej Koska <andrzej.koska@intel.com>
2024-08-14 11:28:29 +02:00
Maciej Plewka
27488a8315 fix: Add option to use divergent paths for barriers
Related-To: NEO-12159, IGC-9846, HSD-16024235164, HSD-15014782386

Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-08-12 12:15:54 +02:00
Bartosz Dunajski
d76ac1d1de fix: scratch controller residency
Related-To: HSD-18039519400

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-09 14:47:56 +02:00
Kamil Kopryk
775b14a7f6 fix: add ioh alignment in heapless
Related-To: NEO-11871

Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-08-09 12:20:00 +02:00
Dominik Dabek
ad229377b9 fix: disable indirect detection if any stack calls
Don't know if kernels will be initialized in the order needed to check
for indirect accesses in stack calls.

Remove now unused functionPointerWithIndirectAccessExists and reading
this value from zebin.

Related-To: NEO-12235

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-08-07 14:48:58 +02:00
Kamil Kopryk
38a194eee6 fix: scratch address from implicit args in ocl
Related-To: NEO-12237
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-08-07 09:40:27 +02:00
Michal Mrozek
d52ca080bd Revert "performance: improve pool handling"
This reverts commit a3c3b6533a.

Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-08-06 13:04:02 +02:00
Kamil Kopryk
2a9bcdeb83 refactor: pass outImplicitArgs to patchImplicitArgs function
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-08-05 17:31:47 +02:00
Fabian Zwoliński
674c4a15ad fix: use correct gpu address when bindless heaps helper is enabled
Related-To: NEO-7063
Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
2024-08-05 15:09:57 +02:00
Bartosz Dunajski
ec34656e0e fix: debug flag to defer first device submission
Related-To: HSD-18039343751

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-08-02 09:31:25 +02:00
Mateusz Hoppe
0196a0f72f refactor(ocl): internal linker version script with OpenCL versions
- use the same map as in:
https://github.com/KhronosGroup/OpenCL-ICD-Loader/blob/main/
loader/linux/icd_exports.map

this allows to skip loader and link directly with libigdrcl.so

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-07-31 20:18:05 +02:00
Michal Mrozek
47009cec90 refactor: remove not needed code
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-07-31 17:07:56 +02:00
Zbigniew Zdanowicz
b33fe6ccf1 feature: adding flag to block dispatch implicit scaling commands
- this feature is part of making compute walker command view
- compute walker is programed for implicit scaling but not dispatched
- together with new flag, comes the refactor to reduce number of arguments

Related-To: NEO-11972

Signed-off-by: Zbigniew Zdanowicz <zbigniew.zdanowicz@intel.com>
2024-07-31 14:24:27 +02:00
Fabian Zwoliński
b1a50104a8 fix: include dynamic SLM in clGetKernelWorkGroupInfo and zeKernelGetProperties
Current implementation only takes static slmInlineSize into account.
With this change we also include dynamic SLM passed as a kernel arguments.

Related-To: NEO-5761
Signed-off-by: Fabian Zwoliński <fabian.zwolinski@intel.com>
2024-07-30 17:59:45 +02:00
Michal Mrozek
e668b4965c performance: demote unrecoverable to debug_break
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-07-30 16:51:00 +02:00
Kamil Kopryk
65fcbff55c refactor: Simplify code
Signed-off-by: Kamil Kopryk <kamil.kopryk@intel.com>
2024-07-29 14:26:51 +02:00
Compute-Runtime-Validation
b1bc4f4cad Revert "fix: Add missing fp64 extensions in caps initialization"
This reverts commit 9a486dd5a1.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-26 14:47:02 +02:00
Szymon Morek
ace883ca55 performance: don't flush gpgpu if not required
Related-To: NEO-12124

If queue is OOQ and there are no cross-engine dependencies,
don't flush CCS before submitting copy on BCS.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-26 06:49:45 +02:00
Maciej Plewka
1cd00b5b89 fix: use per product cache line size to align heaps
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-07-24 17:29:20 +02:00
Maciej Plewka
afee8814ef refactor: get ioh alignment from static function
Signed-off-by: Maciej Plewka <maciej.plewka@intel.com>
2024-07-24 14:43:31 +02:00
Szymon Morek
a7fbc90ebd fix: re-enable staging buffer copy when ccs is busy
Related-To: NEO-11501

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-22 18:14:46 +02:00
Szymon Morek
39ec7facee performance: use BCS for transfers if CCS is busy
Related-To: NEO-11501

Also, if device is iGPU, don't use staging buffers
in that case.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-22 15:36:26 +02:00
Szymon Morek
6a11e8a077 fix: revert changes around zero-copy
Related-To: NEO-12018

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-19 12:29:18 +02:00
Mateusz Hoppe
b6299b8a21 feature: add support for HP copy engine context
- add support for contect group with HP copy engine
- choose HP copy engine when available

Related-To: NEO-11983

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-07-19 12:23:03 +02:00
Filip Hazubski
f8867ac3ac fix: Minor code improvements
Add explicit pointer checks to CommandQueue::blitEnqueueAllowed.

Explicitly check result of getDeviceArgValueIdx during ocloc compile.

Explicitly remove unused StagingBufferManager functions.
Move chunkCopyFunc by reference.

Signed-off-by: Filip Hazubski <filip.hazubski@intel.com>
2024-07-19 09:24:37 +02:00
Compute-Runtime-Validation
7ad15639fc Revert "feature: add support for HP copy engine context"
This reverts commit 3fbcbcaef2.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-18 21:02:14 +02:00
Szymon Morek
33ab962121 fix: adjust compression hint usage for ocl buffers
Related-To: NEO-11989

Also, use zero-copy on lnl

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-18 18:24:48 +02:00
Michal Mrozek
20d6910b66 performance: move usm pool init to first alloc call
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-07-18 16:07:22 +02:00
Mateusz Hoppe
3fbcbcaef2 feature: add support for HP copy engine context
- add support for contect group with HP copy engine
- choose HP copy engine when available

Related-To: NEO-11983

Signed-off-by: Mateusz Hoppe <mateusz.hoppe@intel.com>
2024-07-18 16:07:07 +02:00
Dominik Dabek
c1c9ac634b performance(ocl): enable host usm alloc recycle
Enable at threshold of 2% system memory.

Related-To: NEO-6893

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-07-17 19:33:56 +02:00
Mateusz Jablonski
8a60742a8d fix: correct reported num subslices per slice in fused config
Related-To: NEO-12073
Signed-off-by: Mateusz Jablonski <mateusz.jablonski@intel.com>
2024-07-17 17:17:28 +02:00
Dominik Dabek
fc9de71feb fix(ocl): finish in release ogl object if needed
Finish cache flushes before exiting api call if releasing displayable
ogl object or dcflush is mitigated.

Related-To: NEO-11694

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-07-16 11:21:32 +02:00
Compute-Runtime-Validation
9a6403f3bc Revert "refactor: Add dc flush mitigation infrastructure"
This reverts commit d6076941a8.

Signed-off-by: Compute-Runtime-Validation <compute-runtime-validation@intel.com>
2024-07-15 11:47:30 +02:00
Lukasz Jobczyk
d6076941a8 refactor: Add dc flush mitigation infrastructure
Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-07-12 14:45:51 +02:00
Bartosz Dunajski
e188de2489 fix: initialize page tables before access for TSP allocation in TBX mode
Related-To: NEO-8340

Signed-off-by: Bartosz Dunajski <bartosz.dunajski@intel.com>
2024-07-11 16:03:19 +02:00
Lukasz Jobczyk
b0a5f2cced fix: Stop direct submission before signal GL event
Related-To: NEO-10556

Signed-off-by: Lukasz Jobczyk <lukasz.jobczyk@intel.com>
2024-07-11 13:45:42 +02:00
Dominik Dabek
bdeccab7aa fix: bcs enqueue after marker properly waits
For an example sequence of:
IOQ_1 -> enqueue copy, enqueue marker with waitlist (out event)
IOQ_2 -> enqueue marker with waitlist (event), enqueue copy

Add missing synchronization between the enqueue copies

Related-To: NEO-11694

Signed-off-by: Dominik Dabek <dominik.dabek@intel.com>
2024-07-11 10:36:18 +02:00
Michal Mrozek
05eb4e7a0d performance: add debug flag to disable l1 flush
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-07-11 10:09:46 +02:00
Szymon Morek
dbd96372be performance: adjust staging buffer usage
Related-To: NEO-11928

Don't copy through staging buffer if dst usm allocation
was not used before and transfer would be splitted.
Also, don't use staging buffers for mapped ocl buffers.

Signed-off-by: Szymon Morek <szymon.morek@intel.com>
2024-07-10 10:19:18 +02:00
Michal Mrozek
4cabc9e4d2 performance: remove not needed code.
events are already created with queued state.
Signed-off-by: Michal Mrozek <michal.mrozek@intel.com>
2024-07-10 07:35:05 +02:00