Switch builds to use LLVM 16. Updated the documentation to treat LLVM 16 as default.
Refreshed parts of buildIGC.sh regarding supported versions. Fixed a bug when setting a variable in buildIGC.sh to a default value.
Force enabled exceptions for VC. This is a workaround while we're investigating why they're disabled.
Cleaned up dead code that's related to patch token binary format deprecation. Removed unused code, adjusted some comments.
Most of these changes are related to previous commits that deprecated the format in VC and OCL.
Some parts are still to be refactored, this doesn't cover all patch token code.
TraceRayInline is lowered to stores to the RTStack.
So, if there was a shader with the following pattern:
```cpp
RayQuery q;
q.TraceRayInline();
while(q.Proceed())
{
...
}
```
We would turn it into:
```cpp
RayQuery q;
q.TraceRayInline();
RQCheck();
while(q.Proceed())
{
...
}
RQRelease();
```
But what we want is:
```cpp
RayQuery q;
RQCheck();
q.TraceRayInline();
while(q.Proceed())
{
...
}
RQRelease();
```
intrinsics
This change adds the capability to enable the use of automatic immediate
offset for 2d block intrinsics. Currently, the pass requires the env var
`allowImmOff2DBlockFuncs` to be turned on manually, since I have not yet done
adequate testing, but enabling the feature is now trivial.
I also refactored the code, to make future optimizations easier to
implement; it is now easy to introduce new set intrinsics for other
parameters of the payload (not enabled now by default), and should be
easier to add the "increment" mode functionality of the
LSC2DBlockSetAddrPayloadField intrinsics.
This change should not affect compiler behaviour. Once one of the
features that this change facilitates is found to be advantageous, it
can be enabled.
This pass is designed to remove fantom dependency in a loop.
When value that we get from PHI, is a vector, that is completely overwritten
inside loop body and no other uses, then replace this value with usage of undef.
2D block decomposition in the Decompose2DBlockFuncs pass was not performed in some test cases because the LLVM isLoopInvariant() checks only that the args of the instruction are outside the loop. Moving the pass after an LICM call makes this correct.
Payload hoisting was not performed in some cases because of interfering intrinsics that disrupted the implicit control flow. We can assume, however, payload is speculatable when we are sure we are never changing it in the loop body, fixing this issue.
I also added an IGC flag for easier testing of this feature.
Remove unnecessary TFloat32 to Float conversion, since TFloat32 is already valid Float
Update F -> TF32 conversion API to use floats to match dpas interfaces and avoid unnecessary float->int conversions
Update F -> TF32 conversion API to match naming convention used in cl_intel_bfloat16_conversions (khronos.org)
The warning message for displayed upon VLA detection is confusing for
people not familiar with IGC flag usage. This commit makes it more clear
and adds a link to documentation on flag usage.
When a pointer to SVM is not aligned to 4 bytes, Neo aligns it and stores
a difference between aligned and unaligned pointer into `bufferOffset`
argument, so that IGC can correctly access the memory. Neo must align
the pointer since hardware expects surface state base address to be
aligned to 4 bytes.
This change prevents optimizing `bufferOffset` argument usage out when
IGC cannot assure that a pointer is 4 bytes aligned.
Add support for ".uniform" modifier in ifcall instruction.
When IGC marks an ifcall as uniform, VISA skips running special EU
fusion call WA when -fusedCallWA = 2.
The flag ByPassAllocaSizeHeuristic was unabailable in release driver
version which caused difficulties debugging. This commit switches the
flag release mode to true.