OpenCL represents built-in variables like `get_global_id` with generic
type `size_t`, which translates to i64. This change adds a new
optimization that simplifies built-in calculation to i32 if built-in's
use has an assumption hinting that value fits in i32 range.
The LLVM IR that IGC receives from the LLVM 16-based SPIR-V Reader
contains OpenCL/SPIR-V builtins represented as TargetExtTy types.
Unfortunately, Clang 16 does not emit TargetExtTy and hence the modules
coming from Clang and SPIR-V Reader are not compatible and cannot be
linked together. The solution/workaround is to retype TargetExtTy types
as pointers of correct address space. This approach works since the
mangling/OpenCL builtin call resolution is already done by the SPIR-V
Reader and IGC does not need to work on TargetExtTy types directly.
Such retyping also ensures that all the current pointer-based
optimizations continue to work.
This patch extends the retyping beyond just function arguments and
return types. It now also retypes TargetExtTy used in:
- local variables (alloca instructions)
- loads and stores of TargetExtTy values
- struct types containing TargetExtTy fields
- function attributes (byval, sret, byref)
The retyping is done in a single pass over the module.
This commit changes STB_TranslateOutputArgs to use data structures
instead of pointers to arrays of char. This is done for the purpose of
preventing memory leaks.
For `pOutput` field, llvm::SmallVector is used, as it works with
ZEBinaryBuilder::getBinaryObject(llvm::raw_pwrite_stream).
When we handle annotations with opaque pointers, we can call only single getOperand()
on annotation struct, because we don't need to use e.g. bitcast instruction
like for typed pointers.
For `opaque pointers` we cannot deduce a type for `Prefetch` call with `ptr`.
We need this information to create appropriate builtins. To achieve this,
we can use `llvm::demangle()` method and find appropriate type.
This change is compatible with both typed and opaque pointers.
Change args to const ref where makes sense.
Put std::move where makes sense
Apply rule of three
Change usage of unique_ptr and ptrs to unique_ptr to use just shared_ptr
Update comment in CodeSinking
Use saved boolean value instead of calling method over and over.
In igc_ocl_tranlation_ctx_impl there is handling of -igc_opts passed to
ocloc. The implementation however doesn't handle improper format, which
results in an infinite loop in case no single quotes were provided, or
the second single quote is missing.
This commit adds handling of these cases.
PromoteBools was wrongly skipping RAUW operation on ptr function calls
after promoting their signature and adding their users to promotion
queue which lead to cleaning those users and creating undef values in
subsequent users, which then in turn created dead code which was
eliminated and caused wrong test results. Essentially this pass was
wrongly skipping RAUW when neccessary.
In cases like these:
```llvm
@call = call ptr @some-function(i1 true)
@bitcast = bitcast ptr @call to ptr
@ptrtoint = ptrtoint ptr @bitcast to i64
```
After pass pre fix:
```llvm
@0 = call ptr @some-function(i8 1)
@ptrtoint = ptrtoint ptr undef to i64
```
Proper pass behaviour:
```llvm
@0 = call ptr @some-function(i8 1)
@bitcast = bitcast ptr @0 to ptr
@ptrtoin = ptrtoint ptr @bitcast to i64
```
Currently `handleCacheControlINTELForPrefetch` requires type size to perform call conversion correctly,
but on opaque pointers such size is not available and we cannot just extract it anyhow.
We're waiting for "OpUntypedPrefetch" extension which will support opaque pointers in such cases.
Meanwhile as of today I'm adding skip of the whole prefetch conversion when opaque pointer type is involved
and we're waiting for update about the status of "OpUntypedPrefetch"
zeinfo now contains information if kernel/function has printf calls
and function pointer calls. This allows neo to create printf_buffer when
it is really used.
In cases where we have no local casts to generics and we allocate
private memory in global space, we can replace GenericCastToPtrExplicit
with simple address space cast.
In cases where we have no local casts to generics and we allocate
private memory in global space, we can replace GenericCastToPtrExplicit
with simple address space cast.
In cases where we have no local casts to generics and we allocate
private memory in global space, we can replace GenericCastToPtrExplicit
with simple address space cast.
In LegalizeFunctionSignatures don't call `getFunction()` which
returns parent function. Add support for llvm15+ which works
with opaque pointers and a legacy llvm 14 path.
In PromoteBools:
- Call `getType()` on load instruction - calling `getType()` on src
returns an opaque pointer.
- Use getValueType() in promoteGlobalVariable to work with
opaque pointers.
The metadata node !kernel_arg_base_type must mirror !kernel_arg_type for
OpenCL builtin types (e.g. image1d_t). Unfortunately, this is
inconsistent with LLVM 16-based Common Clang.
This patch ensures that every OpenCL builtin type (*_t) listed in
!kernel_arg_type is also present in !kernel_arg_base_type at the same
position.
Move the PromoteToPredicatedMemoryAccess pass to
the optimization stage of the compiler.
This allows to keep standard LLVM passes to optimize the IR before
the predication pass is applied.
Change the pass to support scalarized loads and stores.
Add a new pass to hoist conversion operations
to the common dominator to unblock the predication pass.
Fix generation of predicated stores, in case address is uniform
and stored value is not.
Cleaned up dead code that's related to patch token binary format deprecation. Removed unused code, adjusted some comments.
Most of these changes are related to previous commits that deprecated the format in VC and OCL.
Some parts are still to be refactored, this doesn't cover all patch token code.
Updated the IGC to handle the new SPIR-V extension decoration IDs for HostAccessINTEL. The code now supports both the old (6147) and new (6190) enum values, ensuring backward compatibility while accommodating the latest extension updates.
Removed ZEBinaryBuilder::addKernelDebugEnv and its usages, as it was obsolete. The debug_env section in ze_info is meant to be empty as of this moment. Also, removed obsolete program debug functionality that was based on the patch token binary format (debugDataSize is always 0).
In cases where we have no local casts to generics and we allocate
private memory in global space, we can replace GenericCastToPtrExplicit
with simple address space cast.
In cases where we have no local casts to generics and we allocate
private memory in global space, we can replace GenericCastToPtrExplicit
with simple address space cast.
This change adds a new field INTELGT section of ZeBinary,
that will mark the version of the IAB layout.
This is needed for runtime to distinguish between non-backwards-compatible
changes to the layout.
While OpenCL builtin types need retyping to correctly link with
Clang 16-generated BIF module, this is unecessary for any other builtin
types (e.g. JointMatrix types).
This patch excludes JointMatrix types from retyping. Other non-OpenCL
types will be excluded in the next patches.
Refactor OCL code paths depending on the enableZEBinary() as if zebin format was always supported and used.
The OpenCLPrintfResolution/basic.ll test was targeting older platforms using the patch token format, so it had to be adjusted for differences in pointer sizes. Specifically, pointers to printf string literals are 64-bit instead of 32-bit.
Some dead code will be left-over after merging, it will be removed in a later PR.
Clang 16 still lowers OpenCL/SPIR-V built-ins as ptr to opaque structs,
while SPIR-V Reader uses TargetExtTy values. This helper transparently
converts any value-typed TargetExtTy function parameter to an opaque
pointer, rewrites call-sites, and updates metadata, ensuring the two IR
dialects link cleanly. The builtin function resolution is already done
earlier by SPIR-V Reader.
With NEO version has been updated to have the
zebin feature for LscStoresWithNonDefaultL1Cache, we can remove the
workaround by driver info control check.
With NEO version has been updated to have the
zebin feature for LscStoresWithNonDefaultL1Cache, we can remove the
workaround by driver info control check.