The old handleStoreInst/loadEltsFromVecAlloca assume 1:1 lane mapping and equal sizes between user value and the promoted vector element type. This is insufficient for mixed widths (e.g. <4 x i8> and <... x i32>), cross-lane accesses created by the new byte-offset GEP lowering, or pointers under opaque pointers (bitcasts between pointers and non-pointers are illegal). With the changes: 1) Stores (handleStoreInst and storeEltsToVecAlloca) normalize the source (scalar or vector) to a single integer of NeedBits = N * DstBits using ptrtoint/bitcast, split the big integer into K = ceil( NeedBits / SrcBits) chunks, bitcast/inttoptr each chunk back to the promoted lane type and insert into K consecutive lanes starting at the scalarized index. 2) Loads (handleLoadInst and loadEltsFromVecAlloca) read K promoted lanes starting at the scalarized index, convert each lane to iSrcBits, pack into i(K*SrcBits), truncate to i(NeedBits), then expand to the requested scalar or <N x DstScalarTy>. Use inttoptr for pointer results. There is also still a simple (old) path. If SrcBits == DstBits, just emit extractelement with casts (if needed). All paths do a single load of the promoted vector, extractelement/insertelement, and in case of stores only a single store back. With these changes, the LLVM IR emitted from LowerGEPForPrivMem will look different. Instead of using plain bitcasts, there are now ptrtoint/inttoptr instructions and there is additional packing/splitting logic. For the simple (old) load path, the new implementation should essentially emit the same pattern (potnetially skipping bitcasts). The additional integer/bitcast instruction sequences should be easily foldable. Memory traffic is unchanged (still one vector load/store). Overall register pressure should be similar, the pass still eliminates GEPs and avoids private/scratch accesses.
Intel® Graphics Compiler for OpenCL™
Introduction
The Intel® Graphics Compiler for OpenCL™ is an LLVM-based compiler for OpenCL™ targeting Intel® graphics hardware architecture.
Please visit the compute Intel® Graphics Compute Runtime repository for more information about the Intel® open-source compute stack: https://github.com/intel/compute-runtime
License
The Intel® Graphics Compute Runtime for OpenCL™ is distributed under the MIT License.
For detailed terms, you can access the full License at:
https://opensource.org/licenses/MIT
Dependencies
- LLVM Project - https://github.com/llvm/llvm-project
- OpenCL Clang - https://github.com/intel/opencl-clang
- SPIRV-LLVM Translator - https://github.com/KhronosGroup/SPIRV-LLVM-Translator
- VC Intrinsics - https://github.com/intel/vc-intrinsics
Supported Linux versions
IGC is continuously built and tested on the following 64-bit Linux operating systems:
- Ubuntu 24.04
- Ubuntu 22.04
Documentation
More documentation is available in the documentation directory.
Supported Platforms
- Intel® Xe2
- Intel® Xe
- Intel® Gen12 graphics
- Intel® Gen11 graphics
- Intel® Gen9 graphics
No code changes may be introduced that would regress support for any currently supported hardware. All contributions must ensure continued compatibility and functionality across all supported hardware platforms. Failure to maintain hardware compatibility may result in the rejection or reversion of the contribution.
Any deliberate modifications or removal of hardware support will be transparently communicated in the release notes.
API options are solely considered as a stable interface. Any debug parameters, environmental variables, and internal data structures, are not considered as an interface and may be changed or removed at any time.
How to provide feedback
If you have any feedback or questions, please open an issue through the native github.com interface: https://github.com/intel/intel-graphics-compiler/issues.
How to contribute
Create a pull request on github.com with your changes. Ensure that your modifications build without errors. A maintainer will get in touch with you if there are any inquiries or concerns.