intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-12 18:27:07 +08:00

Files

Srinivasa Ravi 7d0865122e [clang][NVPTX] Add support for mixed-precision FP arithmetic (#168359 )

This change adds support for mixed precision floating point 
arithmetic for `f16` and `bf16` where the following patterns:
```
%fh = fpext half %h to float
%resfh = fp-operation(%fh, ...)
...
%fb = fpext bfloat %b to float
%resfb = fp-operation(%fb, ...)

where the fp-operation can be any of:
- fadd
- fsub
- llvm.fma.f32
- llvm.nvvm.add(/fma).*
```
are lowered to the corresponding mixed precision instructions which 
combine the conversion and operation into one instruction from 
`sm_100` onwards.

This also adds the following intrinsics to complete support for 
all variants of the floating point `add/fma` operations in order 
to support the corresponding mixed-precision instructions:
- `llvm.nvvm.add.(rn/rz/rm/rp){.ftz}.sat.f`
- `llvm.nvvm.fma.(rn/rz/rm/rp){.ftz}.sat.f`

We lower `fneg` followed by one of the above addition
intrinsics to the corresponding `sub` instruction.

Tests are added in `fp-arith-sat.ll` , `fp-fold-sub.ll`, and
`bultins-nvptx.c`
for the newly added intrinsics and builtins, and in
`mixed-precision-fp.ll`
for the mixed precision instructions.

PTX spec reference for mixed precision instructions:
https://docs.nvidia.com/cuda/parallel-thread-execution/#mixed-precision-floating-point-instructions

2025-12-15 16:28:23 +05:30

bindings

…

cmake

…

docs

[clang] Implement gcc_struct attribute on Itanium targets (#71148 )

2025-12-12 19:22:40 +02:00

examples

…

include

[clang][NVPTX] Add support for mixed-precision FP arithmetic (#168359 )

2025-12-15 16:28:23 +05:30

lib

[Clang][x86]: allow PCLMULQDQ intrinsics to be used in constexpr (#169214 )

2025-12-15 10:27:17 +00:00

runtime

…

test

[clang][NVPTX] Add support for mixed-precision FP arithmetic (#168359 )

2025-12-15 16:28:23 +05:30

tools

[clang][DependencyScanning] Move driver-command logic for by-name scanning into DependencyScanningTool (#171238 )

2025-12-13 04:44:26 +05:30

unittests

[Clang] Add TimeTraceScope to Sema::CheckConstraintSatisfaction (#170264 )

2025-12-11 15:02:41 +01:00

utils

[ARM] Introduce intrinsics for MVE fp-converts under strict-fp. (#170686 )

2025-12-14 12:12:45 +00:00

www

…

.clang-format

…

.clang-tidy

…

.gitignore

…

AreaTeamMembers.txt

…

CMakeLists.txt

…

INSTALL.txt

…

LICENSE.TXT

…

Maintainers.rst

…

NOTES.txt

…

README.md

…

README.md

C language Family Front-end

Welcome to Clang.

This is a compiler front-end for the C family of languages (C, C++ and Objective-C) which is built as part of the LLVM compiler infrastructure project.

Unlike many other compiler frontends, Clang is useful for a number of things beyond just compiling code: we intend for Clang to be host to a number of different source-level tools. One example of this is the Clang Static Analyzer.

If you're interested in more (including how to build Clang) it is best to read the relevant websites. Here are some pointers:

Information on Clang: http://clang.llvm.org/
Building and using Clang: http://clang.llvm.org/get_started.html
Clang Static Analyzer: http://clang-analyzer.llvm.org/
Information on the LLVM project: http://llvm.org/
If you have questions or comments about Clang, a great place to discuss them is on the Clang forums:

Clang Frontend - LLVM Discussion Forums
If you find a bug in Clang, please file it in the LLVM bug tracker:

https://github.com/llvm/llvm-project/issues