This replaces SPS transparent conversion for pointers. Transparent
conversion only applies to argument/return types, not nested types. We
want to be able to serialize / deserialize structs containing pointers.
We may need to replace this in the near future with a new SPSPointer tag
type, since SPSExecutorAddr is meant to be serialization for pure
addresses, and pointers may carry other information (e.g. tag bits), but
we can do that in a follow-up commit.
This is a follow-up PR of #162699.
In this PR we clean CAPI and Python bindings of MLIR rewrite part by:
- remove all manually-defined `wrap`/`unwrap` functions;
- remove useless nanobind-defined Python class `RewritePattern`.
We check the non-dependent constraints early with empty template
arguments when we build a nested requirement. Therefore we cannot assume
a non-empty MLTAL within the Checker.
No release note because this is a regression on trunk.
## Summary
This PR adds three new members to the LLVM Qualification Working Group
member table in the documentation.
## Changes
- **Jorge Pinto Sousa** (Critical Techworks)
- **José Rui Simões** (Critical Software)
- **Zaky Hermawan** (Individual contributor)
## Background
These new members have been nominated and approved through the working
group's established nomination process as outlined in the QualGroup
documentation. They meet the membership criteria for individuals with
relevant experience in qualification-related efforts.
## Testing
- [x] Documentation builds successfully
- [x] Member table formatting is correct
- [x] All links and handles are properly formatted
## Related Links
- [LLVM Qualification Working Group
Documentation](https://llvm.org/docs/QualGroup.html)
This PR adds support for defining custom **`RewritePattern`**
implementations directly in the Python bindings.
Previously, users could define similar patterns using the PDL dialect’s
bindings. However, for more complex patterns, this often required
writing multiple Python callbacks as PDL native constraints or rewrite
functions, which made the overall logic less intuitive—though it could
be more performant than a pure Python implementation (especially for
simple patterns).
With this change, we introduce an additional, straightforward way to
define patterns purely in Python, complementing the existing PDL-based
approach.
### Example
```python
def to_muli(op, rewriter):
with rewriter.ip:
new_op = arith.muli(op.operands[0], op.operands[1], loc=op.location)
rewriter.replace_op(op, new_op.owner)
with Context():
patterns = RewritePatternSet()
patterns.add(arith.AddIOp, to_muli) # a pattern that rewrites arith.addi to arith.muli
frozen = patterns.freeze()
module = ...
apply_patterns_and_fold_greedily(module, frozen)
```
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This patch adds support for fpext/fptrunc operations.
I noticed that finite-only semantics are not supported by the current
representation of constant FP ranges. It should be okay for now, as we
don't expose these types in the IR.
Based on top of #157211.
`FNEG` and `FABS` must preserve signalling NaNs, meaning they should not
convert to f32 to perform the operation. Instead legalize to `XOR` and
`AND`.
Fixes almost all of #104915
Summary:
Without slab reclaiming this interface is much simpler and it can speed
up cases with a lot of churn. Basically, wastes memory for performance.
Consider the following transform:
```
C = binop float A, nnan OOp
D = select ninf, i1 cond, float C, float A
->
E = select ninf, i1 cond, float OOp, float Identity
F = binop float A, E
```
We cannot propagate ninf from the original select, because OOp may be
inf, and the flag only guarantees that FalseVal (op OOp) is never
infinity.
Examples: -inf + +inf = NaN, -inf - -inf = NaN, 0 * inf = NaN
Specifically, if the original select has both ninf and nnan, we can
safely propagate the flag.
Alive2:
+ fadd: https://alive2.llvm.org/ce/z/TWfktv
+ fsub: https://alive2.llvm.org/ce/z/RAsjJb
+ fmul: https://alive2.llvm.org/ce/z/8eg4ND
Closes https://github.com/llvm/llvm-project/issues/161634.
The template argument returned should be relative to the partial
specialization, which would correspond to the partial template parameter
list.
Unfortunately we don't save this anywhere in the AST, and would
otherwise need to deduce them again.
Simply avoid providing this argument for now, until we make it
available.
This fixes regressions which were never released, so there are no
release notes.
Fixes#162770Fixes#162855
This patch moves the `preserve-bc-uselistorder` and
`preserve-ll-uselistorder` options out of individual tools(opt, llvm-as,
llvm-dis, llvm-link, llvm-extract) and make them global defaults for
AsmWriter and BitcodeWriter.
These options are useful when we use `-print-*` options to dump LLVM IR.
Python-defined passes have been merged into the main branch for some
time now. I believe adding a corresponding section in the documentation
will help more users learn about this feature and understand how to use
it.
This PR adds such a section to the docs of Python bindings, summarizing
the feature and providing an example.
The global raw_null_ostream singleton returned by llvm::nulls() is
marked as InternalBuffer rather than Unbuffered, causing it to
allocate a buffer when first written to. In multithreaded environments,
multiple threads can simultaneously trigger buffer allocation via
SetBuffered(), leading to race conditions on the buffer pointer
fields (OutBufCur, OutBufEnd).
For example:
raw_ostream::write(const char *Ptr, size_t Size)
->
raw_ostream::SetBuffered()
->
raw_ostream::SetBufferSize(size_t Size)
->
raw_ostream::SetBufferAndMode(char *BufferStart, size_t Size,
BufferKind Mode)
This can manifest as a heap corruption when multiple threads write to
the
null stream concurrently, as the buffer pointers will become corrupted
during the race.
The fix is to explicitly pass Unbuffered=true to the raw_pwrite_stream
constructor, ensuring the null stream never allocates a buffer and
all writes go directly to the no-op write_impl().
For example, this can fix multithreaded applications using MCELFStreamer
where getCommentOS() returns the shared nulls() singleton.
Lots of the code/structure was based off `AMDGPUCombinerHelper`,
`AMDGPUPreLegalizerCombiner`, etc.
Tasks completed:
- Create new `SPIRVCombinerHelper` inheriting from `CombinerHelper`
- Move combiner logic in `SPIRVPreLegalizerCombiner` to helper methods
in `SPIRVCombinerHelper`
- Update `SPIRVPreLegalizerCombiner` to use the new helper class
- Simplify `applySPIRVDistance` code
The change fixes a bug in the SROA where tree-structured merge
optimization was incorrectly applied when the size of the stored value
was not a multiple of the new allocated element type size. The original
change is https://github.com/llvm/llvm-project/pull/152793. A simple
repro would be
```
define <1 x i32> @foo(<1 x i16> %a, <1 x i16> %b) {
entry:
%alloca = alloca [1 x i32]
%ptr0 = getelementptr inbounds [2 x i16], ptr %alloca, i32 0, i32 0
store <1 x i16> %a, ptr %ptr0
%ptr1 = getelementptr inbounds [2 x i16], ptr %alloca, i32 0, i32 1
store <1 x i16> %b, ptr %ptr1
%result = load <1 x i32>, ptr %alloca
ret <1 x i32> %result
}
```
Currently, this will lead to a compile time crash.
In this change, we will skip the tree-structured merge for this case and
fall back to normal SROA.
This commit aims to align SimpleNativeMemoryMap::FinalizeRequest::Segment with
llvm::orc::tpctypes::SegFinalizeRequest. This will simplify construction of a
new LLVM JITLinkMemoryManager that's capable of using SimpleNativeMemoryMap as
a backend.
There are four uses of BoolGOption, and all of them are essentially debug
info feature flags, which I believe should not the enablement or
disablement of all debug info emission. `OPT_g_group` is used to control
the debug info level here:
https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Clang.cpp#L4387
This doesn't cause any test failures, and seems like the right behavior
for all four flags:
* -g[no-]flag-base
* -g[no-]inline-line-tables
* -g[no-]key-instructions
* -g[no-]structor-decl-linkage-names
None of these, even in the positive form, should enable debug info
emission.
Fixes#162747
…#162740)"
This reverts commit 6010df0402.
This apparently fails some sanitizer test as reported here:
https://github.com/llvm/llvm-project/pull/162740
It isn't really clear where this happens, but reverting as it is a
friday, and I have no idea if I'll be able to repro this anytime soon,
let alone fix it.
This PR introduces a `MathToXeVM` pass, which implements support for the
`afn` fastmath flag for SPIRV/XeVM targets - It takes supported `Math`
Ops with the `afn` flag, and converts them to function calls to OpenCL
`native_` intrinsics.
These intrinsic functions are supported by the SPIRV backend, and are
automatically converted to `OpExtInst` calls to `native_` ops from the
OpenCL SPIRV ext. inst. set when outputting to SPIRV/XeVM.
Note:
- This pass also supports converting `arith.divf` to native equivalents.
There is an option provided in the pass to turn this behavior off.
- This pass preserves fastmath flags, but these flags are currently
ignored by the SPIRV backend. Thus, in order to generate SPIRV that
truly preserves fastmath flags, support needs to be added to the SPIRV
backend.
Add the `analyzeRtStrideCandidate` function. In the future commits we're
going to add the capability to widen strided loads to it. So, in this
commit, we move the size / type checks into it, since it can possibly
change size / type of load.
Still seeing a failure on a CI bot with this test that
I cannot reproduce locally, but luckily this one CI bot is giving
me trace output from the tests. Turn on the unwind log when tracing
is enabled, migth get a better hint what's up with this test fail.
determineFileName was confusing regarding namespaces. The comment and
conditional were both misleading. Now, we just check against a static
global namespace USR to make a file use "index", or just use the name.
Also, ignore the test results since they almost always fail. This allows
us to simplify the build process and skip uploading and downloading the
build and source directories which are huge.
fixes#161754
When the GVN pass calls `PerformLoadPRE` or `processNonLocalLoad` it can
invoke the `SSAUpdater` which adds a phi node for our tokenLike type. If
we check for if the load is on a token like type at the `processLoad` we
can cover both cases. This is because if we don't GVN will use the
SSAUpdater to insert a phi node to reduce duplicate resource.getpointer
calls.
Because in an earlier commit:
01c0a8409a
we made the verifier error with `PHI nodes cannot have token type!`
This test case will fail today if we try to perform this load
optimization
https://godbolt.org/z/xM69fY8zM
This will impact clang aswell because `isTokenLikeTy` also checks for
`isTokenTy` Clang is likely also failing validation with token types but
just doesn't have a test case because the validator would error if it
were in a phi node.
As a followup to the previous AST changes, the next step is to generate
the proper expressions in Sema. This patch does so for +,*,&,|,^ by
modeling them as compound operators.
This also causes the legality of some expressions to change, as these
have to be legal operations, but were previously unchecked, so there are
some test changes.
This does not yet generate any CIR, that will happen in the next patch.
Better test coverage.
The round-tripping test makes sure that if we ever change
`llvm::dwarf::toDW_Lang` or `llvm::dwarf::toDW_LName`, we don't break
the `LanguageDescription` API.
The round-tripping test found an incorrect version check in
`llvm::dwarf::toDW_Lang`, which I corrected as part of this PR (see the
table at the bottom of https://dwarfstd.org/languages-v6.html for
reference).
Lately, I've been using 'getBaseOriginalType' in ArraySectionExpr
incorrectly: it gets the base-ist of element type, when in reality, I
want a single type of indirection. This patch corrects the handful of
uses that I had for it.