Friendlier wrapper for transform.foreach.
To facilitate that friendliness, makes it so that OpResult.owner returns
the relevant OpView instead of Operation. For good measure, also changes
Value.owner to return OpView instead of Operation, thereby ensuring
consistency. That is, makes it is so that all op-returning .owner
accessors return OpView (and thereby give access to all goodies
available on registered OpViews.)
Reland of #171544 due to fixup for integration test.
Friendlier wrapper for `transform.foreach`.
To facilitate that friendliness, makes it so that `OpResult.owner`
returns the relevant `OpView` instead of `Operation`. For good measure,
also changes `Value.owner` to return `OpView` instead of `Operation`,
thereby ensuring consistency. That is, makes it is so that all
op-returning `.owner` accessors return `OpView` (and thereby give access
to all goodies available on registered `OpView`s.)
This bug was introduced by #108323, where the loc and ip were not
properly set. It may lead to errors when the operations are not linearly
asserted to the IR.
Following a series of refactorings, MLIR Python bindings would crash if
a
dialect object requiring a context defined using
mlir_attribute/type_subclass
was constructed outside of the `ir.Context` context manager. The type
caster
for `MlirContext` would try using `ir.Context.current` when the default
`None`
value was provided to the `get`, which would also just return `None`.
The
caster would then attempt to obtain the MLIR capsule for that `None`,
fail,
but access it anyway without checking, leading to a C++ assertion
failure or
segfault.
Guard against this case in nanobind adaptors. Also emit a warning to the
user
to clarify expectations, as the default message confusingly says that
`None` is
accepted as context and then fails with a type error. Using Python C API
is
currently recommended by nanobind in this case since the surrounding
function
must be marked `noexcept`.
The corresponding test is in the PDL dialect since it is where I first
observed
the behavior. Core types are not using the `mlir_type_subclass`
mechanism and
are immune to the problem, so cannot be used for checking.
This PR enables the MLIR execution engine to dump object file as PIC
code, which is needed when the object file is later bundled into a dynamic
shared library.
---------
Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
Generates type stubs like
```python
class RegionSequence(Sequence[Region]):
def __add__(self, arg: RegionSequence, /) -> list[Region]: ...
def __iter__(self) -> RegionIterator:
"""Returns an iterator over the regions in the sequence."""
```
There were two bugs lurking in mlir.ir.loc_tracebacks():
1) The default None parameter was not handled correctly (passed to a
C++ function that expects ints.
2) The `yield` was incorrectly indented meaning loc_tracebacks()
could not be nested (a "generator didn't yield" exception would be
raised).
Added testing of loc_tracebacks by replacing the custom contextmanager
in the auto_location.py test with the loc_tracebacks() API.
Had to harden the test to line number differences.
---------
Co-authored-by: James Molloy <jmolloy@google.com>
Disallow implicit casting, which is surprising, and, IME, usually
indicative of copy-paste errors.
Because the initial value must be a scalar, I don't expect this to
affect any data movement.
The current implementation of the WMMA intrinsic ops as they are defined
in the ROCDL tablegen is incorrect. They represent as operands what
should be attributes such as `clamp`, `opsel`, `signA/signB`. This
change performs a refactoring to bring it in line with what we expect.
---------
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
This makes it similar to `mlir::TypedValue` in the MLIR C++ API and
allows users to be more specific about the values they produce or
accept.
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This PR exposes `linalg::inferContractionDims(ArrayRef<AffineMap>)` to
Python, allowing users to infer contraction dimensions (batch/m/n/k)
directly from a list of affine maps without needing an operation.
---------
Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
The C++ index switch op has utilities for `getCaseBlock(int i)` and
`getDefaultBlock()`, so these have been added.
Optional body builder args have been added: one for the default case and
one for the switch cases.
Updates the derived Op-classes for the main transform ops to have all
the arguments, etc, from the auto-generated classes. Additionally
updates and adds missing snake_case wrappers for the derived classes
which shadow the snake_case wrappers of the auto-generated classes,
which were hitherto exposed alongside the derived classes.
Adds the first XeGPU transform op, `xegpu.set_desc_layout`, which attachs a `xegpu.layout` attribute to the descriptor that a `xegpu.create_nd_tdesc` op returns.
Currently ExecutionEngine tries to dump all functions declared in the
module, even those which are "external" (i.e., linked/loaded at
runtime). E.g.
```mlir
func.func private @printF32(f32)
func.func @supported_arg_types(%arg0: i32, %arg1: f32) {
call @printF32(%arg1) : (f32) -> ()
return
}
```
fails with
```
Could not compile printF32:
Symbols not found: [ __mlir_printF32 ]
Program aborted due to an unhandled Error:
Symbols not found: [ __mlir_printF32 ]
```
even though `printF32` can be provided at final build time (i.e., when
the object file is linked to some executable or shlib). E.g, if our own
`libmlir_c_runner_utils` is linked.
So just skip functions which have no bodies during dump (i.e., are decls
without defns).
https://github.com/llvm/llvm-project/pull/157930 changed `nb::object
getOwner()` to `PyOpView getOwner()` which implicitly constructs the
generic OpView against from a (possibly) concrete OpView. This PR fixes
that.
Add builders on the Python side that match builders in the C++ side, add tests for launching GPU kernels and regions, and correct some small documentation mistakes. This reflects the API decisions already made in the func dialect's Python bindings and makes use of the GPU dialect's bindings work more similar to C++ interface.
By allowing `transform.smt.constrain_params`'s region to yield SMT-vars,
op instances can declare relationships, through constraints, on incoming
params-as-SMT-vars and outgoing SMT-vars-as-params. This makes it
possible to declare that computations on params should be performed.
The semantics are that the yielded SMT-vars should be from any valid
satisfying assignment/model of the constraints in the region.
This test passed locally because I had a python environment with the
`python` command available, but I should have used the `%PYTHON` lit
command substitution instead. Fixes buildbot failures from #163620.
Adds initial support for Python bindings to the OpenACC dialect.
* The bindings do not provide any niceties yet, just the barebones
exposure of the dialect to Python. Construction of OpenACC ops is
therefore verbose and somewhat inconvenient, as evidenced by the test.
* The test only constructs one module, but I attempted to use enough
operations to be meaningful. It does not test all the ops exposed, but
does contain a realistic example of a memcpy idiom.
The func dialect provides a more pythonic interface for constructing
operations, but the gpu dialect does not; this is the first PR to
provide the same conveniences for the gpu dialect, starting with the
gpu.func op.
Changes to linalg `structured.fuse` transform op:
* Adds an optional `use_forall` boolean argument which generates a tiled
`scf.forall` loop instead of `scf.for` loops.
* `tile_sizes` can now be any parameter or handle.
* `tile_interchange` can now be any parameter or handle.
* IR formatting changes from `transform.structured.fuse %0 [4, 8] ...`
to `transform.structured.fuse %0 tile_sizes [4, 8] ...`
- boolean arguments are now `UnitAttrs` and should be set via the op
attr-dict: `{apply_cleanup, use_forall}`
This is a follow-up PR for #162699.
Currently, in the function where we define rewrite patterns, the `op` we
receive is of type `ir.Operation` rather than a specific `OpView` type
(such as `arith.AddIOp`). This means we can’t conveniently access
certain parts of the operation — for example, we need to use
`op.operands[0]` instead of `op.lhs`. The following example code
illustrates this situation.
```python
def to_muli(op, rewriter):
# op is typed ir.Operation instead of arith.AddIOp
pass
patterns.add(arith.AddIOp, to_muli)
```
In this PR, we convert the operation to its corresponding `OpView`
subclass before invoking the rewrite pattern callback, making it much
easier to write patterns.
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
This PR adds support for defining custom **`RewritePattern`**
implementations directly in the Python bindings.
Previously, users could define similar patterns using the PDL dialect’s
bindings. However, for more complex patterns, this often required
writing multiple Python callbacks as PDL native constraints or rewrite
functions, which made the overall logic less intuitive—though it could
be more performant than a pure Python implementation (especially for
simple patterns).
With this change, we introduce an additional, straightforward way to
define patterns purely in Python, complementing the existing PDL-based
approach.
### Example
```python
def to_muli(op, rewriter):
with rewriter.ip:
new_op = arith.muli(op.operands[0], op.operands[1], loc=op.location)
rewriter.replace_op(op, new_op.owner)
with Context():
patterns = RewritePatternSet()
patterns.add(arith.AddIOp, to_muli) # a pattern that rewrites arith.addi to arith.muli
frozen = patterns.freeze()
module = ...
apply_patterns_and_fold_greedily(module, frozen)
```
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>
`PassManager::enableStatistics` seems currently missing in both C API
and Python bindings. So here we added them in this PR, which includes
the `PassDisplayMode` enum type and the `EnableStatistics` method.
In [#160520](https://github.com/llvm/llvm-project/pull/160520), we
discussed the current limitations of PDL rewriting in Python (see [this
comment](https://github.com/llvm/llvm-project/pull/160520#issuecomment-3332326184)).
At the moment, we cannot create new operations in PDL native (python)
rewrite functions because the `PatternRewriter` APIs are not exposed.
This PR introduces bindings to retrieve the insertion point of the
`PatternRewriter`, enabling users to create new operations within Python
rewrite functions. With this capability, more complex rewrites e.g. with
branching and loops that involve op creations become possible.
---------
Co-authored-by: Maksim Levental <maksim.levental@gmail.com>