The methods in `SideEffectUtils.h` (and their implementations in
`SideEffectUtils.cpp`) seem to have similar intent to methods already
existing in `SideEffectInterfaces.h`. Move the decleration (and
implementation) from `SideEffectUtils.h` (and `SideEffectUtils.cpp`)
into `SideEffectInterfaces.h` (and `SideEffectInterface.cpp`).
Also drop the `SideEffectInterface::hasNoEffect` method in favor of
`mlir::isMemoryEffectFree` which actually recurses into the operation
instead of just relying on the `hasRecursiveMemoryEffectTrait`
exclusively.
Differential Revision: https://reviews.llvm.org/D137857
This patch fixes:
mlir/include/mlir/ExecutionEngine/SparseTensor/Storage.h:955:20:
error: unused variable 'sz' [-Werror,-Wunused-variable]
mlir/lib/Dialect/Linalg/TransformOps/LinalgTransformOps.cpp:1460:2:
error: extra ';' outside of a function is incompatible with C++98
[-Werror,-Wc++98-compat-extra-semi]
Modify the integration test to check number_of_entries and use it to limit for
outputing sparse tensor values.
Reviewed By: aartbik, Peiming
Differential Revision: https://reviews.llvm.org/D138046
This is a similar builder to the one for SCF::IfOp which allows users to pass region builders to it. Refer to the builders for IfOp.
Reviewed By: tpopp
Differential Revision: https://reviews.llvm.org/D137709
`DeviceMappingAttrInterface` is implemented as unifiying mechanism for thread mapping. A code generator could use any attribute that implements this interface to lower `scf.foreach_thread` to device specific code. It is allowed to choose its own mapping and interpretation.
Currently, GPU transform dialect supports only `GPUThreadMapping` and `GPUBlockMapping`; however, other mappings should to be supported as well. This change addresses this issue. It decouples gpu transform dialect from the `GPUThreadMapping` and `GPUBlockMapping`. Now, they can work any other mapping.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D138020
Add a separate validation pass to check if TOSA operations match with
the specification against given requirement. Perform profile type
checking as the initial feature in the pass.
This is an optional pass that can be enabled via command line. e.g.
$mlir-opt --tosa-validate="profile=bi" for validating against the
base inference profile.
Description:
TOSA defines a variety of operator behavior and requirements in the
specification. It would be helpful to have a separate validation pass
to keep TOSA operation input match with TOSA specification for given
criteria, and also diminish the burden of dialect validation during
compilation.
TOSA supports three profiles of which two are for inference purposes.
The main inference profile supports both integer and floating-point
data types, but the base inference profile only supports integers.
In this initial PR, validate the operations against a given profile
of TOSA, so that validation would fail if a floating point tensor is
present when the base inference profile is selected. Afterward, others
checking will be added to the pass if needed. e.g. control flow
operators and custom operators validation.
The pass is expected to be able to run on any point of TOSA dialect
conversion/transformation pipeline, and not depend on a particular
pass run ahead. So that it is can be used to validate the initial tosa
operations just converted from other dialects, the intermediate form,
or the final tosa operations output.
Change-Id: Ib58349c873c783056e89d2ab3b3312b8d2c61863
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D137279
In particular, this silences warnings from [-Wsign-compare].
Depends On D137681
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137735
Systematically updates the SparseTensorRuntime to properly distinguish tensor-dimensions from storage-levels (and their associated ranks, shapes, sizes, indices, etc). With a few exceptions which are noted in the code, this ensures the runtime has all the **semantic** changes necessary to support non-permutations.
(Whereas **operationally**, since we're still using `std::vector<uing64_t>` to represent the mappings, there's no way to pass in any interesting non-permutations. Changing the representation to `std::function` will be done in a separate differential.)
Depends On D137680
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137681
Since all output elements are known to be overridden by construction the fill is not required. This change makes the tosa lowering consistent with the MHLO and Torch lowerings of concat which do not do the fill.
Reviewed By: rsuderman
Differential Revision: https://reviews.llvm.org/D137967
Essentially, this patches changes the anchor point of the
`extract_strided_metadata(reshapeLikeOp)` pattern from
`extract_strided_metadata` to `reshapeLikeOp`.
In details, this means that instead of replacing:
```
base, offset, sizes, strides =
extract_strided_metadata(reshapeLikeOp(src))
```
With
```
base, offset = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
```
We replace only the reshapeLikeOp part and connect it back with a
reinterpret_cast:
```
val = reshapeLikeOp(src)
```
=>
```
base, offset, ... = extract_strided_metadata(src)
sizes = <some math>
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides
Differential Revision: https://reviews.llvm.org/D136386
Essentially, this patches changes the anchor point of the
`extract_strided_metadata(subview)` pattern from
`extract_strided_metadata` to `subview`.
In details, this means that instead of replacing:
```
base, offset, sizes, strides = extract_strided_metadata(subview(src))
```
With
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
```
We replace only the subview part and connect it back with a
reinterpret_cast:
```
val = subview(src)
```
=>
```
base, ... = extract_strided_metadata(src)
offset = <some math>
sizes = subSizes
strides = <some math>
val = reinterpret_cast base, offset, sizes, strides
```
Differential Revision: https://reviews.llvm.org/D135839
This adds a transformation to tile reduction operations to partial
reduction using scf.foreachthread. This uses
PartialReductionOpInterface to create a merge operation of the partial
tiles.
Differential Revision: https://reviews.llvm.org/D137912
Prior to this patch the canonicalization pattern that turns
`reinterpret_cast(extract_strided_metadata)` into cast was only applied
when all the input operands of the `reinterpret_cast` are exactly all the
output results of the `extract_strided_metadata`.
This missed simplification opportunities when the values would have hold
the same constant values, but yet, come from different actual values.
E.g., prior to this patch, a pattern of the form:
```
%base, %offset = extract_strided_metadata %source : memref<i16>
reinterpret_cast %base to offset:[0]
```
Wouldn't have been simplified into a simple cast, because %offset is not
directly the same value object as 0.
This patch teaches this pattern how to check if the constant values
match what the results of the `extract_strided_metadata` operation would
have hold.
Differential Revision: https://reviews.llvm.org/D135736
Previously, the need for a dense permutation leaked into the thread_dim_mapping specification.
This revision allows to use a sparse specification of the thread_dim_mapping and the proper completion / sorting is applied automatically.
In the process, the sematics of scf.foreach_thread is tightened to require a matching number of thread dimensions and mappings.
The relevant negative test is added.
Differential Revision: https://reviews.llvm.org/D137906
Expose `function-boundary-type-conversion` in `OneShotBufferizeOp`. To
reuse options between passes and transform operations, create a
`BufferizationEnums.td`.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D137833
Refactor the rewriting of sparse_tensor.sort to support the implementation of
sparse_tensor.sort_coo.
Reviewed By: aartbik
Differential Revision: https://reviews.llvm.org/D137522
This commit adds automatic unpacking of Value's of type pdl::OperationType to the underlying single-result OpResult.
This allows mixing single-value, attribute and multi-value pdl::Operation tile sizes and num threads to TileToForeachThreadOp.
Differential Revision: https://reviews.llvm.org/D137896
This allows the MLIR transformer to see the command line options and
make desicions based on them. No change upstream, but my use-case is to
look at the entry point name and type to make sure I can use them.
Differential Revision: https://reviews.llvm.org/D137861
This addresses the TODO in the code previously and checks that the address of `dataIt` is properly aligned to the requested alignment.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D137855
We currently only have the SubElement interface API for attribute/type
replacement, but this suffers from several issues; namely that it doesn't
allow caching across multiple replacements (very common), and also
creates a somewhat awkward/limited API. The new AttrTypeReplacer class
allows for registering replacements using a much cleaner API, similarly to
the TypeConverter class, removes a lot of manual interaction with the
sub element interfaces, and also better enables large scale replacements.
Differential Revision: https://reviews.llvm.org/D137764
D137413 clarified `scf_foreach_thread` thread mapping nicely. `tile_to_foreach_thread_op` is one of the op that generates `scf_foreach_thread`, however, its builders are still having integer array.
This is bug fix of potential problem.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D137891
This is unnecessary if the generated operation type already matches
the type of the replaced value. Also use `OpFoldResult` to reduce the
number of cases the casts are needed.
Reviewed By: springerm, hanchung, antiagainst
Differential Revision: https://reviews.llvm.org/D137479
Permutation wasn't handled correctly. Add a test for the rewriting.
Extend an integration test to run with enable_runtime_library=false to
also test the rewriting.
Reviewed By: Peiming
Differential Revision: https://reviews.llvm.org/D137845
tensor.empty op elimination is an optimization that brings IR in a more bufferization-friendly form. E.g.:
```
%0 = tensor.empty()
%1 = linalg.fill(%cst, %0) {inplace = [true]}
%2 = tensor.insert_slice %1 into %t[10][20][1]
```
Is rewritten to:
```
%0 = tensor.extract_slice %t[10][20][1]
%1 = linalg.fill(%cst, %0) {inplace = [true]}
%2 = tensor.insert_slice %1 into %t[10][20][1]
```
This optimization used to operate on bufferization.alloc_tensor ops. This is not correct because the documentation of bufferization.alloc_tensor says that it always bufferizes to an allocation. Instead, this optimization should operate on tensor.empty ops, which can then be lowered to bufferization.alloc_tensor ops (if they don't get eliminated).
Differential Revision: https://reviews.llvm.org/D137162