Commit Graph

408 Commits

Author SHA1 Message Date
Andrzej Warzyński
42944da5ba [mlir][vector] Group re-order patterns together (#102856)
Group all patterns that re-order vector.transpose and vector.broadcast
Ops (*) under `populateSinkVectorOpsPatterns`. These patterns are
normally used to "sink" redundant Vector Ops, hence grouping together.
Example:

```mlir
%at = vector.transpose %a, [1, 0]: vector<4x2xf32> to vector<2x4xf32>
%bt = vector.transpose %b, [1, 0]: vector<4x2xf32> to vector<2x4xf32>
%r = arith.addf %at, %bt : vector<2x4xf32>
```
would get converted to:
```mlir
%0 = arith.addf %a, %b : vector<4x2xf32>
%r = vector.transpose %0, [1, 0] : vector<2x4xf32>
```

This patch also moves all tests for these patterns so that all of them
are:
  * run under one test-flag: `test-vector-sink-patterns`,
  * located in one file: "vector-sink.mlir".

To facilitate this change:
  * `-test-sink-vector-broadcast` is renamed as
    `test-vector-sink-patterns`,
  * "sink-vector-broadcast.mlir" is renamed as "vector-sink.mlir",
  * tests for `ReorderCastOpsOnBroadcast` and
    `ReorderElementwiseOpsOnTranspose` patterns are moved from
    "vector-reduce-to-contract.mlir" to "vector-sink.mlir",
  * `ReorderElementwiseOpsOnTranspose` patterns are removed from
    `populateVectorReductionToContractPatterns` and added to (newly
    created) `populateSinkVectorOpsPatterns`,
  * `ReorderCastOpsOnBroadcast` patterns are removed from
    `populateVectorReductionToContractPatterns` - these are already
    present in `populateSinkVectorOpsPatterns`.

This should allow us better layering and more straightforward testing.
For the latter, the goal is to be able to easily identify which pattern
a particular test is exercising (especially when it's a specific
pattern).

NOTES FOR DOWNSTREAM USERS

In order to preserve the current functionality, please make sure to add
  * `populateSinkVectorOpsPatterns`,

wherever you are using `populateVectorReductionToContractPatterns`.
Also, rename `populateSinkVectorBroadcastPatterns` as
`populateSinkVectorOpsPatterns`.

(*) I didn't notice any other re-order patterns.
2024-08-16 16:53:53 +01:00
Andrzej Warzyński
efe3db2124 [mlir][vector] Add tests for populateSinkVectorBroadcastPatterns (1/n) (#102286)
Adds tests for scalable vectors in:
  * sink-vector-broadcast.mlir

This test file excercises patterns grouped under
`populateSinkVectorBroadcastPatterns`, which includes:
  * `ReorderElementwiseOpsOnBroadcast`,
  * `ReorderCastOpsOnBroadcast`.

Right now there are only tests for the former. However, I've noticed
that "vector-reduce-to-contract.mlir" contains tests for the latter and
I've left a few TODOs to group these tests back together in one file.

Additionally, added some helpful `notifyMatchFailure` messages in
`ReorderElementwiseOpsOnBroadcast`.
2024-08-14 17:28:55 +01:00
Bangtian Liu
b5e47d2e40 [mlir][vector] Add extra check on distribute types to avoid crashes (#102952)
This PR addresses the issue detailed in
https://github.com/iree-org/iree/issues/17948.

The problem occurs when distributed types are set to NULL, leading to
compilation crashes.

---------

Signed-off-by: Bangtian Liu <liubangtian@gmail.com>
2024-08-14 08:47:38 -07:00
Benjamin Maxwell
5f26497da7 [mlir][vector] Use DenseI64ArrayAttr in vector.multi_reduction (#102637)
This prevents some unnecessary conversions to/from int64_t and
IntegerAttr.
2024-08-10 14:10:24 +01:00
Benjamin Maxwell
9b06e25e73 [mlir][vector] Add mask elimination transform (#99314)
This adds a new transform `eliminateVectorMasks()` which aims at
removing scalable `vector.create_masks` that will be all-true at
runtime. It attempts to do this by simply pattern-matching the mask
operands (similar to some canonicalizations), if that does not lead to
an answer (is all-true? yes/no), then value bounds analysis will be used
to find the lower bound of the unknown operands. If the lower bound is
>= to the corresponding mask vector type dim, then that dimension of the
mask is all true.

Note that the pattern matching prevents expensive value-bounds analysis
in cases where the mask won't be all true.

For example:
```mlir
%mask = vector.create_mask %dynamicValue, %c2 : vector<8x4xi1>
```
From looking at `%c2` we can tell this is not going to be an all-true
mask, so we don't need to run the value-bounds analysis for
`%dynamicValue` (and can exit the transform early).

Note: Eliminating create_masks here means replacing them with all-true
constants (which will then lead to the masks folding away).
2024-08-09 10:51:49 +01:00
Han-Chung Wang
201da87c3f [mlir][vector] Handle corner cases in DropUnitDimsFromTransposeOp. (#102518)
da8778e499
breaks the lowering of vector.transpose that all the dimensions are unit
dimensions. The revision fixes the issue and adds a test.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2024-08-08 17:29:09 -07:00
Benjamin Maxwell
da492d4387 [mlir][vector] Fix return of DropUnitDimsFromTransposeOp pattern (#102478)
This accidentally returned `failure()` (rather than `success()`) when it
applied.
2024-08-08 16:31:11 +01:00
Andrzej Warzyński
22a130220c [mlir][vector] Add more tests for ConvertVectorToLLVM (1/n) (#101936)
Adds tests with scalable vectors for the Vector-To-LLVM conversion pass.
Covers the following Ops:
  * vector.bitcast
  * vector.broadcast

Note, this has uncovered some missing logic in `BroadcastOpLowering`.
This PR fixes the most basic cases where the scalable flags were dropped
and the generated code was incorrect. Also, the conditions in
`vector::isBroadcastableTo` are relaxed to allow cases like this:
```mlir
%0 = vector.broadcast %arg0 : vector<1xf32> to vector<[4]xf32>
```

The `BroadcastOpLowering` pattern is effectively disabled for scalable
vectors in more complex cases where an SCF loop would be required to
loop over the scalable dims, e.g.:
```mlir
 %0 = vector.broadcast %arg0 : vector<[4]x1x2xf32> to vector<[4]x3x2xf32>
```

These cases are marked as "Stretch not at start" in the code. In those
cases, support for scalable vectors is left as a TODO.
2024-08-08 15:57:36 +01:00
Benjamin Maxwell
da8778e499 [mlir][vector] Add pattern to drop unit dims from vector.transpose (#102017)
Example:

BEFORE:
```mlir
%transpose = vector.transpose %vector, [3, 0, 1, 2]
  : vector<1x1x4x[4]xf32> to vector<[4]x1x1x4xf32>
```

AFTER:
```mlir
%dropDims = vector.shape_cast %vector
  : vector<1x1x4x[4]xf32> to vector<4x[4]xf32>
%transpose = vector.transpose %0, [1, 0]
  : vector<4x[4]xf32> to vector<[4]x4xf32>
%restoreDims = vector.shape_cast %transpose
  : vector<[4]x4xf32> to vector<[4]x1x1x4xf32>
```
2024-08-08 11:42:27 +01:00
Andrzej Warzyński
cb89457ff8 [nlir][vector] Constrain ContractionOpToMatmulOpLowering (#102225)
Disables `ContractionOpToMatmulOpLowering` for scalable vectors. This
pattern is meant to enable lowering to `llvm.matrix.multiply` - I'm not
aware of any use of that in the context of scalable vectors.
2024-08-07 10:18:14 +01:00
Kazu Hirata
5262865aac [mlir] Construct SmallVector with ArrayRef (NFC) (#101896) 2024-08-04 11:43:05 -07:00
Benjamin Maxwell
b4444dca47 [mlir][vector] Use DenseI64ArrayAttr for shuffle masks (#101163)
Follow on from #100997. This again removes from boilerplate conversions
to/from IntegerAttr and int64_t (otherwise, this is a NFC).
2024-07-30 15:00:14 +01:00
Benjamin Maxwell
0d9b439408 [mlir][vector] Use DenseI64ArrayAttr for constant_mask dim sizes (#100997)
This prevents a bunch of boilerplate conversions to/from IntegerAttrs
and int64_ts. Other than that this is a NFC.
2024-07-29 18:08:37 +01:00
Andrzej Warzyński
1f5807eb35 [mlir][vector][nfc] Simplify in_bounds attr update (#100334)
Since the `in_bounds` attribute is mandatory, there's no need for logic
like this (`readOp.getInBounds()` is guaranteed to return a non-empty
ArrayRef):

```cpp
ArrayAttr inBoundsAttr = readOp.getInBounds()
    ? rewriter.getArrayAttr( readOp.getInBoundsAttr().getValue().drop_back(dimsToDrop))
    : ArrayAttr();
```

Instead, we can do this:
```cpp
ArrayAttr inBoundsAttr = rewriter.getArrayAttr(
    readOp.getInBoundsAttr().getValue().drop_back(dimsToDrop));
```

This is a small follow-up for #97049 - this change should've been
included there.
2024-07-24 14:40:19 +01:00
Hugo Trachino
de61875e9d [MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leading / trailing dimensions. (#98455)
Generalizes DropUnitDimFromElementwiseOps to support inner unit
dimensions.
This change stems from improving lowering of contractionOps for Arm SME.
Where we end up with inner unit dimensions on MulOp, BroadcastOp and
TransposeOp, preventing the generation of outerproducts.
discussed
[here](https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/17?u=nujaa).

Fix after : https://github.com/llvm/llvm-project/pull/97652 showed an
unhandled edge case when all dimensions are one. The generated target
VectorType would be `vector<f32>` which is apparently not supported by
the mulf.
In case all dimensions are dropped, the target vectorType is
vector<1xf32>

---------

Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
2024-07-17 10:22:25 +01:00
Andrzej Warzyński
2ee5586ac7 [mlir][vector] Make the in_bounds attribute mandatory (#97049)
At the moment, the in_bounds attribute has two confusing/contradicting
properties:
  1. It is both optional _and_ has an effective default-value.
  2. The default value is "out-of-bounds" for non-broadcast dims, and
     "in-bounds" for broadcast dims.

(see the `isDimInBounds` vector interface method for an example of this
"default" behaviour [1]).

This PR aims to clarify the logic surrounding the `in_bounds` attribute
by:
  * making the attribute mandatory (i.e. it is always present),
  * always setting the default value to "out of bounds" (that's
    consistent with the current behaviour for the most common cases).

#### Broadcast dimensions in tests

As per [2], the broadcast dimensions requires the corresponding
`in_bounds` attribute to be `true`:
```
  vector.transfer_read op requires broadcast dimensions to be in-bounds
```

The changes in this PR mean that we can no longer rely on the
default value in cases like the following (dim 0 is a broadcast dim):
```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

Instead, the broadcast dimension has to explicitly be marked as "in
bounds:

```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

All tests with broadcast dims are updated accordingly.

#### Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The following patterns in "Vectorization.cpp" are updated to explicitly
set the `in_bounds` attribute to `false`:
* `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`

Also, `vectorizeAffineLoad` (from "SuperVectorize.cpp") and
`vectorizeAsLinalgGeneric` (from "Vectorization.cpp") are updated to
make sure that xfer Ops created by these hooks set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would complain

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Still, this is consistent with the
current behaviour where the broadcast dim would be implicitly assumed
to be "in bounds".

[1]
4145ad2bac/mlir/include/mlir/Interfaces/VectorInterfaces.td (L243-L246)
[2]
https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop
2024-07-16 16:49:52 +01:00
Andrzej Warzyński
6479a5a438 [mlir][vector] Restrict DropInnerMostUnitDimsTransfer{Read|Write} (#96218)
Restrict `DropInnerMostUnitDimsTransfer{Read|Write}` so that it fails
when one of the indices to be dropped could be != 0 and "out of bounds":

```mlir
func.func @negative_example(%arg0: memref<16x1xf32>, %arg1: vector<8x1xf32>, %idx_1: index, %idx_2: index) {
  vector.transfer_write %arg1, %arg0[%idx_1, %idx_2] {in_bounds = [true, false]} : vector<8x1xf32>, memref<16x1xf32>
  return
}
```

This is an edge case that could represent an out-of-bounds access,
though that will depend on the actual value of %i. Importantly, without
this change it would be transformed as follows:

```mlir
func.func @negative_example(%arg0: memref<16x1xf32>, %arg1: vector<8x1xf32>, %arg2: index, %arg3: index) {
  %subview = memref.subview %arg0[0, 0] [16, 1] [1, 1] : memref<16x1xf32> to memref<16xf32, strided<[1]>>
  %0 = vector.shape_cast %arg1 : vector<8x1xf32> to vector<8xf32>
  vector.transfer_write %0, %subview[%arg2] {in_bounds = [true]} : vector<8xf32>, memref<16xf32, strided<[1]>>
  return
}
```

This is incorrect - `%idx_2` is ignored and the "out of bounds" flags is
not propagated. Hence the extra restriction to avoid such cases.

NOTE: This is a follow-up for: #94904
2024-07-12 09:32:01 +01:00
Han-Chung Wang
eaabd762bd Revert "[MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leading / trailing dimensions." (#97652)
Reverts llvm/llvm-project#92934 because it breaks some lowering. To
repro: `mlir-opt -test-vector-transfer-flatten-patterns ~/repro.mlir`

```mlir
func.func @unit_dim_folding(%arg0: vector<1x1xf32>) -> vector<1x1xf32> {
  %cst = arith.constant dense<0.000000e+00> : vector<1x1xf32>
  %0 = arith.mulf %arg0, %cst : vector<1x1xf32>
  return %0 : vector<1x1xf32>
}
```
2024-07-03 16:03:41 -07:00
Ramkumar Ramachandra
db791b278a mlir/LogicalResult: move into llvm (#97309)
This patch is part of a project to move the Presburger library into
LLVM.
2024-07-02 10:42:33 +01:00
Stanley Winata
ac1e22f305 [mlir][vector] Generalize folding of ext-contractionOp to other types. (#96593)
Many state of the art models and quantization operations are now
directly working on vector.contract on integers.

This commit enables generalizes ext-contraction folding S.T we can emit
more performant vector.contracts on codegen pipelines.

Signed-off-by: Stanley Winata <stanley.winata@amd.com>
2024-06-25 09:29:43 -07:00
Mubashar Ahmad
137a7451f4 [mlir][vector] Support n-D vectors in i8 to i4 trunci emulation (#94946)
Previously, this only supported 1-D vectors via vector.shuffle, with
the new vector.deinterleave this can be updated to support n-D vectors.
2024-06-24 10:24:28 +01:00
Cullen Rhodes
9931ee61d9 [mlir][vector] Fix FlattenGather for scalable vectors (#96074)
This pattern flattens vector.gather ops by unrolling the outermost
dimension for rank > 2 vectors. There's two issues with this pattern for
scalable vectors:

  1. The unrolling doesn't take vscale into account. A constraint is
     added to disable this pattern for vectors with leading scalable
     dims.
  2. The scalable dims are dropped when creating the new gather. Fixed
     by propagating the flags.

Depends on #96049.
2024-06-24 08:36:06 +01:00
Artem Kroviakov
74a105ad80 [mlir][vector] Use notifyMatchFailure instead of assert in VectorLinearize (#93590)
As it was [suggested](https://github.com/llvm/llvm-project/pull/92370#discussion_r1617592942), the `assert` is replaced by `notifyMatchFailure` for improved consistency.
2024-06-21 09:36:56 -05:00
Benjamin Maxwell
dc5d541081 [mlir][vector] Support scalable vectors when unrolling vector.bitcast (#94197)
Follow up to #94064.
2024-06-21 14:38:19 +01:00
Hugo Trachino
9f0aa05bfb [mlir][vector] Add ElementwiseToOuterproduct (#93664)
1D multi-reduction are lowered to arith which can prevent some
optimisations. I propose `ElementwiseToOuterproduct` matching a series of
ops to generate `vector.outerproduct`.
As part of some `ElementwiseToVectorOpsPatterns`, it could allow to fuse
other elementwiseOps to vector dialect.
Originally discussed
https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/24.

quote @MacDue
```
%lhsBcast = vector.broadcast %lhsCast : vector<[4]xf32> to vector<[4]x[4]xf32>
%lhsT = vector.transpose %lhsBcast, [1, 0] : vector<[4]x[4]xf32> to vector<[4]x[4]xf32>
%rhsBcast = vector.broadcast %rhs : vector<[4]xf32> to vector<[4]x[4]xf32>
%mul = arith.mulf %lhsT, %rhsBcast : vector<[4]x[4]xf32>
```

Can be rewritten as:

```
%mul = vector.outerproduct $lhs, $rhs : vector<[4]xf32>, vector<[4]xf32>
```

---------

Co-authored-by: Han-Chung Wang <hanhan0912@gmail.com>
2024-06-21 13:34:37 +01:00
Andrzej Warzyński
34de7fd428 [mlir][vector] Refactor vector-transfer-flatten.mlir (nfc) (1/n) (#95743)
The main goal of this and subsequent PRs is to unify and categorize
tests in:
  * vector-transfer-flatten.mlir
  
This should make it easier to identify the edge cases being tested (and
how they differ), remove duplicates and to add tests for scalable
vectors.

The main contributions of this PR:
  * split tests that covered `xfer_read` + `xfer_write` into separate
tests (majority of the existing tests check _one_ xfer Op at a time),
  * organise tests for `xfer_read` and `xfer_write` into separate
    groups (separate with a big bold comment).

Note, all tests (i.e. test cases) are preserved and some new tests are
added. Deletions that you will see in `git diff` correspond to
`xfer_write` and `xfer_read` Ops being extracted to separate functions
(so that there's one xfer Op per function). In particular, the number of
test functions has grown from 26 to 30.

In addition, this PR unifies the tests so that:
  * input variable names are consistent (e.g. make sure that the input
    memref is always `arg`)
  * CHECK lines use similar indentations
  * 2 x tabs are always used for function arguments, 1 x tab for
    function body

Finally, changes in "VectorTransferOpTransforms.cpp" are merely meant to
unify comments and logic between
  * `FlattenContiguousRowMajorTransferWritePattern` and
  * `FlattenContiguousRowMajorTransferReadPattern`.
2024-06-21 10:55:45 +01:00
Andrzej Warzyński
c65fb32ddd [mlir][vector] Update tests for collapse 3/n (nfc) (#94906)
The main goal of this PR (and subsequent PRs), is to add more tests with
scalable vectors to:
  * vector-transfer-collapse-inner-most-dims.mlir

There's quite a few cases to consider, hence this is split into multiple
PRs. In this PR, the very first test for `vector.transfer_write` is
complemented with all the possible combinations:
  * scalable (rather than fixed) unit trailing dim,
  * dynamic (rather than static) trailing dim in the source memref.

To this end, the following tests:
  * `@leading_scalable_dimension_transfer_write`
    `@trailing_scalable_one_dim_transfer_write`

are replaced with:
  * `@drop_two_inner_most_dim_scalable_inner_dim` and
    `@negative_scalable_unit_dim`,

respectively. In addition:
  * "_for_transfer_write" is removed from function names (to reduce
    noise).

In addition, to maintain consistency between the tests for `xfer_read`
and `xfer_write`, 2 negative tests for `xfer_read` are also renamed.
This is to follow the suggestion made during the review of this PR.

Extra comments in "VectorTransforms.cpp" are added to better
document the limitations related to scalable vectors and which tests
added here excercise.

This is a follow-up for: #94490 and #94604

NOTE: This PR is limited to tests for `vector.transfer_write`.
2024-06-20 17:45:03 +01:00
Hugo Trachino
2c06fb8999 [MLIR][Vector] Generalize DropUnitDimFromElementwiseOps to non leading / trailing dimensions. (#92934)
Generalizes `DropUnitDimFromElementwiseOps` to support inner unit
dimensions.
This change stems from improving lowering of contractionOps for Arm SME.
Where we end up with inner unit dimensions on MulOp, BroadcastOp and
TransposeOp, preventing the generation of outerproducts.
discussed
[here](https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/17?u=nujaa).

---------

Co-authored-by: Benjamin Maxwell <macdue@dueutil.tech>
2024-06-20 10:43:23 +01:00
Cullen Rhodes
cc145f4053 [mlir][vector] Disable Gather1DToConditionalLoads for scalable vectors (#96049)
Pattern scalarizes vector.gather operations and is incorrect for
scalable vectors.
2024-06-20 08:07:43 +01:00
Hugo Trachino
74941d053c [MLIR][Vector] Implement XferOp To {Load|Store}Lowering as MaskableOpRewritePattern (#92892)
Implements `TransferReadToVectorLoadLowering` and
`TransferWriteToVectorStoreLowering` as a `MaskableOpRewritePattern`.
Allowing to exit gracefully when run on an xferOp located inside a
`vector::MaskOp` instead of breaking because the pattern generated
multiple ops in the MaskOp with `error: 'vector.mask' op expects only
one operation to mask`.

Split of https://github.com/llvm/llvm-project/pull/90835
2024-06-18 14:24:03 +01:00
Andrzej Warzyński
77db8b08c8 [mlir][vector] Restrict DropInnerMostUnitDimsTransferRead (#94904)
Restrict `DropInnerMostUnitDimsTransferRead` so that it fails when one
of the indices to be dropped could be != 0, e.g.

```mlir
func.func @negative_example(%A: memref<16x1xf32>, %i:index, %j:index) -> (vector<8x1xf32>) {
  %f0 = arith.constant 0.0 : f32
  %1 = vector.transfer_read %A[%i, %j], %f0 : memref<16x1xf32>, vector<8x1xf32>
  return %1 : vector<8x1xf32>
}
```

This is an edge case that could represent an out-of-bounds access,
though that will depend on the actual value of `%j`. Importantly,
_without this change_ it would be transformed as follows:
```mlir
  func.func @negative_example(%arg0: memref<16x1xf32>, %arg1: index, %arg2: index) -> vector<8x1xf32> {
    %cst = arith.constant 0.000000e+00 : f32
    %subview = memref.subview %arg0[0, 0] [16, 1] [1, 1] : memref<16x1xf32> to memref<16xf32, strided<[1]>>
    %0 = vector.transfer_read %subview[%arg1], %cst : memref<16xf32, strided<[1]>>, vector<8xf32>
    %1 = vector.shape_cast %0 : vector<8xf32> to vector<8x1xf32>
    return %1 : vector<8x1xf32>
  }
```

This is incorrect - `%arg2` is ignored. Hence the extra restriction to
avoid such cases.

NOTE: This PR is limited to tests for `vector.transfer_read`.
2024-06-12 16:13:51 +01:00
Hugo Trachino
0170498a7d [MLIR][Vector] Implement TransferOpReduceRank as MaskableOpRewritePattern (#92426)
Implements `TransferOpReduceRank` as a `MaskableOpRewritePattern`.
Allowing to exit gracefully when run on a `vector::transfer_read`
located inside a `vector::MaskOp` instead of generating  `error: 'vector.mask'
op expects only one operation to mask` because the
pattern generated multiple ops inside the MaskOp.

Split of https://github.com/llvm/llvm-project/pull/90835
2024-06-12 09:00:59 +01:00
Ramkumar Ramachandra
0fb216fb2f mlir/MathExtras: consolidate with llvm/MathExtras (#95087)
This patch is part of a project to move the Presburger library into
LLVM.
2024-06-11 23:00:02 +01:00
Mubashar Ahmad
b87a80d4eb [mlir][vector] Add n-d deinterleave lowering (#94237)
This patch implements the lowering for vector
deinterleave for vector of n-dimensions. Process
involves unrolling the n-d vector to a series
of one-dimensional vectors. The deinterleave
operation is then used on these vectors.

From:
```
%0, %1 = vector.deinterleave %a : vector<2x8xi8> -> vector<2x4xi8>
```

To:
```
%cst = arith.constant dense<0> : vector<2x4xi32>
%0 = vector.extract %arg0[0] : vector<8xi32> from vector<2x8xi32>
%res1, %res2 = vector.deinterleave %0 : vector<8xi32> -> vector<4xi32>
%1 = vector.insert %res1, %cst [0] : vector<4xi32> into vector<2x4xi32>
%2 = vector.insert %res2, %cst [0] : vector<4xi32> into vector<2x4xi32>
%3 = vector.extract %arg0[1] : vector<8xi32> from vector<2x8xi32>
%res1_0, %res2_1 = vector.deinterleave %3 : vector<8xi32> -> vector<4xi32>
%4 = vector.insert %res1_0, %1 [1] : vector<4xi32> into vector<2x4xi32>
%5 = vector.insert %res2_1, %2 [1] : vector<4xi32> into vector<2x4xi32>
...etc.
```
2024-06-07 10:57:00 +01:00
Han-Chung Wang
53ddc87454 [mlir][vector] Improve flattening vector.transfer_write ops. (#94051)
We can flatten the transfer ops even when the collapsed indices are not
zeros. We can compute it. It is already supported in
vector.transfer_read cases. The revision refactors the logic and reuse
it in transfer_write cases.
2024-06-05 15:07:22 -07:00
Han-Chung Wang
0ea1271ee1 [mlir][vector] Add support for unrolling vector.bitcast ops. (#94064)
The revision unrolls vector.bitcast like:

```mlir
%0 = vector.bitcast %arg0 : vector<2x4xi32> to vector<2x2xi64>
```

to

```mlir
%cst = arith.constant dense<0> : vector<2x2xi64>
%0 = vector.extract %arg0[0] : vector<4xi32> from vector<2x4xi32>
%1 = vector.bitcast %0 : vector<4xi32> to vector<2xi64>
%2 = vector.insert %1, %cst [0] : vector<2xi64> into vector<2x2xi64>
%3 = vector.extract %arg0[1] : vector<4xi32> from vector<2x4xi32>
%4 = vector.bitcast %3 : vector<4xi32> to vector<2xi64>
%5 = vector.insert %4, %2 [1] : vector<2xi64> into vector<2x2xi64>
```

The scalable vector is not supported because of the limitation of
`vector::createUnrollIterator`. The targetRank could mismatch the final
rank during unrolling; there is no direct way to query what the final
rank is from the object.
2024-06-03 16:39:52 -07:00
Kunwar Grover
debdbeda15 [mlir] Remove dialect specific bufferization passes (Reland) (#93535)
These passes have been depreciated for a long time and replaced by
one-shot bufferization. These passes are also unsafe because they do not
check for read-after-write conflicts.

Relands https://github.com/llvm/llvm-project/pull/93488 which failed on
buildbot. Fixes the failure by updating integration tests to use
one-shot-bufferize instead.
2024-05-28 20:04:27 +01:00
Artem Kroviakov
01fbc5658c [mlir][vector] Add support for linearizing Insert VectorOp in VectorLinearize (#92370)
Building on top of
[#88204](https://github.com/llvm/llvm-project/pull/88204), this PR adds
support for converting `vector.insert` into an equivalent
`vector.shuffle` operation that operates on linearized (1-D) vectors.
2024-05-28 14:54:37 +02:00
Kunwar Grover
39848d0a98 Revert "[mlir] Remove dialect specific bufferization passes" (#93528)
Reverts llvm/llvm-project#93488

Buildbot failure:
https://lab.llvm.org/buildbot/#/builders/220/builds/39911
2024-05-28 11:21:34 +01:00
Kunwar Grover
2fc5106437 [mlir] Remove dialect specific bufferization passes (#93488)
These passes have been depreciated for a long time and replaced by
one-shot bufferization. These passes are also unsafe because they do not
check for read-after-write conflicts.
2024-05-28 11:12:58 +01:00
Jakub Kuderski
714aee31e1 [mlir][vector] Add result type to interleave assembly format (#93392)
This is to make it more obvious for what the result type is, especially
with some less trivial cases like 0-d inputs resulting in 1-d inputs or
interaction with scalable vector types. Note that `vector.deinterleave`
uses the same format with explicit result type.

Also improve examples and clean up surrounding code.
2024-05-27 11:03:36 -04:00
Hugo Trachino
fdd245ad85 [MLIR][Vector] Implement transferXXPermutationLowering as MaskableOpRewritePattern (#91987)
* Implements `TransferWritePermutationLowering`,
`TransferReadPermutationLowering` and
`TransferWriteNonPermutationLowering` as a MaskableOpRewritePattern.
Allowing to exit gracefully when such use of a xferOp is inside a
`vector::MaskOp`
* Updates MaskableOpRewritePattern to handle MemRefs and buffer
semantics providing empty `Value()` as a return value for
`matchAndRewriteMaskableOp` now represents successful rewriting without
value to replace the original op.

Split of https://github.com/llvm/llvm-project/pull/90835
2024-05-20 21:46:41 +02:00
Benjamin Maxwell
e01ff8238c [mlir][vector] Fix scalability issues in drop innermost unit dims transfer patterns (#92402)
Previously, these rewrites would drop scalable dimensions and treated
`[1]` (scalable one dim) as a unit dimension. This patch propagates
scalable dimensions and ensures `[1]` is not treated as a unit
dimension.
2024-05-17 17:13:59 +01:00
Benjamin Maxwell
90d2f8c630 [mlir][vector] Teach TransferOptimization to look through trivial aliases (#87805)
This allows `TransferOptimization` to eliminate and forward stores that
are to trivial aliases (rather than just to identical memref values).

A trivial aliases is (currently) defined as:

  1. A `memref.cast`
  2. A `memref.subview` with a zero offset and unit strides
  3. A chain of 1 and 2
2024-05-16 10:53:14 +01:00
Andrzej Warzyński
cbd72cb0de [mlir][vector] Split TransposeOpLowering into 2 patterns (#91935)
Splits `TransposeOpLowering` into two patterns:
1. `Transpose2DWithUnitDimToShapeCast` - rewrites 2D `vector.transpose`
    as `vector.shape_cast` (there has to be at least one unit dim),
  2. `TransposeOpLowering` - the original pattern without the part
    extracted into `Transpose2DWithUnitDimToShapeCast`.

The rationale behind the split:
  * the output generated by `Transpose2DWithUnitDimToShapeCast` doesn't
    really match the intended output from `TransposeOpLowering` as
    documented in the source file - it doesn't make much sense to keep
    it embedded inside `TransposeOpLowering`,
* `Transpose2DWithUnitDimToShapeCast` _does_ work for scalable vectors,
    `TransposeOpLowering` _does_ not.
2024-05-14 12:08:57 +01:00
Benoit Jacob
a1d43c14d8 [mlir][vector] Add Vector-dialect interleave-to-shuffle pattern, enable in VectorToSPIRV (#92012)
This is the second attempt at merging #91800, which bounced due to a
linker error apparently caused by an undeclared dependency.
`MLIRVectorToSPIRV` needed to depend on `MLIRVectorTransforms`. In fact
that was a preexisting issue already flagged by the tool in
https://discourse.llvm.org/t/ninja-can-now-check-for-missing-cmake-dependencies-on-generated-files/74344.

Context: https://github.com/iree-org/iree/issues/17346.

Test IREE integrate showing it's fixing the problem it's intended to
fix, i.e. it allows IREE to drop its local revert of
https://github.com/llvm/llvm-project/pull/89131:

https://github.com/iree-org/iree/pull/17359

This is added to VectorToSPIRV because SPIRV doesn't currently handle
`vector.interleave` (see motivating context above).

This is limited to 1D, non-scalable vectors.
2024-05-13 15:36:28 -04:00
Benoit Jacob
5df01ed79c Revert "[mlir][vector] Add Vector-dialect interleave-to-shuffle pattern, enable in VectorToSPIRV" (#92006)
Reverts llvm/llvm-project#91800

Reason: https://lab.llvm.org/buildbot/#/builders/268/builds/13935
2024-05-13 13:50:27 -04:00
Benoit Jacob
cf40c93b5b [mlir][vector] Add Vector-dialect interleave-to-shuffle pattern, enable in VectorToSPIRV (#91800)
Context: https://github.com/iree-org/iree/issues/17346.

Test IREE integrate showing it's fixing the problem it's intended to
fix, i.e. it allows IREE to drop its local revert of
https://github.com/llvm/llvm-project/pull/89131:

https://github.com/iree-org/iree/pull/17359

This is added to VectorToSPIRV because SPIRV doesn't currently handle
`vector.interleave` (see motivating context above).

This is limited to 1D, non-scalable vectors.
2024-05-13 13:20:30 -04:00
Andrzej Warzyński
0bacffbbfc [mlir][vector] Update tests/patterns for vector.transpose (#91359)
Pretty much all logic that we have today for lowering vector.transpose
assumes fixed length vectors (it's done via vector.shuffle that don't
support scalable vectors). This patch updates related tests and patterns
to capture and document this limitation more explicitly.

Note that `vector.transpose` is a valid operation in the context of
scalable vectors, but we are yet to implement the missing lowerings.

Summary of changes:
* `@transpose_nx8x2xf32` is renamed as `@transpose_scalabl`e
  and moved near other tests using `lowering_strategy = "shuffle_1d"
  (to avoid duplicating TD sequences)
* tests specific to X86  (`avx2_lowering_strategy = true`) are moved to
  a dedicated file (to separate generic tests from target-specific
  tests)
* `@transpose10_nx4xnx1xf32` duplicated `@transpose10_4xnx1xf32` and was
  deleted (the latter is renamed as `@transpose10_4x1xf32_scalable` to
  match its fixed-width counterpart: `@transpose10_4x1xf32`)
2024-05-13 08:13:51 +01:00
Benoit Jacob
2083e97e88 Fix VectorEmulateNarrowType asserting on scalar type vs vector type. (#91613) 2024-05-09 12:42:45 -04:00