mirror of
https://github.com/intel/llvm.git
synced 2026-01-26 12:26:52 +08:00
1D multi-reduction are lowered to arith which can prevent some optimisations. I propose `ElementwiseToOuterproduct` matching a series of ops to generate `vector.outerproduct`. As part of some `ElementwiseToVectorOpsPatterns`, it could allow to fuse other elementwiseOps to vector dialect. Originally discussed https://discourse.llvm.org/t/on-improving-arm-sme-lowering-resilience-in-mlir/78543/24. quote @MacDue ``` %lhsBcast = vector.broadcast %lhsCast : vector<[4]xf32> to vector<[4]x[4]xf32> %lhsT = vector.transpose %lhsBcast, [1, 0] : vector<[4]x[4]xf32> to vector<[4]x[4]xf32> %rhsBcast = vector.broadcast %rhs : vector<[4]xf32> to vector<[4]x[4]xf32> %mul = arith.mulf %lhsT, %rhsBcast : vector<[4]x[4]xf32> ``` Can be rewritten as: ``` %mul = vector.outerproduct $lhs, $rhs : vector<[4]xf32>, vector<[4]xf32> ``` --------- Co-authored-by: Han-Chung Wang <hanhan0912@gmail.com>