Commit Graph

8690 Commits

Author SHA1 Message Date
Rahul Kayaith
67a910bbff [mlir][python] Remove PythonAttr mapping functionality
This functionality has been replaced by TypeCasters (see D151840)

depends on D154468

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D154469
2023-07-18 12:21:28 -04:00
Razvan Lupusoru
7496177d98 [openacc] Add dialect definition for acc declare
A declare directive is used to specify the creation of a visible device
copy of a variable for the duration of the implicit data region as it
relates to the scope in which the variable is declared.

In order to support this, the following new operations were added:
1) `acc.global_ctor` and `acc.global_dtor`. These are used whenever the
declare directive applies to a global.
2) `acc.declare_enter` and `acc.declare_exit`. These operations are
modeled similarly to `acc.enter_data` and `acc.exit_data`. The reason
they are not modeled like `acc.data` is so that these operations can be
used both for globals and regions like functions.
3) `acc.declare_device_resident` and `acc.declare_link`. These
operations are modeled in a manner consistent with previously defined
data entry operation model.

The `acc.getdeviceptr` was generalized so that it can be used with
acc.declare_exit.

Reviewed By: clementval, vzakhari

Differential Revision: https://reviews.llvm.org/D155322
2023-07-18 08:11:06 -07:00
iambrj
3dd9931c0f [MLIR][Presburger] Implement domain and range restriction for PresburgerRelation
This patch implements domain and range restriction for PresburgerRelation

Reviewed By: Groverkss

Differential Revision: https://reviews.llvm.org/D154798
2023-07-18 19:12:12 +05:30
Matthias Springer
db393288ff [mlir][NVGPU][transform] Add create_async_groups transform op
This transform looks for suitable vector transfers from global memory to shared memory and converts them to async device copies.

Differential Revision: https://reviews.llvm.org/D155569
2023-07-18 14:36:41 +02:00
Alex Zinenko
4a6b31b8d8 [mlir] NFC: untangle SCF Patterns.h and Transforms.h
These two headers both contained a strange mix of definitions related to
both patterns and non-pattern transforms. Put patterns and "populate"
functions into Patterns.h and standalone transforms into Transforms.h.

Depends On: D155223

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155454
2023-07-18 11:27:36 +00:00
Andrzej Warzynski
3fa5ee67ba [mlir][ArmSME] Introduce custom TypeConverter for ArmSME
At the moment, SME-to-LLVM lowerings rely entirely on
`LLVMTypeConverter`. This patch introduces a dedicated `TypeConverter`
that inherits from `LLVMTypeConverter` (it will also be used when
lowering ArmSME Ops to LLVM).

The new type converter merely disables lowerings for `VectorType` to
prevent 2-d scalable vectors (common in the context of ArmSME), e.g.

   `vector<[16]x[16]xi8>`,

entering the LLVM Type converter. LLVM does not support arrays of
scalable vectors and hence the need for specialisation. In the case of
SME such types are effectively eliminated when emitting LLVM IR
intrinsics for SME.

Differential Revision: https://reviews.llvm.org/D155365
2023-07-18 09:35:32 +00:00
Cullen Rhodes
fb54fec726 [mlir][ArmSME] Implement tile allocation
This patch adds a pass '-allocate-sme-tiles' to the ArmSME dialect that
implements allocation of SME ZA tiles.

It does this at the 'func.func' op level by replacing
'arm_sme.get_tile_id' ops with 'arith.constant' ops that represent the
tile number. The tiles in use in a given function are tracked by an
integer function attribute 'arm_sme.tiles_in_use' that is a 16-bit tile
mask with a bit for each 128-bit element tile (ZA0.Q-ZA15.Q), the
smallest ZA tile granule. This is initialized on the first
'arm_sme.get_tile_id' rewrite and updated on each subsequent rewrite.
Mixing of different element tile types is supported.

Section B2.3.2 of the SME spec [1] describes how the 128-bit element
tiles overlap with other element tiles.

Depends on D154941

[1] https://developer.arm.com/documentation/ddi0616/aa

Reviewed By: awarzynski

Differential Revision: https://reviews.llvm.org/D154955
2023-07-18 08:46:40 +00:00
Andrzej Warzynski
447bb5bee4 [mlir][ArmSME] Introduce new lowering layer (Vector -> ArmSME)
At the moment, the lowering from the Vector dialect to SME looks like
this:

  * Vector --> SME LLVM IR intrinsics

This patch introduces a new lowering layer between the Vector dialect
and the Arm SME extension:

  * Vector --> ArmSME dialect (custom Ops) --> SME LLVM IR intrinsics.

This is motivated by 2 considerations:
1. Storing `ZA` to memory (e.g. `vector.transfer_write`) requires an
   `scf.for` loop over all rows of `ZA`. Similar logic will apply to
   "load to ZA from memory". This is a rather complex transformation and
   a custom Op seems justified.
2. As discussed in [1], we need to prevent the LLVM type converter from
   having to convert types unsupported in LLVM, e.g.
   `vector<[16]x[16]xi8>`. A dedicated abstraction layer with custom Ops
   opens a path to some fine tuning (e.g. custom type converters) that
   will allow us to avoid this.

To facilitate this change, two new custom SME Op are introduced:

  * `TileStoreOp`, and
  * `ZeroOp`.

Note that no new functionality is added - these Ops merely model what's
already supported. In particular, the following tile size is assumed
(dimension and element size are fixed):

  * `vector<[16]x[16]xi8>`

The new lowering layer is introduced via a conversion pass between the
Vector and the SME dialects. You can use the `-convert-vector-to-sme`
flag to run it. The following function:
```
func.func @example(%arg0 : memref<?x?xi8>) {
  // (...)
  %cst = arith.constant dense<0> : vector<[16]x[16]xi8>
  vector.transfer_write %cst, %arg0 : vector<[16]x[16]xi8>, memref<?x?xi8>
  return
}
```
would be lowered to:
```
  func.func @example(%arg0: memref<?x?xi8>) {
    // (...)
    %0 = arm_sme.zero : vector<[16]x[16]xi8>
    arm_sme.tile_store %arg0[%c0, %c0], %0 : memref<?x?xi8>, vector<[16]x[16]xi8>
    return
  }
```

Later, a mechanism will be introduced to guarantee that `arm_sme.zero`
and `arm_sme.tile_store` operate on the same virtual tile. For `i8`
elements this is not required as there is only one tile.

In order to lower the above output to LLVM, use
  * `-convert-vector-to-llvm="enable-arm-sme"`.

[1] https://github.com/openxla/iree/issues/14294

Reviewed By: WanderAway

Differential Revision: https://reviews.llvm.org/D154867
2023-07-18 08:04:59 +00:00
Cullen Rhodes
6ff9761a69 [mlir][ArmSME] Add custom get_tile_id and cast ops
This patch adds three new custom ops to the ArmSME dialect:

  * arm_sme.get_tile_id - returns a scalar integer representing an SME
    "virtual tile" that is not in use.
  * arm_sme.cast_tile_to_vector - casts from a tile id to a 2-d scalable
    vector type, which represents an SME "virtual tile".
  * arm_sme.cast_vector_to_tile - casts from a 2-d scalable vector type,
    which represents an SME "virtual tile", to a tile id.

The 'arm_sme.get_tile_id' op currently only supports tile 0, a follow-up
patch will implement proper tile allocation. A further follow-up patch
will demonstrate load/store to/from ZA using these ops.

See the op descriptions for further details and examples.

Thanks to @paulwalker-arm and @awarzynski for helping drive this.

Reviewed By: awarzynski, dcaballe

Differential Revision: https://reviews.llvm.org/D154941
2023-07-18 07:41:45 +00:00
Martin Erhart
d582562188 [mlir][bufferization] Add DeallocOp
The dealloc operation deallocates each of the given memrefs if there is no alias
to that memref in the list of retained memrefs and the corresponding
condition value is set. This condition can be used to indicate and pass on
ownership of memref values (or in other words, the responsibility of
deallocating that memref). If two memrefs alias each other, only one will be
deallocated to avoid double free situations.

The memrefs to be deallocated must be the originally allocated memrefs,
however, the memrefs to be retained may be arbitrary memrefs.

Returns a list of conditions corresponding to the list of memrefs which
indicates the new ownerships, i.e., if the memref was deallocated the
ownership was dropped (set to 'false') and otherwise will be the same as the
input condition.

Differential Revision: https://reviews.llvm.org/D155467
2023-07-18 07:32:49 +00:00
Matthias Springer
9f808f6e2f [mlir][vector][NFC] Drop get...AttrStrName helper functions
These functions are not needed. They are auto-generated from the `.td` files.

Differential Revision: https://reviews.llvm.org/D155483
2023-07-17 18:16:08 +02:00
Guray Ozen
baba13e9a1 [mlir][nvvm] Delete backslash
Delete the backslash. It was there to compile tablegen file. It looks like space also works fine.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D155474
2023-07-17 17:56:52 +02:00
Matthias Springer
0e8c68c301 [mlir][Interfaces] Fix DestinationStyleOpInterface for vector ops
This revision fixes `hasTensorSemantics` and `hasBufferSemantics` for vector transfer ops, which may have a vector operand. `VectorType` implements `ShapedType` and such operands do not affect whether an op has tensor or buffer semantics. Also implement `DestinationStyleOpInterface` on `TransferReadOp` so that `hasTensorSemantics`/`hasBufferSemantics` can be called. (The op has no inits, but this makes it symmetric to `TransferWriteOp`.)

Differential Revision: https://reviews.llvm.org/D155469
2023-07-17 17:40:18 +02:00
Guray Ozen
28555793b1 [mlir][nvvm] Add cp.async.bulk.tensor.shared.cluster.global
This work introduce `cp.async.bulk.tensor.shared.cluster.global` in NVVM dialect that executes load using TMA.

Depends on D155056

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155060
2023-07-17 17:10:39 +02:00
Adam Paszke
fbfff1caff [MLIR][CAPI] Add C API dialect registration methods for Arith, Math, MemRef and Vector dialects
Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D155450
2023-07-17 14:45:49 +00:00
Matthias Springer
b1d2687501 [mlir][IR] Remove duplicate isLastMemrefDimUnitStride functions
This function is duplicated in various dialects.

Differential Revision: https://reviews.llvm.org/D155462
2023-07-17 16:31:04 +02:00
Alex Zinenko
371366ce27 [mlir][nvgpu] add simple pipelining for shared memory copies
Add a simple transform operation to the NVGPU extension that performs
software pipelining of copies to shared memory. The functionality is
extremely minimalistic in this version and only supports copies from
global to shared memory inside an `scf.for` loop with either
`vector.transfer` or `nvgpu.device_async_copy` operations when
pipelining preconditions are already satisfied in the IR. This is the
minimally useful version that uses the more general loop pipeliner in an
NVGPU-specific way. Further extensions and orthogonalizations will be
necessary.

This required a change to the loop pipeliner itself to properly
propagate errors should the predicate generator fail.

This is loosely inspired from the vesion in IREE, but has less unsafe
assumptions and more principled way of communicating decisions.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155223
2023-07-17 14:29:12 +00:00
Matthias Springer
a4f4d82c35 [mlir][NVGPU][NFC] Clean up code structure
* Move passes to `Transforms` directory.
* Add `Utils.h` (will be utilized in a subsequent change).

Differential Revision: https://reviews.llvm.org/D155427
2023-07-17 14:15:42 +02:00
Matthias Gehre
0ebb050311 [MLIR] [TOSA]: Move reshape(reshape(x)) -> reshape(x) from canonicalization to fold
reshape(reshape(x)) -> reshape(x) can be directly written as a fold instead of a canonicalization,
to help other passes cleanup while they work.

This initially broke ReshapeConverterExpand/Collapse, which relies on creating foldable reshapes and a carefully crafted
benefit priority of patterns.
I turned this into a single pattern on reshapes, which does expand and/or collapse as needed in one go.

Differential Revision: https://reviews.llvm.org/D155266
2023-07-17 10:14:37 +02:00
Matthias Springer
98770ecd76 [mlir][bufferization] Add buffer_loop_hoisting transform op
This op hoists buffer allocation from loops.

Differential Revision: https://reviews.llvm.org/D155289
2023-07-14 17:09:38 +02:00
Nicolas Vasilache
5e877caf4d [mlir] Add an IntNEQValue predicate
Differential Revision: https://reviews.llvm.org/D155298
2023-07-14 16:57:04 +02:00
Nicolas Vasilache
9e54d5e778 [mlir] NFC - Basic improvements to IndexingUtils (product and sum) 2023-07-14 16:41:31 +02:00
Nicolas Vasilache
ed68282942 Revert "[mlir][memref] NFC - Move utility function declaration from IR/MemRef.h to Utils/MemRefUtils.h"
This reverts commit 8b161e9772.

This creates cyclic dependencies that cannot be easily untangled for now.
2023-07-14 16:31:54 +02:00
Matthias Springer
fd5cda3393 [mlir][vector][NFC] Minor VectorTransferOpInterface cleanup
* Rename functions with underscore to camel case.
* Return C++ bools of "in_bounds" values instead of an `ArrayAttr`.

Differential Revision: https://reviews.llvm.org/D155277
2023-07-14 15:41:21 +02:00
Markus Böck
9170fa5808 [mlir][LLVM] Convert access group metadata to using attributes instead of ops
Using MLIR attributes instead of metadata has many advantages:
* No indirection: Attributes can simply refer to each other seemlessly without having to use the indirection of `SymbolRefAttr`. This also gives us correctness by construction in a lot of places as well
* Multithreading safe: The Attribute infrastructure gives us thread-safety for free. Creating operations and inserting them into a block is not thread-safe. This is a major use case for e.g. the inliner in MLIR which runs in parallel
* Easier to create: There is no need for a builder or a metadata region

This patch therefore does exactly that. It leverages the new distinct attributes to create distinct access groups in a deterministic and threadsafe manner.

Differential Revision: https://reviews.llvm.org/D155285
2023-07-14 14:57:46 +02:00
Matthias Springer
1a5aa77f30 [mlir][linalg] BufferizeToAllocationOp: Add option to specify custom alloc op
Supported ops are "memref.alloc" and "memref.alloca".

Differential Revision: https://reviews.llvm.org/D155282
2023-07-14 13:39:05 +02:00
Matthias Springer
88f4292a16 [mlir][bufferization] OneShotBufferizeOp: Add options to use linalg.copy
This new option allows users to specify a custom memcpy op.

Differential Revision: https://reviews.llvm.org/D155280
2023-07-14 13:34:22 +02:00
Nicolas Vasilache
8b161e9772 [mlir][memref] NFC - Move utility function declaration from IR/MemRef.h to Utils/MemRefUtils.h 2023-07-14 11:24:22 +02:00
Nicolas Vasilache
0489cfe13d Revert "[RandomIRBuilder] Remove use of getNonOpaquePointerElementType() (NFC)"
This reverts commit afdb83b19c.

This was landed with a bad description.
2023-07-14 11:24:22 +02:00
Markus Böck
78d00a160f [mlir][LLVM] Convert alias metadata to using attributes instead of ops
Using MLIR attributes instead of metadata has many advantages:
* No indirection: Attributes can simply refer to each other seemlessly without having to use the indirection of `SymbolRefAttr`. This also gives us correctness by construction in a lot of places as well
* Multithreading save: The Attribute infrastructure gives us thread-safety for free. Creating operations and inserting them into a block is not thread-safe. This is a major use case for e.g. the inliner in MLIR which runs in parallel
* Easier to create: There is no need for a builder or a metadata region

This patch therefore does exactly that. It leverages the new distinct attributes to create distinct alias domains and scopes in a deterministic and threadsafe manner.

Differential Revision: https://reviews.llvm.org/D155159
2023-07-14 11:14:42 +02:00
Nikita Popov
afdb83b19c [RandomIRBuilder] Remove use of getNonOpaquePointerElementType() (NFC) 2023-07-14 11:09:01 +02:00
Hideto Ueno
cf40fde4ed [mlir] Don't emit forward declaration for user defined storage classes
Currently DefGen::emitDecl always emits forward declarations of storage classes even for user define ones, which makes it difficult to use template class directly in ODS. This patch changes `DefGen` not to emit forward decl when `genStorageClass` is false.

Original discussion: https://discourse.llvm.org/t/use-template-classes-as-user-defined-storage-classes/72015

Reviewed By: mehdi_amini, rriddle

Differential Revision: https://reviews.llvm.org/D155225
2023-07-13 21:14:48 -07:00
Hanhan Wang
8fc433f055 [mlir][MemRef] Move narrow type emulation common methods to MemRefUtils.
It also unifies the computation of StridedLayoutAttr. If the stride is
static known value, we can just use it.

Differential Revision: https://reviews.llvm.org/D155017
2023-07-13 14:43:21 -07:00
Guray Ozen
22a32f7d9c [mlir][gpu] Add dump-ptx option
When targeting NVIDIA GPUs, seeing the generated PTX is important. Currently, we don't have simple way to do it.

This work adds dump-ptx to gpu-to-cubin pass. One can use it like `gpu-to-cubin{chip=sm_90 features=+ptx80 dump-ptx}`.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155166
2023-07-13 21:14:57 +02:00
Nicolas Vasilache
39427a4fbb [mlir][Linalg] Fold/erase self-copy linalg.copy on buffers
Differential Revision: https://reviews.llvm.org/D155203
2023-07-13 16:38:02 +02:00
Jan Sjodin
45a9604417 [Flang][OpenMP][MLIR] Add early outlining pass for omp.target operations to flang
This patch implements an early outlining transform of omp.target operations in
flang. The pass is needed because optimizations may cross target op region
boundaries, but with the outlining the resulting functions only contain a
single omp.target op plus a func.return, so there should not be any opportunity
to optimize across region boundaries.

The patch also adds an interface to be able to store and retrieve the parent
function name of the original target operation. This is needed to be able to
create correct kernel function names when lowering to LLVM-IR.

Reviewed By: kiranchandramohan, domada

Differential Revision: https://reviews.llvm.org/D154879
2023-07-13 09:14:42 -04:00
Guray Ozen
eda52f3cd3 [mlir][nvvm] Add populate function (nfc)
This work adds populate function for the nvvm to llvm conversion pattern.

Reviewed By: kuhar

Differential Revision: https://reviews.llvm.org/D155189
2023-07-13 14:53:51 +02:00
Adam Paszke
c83318e3e0 [MLIR][Python] Implement pybind adapters for MlirBlock
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D155092
2023-07-12 22:27:01 -07:00
Adam Paszke
86bc2e3ae9 [MLIR] Add a number of methods to the C API
Those include:
- mlirFuncSetArgAttr
- mlirOperationSetOperands
- mlirRegionTakeBody
- mlirBlockInsertArgument

Reviewed By: ftynse, jpienaar

Differential Revision: https://reviews.llvm.org/D155091
2023-07-12 22:10:03 -07:00
Hideto Ueno
d138c89148 [mlir] Forward arguments of pair in SubElementInterface::replaceImmediateSubElementsImpl
`SubElementInterface::replaceImmediateSubElementsImpl` specializes tuples so that
arguments are forwarded to type getter. However currently pairs are not supported even though
an example in documents uses a pair as a key type. This patch adds support for pairs as well.

Reviewed By: Mogball

Differential Revision: https://reviews.llvm.org/D155043
2023-07-12 22:07:27 -07:00
Jakub Kuderski
4ba61f5a30 [mlirv][spirv] Add KHR Cooperative Matrix type and extension
Start plumbing through support for the `SPV_KHR_cooperative_matrix`
extension: https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/KHR/SPV_KHR_cooperative_matrix.html.

Register the extension, add new coop matrix type, and add
`spirv.KHR.CooperativeMatrixLength` op to exercise it.

Make sure that mixing of the KHR and NV coop matrix extensions is not
allowed. Make cast verification more robust.

Reviewed By: antiagainst, qedawkins

Differential Revision: https://reviews.llvm.org/D154877
2023-07-12 21:11:08 -04:00
Peiming Liu
269c82d389 [mlir][sparse] introduce new 2:4 block sparsity level type.
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D155128
2023-07-12 23:33:53 +00:00
Tai Ly
d713a00270 [TOSA] Add level checks and remove Tensor1DTo4D
Remove Tosa_Tensor1Dto4D and Tosa_TensorUpto4D in the Tosa Dialect
and added level checks to TosaValidation pass to validate per spec.

Signed-off-by: Tai Ly <tai.ly@arm.com>
Change-Id: Icd32137e9f8051f99994cee9f388f20c1a840f4b

Reviewed By: eric-k256

Differential Revision: https://reviews.llvm.org/D154273
2023-07-12 16:56:44 +00:00
Matthias Springer
d3ddcfd448 [mlir][DialectUtils] Generalize extractFromI64ArrayAttr helper
Generalize `extractFromI64ArrayAttr` to `extractFromIntegerArrayAttr`, so that arbitrary integer/bool types can be extracted.

Differential Revision: https://reviews.llvm.org/D154974
2023-07-12 17:59:40 +02:00
Amanda Tang
47b0a9b931 [ODS] Extra Concrete Declarations and Definitions under Traits
Support extra concrete class declarations and definitions under NativeTrait that get injected into the class that specifies the trait. Extra declarations and definitions can be passed in as template arguments for NativeOpTraitNativeAttrTrait and NativeTypeTrait.

Usage examples of this feature include:

- Creating a wrapper Trait for authoring inferReturnTypes with the OpAdaptor by specifying necessary Op specific declarations and definitions directly in the trait
- Refactoring the InferTensorType trait

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D154731
2023-07-12 08:46:19 -07:00
Ingo Müller
ab86b8cef4 [mlir][linalg][transform] Fix printing of TileToForall in edge case.
The `static_(num_threads|tile_sizes)` attributes of this op are
`DefaultValuedOptionalAttr`s, so they can be constructed *without* such
an attribute. In other words, the following is a valid op (note the
absense of the `static_num_threads` attribute):

"builtin.module"() ({
  "transform.sequence"() <{failure_propagation_mode = 1 : i32, operand_segment_sizes = array<i32: 0, 0>}> ({
  ^bb0(%arg0: !pdl.operation, %arg1: !transform.op<"linalg.matmul">, %arg2: !transform.op<"linalg.elemwise_binary">):
    %0 = "transform.structured.match"(%arg0) <{ops = ["test.dummy"]}> : (!pdl.operation) -> !pdl.operation
    %1:2 = "transform.structured.tile_to_forall_op"(%arg1, %0) <{operand_segment_sizes = array<i32: 1, 0, 0, 0, 1>}> : (!transform.op<"linalg.matmul">, !pdl.operation) -> (!transform.op<"scf.forall">, !transform.op<"linalg.matmul">)
    "transform.yield"() : () -> ()
  }) : () -> ()
}) : () -> ()

However, the custom printing directive converted those to an `ArrayRef`,
which crashes if done on an empty `ArrayAttr`. This patch changes the
signature such that no automatic conversion takes place and extends the
test to test for existinnce of the attribute.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D155062
2023-07-12 13:30:15 +00:00
Guray Ozen
ffbca7e9f3 [mlir][nvvm] Change return type of std::string of getPtx of PtxBuilder
getPtx used to return `const char*`. It is not flexible when one needs to build string in the function. This work changes return type.

Reviewed By: springerm

Differential Revision: https://reviews.llvm.org/D155056
2023-07-12 14:59:54 +02:00
Adrian Kuegel
7724c4b5a9 [mlir] Apply ClangTidy fixes
The get() call is redundant.
2023-07-12 11:31:05 +02:00
Marius Brehler
a2426eb603 [mlir][emitc] Add div, mul and rem operators
This adds operations for binary multiplicative arithmetic operators to
EmitC. The input and output arguments for the remainder operator are
restricted to index (emitted as size_t), integers and the EmitC opaque
types (as the operator can be overloaded for a custom type). The
multiplication and division operator further support floating point
numbers.

Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D154846
2023-07-12 08:45:10 +02:00
yzhang93
9a7677d8ee [mlir] Narrow bitwidth emulation for vector.load
This patch is a following for the previous patch https://reviews.llvm.org/D151519.
With this patch, vector.load op with narrow bitwidth (e.g., i4) can be converted to
supported wider bitwidth (e.g., i8).

Reviewed By: hanchung

Differential Revision: https://reviews.llvm.org/D154178
2023-07-11 13:38:15 -07:00