intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-26 03:56:16 +08:00

Author	SHA1	Message	Date
Matthias Springer	a5bba98a58	[mlir][linalg] BufferizeToAllocationOp: Add option to materialize buffers for operands Add an option that does not bufferize the targeted op itself, but just materializes a buffer for the destination operands. This is useful for partial bufferization of complex ops such as `scf.forall`, which need special handling (and an analysis if the region). Differential Revision: https://reviews.llvm.org/D155946	2023-07-21 15:29:59 +02:00
Matthias Springer	20245ed4de	[mlir][transform] Add `apply_cse` option to `transform.apply_patterns` op Applying the canonicalizer and CSE in an interleaved fashion is useful after bufferization (and maybe other transforms) to fold away self copies. Differential Revision: https://reviews.llvm.org/D155933	2023-07-21 15:13:56 +02:00
Guray Ozen	e56d6745f7	[mlir][nvgpu] Add `tma.create.descriptor` to create tensor map descriptor The Op creates a tensor map descriptor object representing tiled memory region. The descriptor is used by Tensor Memory Access (TMA). The `tensor` is the source tensor to be tiled. The `boxDimensions` is the size of the tiled memory region in each dimension. The pattern here lowers `tma.create.descriptor` to a runtime function call that eventually calls calls CUDA Driver's `cuTensorMapEncodeTiled`. For more information see below: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TENSOR__MEMORY.html Depends on D155453 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155680	2023-07-21 11:33:04 +02:00
Alex Zinenko	8dbddb1718	[mlir] allow region branch spec from parent op to itself RegionBranchOpInterface did not allow the operation with regions to specify itself as successors. Therefore, this implied that the control is always transferred to a region before being transferred back to the parent op. Since the region can only transfer the control back to the parent op from a terminator, this transitively implied that the first block of any region with a RegionBranchOpInterface is always executed until the terminator can transfer the control flow back. This is trivially false for any conditional-like operation that may or may not execute the region, as well as for loop-like operations that may not execute the body. Remove the restriction from the interface description and update the only transform that relied on it. See https://discourse.llvm.org/t/rfc-region-control-flow-interfaces-should-encode-region-not-executed-correctly/72103. Depends On: https://reviews.llvm.org/D155757 Reviewed By: Mogball, springerm Differential Revision: https://reviews.llvm.org/D155822	2023-07-21 09:16:56 +00:00
Alex Zinenko	5d8813dec6	[mlir] allow dense dataflow to customize call and region operations Initial implementations of dense dataflow analyses feature special cases for operations that have region- or call-based control flow by leveraging the corresponding interfaces. This is not necessarily sufficient as these operations may influence the dataflow state by themselves as well we through the control flow. For example, `linalg.generic` and similar operations have region-based control flow and their proper memory effects, so any memory-related analyses such as last-writer require processing `linalg.generic` directly instead of, or in addition to, the region-based flow. Provide hooks to customize the processing of operations with region- cand call-based contol flow in forward and backward dense dataflow analysis. These hooks are trigerred when control flow is transferred between the "main" operation, i.e. the call or the region owner, and another region. Such an apporach allows the analyses to update the lattice before and/or after the regions. In the `linalg.generic` example, the reads from memory are interpreted as happening before the body region and the writes to memory are interpreted as happening after the body region. Using these hooks in generic analysis may require introducing additional interfaces, but for now assume that the specific analysis have spceial cases for the (rare) operaitons with call- and region-based control flow that need additional processing. Reviewed By: Mogball, phisiart Differential Revision: https://reviews.llvm.org/D155757	2023-07-21 09:16:03 +00:00
Guray Ozen	70c2e0618a	[mlir][nvgpu] Add nvgpu.tma.async.load and nvgpu.tma.descriptor This work adds `nvgpu.tma.async.load` Op that requests tma load asyncronusly using mbarrier object. It also creates nvgpu.tma.descriptor type. The type is supposed be created by `cuTensorMapEncodeTiled` cuda drivers api. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155453	2023-07-21 10:23:25 +02:00
Alex Zinenko	2469cdd156	[mlir] remove RegionBranchOpInterface from linalg ops Linalg structure ops do not implement control flow in the way expected by RegionBranchOpInterface, and the interface implementation isn't actually used anywhere. The presence of this interface without correct implementation is confusing for, e.g., dataflow analyses. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D155841	2023-07-21 08:18:41 +00:00
Markus Böck	f117bbca04	[mlir] Add opt-in default property bytecode read and write implementation Using properties currently requires at the very least implementing four methods/code snippets: * `convertToAttribute` * `convertFromAttribute` * `writeToMlirBytecode` * `readFromMlirBytecode` This makes replacing attributes with properties harder than it has to be: Attributes by default do not require immediately defining custom bytecode encoding. This patch therefore adds opt-in implementations of `writeToMlirBytecode` and `readFromMlirBytecode` that work with the default implementations of `convertToAttribute` and `convertFromAttribute`. They are provided by `defvar`s in `OpBase.td` and can be used by adding: ``` let writeToMlirBytecode = writeMlirBytecodeWithConvertToAttribute; let readFromMlirBytecode = readMlirBytecodeUsingConvertFromAttribute; ``` to ones TableGen definition. While this bytecode encoding is almost certainly not ideal for a given property, it allows more incremental use of properties and getting something sane working before optimizing the bytecode format. Differential Revision: https://reviews.llvm.org/D155286	2023-07-21 08:03:26 +02:00
Amanda Tang	057fc8e7d8	[ODS] Use Adaptor Trait for Shaped Type Inference Author inferReturnTypeComponents methods with the Op Adaptor by using the InferShapedTypeOpAdaptor. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D155243	2023-07-20 19:41:08 +00:00
Giuseppe Rossini	4b3eaee270	[mlir][AMDGPU] Define wrappers for WMMA matrix ops Wave Matrix Multiply Accumulate (WMMA) is the instruction to accelerate matrix multiplication on RDNA3 architectures. LLVM already provides a set of intrinsics to generate wmma instructions. This change uses those intrinsics to enable the feature in MLIR. Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D152451	2023-07-20 18:38:35 +00:00
Jakub Kuderski	ab6827f2d4	[mlir][spirv] Extract Atomic/Cast/Group op implementation. NFC. Continue to work outlined in D155747 and split the main SPIR-V ops implementation file into a few smaller and quicker to compile files. This organization matches the op definition organizaion in `.td` files. In this patch, extract atomic, cast/conversion, and group op implementation into separate files. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D155777	2023-07-20 11:15:30 -04:00
Ingo Müller	f62cb13fb2	[mlir][linalg][transform] Rename ApplyPatternsOp.{region => patterns}. This gives the region a more meaningful name. The topic came up in a discussion on https://reviews.llvm.org/D155435, where the name `region` would have led to a situation where a convenience accessor called `region` (after the ODS name) would have returned a Block. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155810	2023-07-20 14:20:45 +00:00
Sergio Afonso	40340cf91a	[MLIR][OpenMP][OMPIRBuilder] Use target triple to initialize `IsGPU` flag This patch modifies the construction of the `OpenMPIRBuilder` in MLIR to initialize the `IsGPU` flag using target triple information passed down from the Flang frontend. If not present, it will default to `false`. This replicates the behavior currently implemented in Clang, where the `CodeGenModule::createOpenMPRuntime()` method creates a different `CGOpenMPRuntime` instance depending on the target triple, which in turn has an effect on the `IsGPU` flag of the `OpenMPIRBuilderConfig` object. Differential Revision: https://reviews.llvm.org/D151903	2023-07-20 15:07:50 +01:00
Markus Böck	f9173c2958	[mlir][LLVM] Convert `noalias` parameters into alias scopes during inlining Currently, inlining a function with a `noalias` parameter leads to a large loss of optimization potential as the `noalias` parameter, an important hint for alias analysis, is lost completely. This patch fixes this with the same approach as LLVM by annotating all users of the `noalias` parameter with appropriate alias and noalias scope lists. The implementation done here is not as sophisticated as LLVMs, which has more infrastructure related to escaping and captured pointers, but should work in the majority of important cases. Any deficiency can be addressed in future patches. Related LLVM code: `27ade4b554/llvm/lib/Transforms/Utils/InlineFunction.cpp (L1090)` Differential Revision: https://reviews.llvm.org/D155712	2023-07-20 15:05:28 +02:00
Guray Ozen	836dbb8522	[mlir][nvgpu] Add `mbarrier.arrive.expect_tx` and `mbarrier.try_wait.parity` This work adds two Ops: `mbarrier.arrive.expect_tx` performs expect_tx `mbarrier.barrier` returns `mbarrier.barrier.token` `mbarrier.try_wait.parity` waits on `mbarrier.barrier` and `mbarrier.barrier.token` `mbarrier.arrive.expect_tx` is one of the requirement to enable H100 TMA support. Depends on D154074 D154076 D154059 D154060 Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D154094	2023-07-20 13:48:30 +02:00
Tobias Gysi	10fa27704b	[mlir][llvm] Add branch weight op interface This revision adds a branch weight op interface for the call / branch operations that support branch weights. It can be used in the LLVM IR import and export to simplify the branch weight conversion. An additional mapping between call operations and instructions ensures the actual conversion can be done in the module translation itself, rather than in the dialect translation interface. It also has the benefit that downstream users can amend custom metadata to the call operation during the export to LLVM IR. Reviewed By: zero9178, definelicht Differential Revision: https://reviews.llvm.org/D155702	2023-07-20 10:46:04 +00:00
Ivan Butygin	9dec3fd812	[mlir] Add `ub` dialect and `poison` op. Add new dialect boilerplate and `poison` op definition. Discussion: https://discourse.llvm.org/t/rfc-poison-semantics-for-mlir/66245/24 Differential Revision: https://reviews.llvm.org/D154248	2023-07-20 11:19:43 +02:00
Matthias Springer	2137915137	[mlir] Remove some code duplication between `Builders.cpp` and `FoldUtils.cpp` Also update the documentation of `Operation::fold`, which did not take into account in-place foldings. Differential Revision: https://reviews.llvm.org/D155691	2023-07-20 10:27:14 +02:00
Matthias Springer	dd115e5a9b	[mlir][IR] Implement proper folder for `IsCommutative` trait Commutative ops were previously folded with a special rule in `OperationFolder`. This change turns the folding into a proper `OpTrait` folder. Differential Revision: https://reviews.llvm.org/D155687	2023-07-20 10:19:48 +02:00
Kelvin Li	99dc3935b9	[flang] Add PowerPC vec_convert, vec_ctf and vec_cvf intrinsic Co-authored-by: Paul Scoropan <1paulscoropan@gmail.com> Differential Revision: https://reviews.llvm.org/D155235	2023-07-19 22:27:55 -04:00
Jakub Kuderski	9415241c5b	[mlir][spirv] Split op implementation file into subfiles. NFC. The main op implementation file for SPIR-V grew past 5k LOC. This makes it take a long time to compile and index with LSPs like clangd. Pull out the first few SPIR-V extension ops into their own `.cpp` files, just like we do with `.td` op definitions. This includes the KHR/NV/Intel coop matrix and the integer dot prod extensions. I plan to further split this in future revisions. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D155747	2023-07-19 16:48:47 -04:00
Mahesh Ravishankar	67399932c7	[mlir][Linalg] Cleanup the drop unit dims pass in Linalg. TL;DR the following API functions have been merged ``` void populateFoldUnitExtentDimsViaReshapesPatterns(RewritePatternSet &patterns); void populateFoldUnitExtentDimsViaSlicesPatterns(RewritePatternSet &patterns); ``` into ``` void populateFoldUnitExtentDimsPatterns(RewritePatternSet &patterns, ControlDropUnitDims &options); ``` To use the previous functionality use ``` ControlDropUnitDims options; // By default options.rankReductionStrategy is // ControlDropUnitDims::RankReductionStrategy::ReassociativeReshape. populateFoldUnitExtentDimsPatterns(patterns, options); ``` and ``` ControlDropUnitDims options; options.rankReductionStrategy = ControlDropUnitDims::RankReductionStrategy::ExtractInsertSlice populateFoldUnitExtentDimsPatterns(patterns, options); ``` This pass is quite old and needed to be updated based on the current approach to transformations in Linalg - Instead of two patterns, one to just remove loop dimensions that are unit extent (and using 0 in the indexing maps), and another to drop the unit-extents in the operand shapes, combine into a single transformation. This avoid creating an intermediate step with indexing maps having 0's in the domains exp ressions. - Expose the core transformation as a utility function and add a pattern that calls this transformation. This is a mostly NFC change, apart from the API change and dropping the patterns/test that only dropped the loops that are unit extents. Differential Revision: https://reviews.llvm.org/D155518	2023-07-19 17:47:18 +00:00
Valentin Clement	1d49834280	[mlir][openacc] Relax verifier for the acc.reduction.recipe Allow the init and combiner regions to have more arguments to pass information. Reviewed By: razvanlupusoru Differential Revision: https://reviews.llvm.org/D155656	2023-07-19 10:31:37 -07:00
Razvan Lupusoru	65cd3cfed2	[nfc][openacc] Add missing comma in acc dialect operation macros A comma is missing which is incorrect if macro is used. Reviewed By: clementval Differential Revision: https://reviews.llvm.org/D155722	2023-07-19 09:12:06 -07:00
Jakub Kuderski	68cd1dbc2e	[mlir][spirv] Add cooperative matrix store op Implement cooperative matrix store for the `SPV_KHR_cooperative_matrix` extension: https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/KHR/SPV_KHR_cooperative_matrix.html. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D155631	2023-07-19 11:01:09 -04:00
Jakub Kuderski	1fa9e150b4	[mlir][spirv] Add cooperative matrix load op Implement cooperative matrix load for the `SPV_KHR_cooperative_matrix` extension: https://github.com/KhronosGroup/SPIRV-Registry/blob/master/extensions/KHR/SPV_KHR_cooperative_matrix.html. Also some minor fixes in common code for custom parsing. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D155616	2023-07-19 10:55:27 -04:00
Razvan Lupusoru	132c376a33	[openacc] Add attribute to hold declare data clause information For variables in declare clauses, their producing operation should be marked with the data clause for ease of lookup and consistency verification. Thus add an attribute that can be used for this purpose plus verification that declare data operation matches the declare data clause on variable. Reviewed By: clementval Differential Revision: https://reviews.llvm.org/D155640	2023-07-19 07:54:57 -07:00
Markus Böck	1dda134f85	[mlir][flang] Convert TBAA metadata to an attribute representation The current representation of TBAA is the very last in-tree user of the `llvm.metadata` operation. Using ops to model metadata has a few disadvantages: * Building a graph has to be done through some weakly typed indirection mechanism such as `SymbolRefAttr` * Creating the metadata has to be done through a builder within a metadata op. * It is not multithreading safe as operation insertion into the same block is not thread-safe This patch therefore converts TBAA metadata into an attribute representation, in a similar manner as it has been done for alias groups and access groups in previous patches. This additionally has the large benefit of giving us more "correctness by construction" as it makes things like cycles in a TBAA graph, or references to an incorrectly typed metadata node impossible. Differential Revision: https://reviews.llvm.org/D155444	2023-07-19 16:42:50 +02:00
Martin Erhart	07c079a97a	[mlir][bufferization] Add lowering of bufferization.dealloc to memref.dealloc Adds a generic lowering that suppors all cases of bufferization.dealloc and one specialized, more efficient lowering for the simple case. Using a helper function with for loops in the general case enables O(\|num_dealloc_memrefs\|+\|num_retain_memrefs\|) size of the lowered code. Depends on D155467 Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D155468	2023-07-19 14:28:01 +00:00
Guray Ozen	bf62748342	[mlir][nvvm] Introduce Syncronization Ops for WGMMA This work introduces : `wgmma.fence.aligned`, `wgmma.commit.group.sync.aligned` and `wgmma.wait.group.sync.aligned` Ops. They are used to syncronize warpgroup level matrix multiply-accumulate instructions, as known as WGMMA. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155676	2023-07-19 11:45:04 +02:00
Markus Böck	ec3cbe92c0	[mlir][LLVM] add `llvm.ssa.copy` intrinsic This is quite the niche intrinsic, whose whole purpose is to be able to essentially split an SSA value to be able to attach additional information to the new value. It therefore really acts like a noop that just passes through its argument. It interestingly does not have any support in CodeGen and is therefore required to also be deleted by any pass creating it. Differential Revision: https://reviews.llvm.org/D155678	2023-07-19 09:53:28 +02:00
Matthias Springer	9d072bbe0f	[mlir][NFC] Avoid `OpBuilder::setListener` when possible `setListener` is dangerous because an already registered listener may accidentally be overwritten/replaced. (A `ForwardingListener` must be used in such cases.) This change updates a few trivial call sites of `setListener`, where no forwarding listener is needed. Differential Revision: https://reviews.llvm.org/D155599	2023-07-19 09:13:38 +02:00
Alexander Belyaev	2e52b093cf	[mlir] Remove dead code in Analysis/FlatLinearValueConstraints. Differential Revision: https://reviews.llvm.org/D154830	2023-07-19 08:19:53 +02:00
River Riddle	857b0a1f40	[mlir-lsp] Add client information to the InitializationParams This is specified in the spec, but we just never really needed it. This allows for users of the LSP libraries to inspect information about the client that is connected to the server. Differential Revision: https://reviews.llvm.org/D155566	2023-07-18 12:27:01 -07:00
Razvan Lupusoru	b232054fac	[openacc] Update data clause attribute definition Instead of using the I64EnumAttr for the DataClause, now use EnumAttr instead. This makes tests more readable since now one can use the format #acc<data_clause acc_copyin> instead of just the number. Reviewed By: clementval, vzakhari Differential Revision: https://reviews.llvm.org/D155605	2023-07-18 11:39:33 -07:00
Amanda Tang	5267ed05bc	[ODS] Use Adaptor Traits for Type Inference Author inferReturnTypes methods with the Op Adaptor by using the InferTypeOpAdaptor. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D155115	2023-07-18 17:58:31 +00:00
Quentin Colombet	9be8219f60	[mlir][Linalg] Add an interface to decompose complex ops This patch adds an interface, named AggregatedOpInterface, that decomposes complex operations into simpler ones. For now, make the interface specific to Linalg because although the concept is general, the way to materialize it needs some maturing. Use that interface with the softmax operator. Differential Revision: https://reviews.llvm.org/D154363	2023-07-18 19:06:36 +02:00
Rahul Kayaith	67a910bbff	[mlir][python] Remove PythonAttr mapping functionality This functionality has been replaced by TypeCasters (see D151840) depends on D154468 Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D154469	2023-07-18 12:21:28 -04:00
Razvan Lupusoru	7496177d98	[openacc] Add dialect definition for acc declare A declare directive is used to specify the creation of a visible device copy of a variable for the duration of the implicit data region as it relates to the scope in which the variable is declared. In order to support this, the following new operations were added: 1) `acc.global_ctor` and `acc.global_dtor`. These are used whenever the declare directive applies to a global. 2) `acc.declare_enter` and `acc.declare_exit`. These operations are modeled similarly to `acc.enter_data` and `acc.exit_data`. The reason they are not modeled like `acc.data` is so that these operations can be used both for globals and regions like functions. 3) `acc.declare_device_resident` and `acc.declare_link`. These operations are modeled in a manner consistent with previously defined data entry operation model. The `acc.getdeviceptr` was generalized so that it can be used with acc.declare_exit. Reviewed By: clementval, vzakhari Differential Revision: https://reviews.llvm.org/D155322	2023-07-18 08:11:06 -07:00
iambrj	3dd9931c0f	[MLIR][Presburger] Implement domain and range restriction for PresburgerRelation This patch implements domain and range restriction for PresburgerRelation Reviewed By: Groverkss Differential Revision: https://reviews.llvm.org/D154798	2023-07-18 19:12:12 +05:30
Matthias Springer	db393288ff	[mlir][NVGPU][transform] Add `create_async_groups` transform op This transform looks for suitable vector transfers from global memory to shared memory and converts them to async device copies. Differential Revision: https://reviews.llvm.org/D155569	2023-07-18 14:36:41 +02:00
Alex Zinenko	4a6b31b8d8	[mlir] NFC: untangle SCF Patterns.h and Transforms.h These two headers both contained a strange mix of definitions related to both patterns and non-pattern transforms. Put patterns and "populate" functions into Patterns.h and standalone transforms into Transforms.h. Depends On: D155223 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155454	2023-07-18 11:27:36 +00:00
Andrzej Warzynski	3fa5ee67ba	[mlir][ArmSME] Introduce custom TypeConverter for ArmSME At the moment, SME-to-LLVM lowerings rely entirely on `LLVMTypeConverter`. This patch introduces a dedicated `TypeConverter` that inherits from `LLVMTypeConverter` (it will also be used when lowering ArmSME Ops to LLVM). The new type converter merely disables lowerings for `VectorType` to prevent 2-d scalable vectors (common in the context of ArmSME), e.g. `vector<[16]x[16]xi8>`, entering the LLVM Type converter. LLVM does not support arrays of scalable vectors and hence the need for specialisation. In the case of SME such types are effectively eliminated when emitting LLVM IR intrinsics for SME. Differential Revision: https://reviews.llvm.org/D155365	2023-07-18 09:35:32 +00:00
Cullen Rhodes	fb54fec726	[mlir][ArmSME] Implement tile allocation This patch adds a pass '-allocate-sme-tiles' to the ArmSME dialect that implements allocation of SME ZA tiles. It does this at the 'func.func' op level by replacing 'arm_sme.get_tile_id' ops with 'arith.constant' ops that represent the tile number. The tiles in use in a given function are tracked by an integer function attribute 'arm_sme.tiles_in_use' that is a 16-bit tile mask with a bit for each 128-bit element tile (ZA0.Q-ZA15.Q), the smallest ZA tile granule. This is initialized on the first 'arm_sme.get_tile_id' rewrite and updated on each subsequent rewrite. Mixing of different element tile types is supported. Section B2.3.2 of the SME spec [1] describes how the 128-bit element tiles overlap with other element tiles. Depends on D154941 [1] https://developer.arm.com/documentation/ddi0616/aa Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D154955	2023-07-18 08:46:40 +00:00
Andrzej Warzynski	447bb5bee4	[mlir][ArmSME] Introduce new lowering layer (Vector -> ArmSME) At the moment, the lowering from the Vector dialect to SME looks like this: * Vector --> SME LLVM IR intrinsics This patch introduces a new lowering layer between the Vector dialect and the Arm SME extension: * Vector --> ArmSME dialect (custom Ops) --> SME LLVM IR intrinsics. This is motivated by 2 considerations: 1. Storing `ZA` to memory (e.g. `vector.transfer_write`) requires an `scf.for` loop over all rows of `ZA`. Similar logic will apply to "load to ZA from memory". This is a rather complex transformation and a custom Op seems justified. 2. As discussed in [1], we need to prevent the LLVM type converter from having to convert types unsupported in LLVM, e.g. `vector<[16]x[16]xi8>`. A dedicated abstraction layer with custom Ops opens a path to some fine tuning (e.g. custom type converters) that will allow us to avoid this. To facilitate this change, two new custom SME Op are introduced: * `TileStoreOp`, and * `ZeroOp`. Note that no new functionality is added - these Ops merely model what's already supported. In particular, the following tile size is assumed (dimension and element size are fixed): * `vector<[16]x[16]xi8>` The new lowering layer is introduced via a conversion pass between the Vector and the SME dialects. You can use the `-convert-vector-to-sme` flag to run it. The following function: ``` func.func @example(%arg0 : memref<?x?xi8>) { // (...) %cst = arith.constant dense<0> : vector<[16]x[16]xi8> vector.transfer_write %cst, %arg0 : vector<[16]x[16]xi8>, memref<?x?xi8> return } ``` would be lowered to: ``` func.func @example(%arg0: memref<?x?xi8>) { // (...) %0 = arm_sme.zero : vector<[16]x[16]xi8> arm_sme.tile_store %arg0[%c0, %c0], %0 : memref<?x?xi8>, vector<[16]x[16]xi8> return } ``` Later, a mechanism will be introduced to guarantee that `arm_sme.zero` and `arm_sme.tile_store` operate on the same virtual tile. For `i8` elements this is not required as there is only one tile. In order to lower the above output to LLVM, use * `-convert-vector-to-llvm="enable-arm-sme"`. [1] https://github.com/openxla/iree/issues/14294 Reviewed By: WanderAway Differential Revision: https://reviews.llvm.org/D154867	2023-07-18 08:04:59 +00:00
Cullen Rhodes	6ff9761a69	[mlir][ArmSME] Add custom get_tile_id and cast ops This patch adds three new custom ops to the ArmSME dialect: * arm_sme.get_tile_id - returns a scalar integer representing an SME "virtual tile" that is not in use. * arm_sme.cast_tile_to_vector - casts from a tile id to a 2-d scalable vector type, which represents an SME "virtual tile". * arm_sme.cast_vector_to_tile - casts from a 2-d scalable vector type, which represents an SME "virtual tile", to a tile id. The 'arm_sme.get_tile_id' op currently only supports tile 0, a follow-up patch will implement proper tile allocation. A further follow-up patch will demonstrate load/store to/from ZA using these ops. See the op descriptions for further details and examples. Thanks to @paulwalker-arm and @awarzynski for helping drive this. Reviewed By: awarzynski, dcaballe Differential Revision: https://reviews.llvm.org/D154941	2023-07-18 07:41:45 +00:00
Martin Erhart	d582562188	[mlir][bufferization] Add DeallocOp The dealloc operation deallocates each of the given memrefs if there is no alias to that memref in the list of retained memrefs and the corresponding condition value is set. This condition can be used to indicate and pass on ownership of memref values (or in other words, the responsibility of deallocating that memref). If two memrefs alias each other, only one will be deallocated to avoid double free situations. The memrefs to be deallocated must be the originally allocated memrefs, however, the memrefs to be retained may be arbitrary memrefs. Returns a list of conditions corresponding to the list of memrefs which indicates the new ownerships, i.e., if the memref was deallocated the ownership was dropped (set to 'false') and otherwise will be the same as the input condition. Differential Revision: https://reviews.llvm.org/D155467	2023-07-18 07:32:49 +00:00
Matthias Springer	9f808f6e2f	[mlir][vector][NFC] Drop `get...AttrStrName` helper functions These functions are not needed. They are auto-generated from the `.td` files. Differential Revision: https://reviews.llvm.org/D155483	2023-07-17 18:16:08 +02:00
Guray Ozen	baba13e9a1	[mlir][nvvm] Delete backslash Delete the backslash. It was there to compile tablegen file. It looks like space also works fine. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D155474	2023-07-17 17:56:52 +02:00
Matthias Springer	0e8c68c301	[mlir][Interfaces] Fix DestinationStyleOpInterface for vector ops This revision fixes `hasTensorSemantics` and `hasBufferSemantics` for vector transfer ops, which may have a vector operand. `VectorType` implements `ShapedType` and such operands do not affect whether an op has tensor or buffer semantics. Also implement `DestinationStyleOpInterface` on `TransferReadOp` so that `hasTensorSemantics`/`hasBufferSemantics` can be called. (The op has no inits, but this makes it symmetric to `TransferWriteOp`.) Differential Revision: https://reviews.llvm.org/D155469	2023-07-17 17:40:18 +02:00

1 2 3 4 5 ...

8727 Commits