intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-22 15:41:35 +08:00

Author	SHA1	Message	Date
Shilei Tian	6bd74fd65f	Revert commits for kernel environment This reverts commits for kernel environments as they causes issues in AMD BB.	2023-07-23 23:32:31 -04:00
Shilei Tian	ae4292abe4	[MLIR][OpenMP] Remove local_unnamed_addr from check line	2023-07-23 18:51:02 -04:00
Shilei Tian	c7df940184	[MLIR][OpenMP] Fix wrong check lines in mlir/test/Target/LLVMIR/omptarget-region-device-llvm.mlir	2023-07-23 18:45:44 -04:00
Shilei Tian	c5c8040390	[OpenMP] Introduce kernel environment This patch introduces per kernel environment. Previously, flags such as execution mode are set through global variables with name like `__kernel_name_exec_mode`. They are accessible on the host by reading the corresponding global variable, but not from the device. Besides, some assumptions, such as no nested parallelism, are not per kernel basis, preventing us applying per kernel optimization in the device runtime. This is a combination and refinement of patch series D116908, D116909, and D116910. Depend on D155886. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D142569	2023-07-23 18:36:01 -04:00
Andrzej Warzynski	5da317a79e	[mlir][docs][nfc] Fix markdown link	2023-07-23 23:05:47 +02:00
Daniil Dudkin	8be07adfb4	[mlir][LLVM] Introduce reduction intrinsics for minimum/maximum This patch adds supports for the reduction intrinsic for floating point minimum and maximum that have been added to LLVM by https://reviews.llvm.org/D152370. Related to: #63969 Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D155869	2023-07-22 16:25:32 +03:00
Fabian Mora	4538347fb2	[mlir][rocdl] Change the translation of `GridDimOp` to `__ockl_get_num_groups` Currently, `ROCDL::GridDimOp` is being translated to `__ockl_get_global_size`, however to match the meaning of `gpu.grid_dim` it should instead be translated to `__ockl_get_num_groups`. This change would also make it agree with the meaning of `gridDimx.` in HIP, see: https://github.com/ROCm-Developer-Tools/hipamd/blob/develop/include/hip/amd_detail/amd_hip_runtime.h#L257 Difference between the functions: ``` __ockl_get_global_size = blockDim numBlocks __ockl_get_num_groups = numBlocks ``` Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D156009	2023-07-22 11:33:11 +00:00
Matthias Springer	0318123f67	[mlir][Transforms] GreedyPatternRewriteDriver: Fix `changed` parameter `changed` was not updated correctly when it was already set to "true" before calling `applyPatternsAndFoldGreedily`. Differential Revision: https://reviews.llvm.org/D155934	2023-07-22 08:31:24 +02:00
Matthias Springer	c2d5d348a8	[mlir][transform] Add `transform.apply_dce` op Add a transform that eliminates dead operations. This is useful after certain transforms (such as fusion) that create/clone new IR but leave the original IR in place. Differential Revision: https://reviews.llvm.org/D155954	2023-07-22 08:25:02 +02:00
Mehdi Amini	5f1a388a11	Fix crash in ODS backend for Type/Attr when an incorrect construct is used for Type/Attr Instead of crashing, try to print a useful error message.	2023-07-21 22:06:02 -07:00
TatWai Chong	a5f0b237be	[mlir][tosa][fix] Add proper type checking trait for tosa mul when operating integer type tensors, tosa elementwise multiplication requires the element type of result to be a 32-bit integer rather than the same type as inputs. Change-Id: Ifd3d7ebd879be5c6b2c8e23aa6d7ef41f39c6d41 Reviewed By: mgehre-amd Differential Revision: https://reviews.llvm.org/D154988	2023-07-21 23:29:05 +00:00
Jakub Kuderski	eaa4bc6557	[mlir][arith] Add canon pattern for chained `arith.muli` @benvanik reported this as missing. Reviewed By: Mogball Differential Revision: https://reviews.llvm.org/D155907	2023-07-21 18:20:31 -04:00
Rafael Ubal Tena	b2d76a063d	TOSA-to-Linalg lowering for element-wise ops - Wrote complete documentation for the `Broadcastable` op trait. This is mostly meant as a thorough description of its previous behavior, with the exception of minor feature updates. - Restricted legality criteria for a `Broadcastable` op in order to simplify current and future lowering passes and increase efficiency of code generated by those passes. New restriction are: 1) A dynamic dimension in an inferred result is not compatible with a static dimension in the actual result. 2) Broadcast semantics are restricted to input operands and not supported between inferred and actual result shapes. - Implemented TOSA-to-Linalg lowering support for unary, binary, tertiary element-wise ops. This support is complete for all legal cases described in the `Broadcastable` trait documentation. - Added unit tests for `tosa.abs`, `tosa.add`, and `tosa.select` as examples of unary, binary, and tertiary ops. Reviewed By: eric-k256 Differential Revision: https://reviews.llvm.org/D153291	2023-07-21 22:08:33 +00:00
Jacques Pienaar	863e8123df	[mlir] Move attr -> properties to not require Operation This allows for converting before/without an Operation is created. Differential Revision: https://reviews.llvm.org/D155996	2023-07-21 14:54:43 -07:00
Srishti Srivastava	de826ea35d	[MLIR][ANALYSIS] Add liveness analysis utility This commit adds a utility to implement liveness analysis using the sparse backward data-flow analysis framework. Theoretically, liveness analysis assigns liveness to each (value, program point) pair in the program and it is thus a dense analysis. However, since values are immutable in MLIR, a sparse analysis, which will assign liveness to each value in the program, suffices here. Liveness analysis has many applications. It can be used to avoid the computation of extraneous operations that have no effect on the memory or the final output of a program. It can also be used to optimize register allocation. Both of these applications help achieve one very important goal: reducing runtime. A value is considered "live" iff it: (1) has memory effects OR (2) is returned by a public function OR (3) is used to compute a value of type (1) or (2). It is also to be noted that a value could be of multiple types (1/2/3) at the same time. A value "has memory effects" iff it: (1.a) is an operand of an op with memory effects OR (1.b) is a non-forwarded branch operand and a block where its op could take the control has an op with memory effects. A value `A` is said to be "used to compute" value `B` iff `B` cannot be computed in the absence of `A`. Thus, in this implementation, we say that value `A` is used to compute value `B` iff: (3.a) `B` is a result of an op with operand `A` OR (3.b) `A` is used to compute some value `C` and `C` is used to compute `B`. --- It is important to note that there already exists an MLIR liveness utility here: llvm-project/mlir/include/mlir/Analysis/Liveness.h. So, what is the need for this new liveness analysis utility being added by this commit? That need is explained as follows:- The similarities between these two utilities is that both use the fixpoint iteration method to converge to the final result of liveness. And, both have the same theoretical understanding of liveness as well. However, the main difference between (a) the existing utility and (b) the added utility is the "scope of the analysis". (a) is restricted to analysing each block independently while (b) analyses blocks together, i.e., it looks at how the control flows from one block to the other, how a caller calls a callee, etc. The restriction in the former implies that some potentially non-live values could be marked live and thus the full potential of liveness analysis will not be realised. This can be understood using the example below: ``` 1 func.func private @private_dead_return_value_removal_0() -> (i32, i32) { 2 %0 = arith.constant 0 : i32 3 %1 = arith.addi %0, %0 : i32 4 return %0, %1 : i32, i32 5 } 6 func.func @public_dead_return_value_removal_0() -> (i32) { 7 %0:2 = func.call @private_dead_return_value_removal_0() : () -> (i32, i32) 8 return %0#0 : i32 9 } ``` Here, if we just restrict our analysis to a per-block basis like (a), we will say that the %1 on line 3 is live because it is computed and then returned outside its block by the function. But, if we perform a backward data-flow analysis like (b) does, we will say that %0#1 of line 7 is not live because it isn't returned by the public function and thus, %1 of line 3 is also not live. So, while (a) will be unable to suggest any IR optimizations, (b) can enable this IR to convert to:- ``` 1 func.func private @private_dead_return_value_removal_0() -> i32 { 2 %0 = arith.constant 0 : i32 3 return %0 : i32 4 } 5 func.func @public_dead_return_value_removal_0() -> i32 { 6 %0 = call @private_dead_return_value_removal_0() : () -> i32 7 return %0 : i32 8 } ``` One operation was removed and one unnecessary return value of the function was removed and the function signature was modified. This is an optimization that (b) can enable but (a) cannot. Such optimizations can help remove a lot of extraneous computations that are currently being done. Signed-off-by: Srishti Srivastava <srishtisrivastava.ai@gmail.com> Reviewed By: matthiaskramm, jcai19 Differential Revision: https://reviews.llvm.org/D153779	2023-07-21 13:29:14 -07:00
Uday Bondhugula	b36de52c98	NFC. Move remaining affine/memref test cases into respective dialect dirs Move a bunch of lingering test cases from test/Transforms/ into test/Dialect/Affine and MemRef. Differential Revision: https://reviews.llvm.org/D155855	2023-07-21 22:36:01 +05:30
Lorenzo Chelini	4fb25ca51c	[MLIR][Linalg] Preserve DPS when decomposing Softmax Preserve destination passing style (DPS) when decomposing `linalg.Softmax`; instead of creating a new empty, which may materialize as a new buffer after bufferization, use the result directly. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D155942	2023-07-21 18:03:26 +02:00
Guray Ozen	4622113820	[mlir][nvgpu] Set useDefaultAttributePrinterParser Differential Revision: https://reviews.llvm.org/D155959	2023-07-21 17:00:39 +02:00
Matthias Springer	440808faf6	[mlir][linalg] MapCopyToThreadsOp: Support tensor.pad Also return the generated loop op. Differential Revision: https://reviews.llvm.org/D155950	2023-07-21 15:51:46 +02:00
Matthias Springer	a5bba98a58	[mlir][linalg] BufferizeToAllocationOp: Add option to materialize buffers for operands Add an option that does not bufferize the targeted op itself, but just materializes a buffer for the destination operands. This is useful for partial bufferization of complex ops such as `scf.forall`, which need special handling (and an analysis if the region). Differential Revision: https://reviews.llvm.org/D155946	2023-07-21 15:29:59 +02:00
Matthias Springer	20245ed4de	[mlir][transform] Add `apply_cse` option to `transform.apply_patterns` op Applying the canonicalizer and CSE in an interleaved fashion is useful after bufferization (and maybe other transforms) to fold away self copies. Differential Revision: https://reviews.llvm.org/D155933	2023-07-21 15:13:56 +02:00
Jie Fu	3fd1790638	[mlir][nvgpu] Ignore -Wunused-function in NVGPUDialect.cpp (NFC) In file included from /Users/jiefu/llvm-project/mlir/lib/Dialect/NVGPU/IR/NVGPUDialect.cpp:363: /Users/jiefu/llvm-project/build-Release/tools/mlir/include/mlir/Dialect/NVGPU/IR/NVGPUAttrDefs.cpp.inc:22:36: error: unused function 'generatedAttributeParser' [-Werror,-Wunused-function] static ::mlir::OptionalParseResult generatedAttributeParser(::mlir::AsmParser &parser, ::llvm::StringRef *mnemonic, ::mlir::Type type, ::mlir::Attribute &value) { ^ /Users/jiefu/llvm-project/build-Release/tools/mlir/include/mlir/Dialect/NVGPU/IR/NVGPUAttrDefs.cpp.inc:46:30: error: unused function 'generatedAttributePrinter' [-Werror,-Wunused-function] static ::mlir::LogicalResult generatedAttributePrinter(::mlir::Attribute def, ::mlir::AsmPrinter &printer) { ^ 2 errors generated.	2023-07-21 20:50:48 +08:00
Matthias Springer	544f0e9161	[mlir] Fix build after D155680	2023-07-21 13:33:54 +02:00
Matthias Springer	ba745eea40	[mlir][bufferization] Remove cleanup pipeline from bufferization pass To keep the pass simple, users should apply cleanup passes manually when necessary. In particular, `-cse -canonicalize` are often desireable to fold away self-copies that are created by the bufferization. This addresses a comment in D120191. Differential Revision: https://reviews.llvm.org/D155923	2023-07-21 12:11:25 +02:00
Ingo Müller	8fd207fd0d	[mlir][transform][structured][python] Allow str arg in match_op_names. Allow the `names` argument in `MatchOp.match_op_names` to be of type `str` in addition to `Sequence[str]`. In this case, the argument is treated as a list with one name, i.e., it is possible to write `MatchOp.match_op_names(..., "test.dummy")` instead of `MatchOp.match_op_names(..., ["test.dummy"])`. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155807	2023-07-21 09:36:55 +00:00
Ingo Müller	522831384f	[mlir][linalg][transform] Extend diagnostics of FuseIntoContainingOp. This patch extends the diagnostic output of `FuseIntoContainingOp` when it fails to find the next producer by also provided the location of the affected transform op. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155803	2023-07-21 09:34:04 +00:00
Guray Ozen	e56d6745f7	[mlir][nvgpu] Add `tma.create.descriptor` to create tensor map descriptor The Op creates a tensor map descriptor object representing tiled memory region. The descriptor is used by Tensor Memory Access (TMA). The `tensor` is the source tensor to be tiled. The `boxDimensions` is the size of the tiled memory region in each dimension. The pattern here lowers `tma.create.descriptor` to a runtime function call that eventually calls calls CUDA Driver's `cuTensorMapEncodeTiled`. For more information see below: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TENSOR__MEMORY.html Depends on D155453 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155680	2023-07-21 11:33:04 +02:00
Andrzej Warzynski	5c9db62587	[mlir][test] Add missing LIT config for `mlir-cpu-config` + emulator Similarly to when using `lli`, make sure that when using `mlir-cpu-runner` with an emulator, a full path to `mlir-cpu-runner` is used. Otherwise `mlir-cpu-runner` won't be found and you will get the following error: ``` Error while loading mlir-cpu-runner: No such file or directory ``` This patch should fix: * https://lab.llvm.org/buildbot/#/builders/179 The breakage was originally introduced in https://reviews.llvm.org/D155405. Differential Revision: https://reviews.llvm.org/D155920	2023-07-21 09:20:03 +00:00
Alex Zinenko	8dbddb1718	[mlir] allow region branch spec from parent op to itself RegionBranchOpInterface did not allow the operation with regions to specify itself as successors. Therefore, this implied that the control is always transferred to a region before being transferred back to the parent op. Since the region can only transfer the control back to the parent op from a terminator, this transitively implied that the first block of any region with a RegionBranchOpInterface is always executed until the terminator can transfer the control flow back. This is trivially false for any conditional-like operation that may or may not execute the region, as well as for loop-like operations that may not execute the body. Remove the restriction from the interface description and update the only transform that relied on it. See https://discourse.llvm.org/t/rfc-region-control-flow-interfaces-should-encode-region-not-executed-correctly/72103. Depends On: https://reviews.llvm.org/D155757 Reviewed By: Mogball, springerm Differential Revision: https://reviews.llvm.org/D155822	2023-07-21 09:16:56 +00:00
Alex Zinenko	5d8813dec6	[mlir] allow dense dataflow to customize call and region operations Initial implementations of dense dataflow analyses feature special cases for operations that have region- or call-based control flow by leveraging the corresponding interfaces. This is not necessarily sufficient as these operations may influence the dataflow state by themselves as well we through the control flow. For example, `linalg.generic` and similar operations have region-based control flow and their proper memory effects, so any memory-related analyses such as last-writer require processing `linalg.generic` directly instead of, or in addition to, the region-based flow. Provide hooks to customize the processing of operations with region- cand call-based contol flow in forward and backward dense dataflow analysis. These hooks are trigerred when control flow is transferred between the "main" operation, i.e. the call or the region owner, and another region. Such an apporach allows the analyses to update the lattice before and/or after the regions. In the `linalg.generic` example, the reads from memory are interpreted as happening before the body region and the writes to memory are interpreted as happening after the body region. Using these hooks in generic analysis may require introducing additional interfaces, but for now assume that the specific analysis have spceial cases for the (rare) operaitons with call- and region-based control flow that need additional processing. Reviewed By: Mogball, phisiart Differential Revision: https://reviews.llvm.org/D155757	2023-07-21 09:16:03 +00:00
Guray Ozen	9dad32cb90	[mlir][nvgpu] Improve finding module Op to for `mbarrier.create` Current transformation expects module op to be two level higher, however, it is not always the case. This work searches module op in a while loop. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155825	2023-07-21 10:36:45 +02:00
Guray Ozen	70c2e0618a	[mlir][nvgpu] Add nvgpu.tma.async.load and nvgpu.tma.descriptor This work adds `nvgpu.tma.async.load` Op that requests tma load asyncronusly using mbarrier object. It also creates nvgpu.tma.descriptor type. The type is supposed be created by `cuTensorMapEncodeTiled` cuda drivers api. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155453	2023-07-21 10:23:25 +02:00
Alex Zinenko	2469cdd156	[mlir] remove RegionBranchOpInterface from linalg ops Linalg structure ops do not implement control flow in the way expected by RegionBranchOpInterface, and the interface implementation isn't actually used anywhere. The presence of this interface without correct implementation is confusing for, e.g., dataflow analyses. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D155841	2023-07-21 08:18:41 +00:00
Markus Böck	f117bbca04	[mlir] Add opt-in default property bytecode read and write implementation Using properties currently requires at the very least implementing four methods/code snippets: * `convertToAttribute` * `convertFromAttribute` * `writeToMlirBytecode` * `readFromMlirBytecode` This makes replacing attributes with properties harder than it has to be: Attributes by default do not require immediately defining custom bytecode encoding. This patch therefore adds opt-in implementations of `writeToMlirBytecode` and `readFromMlirBytecode` that work with the default implementations of `convertToAttribute` and `convertFromAttribute`. They are provided by `defvar`s in `OpBase.td` and can be used by adding: ``` let writeToMlirBytecode = writeMlirBytecodeWithConvertToAttribute; let readFromMlirBytecode = readMlirBytecodeUsingConvertFromAttribute; ``` to ones TableGen definition. While this bytecode encoding is almost certainly not ideal for a given property, it allows more incremental use of properties and getting something sane working before optimizing the bytecode format. Differential Revision: https://reviews.llvm.org/D155286	2023-07-21 08:03:26 +02:00
wren romano	889f4bf264	[mlir][sparse] Improve `DimLvlMapParser`'s handling of variable bindings This commit comprises a number of related changes: (1) Reintroduces the semantic distinction between `parseVarUsage` vs `parseVarBinding`, adds documentation explaining the distinction, and adds commentary to the one place that violates the desired/intended semantics. (2) Improves documentation/commentary about the forward-declaration of level-vars, and about the meaning of the `bool` parameter to `parseLvlSpec`. (2) Removes the `VarEnv::addVars` method, and instead has `DimLvlMapParser` handle the conversion issues directly. In particular, the parser now stores and maintains the `{dims,lvls}AndSymbols` arrays, thereby avoiding the O(n^2) behavior of scanning through the entire `VarEnv` for each `parse{Dim,Lvl}Spec` call. Unfortunately there still remains another source of O(n^2) behavior, namely: the `AsmParser::parseAffineExpr` method will copy the `DimLvlMapParser::{dims,lvls}AndSymbols` arrays into `AffineParser::dimsAndSymbols` on each `parse{Dim,Lvl}Spec` call; but fixing that would require extensive changes to `AffineParser` itself. Depends On D155532 Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D155533	2023-07-20 15:56:03 -07:00
varconst	81b4e7d2b0	[mlir][spirv] Extract more ops from the main implementation file. NFC. Continue to work outlined in D155747 and split the main SPIR-V ops implementation file into a few smaller and quicker to compile files. Move control flow and memory ops to their own implementation files. Create new `.cpp` files for tablegened code. After this change, the `SPIRVOps.cpp` is 2k LoC-long and takes a reasonable amount of time to compile. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D155883	2023-07-20 17:11:32 -04:00
Amanda Tang	057fc8e7d8	[ODS] Use Adaptor Trait for Shaped Type Inference Author inferReturnTypeComponents methods with the Op Adaptor by using the InferShapedTypeOpAdaptor. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D155243	2023-07-20 19:41:08 +00:00
Andrzej Warzynski	64466b777b	[mlir][test] Add emulator to the mlir-cpu-runner invocation In https://reviews.llvm.org/D146917, MLIR's LIT configuration was updated to allow us to use `mlir-cpu-runner` to run Arm SVE integration tests. That update broke the following buildbot that doesn't support SVE: https://lab.llvm.org/buildbot/#/builders/179/builds/6704 While that bot doesn't support SVE, it can run SVE tests under emulation. This patch makes sure that whenever an Arm emulator is set (via `RM_EMULATOR_EXECUTABLE` CMake variable), it is used to run both `lli` _and_ `mlir-cpu-runner`. I am sending this without a review as it's a rather trivial change and I want to quickly fix the spurious bot failure.	2023-07-20 19:08:23 +00:00
Giuseppe Rossini	4b3eaee270	[mlir][AMDGPU] Define wrappers for WMMA matrix ops Wave Matrix Multiply Accumulate (WMMA) is the instruction to accelerate matrix multiplication on RDNA3 architectures. LLVM already provides a set of intrinsics to generate wmma instructions. This change uses those intrinsics to enable the feature in MLIR. Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D152451	2023-07-20 18:38:35 +00:00
Jakub Kuderski	ab6827f2d4	[mlir][spirv] Extract Atomic/Cast/Group op implementation. NFC. Continue to work outlined in D155747 and split the main SPIR-V ops implementation file into a few smaller and quicker to compile files. This organization matches the op definition organizaion in `.td` files. In this patch, extract atomic, cast/conversion, and group op implementation into separate files. Reviewed By: antiagainst Differential Revision: https://reviews.llvm.org/D155777	2023-07-20 11:15:30 -04:00
Ingo Müller	4f30746ca0	[mlir][transform][python] Add extended ApplyPatternsOp. This patch adds a mixin for ApplyPatternsOp to _transform_ops_ext.py with syntactic sugar for construction such ops. Curiously, the op did not have any constructors yet, probably because its tablegen definition said to skip the default builders. The new constructor is thus quite straightforward. The commit also adds a refined `region` property which returns the first block of the single region. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155435	2023-07-20 14:20:50 +00:00
Ingo Müller	f62cb13fb2	[mlir][linalg][transform] Rename ApplyPatternsOp.{region => patterns}. This gives the region a more meaningful name. The topic came up in a discussion on https://reviews.llvm.org/D155435, where the name `region` would have led to a situation where a convenience accessor called `region` (after the ODS name) would have returned a Block. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155810	2023-07-20 14:20:45 +00:00
Ingo Müller	5f4f9220f9	[mlir][transform][gpu][python] Add MapForallToBlocks mix-in. This patch adds a mix-in class for MapForallToBlocks with overloaded constructors. This makes it optional to provide the return type of the op, which is defaulte to `AnyOpType`. Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D155717	2023-07-20 14:20:40 +00:00
Sergio Afonso	40340cf91a	[MLIR][OpenMP][OMPIRBuilder] Use target triple to initialize `IsGPU` flag This patch modifies the construction of the `OpenMPIRBuilder` in MLIR to initialize the `IsGPU` flag using target triple information passed down from the Flang frontend. If not present, it will default to `false`. This replicates the behavior currently implemented in Clang, where the `CodeGenModule::createOpenMPRuntime()` method creates a different `CGOpenMPRuntime` instance depending on the target triple, which in turn has an effect on the `IsGPU` flag of the `OpenMPIRBuilderConfig` object. Differential Revision: https://reviews.llvm.org/D151903	2023-07-20 15:07:50 +01:00
Markus Böck	f9173c2958	[mlir][LLVM] Convert `noalias` parameters into alias scopes during inlining Currently, inlining a function with a `noalias` parameter leads to a large loss of optimization potential as the `noalias` parameter, an important hint for alias analysis, is lost completely. This patch fixes this with the same approach as LLVM by annotating all users of the `noalias` parameter with appropriate alias and noalias scope lists. The implementation done here is not as sophisticated as LLVMs, which has more infrastructure related to escaping and captured pointers, but should work in the majority of important cases. Any deficiency can be addressed in future patches. Related LLVM code: `27ade4b554/llvm/lib/Transforms/Utils/InlineFunction.cpp (L1090)` Differential Revision: https://reviews.llvm.org/D155712	2023-07-20 15:05:28 +02:00
Guray Ozen	836dbb8522	[mlir][nvgpu] Add `mbarrier.arrive.expect_tx` and `mbarrier.try_wait.parity` This work adds two Ops: `mbarrier.arrive.expect_tx` performs expect_tx `mbarrier.barrier` returns `mbarrier.barrier.token` `mbarrier.try_wait.parity` waits on `mbarrier.barrier` and `mbarrier.barrier.token` `mbarrier.arrive.expect_tx` is one of the requirement to enable H100 TMA support. Depends on D154074 D154076 D154059 D154060 Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D154094	2023-07-20 13:48:30 +02:00
Tobias Gysi	10fa27704b	[mlir][llvm] Add branch weight op interface This revision adds a branch weight op interface for the call / branch operations that support branch weights. It can be used in the LLVM IR import and export to simplify the branch weight conversion. An additional mapping between call operations and instructions ensures the actual conversion can be done in the module translation itself, rather than in the dialect translation interface. It also has the benefit that downstream users can amend custom metadata to the call operation during the export to LLVM IR. Reviewed By: zero9178, definelicht Differential Revision: https://reviews.llvm.org/D155702	2023-07-20 10:46:04 +00:00
Ivan Butygin	9dec3fd812	[mlir] Add `ub` dialect and `poison` op. Add new dialect boilerplate and `poison` op definition. Discussion: https://discourse.llvm.org/t/rfc-poison-semantics-for-mlir/66245/24 Differential Revision: https://reviews.llvm.org/D154248	2023-07-20 11:19:43 +02:00
Markus Böck	b82acf8a14	[mlir][LLVM] Handle access groups during inlining Handling access groups is luckily rather trivial: Any access groups from the call instruction are simply appended to any memory operations. This is similar to one of the steps when handling alias scopes. This patch nevertheless implements it as a separate function purely for readability purposes as it uses a different interface than alias scopes. Differential Revision: https://reviews.llvm.org/D155795	2023-07-20 10:45:15 +02:00
Matthias Springer	2137915137	[mlir] Remove some code duplication between `Builders.cpp` and `FoldUtils.cpp` Also update the documentation of `Operation::fold`, which did not take into account in-place foldings. Differential Revision: https://reviews.llvm.org/D155691	2023-07-20 10:27:14 +02:00

1 2 3 4 5 ...

16466 Commits