This patch supports the R_RISCV_JAL relocation.
Moreover, it will fix the extractBits function's behavior as it extracts Size + 1 bits.
In the test ELF_jal.s:
Before:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 480
```
After:
```
Hi: 4294836480
extractBits(Hi, 12, 8): 224
```
Reviewed By: StephenFan
Differential Revision: https://reviews.llvm.org/D117975
Add the passthru operand for
VMV_V_X_VL, VFMV_V_F_VL and SPLAT_VECTOR_SPLIT_I64_VL also.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D119688
Turns out the test was working by accident: we need to ensure
TSan instrumentation is not called from the fork() hook, otherwise the
tool will deadlock. Previously it worked because alloc_free_blocks() got
inlined into __tsan_test_only_on_fork(), but it cannot always be the
case.
Adding __attribute__((disable_sanitizer_instrumentation)) will prevent
TSan from instrumenting alloc_free_blocks().
Reviewed By: dvyukov
Differential Revision: https://reviews.llvm.org/D120050
In SPIR-V, resources are represented as global variables that
are bound to certain descriptor. SPIR-V requires those global
variables to be declared as aliased if multiple ones are bound
to the same slot. Such aliased decorations can cause issues
for transcompilers like SPIRV-Cross when converting to source
shading languages like MSL.
So this commit adds a pass to perform analysis of aliased
resources and see if we can unify them into one.
Reviewed By: ThomasRaoux
Differential Revision: https://reviews.llvm.org/D119872
This patch added the MC layer support of Zfinx extension.
Authored-by: StephenFan
Co-Authored-by: Shao-Ce Sun
Reviewed By: asb
Differential Revision: https://reviews.llvm.org/D93298
After D118128 relaxed the heuristic to require only one EFLAGS generating operand, it now makes sense to avoid X86ISD::SMUL/UMULO duplication as well.
Differential Revision: https://reviews.llvm.org/D119578
The current ABD combine doesn't quite work for SVE because only a
single scalable vector per scalar integer type is legal (e.g. for
i32, <vscale x 4 x i32> is the only legal scalable vector type).
This patch extends the combine to also trigger for the cases when
operand extension must be retained.
Differential Revision: https://reviews.llvm.org/D115739
When doing SelectionDAG::ReplaceAllUsesOfValuesWith a worklist is
prepared containing all users that should be updated. Then we use
the RemoveNodeFromCSEMaps/AddModifiedNodeToCSEMaps helpers to handle
recursive CSE updates while doing the replacements.
This patch aims at solving a problem that could arise if the recursive
CSE updates would result in an SDNode present in the worklist is being
removed as a side-effect of morphing a prio user in the worklist.
To examplify such a scenario, imagine that we have these nodes in
the DAG
t12: i64 = add t8, t11
t13: i64 = add t12, t8
t14: i64 = add t11, t11
t15: i64 = add t14, t8
t16: i64 = sub t13, t15
and that the t8 uses should be replaced by t11. An initial worklist
(listing the users that should be morphed) could be [t12, t13, t15].
When updating t12 we get
t12: i64 = add t11, t11
which results in a CSE update that replaces t14 by t12, so we get
t15: i64 = add t12, t8
which results in a CSE update that replaces t13 by t12, so we get
t16: i64 = sub t12, t15
and then t13 is removed given that it was the last use of t13.
So when being done with the updates triggered by rewriting the use
of t8 in t12 the t13 node no longer exist. And we used to end up
hitting an assertion when continuing with the worklist aiming at
replacing the t8 uses in t13.
The solution is based on using a DAGUpdateListener, making sure that
we prune a user from the worklist if it is removed during the
recursive CSE updates.
The bug was found using an OOT target. I think the problem is quite
old, even if the particular intree target reproducer added in this
patch seem to pass when using LLVM 13.0.0.
Differential Revision: https://reviews.llvm.org/D119088
This would create a double free when a memref is passed twice to the
same op. This wasn't a problem at the time the pass was written but is
common since the introduction of scf.while.
There's a latent non-determinism that's triggered by the test, but this
change is messy enough as-is so I'll leave that for later.
Differential Revision: https://reviews.llvm.org/D120044
This also requires adjustment to code in AArch64ISelLowering so that
vector_extract is distributed over strict_fadd.
Differential Revision: https://reviews.llvm.org/D118489
Using an indexed version instead of a non-indexed version doesn't
change anything with regards to exceptions or rounding.
Differential Revision: https://reviews.llvm.org/D118487
These patterns don't change the fundamental instructions that are
used, just the variants that are used in order to remove some extra
MOVs.
Differential Revision: https://reviews.llvm.org/D118485
This consists of marking the various strict opcodes as legal, and
adjusting instruction selection patterns so that 'op' is 'any_op'.
FP16 and vector instructions additionally require some extra work in
lowering and legalization, so we can't set IsStrictFPEnabled just yet.
Also more work needs to be done for full strict fp support (marking
instructions that can raise exceptions as such, and modelling FPCR use
for controlling rounding).
Differential Revision: https://reviews.llvm.org/D114946
This patch adds support for the `-emit-llvm` option in the frontend
driver (i.e. `flang-new -fc1`). Similarly to Clang, `flang-new -fc1
-emit-llvm file.f` will generate a textual LLVM IR file.
Depends on D118985
Differential Revision: https://reviews.llvm.org/D119012
This code could be generalized to be type-independent, but for now
just ensure that the same type constraints are enforced with opaque
pointers as with typed pointers.
Our current strategy of computing ranges of SCEVUnknown Phis was to simply
compute the union of ranges of all its inputs. In order to avoid infinite recursion,
we mark Phis as pending and conservatively return full set for them. As result,
even simplest patterns of cycled phis always have a range of full set.
This patch makes this logic a bit smarter. We basically do the same, but instead
of taking inputs of single Phi we find its strongly connected component (SCC)
and compute the union of all inputs that come into this SCC from outside.
Processing entire SCC together has one more advantage: we can set range for all
of them at once, because the only thing that happens to them is the same value is
being passed between those Phis. So, despite we spend more time analyzing a
single Phi, overall we may save time by not processing other SCC members, so
amortized compile time spent should be approximately the same.
Differential Revision: https://reviews.llvm.org/D110620
Reviewed By: reames
Until now, overloads with a 64-bit atomic type argument were always
made available with `-fdeclare-opencl-builtins`. Ensure these
overloads are only available when both the `cl_khr_int64_base_atomics`
and `cl_khr_int64_extended_atomics` extensions have been enabled, as
required by the OpenCL specification.
Differential Revision: https://reviews.llvm.org/D119858
Current implementation of Check[HSDQ]Form predicates doesn’t handle virtual registers and therefore isn’t useful for pre-RA scheduling. Patch fixes this implementing two function predicates: CheckQForm for checking that instruction writes 128-bit NEON register and CheckFpOrNEON which checks that instruction writes FP register (any width). The latter supersedes Check[HSD]Form predicates which are not used individually.
OS Laboratory. Huawei Russian Research Institute. Saint-Petersburg
Reviewed By: dmgreen
Differential Revision: https://reviews.llvm.org/D114642
To make uses of the deprecated constructor easier to spot, and to
ensure that no new uses are introduced, rename it to
Address::deprecated().
While doing the rename, I've filled in element types in cases
where it was relatively obvious, but we're still left with 135
calls to the deprecated constructor.
The goal is support tail and mask policy in RVV builtins.
We focus on IR part first.
If the passthru operand is undef, we use tail agnostic, otherwise
use tail undisturbed.
Reviewed By: rogfer01
Differential Revision: https://reviews.llvm.org/D119686
This reverts commit 910a642c0a.
There are serious correctness issues with the current approach: __sync_*
routines which are not actually atomic should not be enabled by default.
I'll continue discussion on the review.
The MLIR parser allows regions to have an unnamed entry block.
Make this explicit in the language grammar.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D119950
This is some NFC (hopefully!) cleanup for performCommonVectorExtendCombine
and related methods, removing conditions that cannot occur and otherwise
cleaning up the code a little.