Factor out some stack allocation in a separate function. This patch
splits out the generic portion of a larger refactoring done as a part of
stack clash protection support.
The patch is almost, but not quite NFC. The only difference should
be that where we have adjacent allocation of stack space
for local SVE objects and non-local SVE objects the order
of `sub sp, ...` and `addvl sp, ...` instructions is reversed, because now
it's done with a single call to `emitFrameOffset` and it happens
add/subtract the fixed part before the scalable part, e.g.
addvl sp, sp, #-2
sub sp, sp, #16, lsl #12
sub sp, sp, #16
becomes
sub sp, sp, #16, lsl #12
sub sp, sp, #16
addvl sp, sp, #-2
Patch 3/3 of the transition step 1 described in
https://discourse.llvm.org/t/rfc-enabling-the-hlfir-lowering-by-default/72778/7
This patch changes bbc and flang-new driver to use HLFIR lowering by
default.
`-hlfir=false` can be used with bbc and `-flang-deprecated-no-hlfir`
with flang-new to get the previous default lowering behavior, but these
options will only be available for a limited period of time.
If any user needs these options to workaround bugs, they should open an
issue against flang in llvm github repo so that the regression can be
fixed in HLFIR.
This reverts commit 9a9933fae2.
The OCaml bindings were using `LLVMDIBuilderCreateStaticMemberType`,
causing the API change in `9a9933fae23249fbf6cf5b3c090e630f578b7f98`
to break buildbots that built the bindings. Revert until we figure out
whether to fixup the bindings or just not change the C-API
Due to a GCC bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=25940),
GCC doesn't consider extern "C" functions with the same name but
different namespace to be the same. As such, the default visibility
attribute (on a declaration outside the namespace) doesn't get applied
to the definition in the namespace and the symbol is not exported.
This came up as an ABI diff when switching between gcc and clang for
compiling compiler-rt.
FileCheck currently compiles a regular expression of the form
`Prefix1|Prefix2|...` and uses it to find the next prefix in the input.
If we had a fast regex implementation, this would be a useful thing to
do, as the regex implementation would be able to match multiple prefixes
more efficiently than a naive approach. However, with our actual regex
implementation, finding the prefixes basically becomes O(InputLen *
RegexLen * LargeConstantFactor), which is a lot worse than a simple
string search.
Replace the regex with StringRef::find(), and keeping track of the next
position of each prefix. There are various ways this could be improved
on, but it's already significantly faster that the previous approach.
For me, this improves check-llvm time from 138.5s to 132.5s, so by
around 4-5%.
For vector-interleaved-load-i16-stride-7.ll in particular, test time
drops from 5s to 2.5s.
This patch adds the LLVM-side infrastructure to implement DWARFv5 issue
161118.1: "DW_TAG for C++ static data members".
The clang-side of this patch will simply construct the DIDerivedType
with a different DW_TAG.
EvaluateAsConstantExpr() uses ::EvaluateInPlace() directly, which does
not use the new interpreter if requested. Do it here, which is the same
pattern we use in EvaluateAsInitializer.
https://github.com/llvm/llvm-project/pull/70958 adds registers R16-R31
(EGPR), this patch
1. Supports decoding of EGPR for instruction w/ REX2 prefix
2. Supports decoding of EGPR for instruction w/ EVEX prefix
For simplicity's sake, we
1. Simulate the REX prefix w/ the 1st payload of REX2
2. Simulate the REX2 prefix w/ the 2nd and 3rd payloads of EVEX
RFC:
https://discourse.llvm.org/t/rfc-design-for-apx-feature-egpr-and-ndd-support/73031/4
Explanations for some changes:
1. invalid-EVEX-R2.txt is deleted b/c `0x62 0xe1 0xff 0x08 0x79 0xc0` is
valid and decoded to `vcvtsd2usi %xmm0, %r16` now.
2. One line in x86-64-err.txt is removed b/c APX relaxes the limitation
of the 1st and 2nd payloads of EVEX prefix, so the error message changes
The first commit extends the capacity from the compiler infrastructure,
and the second commit continues the effort in #71140 to introduce tuple
types for bfloat16.
memref.atomic_rmw will fail to convert for memref types that have an offset because they do not have identity maps. This restriction is overly conservative, so this changes the restriction to only strided memref types.
suggested in review of https://github.com/llvm/llvm-project/pull/69553
This is actually not an NFC as isFunction() does not return false for
some "invalid" object, instead it returns the errors to its caller. But
since there is no such invalid object in the LIT tests, so no case
changes.
The R_LARCH_{ADD,SUB}6 relocation type are usually used by DwarfCFA to
calculate a tiny offset. They appear after binutils 2.41, with GAS
enabling relaxation by default.
This patch adds support for saving minidumps with the arm64
architecture. It also will cause unsupported architectures to emit an
error where before this patch it would emit a minidump with partial
information. This new code is tested by the arm64 windows buildbot that
was failing:
https://lab.llvm.org/buildbot/#/builders/219/builds/6868
This is needed following this PR:
https://github.com/llvm/llvm-project/pull/71772
This PR separates parts of the symbolizer markup
implementation that are Fuchsia OS specific. This
is in preparation of enabling symbolizer markup
in other OSs.
The tryMergeAdjacentSTG function tries to merge multiple
stg/st2g/stg_loop instructions. It doesn't verify the liveness of NZCV
flag before moving around STGloop which also alters NZCV flags. This was
not issue before the patch 5e612bc as these stack tag stores does not
alter the NZCV flags. But after the change, this merge function leads to
miscompilation because of control flow change in instructions. Added the
check to to see if the first instruction after insert point reads or
writes to NZCV flag and it's liveout state. This check happens after the
filling of merge list just before merge and bails out if necessary.
Using an op with a region cause some issue with unstructured code. This
patch make use of acc.declare_enter and acc.declare_exit to represent
the implicit declare region.
Adding this section ensures that the documentation generated by
mlir-tblgen for all the FIR operations are at the correct depth.
FIROperations are at depth 3, the new section is at depth 2. This fixes
the "Non-consecutive header level increase; H1 to H3" warning in
FIRLangRef.md.
The `ExprMutationAnalyzer`s matcher of `binaryOperator`s
that contained the variable expr, were previously narrowing the
variable to be type dependent, when the `binaryOperator` should
have been narrowed as dependent.
The variable we are trying to find mutations for does
not need to be the dependent type, the other operand of
the `binaryOperator` could be dependent.
Fixes#57297