Refactor TerminalState to make the code simpler. Move 'struct termios'
to a PImpl-style subclass. Add an RAII interface to automatically store
and restore the state.
Differential revision: https://reviews.llvm.org/D110721
References to fields inside anon structs contain an implicit children
for the container, which has the same SourceLocation with the field.
This was resulting in SelectionTree always picking the anon-struct rather than
the field as the selection.
This patch prevents that by claiming the range for the field early.
https://github.com/clangd/clangd/issues/877.
Differential Revision: https://reviews.llvm.org/D110825
This patch additional tests with i64 GEP indices for 32 bit pointers.
@mustalias_overflow_in_32_bit_add_mul_gep highlights a case where
BasicAA currently incorrectly determines noalias.
Modeled in Alive2 for 32 bit pointers: https://alive2.llvm.org/ce/z/HHjQgb
Modeled in Alive2 for 64 bit pointers: https://alive2.llvm.org/ce/z/DoWK2c
D104809 changed `buildTree_rec` to check for extract element instructions
with scalable types. However, if the extract is extended or truncated,
these changes do not apply and we assert later on in isShuffle(), which
attempts to cast the type of the extract to FixedVectorType.
Reviewed By: ABataev
Differential Revision: https://reviews.llvm.org/D110640
When replacing function calls, skip call instructions where the old
function is not the called function, but e.g. the old function is passed
as an argument.
This fixes a crash due to trying to construct invalid IR for the test
case.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D109759
This reverts commit 9f641c96cb.
The "python" command in gdb uses the python gdb is linked to,
not what "python" would give you if you used it directly in the shell.
This patch adds further support for vectorisation of loops that involve
selecting an integer value based on a previous comparison. Consider the
following C++ loop:
int r = a;
for (int i = 0; i < n; i++) {
if (src[i] > 3) {
r = b;
}
src[i] += 2;
}
We should be able to vectorise this loop because all we are doing is
selecting between two states - 'a' and 'b' - both of which are loop
invariant. This just involves building a vector of values that contain
either 'a' or 'b', where the final reduced value will be 'b' if any lane
contains 'b'.
The IR generated by clang typically looks like this:
%phi = phi i32 [ %a, %entry ], [ %phi.update, %for.body ]
...
%pred = icmp ugt i32 %val, i32 3
%phi.update = select i1 %pred, i32 %b, i32 %phi
We already detect min/max patterns, which also involve a select + cmp.
However, with the min/max patterns we are selecting loaded values (and
hence loop variant) in the loop. In addition we only support certain
cmp predicates. This patch adds a new pattern matching function
(isSelectCmpPattern) and new RecurKind enums - SelectICmp & SelectFCmp.
We only support selecting values that are integer and loop invariant,
however we can support any kind of compare - integer or float.
Tests have been added here:
Transforms/LoopVectorize/AArch64/sve-select-cmp.ll
Transforms/LoopVectorize/select-cmp-predicated.ll
Transforms/LoopVectorize/select-cmp.ll
Differential Revision: https://reviews.llvm.org/D108136
Some vectors require both widening and promotion for their legalization.
This case is not yet handled in getCopyToPartsVector and falls back
on scalarizing by default. BBecause scalable vectors can't easily be
scalarised, we need to implement this in two separate stages:
1. Widen the vector.
2. Promote the vector.
As part of this patch, PromoteIntRes_CONCAT_VECTORS also needed to be
made scalable aware. Instead of falling back on scalarizing the vector
(fixed-width only), each sub-part of the CONCAT vector is promoted,
and the operation is performed on the type with the widest element type,
finally truncating the result to the promoted result type.
Differential Revision: https://reviews.llvm.org/D110646
Move the big builder out of the td file to the cpp file.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: kiranchandramohan
Differential Revision: https://reviews.llvm.org/D110820
Revert "[flang][NFC] Add debug dump method to evaluate::Expr and semantics::Symbol"
This reverts commit b0e35fde21.
Revert "[flang] Add a wrapper for Fortran main program"
This reverts commit 2c1ce0755e.
Revert "[flang][NFC] Fix header comments in some runtime headers"
This reverts commit a63f57674d.
Follow up of https://reviews.llvm.org/D83397.
In folding, make pgmath usage conditional to C99 complex
support in C++. Disable warning in such case.
In lowering, use an empty class type to indicate C99 complex
type in runtime interface.
Add a unit test enforcing C99 complex can be processed
by FIR runtime interface builder.
Differential Revision: https://reviews.llvm.org/D110860
Add a C wrapper that calls the Fortran runtime initialization and
finalization routines as well as the compiled Fortran main program
_QQmain.
Place it in its own library to satisfy shared library builds since it
contains a C main function.
- cc7ac498f9 (diff-fa35a5efa62731fd2845e5e982eca9a2e36439783e11a4e4a463753c2160ec10R53)
- was created in flang/test/Examples/main.c in Eric's branch
The signatures for the PowerPC builtins lharx and
lbarx are incorrect, and causes issues when used in a function
that requires the return of the builtin to be promoted.
This patch fixes these signatures.
Differential revision: https://reviews.llvm.org/D110273
While these functions are only used in one location in upstream,
it has been reused in multiple downstreams. Restore this file to
a globally visibile location (outside of APInt.h) to eliminate
donwstream breakage and enable potential future reuse.
Additionally, this patch renames types and cleans up
clang-tidy issues.
Latest upstream llvm caused the kernel bpf selftest emitting the
following warnings:
In file included from progs/profiler3.c:6:
progs/profiler.inc.h:489:2: warning: loop not unrolled:
the optimizer was unable to perform the requested transformation;
the transformation might be disabled or specified as part of an unsupported
transformation ordering [-Wpass-failed=transform-warning]
for (int i = 0; i < MAX_PATH_DEPTH; i++) {
^
Further bisecting shows this SimplifyCFG patch ([1]) changed
the condition on how to fold branch to common dest. This caused
some unroll pragma is not honored in selftests/bpf.
The patch [1] test getUserCost() as the condition to
perform the certain basic block folding transformation.
For the above example, before the loop unroll pass, the control flow
looks like:
cond_block:
branch target: body_block, cleanup_block
body_block:
branch target: cleanup_block, end_block
end_block:
branch target: cleanup_block, end10_block
end10_block:
%add.ptr = getelementptr i8, i8* %payload.addr.0, i64 %call2
%inc = add nuw nsw i32 %i.0, 1
branch target: cond_block
In the above, %call2 is an unknown scalar.
Before patch [1], end10_block will be folded into end_block, forming
the code like
cond_block:
branch target: body_block, cleanup_block
body_block:
branch target: cleanup_block, end_block
end_block:
branch target: cleanup_block, cond_block
and the compiler is happy to perform unrolling.
With patch [1], getUserCost(), which calls getGEPCost(), which calls
isLegalAddressingMode() in TargetLoweringBase.cpp, considers IR
%add.ptr = getelementptr i8, i8* %payload.addr.0, i64 %call2
is free, so the above basic block folding transformation is not performed
and unrolling does not happen.
For BPF target, the IR
%add.ptr = getelementptr i8, i8* %payload.addr.0, i64 %call2
is not free and we don't have ld/st instruction address with 'r+r' mode.
This patch implemented a BPF hook for isLegalAddressingMode(), which is
identical to Mips isLegalAddressingMode() implementation where
the address pattern like 'r+r', 'r+r+i' or '2*r' are not allowed.
With testing kernel bpf selftests, all loop not unrolled warnings
are gone and all selftests run successfully.
[1] https://reviews.llvm.org/D108837
Differential Revision: https://reviews.llvm.org/D110789
As of e9564c3698, libcxx/gdb/gdb_pretty_printer_test.sh.cpp
fails locally for me because the REQUIRES check for host-has-gdb-with-python
uses python, which for me expands to python 2.7.18. This failure does not seem
to be caught on any upstream builders, potentially because they don't have gdb,
python, or a version of python that makes the test UNSUPPORTED (like python3).
This updates the check to use the python specified by the build (which should
be the python that runs this code), rather than just python.
Differential Revision: https://reviews.llvm.org/D110887
Previously for mem* intrinsics we only incremented the access count for
the first word in the range. However, after thinking it through I think
it makes more sense to record an access for every word in the range.
This better matches the behavior of inlined memory intrinsics, and also
allows better analysis of utilization at a future date.
Differential Revision: https://reviews.llvm.org/D110799
Helps debugging when working with symbol/expression issue. The dump
method is easy to call in the debugger.
Co-authored-by: Eric Schweitz <eschweitz@nvidia.com>
Differential Revision: https://reviews.llvm.org/D110856
This consists of 3 compressed instructions, c.not, c.neg, and c.zext.w.
I believe these have been picked up by the Zce effort using different
encodings. I don't think it makes sense to keep them in bitmanip. It
will eventually cause a conflict if/when Zce is implemented in llvm.
Differential Revision: https://reviews.llvm.org/D110871
When the ProcRef is Symbol is a SubprogramDetails, the interface is
the SubprogramDetails. Do not return nullptr.
Differential Revision: https://reviews.llvm.org/D110853
Add a FAQ entry about the names of openmp offloading components
and how they are searched for.
Reviewed By: jhuber6
Differential Revision: https://reviews.llvm.org/D109619
rG210d72e9d6b4a8e7633921d0bd7186fd3c7a2c8c moved the check from
builtin-config-ix to config-ix so that the check would be made even when
the builtins are not built. However, now the check is no longer made
when the builtins are built standalone which causes the builtins to fail
to build.
Add the check back to builtins-config-ix so that the check gets
performed both when the builtins are not built, and when they are built
standalone.
Reviewed By: smeenai
Differential Revision: https://reviews.llvm.org/D110879
Fixes 51982. Adds a missing CreatePointerCast and allocates a global in
the correct address space.
Test case derived from https://github.com/ROCm-Developer-Tools/aomp/\
blob/aomp-dev/test/smoke/nest_call_par2/nest_call_par2.c by deleting
parts while checking the assertion failure still occurred.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D110556
Use enum for execution mode.
This is partly a port from ROCm and partly a port from D110029. Attempted to
make the same choices as ROCm as far as comments etc go to reduce the merge
conflicts.
There is some cleanup warranted here - in particular I like the cuda patch
factoring out the comparisons into named variables - but I'd like to leave
that for a follow up patch, keeping this one minimal.
Reviewed By: carlo.bertolli
Differential Revision: https://reviews.llvm.org/D110845