Commit Graph

466087 Commits

Author SHA1 Message Date
Denis Revunov
ad4e0770ca [BOLT][Instrumentation] Put Allocator itslef in shared memory by default
In absence of instrumentation-file-append-pid option,
global allocator uses shared pages for allocation. However, since it is a
global variable, it gets COW'd after fork if instrumentation-sleep-time
is used, or if a process forks by itself. This means it handles the same
pages to every process which causes hash table corruption. Thus, if we
want shared pages, we need to put the allocator itself in a shared page,
which we do in this commit in __bolt_instr_setup.
I also added a couple of assertions to sanity-check the hash table.

Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D153771
2023-06-30 01:03:52 +03:00
Denis Revunov
02c3724d43 [BOLT][Instrumentation] Don't share counters when using append-pid
The point of append-pid option is to record separate profiles for
separate forks, which is impossible when counters are the same for
every process. It leads to a sum of all profiles in every file, plus
GlobalWriteProfileMutex located in a shared memory prevents some
processes from dumping their data at all.

Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D153771
2023-06-30 01:03:52 +03:00
Denis Revunov
8f7c53ef81 [BOLT][Instrumentation] Add mmap return value assertions
In a very rare case that mmap call fails, we'll at least get a message
instead of segfault.

Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D154056
2023-06-30 01:03:52 +03:00
Denis Revunov
f0b45fba4b [BOLT][Instrumentation][NFC] define and use mmap flags
Reviewed By: rafauler, Amir
Differential Revision: https://reviews.llvm.org/D154056
2023-06-30 01:03:52 +03:00
Haojian Wu
4b47c6e018 Fix -Wunused-variable in release build. 2023-06-30 00:02:05 +02:00
Brendon Cahoon
853b2a84cb [AMDGPU] Reserve SGPR pair when long branches are present
Branch relaxation requires 2 additional SGPRs for AMDGPU to handle the
case when an indirect branch target is too far away. The register
scavanger may not find available registers, which causes a “did not find
scavenging index” assert to occur in assignRegToScavengingIndex.

In this patch, we estimate before register allocation whether an
indirect branch is likely to be needed, and reserve 2 SGPRs if the
branch distance is found to be above a threshold. The distance threshold
is an approximation as the exact code size and branch distance are
unknown prior to register allocation.

Patch by Corbin Robeck. Thanks!

Differential Review: https://reviews.llvm.org/D149775
2023-06-29 16:50:46 -05:00
varconst
b5270ba20d [libc++] Remove the legacy debug mode.
See https://discourse.llvm.org/t/rfc-removing-the-legacy-debug-mode-from-libc/71026

Reviewed By: #libc, Mordante, ldionne

Differential Revision: https://reviews.llvm.org/D153672
2023-06-29 14:49:51 -07:00
Yuanfang Chen
632dd6a4ca [Clang] Implements CTAD for aggregates P1816R0 and P2082R1
Differential Revision: https://reviews.llvm.org/D139837
2023-06-29 14:22:24 -07:00
philass
b287a4cbc4 [mlir] Remove self-include from BytecodeOpInterface.h (NFC)
Differential Revision: https://reviews.llvm.org/D153830
2023-06-29 14:14:23 -07:00
Louis Dionne
e7c63c0e90 [libc++] Stop using __builtin_assume in _LIBCPP_ASSERT
__builtin_assume can sometimes worsen code generation. For now, the
guideline seems to be to avoid adding assumptions without a clear
optimization intent. Since _LIBCPP_ASSERT is very general, we can't
have a clear optimization intent at this level, which makes
__builtin_assume the wrong tool for the job -- at least until
__builtin_assume is changed.

See https://discourse.llvm.org/t/llvm-assume-blocks-optimization/71609
for a discussion of this.

Differential Revision: https://reviews.llvm.org/D153968
2023-06-29 16:59:29 -04:00
Joseph Huber
b15ac1fd89 [libc] Enable the 'div' routines on the GPU
This patch simply enables the `div`, `ldiv,` and, `lldiv` functions on
the GPU. This should be straightforward enough.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D154143
2023-06-29 15:42:46 -05:00
Johannes Doerfert
5186c2f9f8 [Attributor][NFC] Introduce IRP helpers for Attr and Arg handling 2023-06-29 13:32:06 -07:00
Johannes Doerfert
d6fa3b374f [Attributor] Remove now obsolete initialization code
With the helpers in place to judge AAs [1] we can remove the custom
rolled initialization checking code. This exposed a minor oversight in
the AAMemoryLocation where we did not check the IR before we gave up for
a declaration.

[1] d33bca840a
2023-06-29 13:32:06 -07:00
V Donaldson
2b12d8350e [flang] Block containing an interface
Name mangling may be invoked for an interface procedure contained in
a block in a context that does not have access to block ID mapping.
Procedures can't be defined inside a block, so name mangling doesn't
need a block map. Relax an assert to account for this.

    block
      interface
        subroutine ss(n) bind(c)
          integer :: n
        end subroutine
      end interface
      call ss(5)
    end block
    end
2023-06-29 13:25:45 -07:00
Ben Langmuir
1ede7b4749 [clang][modules] Avoid serializing all diag mappings in non-deterministic order
When writing a pcm, we serialize diagnostic mappings in order to
accurately reproduce the diagnostic environment inside any headers from
that module. However, the diagnostic state mapping table contains
entries for every diagnostic ID ever accessed, while we only want to
serialize the ones that are actually modified from their default value.
Futher, we need to serialize them in a deterministic order.

rdar://111477511

Differential Revision: https://reviews.llvm.org/D154016
2023-06-29 13:17:24 -07:00
Daniel Kiss
92fbb602f3 [lld][AArch64] Add BTI landing pad to PLT entries when the symbol is exported.
With relative vtables the caller jumps directly to the plt entries in the shared object,
therefore landing pad is need for these entries.

Reproducer:
main.cpp
```
#include "v.hpp"
int main() {
    A* a = new B();
    a->do_something2();
    return 0;
}
```
v.hpp
```
struct A {
    virtual void do_something() = 0;
    virtual void do_something2();
};
struct B : public A {
    void do_something() override;
    void do_something2() override;
};
```
v.cpp
```
#include "v.hpp"
void A::do_something2() { }
void B::do_something() { }
void B::do_something2() { }
```
```
CC="clang++ --target=aarch64-unknown-linux-gnu -fuse-ld=lld -mbranch-protection=bti"
F=-fexperimental-relative-c++-abi-vtables
${=CC} $F -shared v.cpp -o v.so -z force-bti
${=CC} $F main.cpp -L./ v.so -Wl,-rpath=. -z force-bti
qemu-aarch64-static -L /usr/aarch64-linux-gnu -cpu max ./a.out
```
For v.so, the regular vtable entry is relocated by an R_AARCH64_ABS64 relocation referencing _ZN1B13do_something2Ev.
```
_ZTV1B:
.xword  _ZN1B13do_something2Ev
```
Using relative vtable entry for a DSO has a downside of creating many PLT entries and making their addresses escape.
The relative vtable entry references a PLT entry _ZN1B13do_something2Ev@plt.
```
.L_ZTV1A.local:
        .word   (_ZN1A13do_something2Ev@PLT-.L_ZTV1A.local)-8
```

fixes: #63580

Reviewed By: peter.smith, MaskRay

Differential Revision: https://reviews.llvm.org/D153264
2023-06-29 22:17:17 +02:00
LLVM GN Syncbot
fcdc3c9775 [gn build] Port b4ff893877 2023-06-29 20:10:25 +00:00
Christopher Ferris
36ca9a2902 [scudo] Use getMonotonicTimeFast for tryLock.
In tryLock, the Precedence value is set using the fast time function
now. This should speed up tryLock calls slightly.

This should be okay even though the value is used as a kind of random
value in getTSDAndLockSlow. The fast time call still sets enough bits
to avoid getting the same TSD on every call.

Reviewed By: Chia-hungDuan

Differential Revision: https://reviews.llvm.org/D154039
2023-06-29 13:07:08 -07:00
Christian Trott
b4ff893877 [libc++][mdspan] Implement layout_left
This commit implements layout_left in support of C++23 mdspan
(https://wg21.link/p0009). layout_left is a layout mapping policy
whose index mapping corresponds to the memory layout of Fortran arrays.
Thus the left most index has stride-1 access, and the right most index
is associated with the largest stride.

Co-authored-by: Damien L-G <dalg24@gmail.com>

Differential Revision: https://reviews.llvm.org/D153783
2023-06-29 14:01:08 -06:00
Fangrui Song
9979417d4d Revert D145226 "[mlir][Transforms][NFC] CSE: Add non-pass entry point"
This reverts commit 189033e6be.

This commit causes memory leak. See comments on D145226.
2023-06-29 12:53:31 -07:00
Johannes Doerfert
1221526681 [Attributor][FIX] Check AA preconditions
AAs often have preconditions, e.g., that the associated type is a
pointer type. If these do not hold, we do not need to bother creating
the AA. Best case, we invalidate it right away, worst case, we crash or
do something wrong (as happend in the issues below).

Fixes: https://github.com/llvm/llvm-project/issues/63553
Fixes: https://github.com/llvm/llvm-project/issues/63597
2023-06-29 12:32:45 -07:00
Johannes Doerfert
de88628ab9 [Attributor][FIX] Ensure AAAssumptionInfo properly reports change
I have no test as I just noticed the wrong change status reported by
update randomly.
2023-06-29 12:32:45 -07:00
Johannes Doerfert
d33bca840a [Attributor] Introduce helpers to judge AAs prior to creation
This is a partial cleanup to centralize the initialization and update
decisions for AAs. Lifting the burdon and boilerplate on users and
making it harder to accidentally perform unsound deductions.

The two static helpers show how we can lift the decisions to generate an
AA into the Attributor, avoiding trivial AAs that just cost us compile
time and maintenance code (to check for pre-conditions).
2023-06-29 12:32:45 -07:00
John Harrison
b9b0ab32f9 [lldb-vscode] Adjusting CreateSource to detect compiler generated frames.
Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D154026
2023-06-29 15:31:08 -04:00
Antonio Frighetto
a2ba4e8075 [ConstraintElimination] Handle solving-only ICMP_NE predicates
Simplification of non-equality predicates for solving constraint
systems is now supported by checking the validity of related
inequalities and equalities.

Differential Revision: https://reviews.llvm.org/D152684
2023-06-29 21:22:48 +02:00
Sam McCall
2f7d30dee8 [dataflow] fix compile on gcc7
Reported on https://reviews.llvm.org/D153674
This returned expression is move-eligible, this is a bug in old GCC.
2023-06-29 21:20:53 +02:00
Jennifer Yu
085845a2ac [OMP5.2] Initial support for doacross clause. 2023-06-29 11:58:17 -07:00
Yaxun (Sam) Liu
41a1625e07 [HIP] Fix version detection for old HIP-PATH
ROCm used to install components under individual directories,
e.g. HIP installed to /opt/rocm/hip and rocblas installed to
/opt/rocm/rocblas. ROCm has transitioned to a flat directory
structure where all components are installed to /opt/rocm.
HIP-PATH and --hip-path are supposed to be /opt/rocm as
clang detect HIP version by /opt/rocm/share/hip/version.
However, some existing HIP app still uses HIP-PATH=/opt/rocm/hip.
To avoid regression, clang will also try detect share/hip/version
under the parent directory of HIP-PATH or --hip-path.
This way, the detection will work for both new HIP-PATH and
old HIP-PATH.

Reviewed by: Artem Belevich

Differential Revision: https://reviews.llvm.org/D154077

Fixes: SWDEV-407757
2023-06-29 14:57:26 -04:00
Alexey Bataev
bb4e547a60 [SLP][NFC]Add a test for buildvector with reused scalars and
extractelements.
2023-06-29 11:52:12 -07:00
John Harrison
fc52f8dc6c [lldb-vscode] Prior to running the launchCommands during a launch request set the launch info so the configured launch information is accessible by the launch commands.
Reviewed By: wallace

Differential Revision: https://reviews.llvm.org/D154028
2023-06-29 14:50:19 -04:00
John Harrison
227b2180eb Creating a startDebugging reverse DAP request handler in lldb-vscode.
Adds support for a reverse DAP request to startDebugging. The new request can be used to launch child processes from lldb scripts, for example it would be start forward to configure a debug configuration for a server and a client allowing you to launch both processes with a single debug configuraiton.

Reviewed By: wallace, ivanhernandez13

Differential Revision: https://reviews.llvm.org/D153447
2023-06-29 14:45:57 -04:00
Aart Bik
6b88c852b6 [mlir][sparse] Start migration to new surface syntax for STEA
We are in the progress of migrating to a much improved surface syntax for the Sparse Tensor Encoding Attribute (STEA).

You can see a preview of this in the StableHLO RFC at

 https://github.com/openxla/stablehlo/blob/main/rfcs/20230210-sparsity.md

//**This design is courtesy Wren Romano.**//

This initial revision
(1) Introduces the first version of a new parser written by Wren Romano
(2) Introduces a simple "migration plan" using NEW_SYNTAX on the STEA, which will allow us to test the new parser with new examples, as well as migrate existing examples over without the need to rewrite them all

This first "drop" merely provides the entry points to parse the new syntax. The parser is still under active development. For example, we need to address the "lookahead" issue when parsing the lvl spec (viz. do we see l0 = d0 or a direct d0). Another larger task is to actually implement "affine" parsing (since the MLIR affine parser is not accessible in other parts of the tree).

EXAMPLE:

Currently, CSR looks like

  #CSR = #sparse_tensor.encoding<{
    lvlTypes = ["dense","compressed"],
    dimToLvl = affine_map<(i,j) -> (i,j)>
  }>

but you can "force" the new parser with

  #CSR = #sparse_tensor.encoding<{
    NEW_SYNTAX =
    (d0, d1) -> (l0 = d0 : dense, l1 = d1 : compressed)
  }>

Reviewed By: Peiming

Differential Revision: https://reviews.llvm.org/D153997
2023-06-29 11:32:07 -07:00
Arthur Eubanks
e9a9fccc9d [gn build] Fix tablegen dependencies
The source_set needs to depend on Support so llvm-config files are generated first.
2023-06-29 11:27:58 -07:00
Joseph Huber
667c10353e [libc] Fix the implementation of exit on the GPU
The RPC calls all have delays associated with them. Currently the `exit`
function does an async send and immediately exits the GPU. This can have
the effect that the RPC server never sees the exit call and we continue.
This patch changes that to first sync with the server before continuing
to perform its exit. There is still a hazard here, where the kernel can
complete before the RPC call reads back its response, but this is simply
multi-threaded hazards. This change ensures that the server *will*
always exit some time after the GPU exits.

Reviewed By: JonChesterfield

Differential Revision: https://reviews.llvm.org/D154112
2023-06-29 13:22:23 -05:00
Aiden Grossman
46f42e2ee5 [llvm-exegesis] Change map address in memory annotation tests
Test failures have been reported by some LLVM developers in regards to
the low value of of the location where the memory is being mapped into
the virtual address space as it causes problems with some default
configurations of vm.mmap_min_addr. This patch sets it to 2^20 (1048576)
to alleviate this issues as most distros seem to use a default value of
65536.
2023-06-29 18:21:06 +00:00
wlei
444d2e1a54 [CSSPGO] Enable stale profile matching by default for CSSPGO
We tested the stale profile matching on several Meta's internal services, all results are positive, for instance, in one service that refreshed its profile every one or two weeks, it consistently gave 1~2% performance improvement. We also observed an instance that a trivial refactoring caused a 2% regression and the matching can successfully recover the whole regression. Therefore, we'd like to turn it on by default for CSSPGO.

Reviewed By: hoy, wenlei

Differential Revision: https://reviews.llvm.org/D154027
2023-06-29 11:18:51 -07:00
Slava Zakharin
7b4aa95d7c [flang][hlfir] Set/propagate 'unordered' attribute for elementals.
This patch adds 'unordered' attribute handling the HLFIR elementals'
builders and fixes the attribute handling in lowering and transformations.

Depends on D154031, D154032

Reviewed By: jeanPerier, tblah

Differential Revision: https://reviews.llvm.org/D154035
2023-06-29 11:16:38 -07:00
Slava Zakharin
65379d40cf [flang][hlfir] Do not inline ordered elementals.
This patch just disables inlining of ordered hlfir.elemental operations.
Proving the safeness of inlining is left for future development.

Depends on D154032

Reviewed By: jeanPerier, tblah

Differential Revision: https://reviews.llvm.org/D154034
2023-06-29 11:16:38 -07:00
Noah Goldstein
13f16f4dea [InstCombine] Canonicalize (icmp eq/ne (and x, C), x) -> (icmp eq/ne (and x, ~C), 0)
This increases the likelyhood `x` is single-use and is typically
easier to analyze.

Proofs: https://alive2.llvm.org/ce/z/8ZpS2W

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D154004
2023-06-29 13:14:37 -05:00
Mark de Wever
9595a18de1 [NFC][libc++] Use a better type_trait to show the intention. 2023-06-29 19:56:28 +02:00
Igor Kirillov
17bde328d6 [LV] Add mask support for vectorizing interleaved groups
This patch extends LoopVectorize to handle the vectorization of interleaved
memory accesses with scalable vectors when mask is required or/and predicated
tail folding is enabled.

Differential Revision: https://reviews.llvm.org/D152258
2023-06-29 17:50:56 +00:00
Luke Lau
d0d864f6f4 [SLP] Explicitly pass AccessTy to getGEPCost
Building on D149889, this patch updates SLP to pass the vector type as
the AccessTy to getGEPCost.
This should have the effect of GEPs being costed for more often instead
of being treated as foldable into the address mode and thus free, as
some architectures, notably RISC-V, do not have offset+reg addressing
modes for vector memory accesses.

Note that in SLP, GEPs are costed in two places: getPointersChainCost
and GetGEPCostDiff.

Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D153570
2023-06-29 18:42:24 +01:00
Luke Lau
2b28f8f044 [RISCV][SLP] Add tests for unprofitable SLP vectorization due to GEP. NFC
Reviewed By: ABataev

Differential Revision: https://reviews.llvm.org/D149888
2023-06-29 18:42:22 +01:00
Slava Zakharin
39e87db192 [flang][hlfir] Codegen for unordered elemental operations.
Depends on D154031, D154032

Reviewed By: jeanPerier, tblah

Differential Revision: https://reviews.llvm.org/D154033
2023-06-29 10:35:43 -07:00
Slava Zakharin
583168ee86 [flang][hlfir] Parse unordered attribute for elemental operations.
By default, `hlfir.elemental` and `hlfir.elemental_addr` must process
the elements in order. The `unordered` attribute may be set,
if it is safe to process the elements out of order.
This patch just adds parsing support for the new attribute.

Reviewed By: jeanPerier, tblah

Differential Revision: https://reviews.llvm.org/D154032
2023-06-29 10:35:43 -07:00
Slava Zakharin
5983b8b6d3 [flang][hlfir] Lower ordered elemental subroutine calls.
This patch sets `unordered` `fir.do_loop` attribute during lowering
of elemental subroutine calls to HLFIR, when it is safe to do so.
Proper handling of `hlfir.elemental` will be done in a separate patch.

Reviewed By: jeanPerier, tblah

Differential Revision: https://reviews.llvm.org/D154031
2023-06-29 10:35:43 -07:00
Sergei Barannikov
2348902268 [clang][CodeGen] Remove no-op EmitCastToVoidPtr (NFC)
Reviewed By: JOE1994

Differential Revision: https://reviews.llvm.org/D153694
2023-06-29 20:29:38 +03:00
Craig Topper
0e84eec745 [RISCV] Add a helper class for creating GPR register classes.
Reduces the amount of repeated template parameters for every class.

Reviewed By: asb

Differential Revision: https://reviews.llvm.org/D154042
2023-06-29 10:23:39 -07:00
Joseph Huber
968f65ae03 [OpenMP] Adjust using the NVPTX architecture detection tool
A previous patch by @arsenm adjusted these to find the `amdgpu-arch`
tool correctly if we do a `LLVM_ENABLE_PROJECTS` build. This patch
applies the same to `nvptx-arch` tool to keep it consistent.

Reviewed By: tianshilei1992

Differential Revision: https://reviews.llvm.org/D154107
2023-06-29 12:14:44 -05:00
Scott Todd
c304be7cfd [mlir][docgen] Handle Windows line endings in doc generation.
The `printReindented` function searches for Unix style line endings (`\n`), but strings may have Windows style line endings (`\r\n`). Prior to this change, generated document sections could have extra indentation, which some markdown renderers interpret as code blocks rather than paragraphs.

Differential Revision: https://reviews.llvm.org/D153591
2023-06-29 09:56:49 -07:00