Commit Graph

502426 Commits

Author SHA1 Message Date
Florian Hahn
eea150c840 [VPlan] Include IV phi and backedge cost in VPlan cost computation.
In WebAssembly, costs != 0 are assigned to be backedge and induction
phis, so make sure we include those costs in the VPlan-based cost model.

This fixes a downstream crash with WebAssembly after 242cc200cc
(https://github.com/llvm/llvm-project/pull/92555)
2024-06-20 20:44:17 +01:00
Stella Stamenova
dfe55a11cb [vscode-mlir] Bump the version of braces to 3.0.3 (#96137)
Version 3.0.2 of braces has a security vulnerability.
2024-06-20 12:40:46 -07:00
Nikolas Klauser
622d8b0bc2 [libc++] Remove <ostream> include from <chrono> (#96035) 2024-06-20 21:40:23 +02:00
Ahmed Bougacha
7c814c13d0 [clang] Define ptrauth_sign_constant builtin. (#93904)
This is a constant-expression equivalent to
ptrauth_sign_unauthenticated.  Its constant nature lets us guarantee
a non-attackable sequence is generated, unlike
ptrauth_sign_unauthenticated which we generally discourage using.

It being a constant also allows its usage in global initializers, though
requiring constant pointers and discriminators.

The value must be a constant expression of pointer type which evaluates
to a non-null pointer.

The key must be a constant expression of type ptrauth_key.
The extra data must be a constant expression of pointer or integer type;
if an integer, it will be coerced to ptrauth_extra_data_t.
The result will have the same type as the original value.

This can be used in constant expressions.

Co-authored-by: John McCall <rjmccall@apple.com>
2024-06-20 12:09:54 -07:00
Ahmed Bougacha
50b9193781 [clang] Define ptrauth_string_discriminator builtin. (#93903)
This exposes the ABI-stable hash function that allows computing a 16-bit
discriminator from a constant string.

This allows manually matching the implicit string discriminators
computed in the ABI (e.g., from mangled names for vtable pointer/entry
signing), as well as enabling the use of interesting discriminators when
manually annotating specific pointers with the __ptrauth qualifier.

The argument must be a string literal of char character type.  The
result has type ptrauth_extra_data_t.
The result value is never zero and always within range for both the
__ptrauth qualifier and ptrauth_blend_discriminator.
This can be used in constant expressions.

Co-authored-by: John McCall <rjmccall@apple.com>
2024-06-20 11:55:41 -07:00
Timm Bäder
99f5fcb0d1 [clang][Interp] Try to fix #embed on big-endian machines
Insert a cast to the proper value.
2024-06-20 20:29:27 +02:00
eddyz87
01ce74fe14 Revert "[DebugInfo][BPF] Add 'annotations' field for DIBasicType & DI… (#96172)
…SubroutineType (#91422)"

This reverts commit 3ca17443ef.

As reported in [1,2] the commit above causes CI failure for powerpc-aix
target.
There is also a performance regression reported in [3]. Reverting to
comply with the developer policy.

[1]
https://github.com/llvm/llvm-project/pull/91422#issuecomment-2179425473
[2] https://lab.llvm.org/buildbot/#/builders/64/builds/62
[3]
https://github.com/llvm/llvm-project/pull/91422#issuecomment-2175631443
2024-06-20 21:28:02 +03:00
Jonathan Peyton
88dae3d5d0 [OpenMP][libomp] Remove Perl in favor of Python (#95307)
* Removes all Perl scripts and modules
* Adds Python3 scripts which mimic the behavior of the Perl scripts
* Removes Perl from CMake; Adds Python3 requirement to CMake
* The check-instruction-set.pl script is Knights Corner specific. The
script is removed and not replicated with a corresponding Python3
script.

Relevant Discourse:

https://discourse.llvm.org/t/error-compiling-clang-with-offloading-support/79223/4

Fixes: https://github.com/llvm/llvm-project/issues/62289
2024-06-20 12:54:49 -05:00
Louis Dionne
67da89cf74 [libc++abi] Use target_compile_options to pass LIBCXXABI_ADDITIONAL_COMPILE_FLAGS (#96112)
We use target_compile_options to pass the libc++ variant of this flag,
so we should be consistent for libc++abi. This is actually not only a
matter of consistency: target_compile_options handles duplicate CMake
options in a certain way (it removes duplicates but has an escape hatch
using the "SHELL:" prefix), and it is important for both libc++ and
libc++abi options to be handled in the same way.
2024-06-20 13:52:20 -04:00
Mingming Liu
8d9db947b7 Reland "[ThinLTO] Populate declaration import status except for distributed ThinLTO under a default-off new option" (#95482)
Make `FunctionsToImportTy` an `unordered_map` rather than `DenseMap`.
Credit goes to jvoung@ for the 'DenseMap -> unordered_map' change. This
is a reland of https://github.com/llvm/llvm-project/pull/92718

* `DenseMap` allocates space for a large number of key/value pairs and
wastes space when the number of elements are small.
* While init bucket size is zero [1], it quickly allocates buckets for 64 elements [2]
when the number of elements is small (for example, 3 or 4 elements). The programmer
manual [3] also mentions it could waste space.
* Experiments show `FunctionsToImportTy.size()` is smaller than 4 for
multiple binaries with high indexing ram usage. `unordered_map` grows
factor is at most 2 in llvm libc [4] for insert operations.
 
With this change, `ComputeCrossModuleImport` ram increase is smaller
than 0.5G on a couple of binaries with high indexing ram usage. A wider
range of (pre-release) tests pass.

[1] ad79a14c9e/llvm/include/llvm/ADT/DenseMap.h (L431-L432) 
[2] ad79a14c9e/llvm/include/llvm/ADT/DenseMap.h (L849)
[3] https://llvm.org/docs/ProgrammersManual.html#llvm-adt-densemap-h
[4] ad79a14c9e/libcxx/include/__hash_table (L1525-L1526)

**Original commit message** 
The goal is to populate `declaration` import status if a new flag
`-import-declaration` is on.

* For in-process ThinLTO, the `declaration` status is visible to backend
`function-import` pass, so `FunctionImporter::importFunctions` should
read the import status and be no-op for declaration summaries.
Basically, the postlink pipeline is updated to keep its current behavior
(import definitions), but not updated to handle `declaration` summaries.
Two use cases ([better call-graph
sort](https://discourse.llvm.org/t/rfc-for-better-call-graph-sort-build-a-more-complete-call-graph-by-adding-more-indirect-call-edges/74029#support-cross-module-function-declaration-import-5)
or [cross-module
auto-init](https://github.com/llvm/llvm-project/pull/87597#discussion_r1556067195))
would use this bit differently.

* For distributed ThinLTO, the `declaration` status is not serialized to
bitcode. As discussed, https://github.com/llvm/llvm-project/pull/87600
will do this.
2024-06-20 10:50:31 -07:00
Alan Zhao
836703087d [BranchFolder] Fix missing debug info with tail merging (#94715)
`BranchFolder::TryTailMergeBlocks(...)` removes unconditional branch
instructions and then recreates them. However, this process loses debug
source location information from the previous branch instruction, even
if tail merging doesn't change IR. This patch preserves the debug
information from the removed instruction and inserts them into the
recreated instruction.

Fixes #94050
2024-06-20 10:48:18 -07:00
Jonas Devlieghere
5e9f247c06 [lldb] Make LanguageRuntime::GetTypeBitSize return an optional (NFC) (#96013)
Make LanguageRuntime::GetTypeBitSize return an optional. This should be
NFC, though the ObjCLanguageRuntime implementation is (possibly) more
defensive against returning 0.

I'm not sure if it's possible for both `m_ivar.size` and `m_ivar.offset`
to be zero. Previously, we'd return 0 and cache it, only to discard it
the next time when finding it in the cache, and recomputing it again.
The new code will avoid putting it in the cache in the first place.
2024-06-20 10:46:26 -07:00
Mital Ashok
482c41e992 [Clang] [Sema] Diagnose unknown std::initializer_list layout in SemaInit (#95580)
This checks if the layout of `std::initializer_list` is something Clang
can handle much earlier and deduplicates the checks in
CodeGen/CGExprAgg.cpp and AST/ExprConstant.cpp

Also now diagnose `union initializer_list` (Fixes #95495), bit-field for
the size (Fixes a crash that would happen during codegen if it were
unnamed), base classes (that wouldn't be initialized) and polymorphic
classes (whose vtable pointer wouldn't be initialized).
2024-06-20 19:44:06 +02:00
Nikita Popov
6cea40400d [IR] Remove RepeatedPass (#96211)
This pass is not used in any pipeline, barely used in tests and not
really useful, so drop it. The only place where we "repeat" passes is
devirt repetition, and that is done using a separate pass.
2024-06-20 19:39:19 +02:00
OverMighty
1107575c95 [libc][math][c23] Add {getpayload,setpayload,setpayloadsig}f16 C23 math functions (#95159)
Part of #93566.
2024-06-20 13:33:34 -04:00
Adrian Prantl
f1edc0459a Reformat test (NFC) 2024-06-20 10:32:07 -07:00
Adrian Prantl
b8f0ca09b6 Factor out expression result error strings. 2024-06-20 10:32:06 -07:00
Adrian Prantl
f900644ae2 Refactor GetObjectDescription() to return llvm::Expected (NFC)
This is de facto an NFC change for Objective-C but will benefit the
Swift language plugin.
2024-06-20 10:32:06 -07:00
Adrian Prantl
d1bc75c0bc Convert ValueObject::Dump() to return llvm::Error() (NFCish)
This change by itself has no measurable effect on the LLDB
testsuite. I'm making it in preparation for threading through more
errors in the Swift language plugin.
2024-06-20 10:32:06 -07:00
Aaron Ballman
6bc71cdd32 [C99] Claim partial conformance to n448
This is the paper that added the 'restrict' keyword. Clang is
conforming to the letter of the standard's requirements, so it would be
defensible for us to claim full support instead. However, LLVM does not
currently support the optimization semantics with restricted local
variables or data members, only with restricted pointers declared in
function parameters. So we're only claiming partial support because we
don't yet take full advantage of what the feature allows.
2024-06-20 13:29:36 -04:00
Michael Buch
0ec567c370 Revert "[lldb][ObjC] Don't query objective-c runtime for decls in C++ contexts (#95963)"
This reverts commit dadf960607.

The commit caused `TestEarlyProcessLaunch.py` to fail on the
macOS bots.
2024-06-20 17:53:37 +01:00
Sam Clegg
afb3f7e0b7 [lld][WebAssembly] Handle stub symbol dependencies when an explicit import name is used (#80169) 2024-06-20 09:48:25 -07:00
Adrian Prantl
869f551760 [lldb] Give more time to test/API/multiple-debuggers
This test occasionally fails on two of the busiest CI bots (asan and
matrix), and we can't reproduce it locally. This leads to the
hypothesis that the test is timing out (in the sense of the number of
"join attempts" performed by this test's driver).

This commit doubles the number of iterations performed and also does
an NFC refactor of the main test loop so that it can be more easily
understood.
2024-06-20 09:45:58 -07:00
Andrzej Warzyński
c65fb32ddd [mlir][vector] Update tests for collapse 3/n (nfc) (#94906)
The main goal of this PR (and subsequent PRs), is to add more tests with
scalable vectors to:
  * vector-transfer-collapse-inner-most-dims.mlir

There's quite a few cases to consider, hence this is split into multiple
PRs. In this PR, the very first test for `vector.transfer_write` is
complemented with all the possible combinations:
  * scalable (rather than fixed) unit trailing dim,
  * dynamic (rather than static) trailing dim in the source memref.

To this end, the following tests:
  * `@leading_scalable_dimension_transfer_write`
    `@trailing_scalable_one_dim_transfer_write`

are replaced with:
  * `@drop_two_inner_most_dim_scalable_inner_dim` and
    `@negative_scalable_unit_dim`,

respectively. In addition:
  * "_for_transfer_write" is removed from function names (to reduce
    noise).

In addition, to maintain consistency between the tests for `xfer_read`
and `xfer_write`, 2 negative tests for `xfer_read` are also renamed.
This is to follow the suggestion made during the review of this PR.

Extra comments in "VectorTransforms.cpp" are added to better
document the limitations related to scalable vectors and which tests
added here excercise.

This is a follow-up for: #94490 and #94604

NOTE: This PR is limited to tests for `vector.transfer_write`.
2024-06-20 17:45:03 +01:00
Florian Hahn
242cc200cc Recommit "[VPlan] First step towards VPlan cost modeling. (#92555)"
This reverts commit 6f538f6a2d.

Extra tests for crashes discovered when building Chromium have been
added in fb86cb7ec1, 3be7312f81.

Original message:
This adds a new interface to compute the cost of recipes, VPBasicBlocks,
VPRegionBlocks and VPlan, initially falling back to the legacy cost model
for all recipes. Follow-up patches will gradually migrate recipes to
compute their own costs step-by-step.

It also adds getBestPlan function to LVP which computes the cost of all
VPlans and picks the most profitable one together with the most
profitable VF.

The VPlan selected by the VPlan cost model is executed and there is an
assert to catch cases where the VPlan cost model and the legacy cost
model disagree. Even though I checked a number of different build
configurations on AArch64 and X86, there may be some differences
that have been missed.

Additional discussions and context can be found in @arcbbb's
https://github.com/llvm/llvm-project/pull/67647 and
https://github.com/llvm/llvm-project/pull/67934 which is an earlier
version of the current PR.

PR: https://github.com/llvm/llvm-project/pull/92555
2024-06-20 17:32:52 +01:00
Florian Hahn
c07be08df5 [LV] Add tail folding test with scalarized store and wide header mask.
Add additional test with salarized store which caused crashes with
earlier versions of https://github.com/llvm/llvm-project/pull/92555.
2024-06-20 17:24:59 +01:00
Daniel Otero
651d44d3da [clang] Fix missing installed header (#95979)
Since commit 8d468c132e, the header
`openmp_wrappers/complex` is hidden behind `openmp_wrappers/complex.h`
due to a bug in CMake[^1], so is not actually installed.

To test the issue, you can ask `ninja` to generate the file on your
build:

```
$ ninja lib/clang/19/include/openmp_wrappers/complex.h
[199/199] Copying clang's openmp_wrappers/complex.h...
$ ninja lib/clang/19/include/openmp_wrappers/complex
ninja: error: unknown target 'lib/clang/19/include/openmp_wrappers/complex', did you mean 'lib/clang/19/include/openmp_wrappers/complex.h'? 
```

Re-ordering the entries workarounds the issue. The other option is to
revert the cited commit, but I'm not sure which approach is preferred.

CC @etcwilde @jdoerfert 

[^1]: [Here](https://gitlab.kitware.com/cmake/cmake/-/issues/26058) is
the CMake report on the issue.
2024-06-20 09:19:54 -07:00
hdoc
af6acd7442 [Clang][Comments] Support for parsing headers in Doxygen \par commands (#91100)
### Background

Doxygen's `\par` command
([link](https://www.doxygen.nl/manual/commands.html#cmdpar)) has an
optional argument, which denotes the header of the paragraph started by
a given `\par` command.

In short, the paragraph command can be used with a heading, or without
one. The code block below shows both forms and how the current version
of LLVM/Clang parses this code:
```
$ cat test.cpp
/// \par User defined paragraph:
/// Contents of the paragraph.
///
/// \par
/// New paragraph under the same heading.
///
/// \par
/// A second paragraph.
class A {};

$ clang++ -cc1 -ast-dump -fcolor-diagnostics -std=c++20 test.cpp
`-CXXRecordDecl 0x1530f3a78 <test.cpp:11:1, col:10> col:7 class A definition
  |-FullComment 0x1530fea38 <line:2:4, line:9:23>
  | |-ParagraphComment 0x1530fe7e0 <line:2:4>
  | | `-TextComment 0x1530fe7b8 <col:4> Text=" "
  | |-BlockCommandComment 0x1530fe800 <col:5, line:3:30> Name="par"
  | | `-ParagraphComment 0x1530fe878 <line:2:9, line:3:30>
  | |   |-TextComment 0x1530fe828 <line:2:9, col:32> Text=" User defined paragraph:"
  | |   `-TextComment 0x1530fe848 <line:3:4, col:30> Text=" Contents of the paragraph."
  | |-ParagraphComment 0x1530fe8c0 <line:5:4>
  | | `-TextComment 0x1530fe898 <col:4> Text=" "
  | |-BlockCommandComment 0x1530fe8e0 <col:5, line:6:41> Name="par"
  | | `-ParagraphComment 0x1530fe930 <col:4, col:41>
  | |   `-TextComment 0x1530fe908 <col:4, col:41> Text=" New paragraph under the same heading."
  | |-ParagraphComment 0x1530fe978 <line:8:4>
  | | `-TextComment 0x1530fe950 <col:4> Text=" "
  | `-BlockCommandComment 0x1530fe998 <col:5, line:9:23> Name="par"
  |   `-ParagraphComment 0x1530fe9e8 <col:4, col:23>
  |     `-TextComment 0x1530fe9c0 <col:4, col:23> Text=" A second paragraph."
  `-CXXRecordDecl 0x1530f3bb0 <line:11:1, col:7> col:7 implicit class A
```

As we can see above, the optional paragraph heading (`"User defined
paragraph"`) is not an argument of the `\par` `BlockCommandComment`, but
instead a child `TextComment`.

For documentation generators like [hdoc](https://hdoc.io/), it would be
ideal if we could parse Doxygen documentation comments with these
semantics in mind. Currently that's not possible.

### Change

This change parses `\par` command according to how Doxygen parses them,
making an optional header available as a an argument if it is present.
In addition:

- AST unit tests are defined to test this functionality when an argument
is present, isn't present, with additional spacing, etc.
- TableGen is updated with an `IsParCommand` to support this
functionality
- `lit` tests are updated where needed
2024-06-20 12:14:51 -04:00
Sander de Smalen
8e0cd7382a [AArch64] Consider runtime mode when deciding to use SVE for fixed-length vectors. (#96081)
This also fixes the case where an SVE div is incorrectly to be assumed
available in non-streaming mode with SME.
2024-06-20 17:08:14 +01:00
Nikita Popov
49ae2dcf36 [PassManager] Remove some unnecessary includes (NFC) (#96175)
SmallPtrSet.h and TimeProfiler.h are unused. CommandLine.h is only
needed for the UseNewDbgInfoFormat declare, which can be moved to the
places that need it.
2024-06-20 17:41:35 +02:00
Jon Roelofs
037a9a754a [llvm][AArch64] SVE2 is an optional feature in ARMv9.0a (#96007)
... so move it out of the `implied_features` list, and into the
`DefaultExts` list.
2024-06-20 08:31:23 -07:00
Zaara Syeda
898b8a42b5 [PPC] Add DwarfRegAlias for VSRPair (#95837)
Add DwarfRegAlias for VSRPair as it shares dwarfRegNum with the VR
registers.
2024-06-20 11:30:58 -04:00
Kazu Hirata
bed2eb64de [GenericDomTreeConstruction] Use SmallVector (NFC) (#96138)
The use of SmallVector here saves 4.7% of heap allocations during the
compilation of ConvertExpr.cpp.ii, a preprocessed version of
flang/lib/Lower/ConvertExpr.cpp.
2024-06-20 08:20:38 -07:00
Nick Desaulniers (paternity leave)
f1ce6a465d [libc][arm] implement a basic setjmp/longjmp (#93220)
Note: our baremetal arm configuration compiles this as
`--target=arm-none-eabi`, so this code is built in -marm mode. It could be
smaller with `--target=armv7-none-eabi -mthumb`. The assembler is valid ARMv5,
or THUMB2, but not THUMB(1).
2024-06-20 08:16:48 -07:00
Jonas Rickert
abad8455ab [mlir] Expose skipRegions option for Op printing in the C and Python bindings (#96150)
The MLIR C and Python Bindings expose various methods from
`mlir::OpPrintingFlags` . This PR adds a binding for the `skipRegions`
method, which allows to skip the printing of Regions when printing Ops.
It also exposes this option as parameter in the python `get_asm` and
`print` methods
2024-06-20 10:15:08 -05:00
Anton Sidorenko
d4bfc4a821 [RISCV][NFC] Cleanup SCR1 sched model (#96088)
Related to https://github.com/llvm/llvm-project/pull/95948
2024-06-20 18:14:09 +03:00
Shilei Tian
e3eb12cce9 [Clang][AMDGPU] Add a builtin for llvm.amdgcn.make.buffer.rsrc intrinsic (#95276)
Depends on https://github.com/llvm/llvm-project/pull/94830.
2024-06-20 11:01:54 -04:00
Alexandre Ganea
67226bad15 [Support] Vendor rpmalloc in-tree and use it for the Windows 64-bit release (#91862)
### Context

We have a longstanding performance issue on Windows, where to this day,
the default heap allocator is still lockfull. With the number of cores
increasing, building and using LLVM with the default Windows heap
allocator is sub-optimal. Notably, the ThinLTO link times with LLD are
extremely long, and increase proportionally with the number of cores in
the machine.

In
a6a37a2fcd,
I introduced the ability build LLVM with several popular lock-free
allocators. Downstream users however have to build their own toolchain
with this option, and building an optimal toolchain is a bit tedious and
long. Additionally, LLVM is now integrated into Visual Studio, which
AFAIK re-distributes the vanilla LLVM binaries/installer. The point
being that many users are impacted and might not be aware of this
problem, or are unable to build a more optimal version of the toolchain.

The symptom before this PR is that most of the CPU time goes to the
kernel (darker blue) when linking with ThinLTO:


![16c_ryzen9_windows_heap](https://github.com/llvm/llvm-project/assets/37383324/86c3f6b9-6028-4c1a-ba60-a2fa3876fba7)

With this PR, most time is spent in user space (light blue):


![16c_ryzen9_rpmalloc](https://github.com/llvm/llvm-project/assets/37383324/646b88f3-5b6d-485d-a2e4-15b520bdaf5b)

On higher core count machines, before this PR, the CPU usage becomes
pretty much flat because of contention:

<img width="549" alt="VM_176_windows_heap"
src="https://github.com/llvm/llvm-project/assets/37383324/f27d5800-ee02-496d-a4e7-88177e0727f0">


With this PR, similarily most CPU time is now used:

<img width="549" alt="VM_176_with_rpmalloc"
src="https://github.com/llvm/llvm-project/assets/37383324/7d4785dd-94a7-4f06-9b16-aaa4e2e505c8">

### Changes in this PR

The avenue I've taken here is to vendor/re-licence rpmalloc in-tree, and
use it when building the Windows 64-bit release. Given the permissive
rpmalloc licence, prior discussions with the LLVM foundation and
@lattner suggested this vendoring. Rpmalloc's author (@mjansson) kindly
agreed to ~~donate~~ re-licence the rpmalloc code in LLVM (please do
correct me if I misinterpreted our past communications).

I've chosen rpmalloc because it's small and gives the best value
overall. The source code is only 4 .c files. Rpmalloc is statically
replacing the weak CRT alloc symbols at link time, and has no dynamic
patching like mimalloc. As an alternative, there were several
unsuccessfull attempts made by Russell Gallop to use SCUDO in the past,
please see thread in https://reviews.llvm.org/D86694. If later someone
comes up with a PR of similar performance that uses SCUDO, we could then
delete this vendored rpmalloc folder.

I've added a new cmake flag `LLVM_ENABLE_RPMALLOC` which essentialy sets
`LLVM_INTEGRATED_CRT_ALLOC` to the in-tree rpmalloc source.

### Performance

The most obvious test is profling a ThinLTO linking step with LLD. I've
used a Clang compilation as a testbed, ie.
```
set OPTS=/GS- /D_ITERATOR_DEBUG_LEVEL=0 -Xclang -O3 -fstrict-aliasing -march=native -flto=thin -fwhole-program-vtables -fuse-ld=lld
cmake -G Ninja %ROOT%/llvm -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=TRUE -DLLVM_ENABLE_PROJECTS="clang" -DLLVM_ENABLE_PDB=ON -DLLVM_OPTIMIZED_TABLEGEN=ON -DCMAKE_C_COMPILER=clang-cl.exe -DCMAKE_CXX_COMPILER=clang-cl.exe -DCMAKE_LINKER=lld-link.exe -DLLVM_ENABLE_LLD=ON -DCMAKE_CXX_FLAGS="%OPTS%" -DCMAKE_C_FLAGS="%OPTS%" -DLLVM_ENABLE_LTO=THIN
```
I've profiled the linking step with no LTO cache, with Powershell, such
as:
```
Measure-Command { lld-link /nologo @CMakeFiles\clang.rsp /out:bin\clang.exe /implib:lib\clang.lib /pdb:bin\clang.pdb /version:0.0 /machine:x64 /STACK:10000000 /DEBUG /OPT:REF /OPT:ICF /INCREMENTAL:NO /subsystem:console /MANIFEST:EMBED,ID=1 }`
```

Timings:

| Machine | Allocator | Time to link |
|--------|--------|--------|
| 16c/32t AMD Ryzen 9 5950X | Windows Heap | 10 min 38 sec |
|  | **Rpmalloc** | **4 min 11 sec** |
| 32c/64t AMD Ryzen Threadripper PRO 3975WX | Windows Heap | 23 min 29
sec |
|  | **Rpmalloc** | **2 min 11 sec** |
|  | **Rpmalloc + /threads:64** | **1 min 50 sec** |
| 176 vCPU (2 socket) Intel Xeon Platinium 8481C (fixed clock 2.7 GHz) |
Windows Heap | 43 min 40 sec |
|  | **Rpmalloc** | **1 min 45 sec** |

This also improves the overall performance when building with clang-cl.
I've profiled a regular compilation of clang itself, ie:
```
set OPTS=/GS- /D_ITERATOR_DEBUG_LEVEL=0 /arch:AVX -fuse-ld=lld
cmake -G Ninja %ROOT%/llvm -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=TRUE -DLLVM_ENABLE_PROJECTS="clang;lld" -DLLVM_ENABLE_PDB=ON -DLLVM_OPTIMIZED_TABLEGEN=ON -DCMAKE_C_COMPILER=clang-cl.exe -DCMAKE_CXX_COMPILER=clang-cl.exe -DCMAKE_LINKER=lld-link.exe -DLLVM_ENABLE_LLD=ON -DCMAKE_CXX_FLAGS="%OPTS%" -DCMAKE_C_FLAGS="%OPTS%"
```
This saves approx. 30 sec when building on the Threadripper PRO 3975WX:
```
(default Windows Heap)
C:\src\git\llvm-project>hyperfine -r 5 -p "make_llvm.bat stage1_test2" "ninja clang -C stage1_test2"
Benchmark 1: ninja clang -C stage1_test2
  Time (mean ± σ):     392.716 s ±  3.830 s    [User: 17734.025 s, System: 1078.674 s]
  Range (min … max):   390.127 s … 399.449 s    5 runs

(rpmalloc)
C:\src\git\llvm-project>hyperfine -r 5 -p "make_llvm.bat stage1_test2" "ninja clang -C stage1_test2"
Benchmark 1: ninja clang -C stage1_test2
  Time (mean ± σ):     360.824 s ±  1.162 s    [User: 15148.637 s, System: 905.175 s]
  Range (min … max):   359.208 s … 362.288 s    5 runs
```
2024-06-20 10:54:02 -04:00
Balázs Kéri
2e515ed60a [clang] Move 'alpha.cplusplus.MisusedMovedObject' to 'cplusplus.Move' in documentation (NFC) (#95003)
The checker was renamed at some time ago but the documentation was not
updated. The section is now just moved and renamed. The documentation is
still very simple and needs improvement.
2024-06-20 16:41:51 +02:00
Philip Reames
3e55ac94c7 [RISCV] Strength reduce mul by 2^N - 2^M (#88983)
This is a three instruction expansion, and does not depend on zba, so
most of the test changes are in base RV32/64I configurations.

With zba, this gets immediates such as 14, 28, 30, 56, 60, 62.. which
aren't covered by our other expansions.
2024-06-20 07:36:48 -07:00
Timm Bäder
67f5312c41 [clang][Interp] Nested ThisExprs that don't refer to the frame this ptr
Use a series of ops in that case, getting us to the right declaration
field.
2024-06-20 16:34:34 +02:00
Farzon Lotfi
2ae6889d3f [SPIRV] Add trig function lowering (#95973)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

This is part 2 of 4 PRs. It sets the ground work for adding the
intrinsics.

Add SPIRV  Lower for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`
https://github.com/llvm/llvm-project/issues/70079
https://github.com/llvm/llvm-project/issues/70080
https://github.com/llvm/llvm-project/issues/70081
https://github.com/llvm/llvm-project/issues/70083
https://github.com/llvm/llvm-project/issues/70084
https://github.com/llvm/llvm-project/issues/95966


There isn't any aarch64 change in this pr, but when you add a target
opcode it is visible in there validaiton tests.
2024-06-20 10:34:23 -04:00
Tomas Matheson
fa6d38d61a [AArch64][TargetParser] Split FMV and extensions (#92882)
FMV extensions are really just mappings from FMV feature names to lists
of backend features for codegen. Split them out into their own separate
file.
2024-06-20 15:33:21 +01:00
Aaron Ballman
40a0ad2af3 [C11] Claim conformance to WG14 N3159
This was a roll-up of defect reports that we already test elsewhere, so
no additional test coverage is needed.

DR345: tested by clang/test/C/drs/dr3xx.c
DR344: tested by clang/test/C/drs/dr3xx.c
DR343: tested by clang/test/C/drs/dr3xx.c
DR341: tested by clang/test/C/drs/dr3xx.c
DR340: tested by clang/test/C/drs/dr3xx.c
DR338: tested by clang/test/C/drs/dr338.c
DR336: N/A for the compiler
DR330: N/A for the compiler
DR329: N/A for the compiler
DR328: tested by clang/test/C/drs/dr3xx.c
DR327: editorial
DR326: N/A for the compiler
DR315: tested by clang/test/C/drs/dr3xx.c
2024-06-20 10:26:13 -04:00
Shan Huang
ace069d7c5 [DebugInfo][TailCallElim] Drop the debug location of AccRecInstrNew (#95742)
Fix #95731 .
2024-06-20 22:03:56 +08:00
Aaron Ballman
b84323cca9 [C11] Remove N1350 and N1394 from the list of documents on the C status page
These papers added Annex K, which is a library component that Clang
doesn't need to do anything to support.
2024-06-20 10:02:22 -04:00
Nikita Popov
290a939fc3 [SCEV] Handle nusw/nuw gep flags for addrecs
Set the nw flag is gep has any nowrap flags. Transfer the nuw
flag. Also set nuw for the nusw + nneg combination.
2024-06-20 15:59:42 +02:00
Thomas Preud'homme
93ce8e1087 Fix Android build failure in InferIntRangeCommon (#96154)
As of today, Android's libcxx is missing C++17's std::function's CTAD
added in e1eabcdfad. This leads to
InferIntRangeCommon.cpp to fail to compile. This commit makes the
template parameter of std::function in that function explicit, therefore
avoiding CTAD. While LLVM/MLIR's requirement is C++17, the rest of the
code builds fine so hopefully this is acceptable.
2024-06-20 14:44:09 +01:00
Mital Ashok
b608b223ab [Clang] [Sema] Ensure noexcept(typeid(E)) checks if E throws when needed (#95846)
3ad31e12cc changed it so that not all
potentially-evaluated `typeid`s were marked as potentially-throwing, but
I forgot to check the subexpression if the null check of the `typeid`
didn't potentially-throw. This adds that check.
2024-06-20 15:43:14 +02:00
Jay Foad
33c9331a92 [AMDGPU] Precommit a test for llvm.amdgcn.pops.exiting.wave.id 2024-06-20 14:27:35 +01:00