Reland `sigsetjmp` patches with build fixes.
We wrap every target replying on the epilogue library into conditional
checks.
---------
Co-authored-by: Petr Hosek <phosek@google.com>
We skip/xfail the other FrameFormat tests on Windows already. The number
of different buildbot configurations out there targetting Windows makes
it hard to make this test portable. If we want to test this on Windows
we should probably just make a dedicated test for it.
We had two ways of achieving the same thing. This commit removes
unary_builtin.inc in favour of the approach combining gentype.inc with
unary_def.inc.
There is no change to the codegen for any target.
We are adding new test cases to see the transformation impact due to #134979.
These are similar to previous commit #688c3ffb057a87b86c6c1e77040418adf511efbb.
Handling struct with two member and different vector factor.
---------
Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
Currently if we try to create a VPInstructionWithType without a FMF via
VPBuilder::createNaryOp we will use the constructor that asserts
`assert(isFPMathOp() && "this op can't take fast-math flags");`.
This fixes it by checking if FMFs have a value, similar to the other
createNaryOp overloads.
This is needed by #129508
This introduces a new diagnostic, -Wc++-hidden-decl, which is grouped
under -Wc++-compat, that diagnoses declarations which are valid in C but
invalid in C++ due to the type being at the wrong scope. e.g.,
```
struct S {
struct T {
int x;
} t;
};
struct T t; // Valid C, invalid C++
```
This is implementing the other half of #21898
When given an invalid Objective-C extension, Clang would crash when
trying to emit the mangled name of the method to the JSON dump output.
Fixes#137320
When given an invalid Objective-C extension, Clang would crash when
trying to emit the mangled name of the method to the JSON dump output.
Fixes#137320
LegalizerHelper::reduceLoadStoreWidth does not work for non-byte-sized
types, because this would require (un)packing of bits across byte
boundaries.
Precommit tests: #134904
Suppress EGPR/NDD instructions for relocations to avoid APX relocation
types emitted. This is to keep backward compatibility with old version
of linkers without APX support. The use case is to try APX features with
LLVM + old built-in linker on RHEL9 OS which is expected to be EOL in
2032.
If there are APX relocation types, the old version of linkers would
raise "unsupported relocation type" error. Example:
```
$ llvm-mc -filetype=obj -o got.o -triple=x86_64-unknown-linux got.s
$ ld got.o -o got.exe
ld: got.o: unsupported relocation type 0x2b
...
$ cat got.s
...
movq foo@GOTPCREL(%rip), %r16
$ llvm-objdump -dr got.o
...
1: d5 48 8b 05 00 00 00 00 movq (%rip), %r16
0000000000000005: R_X86_64_CODE_4_GOTPCRELX foo-0x4
```
This introduces a new diagnostic group to diagnose implicit casts from
int to an enumeration type. In C, this is valid, but it is not
compatible with C++.
Additionally, this moves the "implicit conversion from enum type to
different enum type" diagnostic from `-Wenum-conversion` to a new group
`-Wimplicit-enum-enum-cast`, which is a more accurate home for it.
`-Wimplicit-enum-enum-cast` is also under `-Wimplicit-int-enum-cast`, as
it is the same incompatibility (the enumeration on the right-hand is
promoted to `int`, so it's an int -> enum conversion).
Fixes#37027
This commit adds an assert to `Operation::moveBefore` which ensures that
moving of operations without a parent block is detected early on. This
case otherwise runs into a segfault, as it's assumed that there is an
parent block.
Currently there is a fair amount of code duplication in various parts of
emitAction. This patch factors out commonality among CCAssignToReg,
CCAssignToRegAndStack, CCAssignToRegWithShadow, and CCAssignToStack.
`XXXGenCallingConv.inc` are identical before and after this patch for
all targets.
This is an alternative to #128506 which doesn't attempt to change the
codegen for fmin and fmax on their way to the CLC library.
The amdgcn and r600 custom definitions of fmin/fmax are now converted to
custom definitions of __clc_fmin and __clc_fmax.
For simplicity, the CLC library doesn't provide vector/scalar versions
of these builtins. The OpenCL layer wraps those up to the vector/vector
versions.
The only codegen change is that non-standard vector/scalar overloads of
fmin/fmax have been removed. We were currently (accidentally,
presumably) providing overloads with mixed elment types such as
fmin(double2, float), fmax(half4, double), etc. The only vector/scalar
overloads in the OpenCL spec are those with scalars of the same element
type as the vector in the first argument.
The Worklist used by computeMinimumValueSizes is wastefully a
Value-vector, when the top of the loop attempts to cast the popped value
to an Instruction anyway. Improve the algorithm by cutting this wasteful
work, refining the type of Worklist, Visited, and Roots from
Value-vectors to Instruction-vectors.
Currently shouldReduceLoadWidth is very relaxed about when loads can be
split to avoid extractions from the original full width load - resulting
in many cases where the number of memory operations notably increases,
replacing the cost of a extract_subvector for additional loads.
This patch adjusts the 256/512-bit vector load splitting metric to
detect cases where ANY use of the full width load is be used directly -
in which case we will now reuse that load for smaller types, unless we'd
need to extract an upper subvector / integer element - i.e. we now
correctly treat (extract_subvector cst, 0) as free.
We retain the existing logic of never splitting loads if all uses are
extract+stores but we improve this by peeking through bitcasts while
looking for extract_subvector/store chains.
This required a number of fixes - shouldReduceLoadWidth now needs to
peek through bitcasts UP the use-chain to find final users (limited to
hasOneUse cases to reduce complexity). It also exposed an issue in
isTargetCanonicalConstantNode which assumed that a load of vector
constant data would always extract, which is no longer the case.
#137595 changed the behaviour for SIMD on ARM to ensure it is enabled
and disabled correctly depending on the options used by the user. In
this, the functionality to disable all features that depend on SIMD was
added. However, there was a spelling error for the i8mm feature, which
caused the `+nosimd` command to fail.
This fixes the error, and allows the command to work again.
…dy.py
Currently, run_clang_tidy.py does not correctly display the list of
checks picked up from the top-level .clang-tidy file. The reason for
that is that we are passing an empty string as input file.
However, that's not how we are supposed to use clang-tidy to list
checks. Per
65eccb463d,
we simply should not pass any file at all - the internal code of
clang-tidy will pass a "dummy" file if that's the case and get the
.clang-tidy file from the current working directory.
Fixes#136659
Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
Error if checks: verify whether the same length and type between input
list, output list, and control-flow blocks.
Level_checks: verify whether the nested depth exceeds MAX_NESTING.
This patch adds another frame-format variable (currently only
implemented in the CPlusPlus language plugin) that represents the
"suffix" of a function. The name is derived from the `DotSuffix` node of
LLVM's Itanium demangler.
For a function name such as `int foo() (.cold)`, the suffix would be
`(.cold)`.
LoopAccessAnalysis has code for handling function calls where the
function is marked with the 'convergent' attribute, but the test
coverage is insufficient. Fix this by adding a test showing the case of
no-runtime-checks adapted from LoopDistribute, and clean up the existing
test with runtime-checks. Also regenerate the test file with
UpdateTestChecks.
There is a bug in the computation and handling of MinBWs in the case of
VPWidenIntrinsicRecipe: a crash is observed in
VPlanTransforms::truncateToMinimalBitwidths due to a mismatch between
the number of recipes processed and the number of entries in MinBWs. Fix
handling of calls in llvm::computeMinimumValueSizes, and handle
VPWidenIntrinsicRecipe in truncateToMinimalBitwidths, fixing the bug.
Fixes#87407.
The outermost function doesn't have a This pointers, but inner calls
can. This still doesn't match the diagnostic output of the current
interpreter properly, but it's closer I think.
Add scheduling model for the MIPS i6400 and i6500, an in-order MIPS64R6
processor.
i6400 and i6500 share same instruction latencies.
CPU has following pipelines
- Two ALUs
- Multiply and Divide unit (MDU)
- Branch Unit (CTU)
- Load/Store Unit (LSU)
- Short Floating-point Unit and
- Long Floating-point Unit
Latency information is available at:
https://mips.com/wp-content/uploads/2025/03/MIPS_I6500-F_Data_Sheet_Rev1.10_3-17-2025.pdf
Some of the tests moved in #134354 do not currently pass on a mac. Disable
them.
The test for testing process priority also does not build on a mac. I'm not
moving it (back) to the linux, since it's "almost" fine -- we'd just need to
implement the RLIMIT_NICE logic differently. I'm not doing that now since
(apart from linux) there are no systems which are able to retrieve the process
priority.
In #130623 support was added for `+nosimd` in the clang driver.
Following this PR, it was discovered that, if NEON is disabled in the
command line, it did not disable features that depend on this, such as
Crypto or AES. To achieve this, This PR does the following:
- Ensure that disabling NEON (e.g., via +nosimd) also disables dependent
features like Crypto and AES.
- Update the driver to automatically enable NEON when enabling features
that require it (e.g., AES).
This fixes inconsistent behavior where features relying on NEON could be
enabled without NEON itself being active, or where disabling NEON left
dependent features incorrectly enabled.
We can use the `__is_nothrow_convertible` builtin unconditionally now,
which makes the implementation very simple, so there isn't much of a
need to keep a separate header around.
The function was always trying to dereference both the synthetic and
non-synthetic view of the object. This is wrong as the caller should be
able to determine which view of the object it wants to access, as is
done e.g. for child member access.
This patch removes the nonsynthetic->synthetic fallback, which is the
more surprising path, and fixes the callers to try both versions of the
object (when appropriate). I also snuck in simplification of the member
access code path because it was possible to use the same helper function
for that, and I wanted to be sure I understand the logic correctly.
I've left the synthetic->nonsynthetic fallback in place. I think we may
want to keep that one as we often have synthetic child providers for
pointer types. They usually don't provide an explicit dereference
operation but I think users would expect that a dereference operation on
those objects would work. What we may want to do is to try the
*synthetic* operation first in this case, so that the nonsynthetic case
is really a fallback.
---------
Co-authored-by: Ilia Kuklin <kuklin.iy@mail.ru>
As of #117683, Blocks are always enclosed in a function, so these checks
never fail. We can't change the signature of
`CalculateSymbolContextFunction` as it's an abstract function (and some
of its implementations can return nullptr), but we can create a
different function that we can call when we know we're dealing with a
block.
Reverts llvm/llvm-project#137408
This change broke `lldb/test/Shell/Unwind/split-machine-functions.test`.
The test binary has a symbol named `_Z3foov.cold` and the test expects
the backtrace to print the name of the cold part of the function like
this:
```
# SPLIT: frame #1: {{.*}}`foo() (.cold) +
```
but now it gets
```
frame #1: 0x000055555555514f split-machine-functions.test.tmp`foo() + 12
```