Summary:
We link these implicitly for OpenMP because it's the canonical
implementation of those C language features. This was inhereted by HIP
and ended up with these resolving functions. There are some useful
functions in here, but this can be problematic as it could potentially
override functions intended to be provided by the ROCm device libraries.
Additionally the HIP runtime does not currently provide the handling for
the RPC server required to run the host resources so those just
segfault.
This makes two changes:
* Only install ocaml docs if LLVM_BUILD_DOCS=ON.
* Add ocaml_doc target to ALL if LLVM_BUILD_DOCS=ON.
If LLVM_BUILD_DOCS=ON, this ensures that the docs are actually
built before an installation is attempted. For LLVM_BUILD_DOCS=OFF
(the default) this means that there is no attempt to install the
(non-built) ocaml docs anymore.
Fixes#154411.
Fixes#125437.
Fixes#108742.
This is in preparation for the new MultiMemRead packet discussed in the
RFC [1].
An alternative to using `qSupported` would be having clients send an
empty `MultiMemRead` packet. However, this is problematic because the
already-existing packet `M` is a prefix of `MultiMemRead`; an empty
reply would be ambiguous in this case. It is also risky that the stub
might interpret the `MultiMemRead` as a valid `M` packet.
Another advantage of `qSupported` is that this packet is already
exchanged, so parsing a new field is simpler than having to exchange one
extra packet.
[1]:
https://discourse.llvm.org/t/rfc-a-new-vectorized-memory-read-packet/88441
Originally `!unpredictable` could only appear on branches and switches,
but now it can also appear on selects. This change updates the LangRef
accordingly.
Motivated by this discussion:
https://github.com/llvm/llvm-project/pull/163208#discussion_r2426842999
We will soon want to emit language version codes into debug-info.
Instead of replicating the `LangOptions -> version code` mapping we
thought we'd try to share some of the logic with the Clang frontend.
This patch teaches `LangStandard` about language versions (currently just C++ and C).
Switch to the `.Cases({S0, S1, ...}, Value)` overload instead.
Update existing uses affected by the deprecation and the surrounding
code (for consistency).
This is a part of a larger cleanup of StringSwitch. The goal is to
eventually deprecate all manually-enumerated overloads of Cases with a
hardcoded number of case values (in favor of passing them via an
initializer list). You can find the full explanation here:
https://github.com/llvm/llvm-project/pull/163117.
Start small (6+ arguments) to keep the number of changes manageable.
This patch replaces LLVM_ATTRIBUTE_UNUSED with [[maybe_unused]]. Note
that this patch adjusts the placement of [[maybe_unused]] to comply
with the C++17 language.
This PR enhances the elementwise unrolling pattern to support higher
rank to lower rank unroll. The approach is to add leading unit dims to
lower rank targetShape to match the rank of original vector (because
ExtractStridedSlice requires same rank to extractSlices), extract slice,
reshape to targetShape's rank and perform the operation.
This PR fixes a bug in combineX86CloadCstore where an optimization was
being applied too broadly, causing incorrect code generation.
Without any assumptions about `X` this transformation is only valid when
`Y` is a non zero power of two/single-bit mask.
```cpp
// res, flags2 = sub 0, (and (xor X, -1), Y)
// cload/cstore ..., cond_ne, flag2
// ->
// res, flags2 = sub 0, (and X, Y)
// cload/cstore ..., cond_e, flag2
```
We can restrict the optimization to most important case, so only apply
when `llvm::isOneConstant(Op1.getOperand(1))`. It might be not trivial
to find code that creates a SelectionDag with other values of `Y`.
Basline test: https://github.com/llvm/llvm-project/pull/163354
- Add CLOSE map flag when USM is required.
- use_device_ptr: prevent implicitly expanding member operands.
- Fixes test offload/test/offloading/fortran/usm_map_close.f90.
This simply ignores 'delayed template parsing' when functions have late
parsed attributes, since these are not MSVC compatible anyway.
Besides ignoring the attribute, this would also cause a memory leak.
Switch to the `.Cases({S0, S1, ...}, Value)` overload instead, and the
manually-enumerated overloads with 6+ arguments are getting deprecated
in https://github.com/llvm/llvm-project/pull/163405.
This pre-commits API updates ahead of the deprecation to make potential
reverts cleaner. This was already reviewed in #163405.
When narrowing stores of a single-scalar, we currently use
ExtractLastElement, which extracts the last element across all parts.
This is not correct if the store's address is not uniform across all
parts. If it is only uniform-per-part, the last lane per part must be
extracted. Add a new ExtractLastLanePerPart opcode to handle this
correctly. Most transforms apply to both ExtractLastElement and
ExtractLastLanePerPart, with the only difference being their treatment
during unrolling.
Fixes https://github.com/llvm/llvm-project/issues/162498.
PR: https://github.com/llvm/llvm-project/pull/163056
Currently the "getter" option `-print-resource-dir` is visible when
typing `--help`, but the corresponding "setter" option `-resource-dir`
is not.
This option is useful when one is using clang on a non-standard
location, or when one is building a libtooling-based tool (e.g. IWYU)
based on a local clang build. In that case, we need to specify the
correct path to the resource directory for things to work.
Existing documentation already makes use of this option, for example
here:
https://clang.llvm.org/docs/StandardCPlusPlusModules.html#possible-issues-failed-to-find-system-headers
There is thus no reason to keep this option hidden from the help and
documentation.
---------
Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
This aligns the simple types created by the native plugin with the ones
from DIA as well as LLVM and the original cvdump.
- A few type names weren't handled when creating the LLDB `Type` name
(e.g. `short`)
- 64-bit integers were created as `(u)int64_t` and are now created as
`(unsigned) long long` (matches DIA)
- 128-bit integers (only supported by clang-cl) weren't created as types
(they have `SimpleTypeKind::(U)Int128Oct`)
- All complex types had the same name - now they have `_Complex
<float-type>`
Some types like `SimpleTypeKind::Float48` can't be tested because they
can't be created in C++.
Whenever someone modifies an existing test that has `undef` in it the
github code formatter will complain so it's not easy to know if it's due
to a new or old use. I figured I may as well just do a simple sed
replace of undef with poison in all the tests to clean them up.
Hopefully it makes the contribution process a bit easier.
This PR adds a baseline test that exposes a bug in the current
`combineX86CloadCstore` optimization. The generated assembly
demonstrates incorrect behavior when the optimization is applied without
proper constraints.
Without any assumptions about `X` this transformation is only valid when
`Y` is a non zero power of two/single-bit mask.
```cpp
// res, flags2 = sub 0, (and (xor X, -1), Y)
// cload/cstore ..., cond_ne, flag2
// ->
// res, flags2 = sub 0, (and X, Y)
// cload/cstore ..., cond_e, flag2
```
In the provided test case, the value in `%al` is unknown at compile
time. If `%al` contains `0`, the optimization cannot be applied, because
`(and (xor X, -1), 0)` is not equal to `(and X, 0)`.
Fix: https://github.com/llvm/llvm-project/pull/163353
Support custom types (4/N): test that it is possible to customize memref
layout specification for custom operations and function boundaries.
This is purely a test setup (no API modifications) to ensure users are
able to pass information from tensors to memrefs within bufferization
process. To achieve this, a test pass is required (since bufferization
options have to be set manually). As there is already a
--test-one-shot-module-bufferize pass present, it is extended for the
purpose.
This updates the `MachineSMEABIPass` to find insertion points for state
changes (i.e., calls to ABI routines), where the NZCV register (status
flags) are not live.
It works by stepping backwards from where the state change is needed
until we find an instruction where NZCV is not live, a previous state
change, or a call sequence. We conservatively don't move into/over
calls, as they may require a different state before the start of the
call sequence.
The PSHUFB instruction shuffles bytes within each 128-bit lane: for each
control byte, if bit 7 is set, the output byte is zeroed; otherwise, the
low 4 bits select a source byte (0–15) from the same lane.
Note: _mm_shuffle_pi8 function had to change as __anyext128 had negative
indices which are invalid in constant expression context.
Fixes#156612
See the patch for details.
I tried to solve the defect left in previous refactorings
but found it was more complex. Add the comment to state
it more clearly.
I encountered a compilation crash issue, and after analysis, it was
caused by the AArch64PostCoalescerPass, see https://godbolt.org/z/vPeqeo5Pa.
When replacing the register, if the source register has undef flag, we
should propagate the flag to all uses of the destination register.
The register set information is stored as a singleton in
GetRegisterInfo_i386. However, other functions later access this
information assuming it is stored in GetSharedRegisterInfoVector. To
resolve this inconsistency, we remove the original construction logic
and instead initialize the singleton using llvm::call_once within the
appropriate function (GetSharedRegisterInfoVector_i386).