This is another step towards supporting DWARF5 checksums and inline
source code in LLDB. This is a reland of #85468 but without the
functional change of storing the support file from the line table (yet).
Dwarf 5 allows separating filenames from .debug_line into a separate
.debug_line_str section. The strings are referenced relative to the
start of the .debug_line_str section. Previously, on COFF, the
relocation information instead caused offsets to be relocated to the
base address of the COFF-File. This lead to wrong offsets in linked
COFF (PE) files which caused the debugger to be unable to find the
correct source files.
This patch fixes this problem by making the offsets relative to the
start of the .debug_line_str section instead. There should be no
changes for ELF-Files as everything seems to be working there.
A test is also added to ensure that the correct relocation entries are
emitted.
This reapplies 272d1b44ef (from #85756),
which was reverted in
407937036f.
In the previous attempt, empty CMAKE_INSTALL_PREFIX was handled by
quoting them, in d209d1340b. That made the
calls to cmake_path(ABSOLUTE_PATH) succeed, but the output paths of that
weren't actually absolute, which was required by file(RELATIVE_PATH).
Avoid this issue by constructing a non-empty base directory variable
to use for calculating the relative path.
I believe we can use XLen alignment as long as eliminateFrameIndex
limits the maximum folded offset to 2043. This way when we split
the load/store into two 2 instructions we'll be able to add 4
without overflowing simm12.
Allow targets to rely on TargetLowering::isGuaranteedNotToBeUndefOrPoisonForTargetNode to test nodes for canCreateUndefOrPoisonForTargetNode + all arguments are isGuaranteedNotToBeUndefOrPoison.
Targets can still perform this themselves for specific special case nodes (e.g. target shuffles).
Matches the fallback in SelectionDAG::isGuaranteedNotToBeUndefOrPoison
NFC.
gfx11_asm_vinterp.s already contained GFX12 run lines. Rename the
assembler and disassembler tests to be sorted based on real16 or fake16
instead of gfxip. Note, both GFX11 and GFX12 currently only have fake16
(fake16 in encoding, but not by name) upstream, so that is why the test
files have a -fake16 suffix.
One test input is changed, and that is the disassembler test for
unsupported bits in the instruction. It is now an input that is valid on
both GFX11 and GFX12. This was necessary because the size of the opcode
field changed.
This adds a new API built with the `ValueBoundsConstraintSet` to compute
the bounds of possibly scalable quantities. It uses knowledge of the
range of vscale (which is defined by the target architecture), to solve
for the bound as either a constant or an expression in terms of vscale.
The result is an `AffineMap` that will always take at most one
parameter, vscale, and returns a single result, which is the bound of
`value`.
The API is defined as follows:
```c++
FailureOr<ConstantOrScalableBound>
vector::ScalableValueBoundsConstraintSet::computeScalableBound(
Value value, std::optional<int64_t> dim,
unsigned vscaleMin, unsigned vscaleMax,
presburger::BoundType boundType,
bool closedUB = true,
StopConditionFn stopCondition = nullptr);
```
Note: `ConstantOrScalableBound` is a thin wrapper over the `AffineMap`
with a utility for converting the bound to a single quantity (i.e. a
size and scalable flag).
We believe this API could prove useful downstream in IREE (which uses a
similar analysis to hoist allocas, which currently fails for scalable
vectors).
Make sure all float/double comparison intrinsics specify what happens
with a NaN input. Update some existing descriptions of comparison
results to make them all consistent.
Also replace "yields" with "returns" throughout.
The existing implementation of tosa-layerwise-constant-fold only works
for constant values backed by DenseElementsAttr. For constants which
hold DenseResourceAttrs, the folder will end up asserting at runtime, as
it assumes that the backing data can always be accessed through
ElementsAttr::getValues.
This change reworks the logic so that types types used to perform
folding are based on whether the ElementsAttr can be converted to a
range of that particular type.
---------
Co-authored-by: Spenser Bauman <sabauma@mathworks.com>
Co-authored-by: Tina Jung <tinamaria.jung@amd.com>
This patch contains slight modifications to the reverted PR #85258 to
avoid issues with constructs containing multiple reduction clauses,
uncovered by a test on the gfortran testsuite.
This reverts commit 9f80444c2e.
Add --skip-symbol and --skip-symbols options that allow to skip symbols
when executing other options that can change the symbol's name, binding
or visibility, similar to an existing option --keep-symbol that keeps a
symbol from being removed by other options.
Move 3 interface headers in `//mlir:IR` from `hdrs` to `srcs`.
Header files should not be added to multiple targets, but this is hard
to avoid because CMake is less strict with headers. But we should at
least avoid exposing them as headers by multiple targets because it
confuses tooling.
Add ones for every high bit that will cleared.
This will allow us to evaluate variables that have their bits known to
see if they have no risk of overflow despite the shift amount being
greater than the difference between the two types.
When building the OpenMP runtime with libomptarget support, the runtimes
configure step needs to have a dependency on various tools, in
particular opt, so that cmake configure checks yield the correct
results.
This did not work correctly, as the dependencies were only added if the
OPENMP_ENABLE_LIBOMPTARGET was set - but that variable is only set by
the openmp/CMakeLists.txt file, which isn't even parsed during the
initial cmake run (in fact, it is only parsed when executing the
runtimes configure step itself, but then it is too late).
Fixed by just adding those dependencies always.
In addition, the list of dependencies collected in ${extra_deps},
including those required for OpenMP, was only actually used when
configuring runtimes for the default set of targets - when the user
specifies a non-default LLVM_RUNTIME_TARGETS, those extra dependencies
were ignored (with the exception of ${hdrgen_deps}).
Fixed by passing the full ${extra_deps} in this case as well.
Fixes: https://github.com/llvm/llvm-project/issues/85933
Commit a7d5f73a03 introduced an
error in a target_compile_definitions on the SystemZ, causing
the build to break. Fixed by adding the missing "PRIVATE".
`copy_n` has been used to allow constant evaluation of `char_traits`. We
now have `__constexpr_memmove`, which `copy_n` just forwards to. We can
call `__constexpr_memmove` directly, avoiding a bunch of instantiations.
This reduces the time it takes to include `<string>` from 321ms to
285ms.
This is compatible with MSVC, `-machine:arm64x` is essentially an alias
to `-machine:arm64ec`. To make a type library that exposes both native
and EC symbols, an additional `-defArm64Native` argument is needed in
both cases.
Refactor the logic that checks if a module contains mixed
absolute/non-lowered LDS GVs.
The check now happens latter when the "worklists" are formed. This is
because in some cases (OpenMP) we can have non-lowered GVs in a lowered
module, and this is normal because those GVs are just unused and removed
from the list at some point before the end of `getUsesOfLDSByFunction`.
Doing the check later ensures that if a mixed module is spotted, then
it's a _real_ mixed module that needs rejection, not a module containing
an intentionally ignored GV.
We currently always lower shuffle to the struct-returning variant. I saw
some cases where this survived all the way through ptx, resulting in
increased register usage. The easiest fix is to simply lower to the
single-result version when the predicate is unused.
We don't have a concat_vector shuffle kind and improveShuffleKindFromMask won't alter the base type to match it as InsertSubvector.
But since this is how X86 will lower concat_vector anyhow, just recognise it explicitly.
Another step for #67803