While merging profiles, some fields in the input header, e.g.
HashFunction, could be uninitialized leading to a UMR. Initialize merged
header with the first input header.
Fixes#109592
This fixes loop iteration count calculation if the step is
a negative value, where we should adjust the added
delta from `step-1` to `step+1` when doing the ceil div.
Remove flang/include/flang/Tools/CLOptions.inc - which was included as
is in - several places. Move the code in it to header and source files
which are used used in the "standard" way. Some minor cleanup such as
removing trailing whitespace and excessive newlines and reordering
entries alphabetically for files that were modified along the way.
Update the documentation that referenced CLOptions.inc.
The vector trip count must already be created when fixupIVUsers is
called. Don't pass the vector preheader there and delay retrieving the
vector loop header. This ensures we are re-using the already computed
trip count. Computing the trip count from scratch would not be correct,
as the IR may not be in a valid state yet.
This feature is supported via the newer option
`-fbasic-block-address-map`. Using the old option still works by
delegating to the newer option, while a warning is printed to show
deprecation.
When searching for an intrinsic name in a target specific slice of the
intrinsic name table, skip over the target prefix. For such cases,
currently the first loop iteration in `lookupLLVMIntrinsicByName` does
nothing (i.e., `Low` and `High` stay unchanged and it does not shrink
down the search window), so we can skip this useless first iteration by
skipping over the target prefix.
Validate that for target independent intrinsics the second dotted
component of their name (after the `llvm.`) does not match any existing
target names (for which atleast one intrinsic has been defined). Doing
so is invalid as LLVM will search for that intrinsic in that target's
intrinsic table and not find it, and conclude that its an unknown
intrinsic.
adds the flag `print_stats_on_exit` which mirrors nsan's same flag.
# Why?
Not only is this nice for the end users, this gives us a very trivial
way to test deduplication which is next up
Currently the style is something like:
```
RealtimeSanitizer exit stats:
Total error count: 488
```
We can lower f16/bf16 memory ops without promotion through the existing
custom lowering.
Some of the zero strided VP loads get combined to a VP splat, so we need
to also handle the lowering for that for f16/bf16 w/ zvfhmin/zvfbfmin.
This patch copies the lowering from ISD::SPLAT_VECTOR over to
lowerScalarSplat which is used by the VP splat lowering.
This patch implements sandboxir::Utils::memoryLocationGetOrNone() that calls
MemoryLocation::getOrNone() internally.
Ideally this would require a sandboxir::MemoryLocation, but this should
be good enough for now.
Set the mayLoad, mayStore, and hasSideEffects hints for NVPTX atomic
instructions. This prevents any optimizations (e.g. rematerialization)
from illegally duplicating them and generating broken code.
Some `FileManager` APIs still return `{File,Directory}Entry` instead of
the preferred `{File,Directory}EntryRef`. These are documented to be
deprecated, but don't have the attribute that warns on their usage. This
PR marks them as such with `LLVM_DEPRECATED()` and replaces their usage
with the recommended counterparts. NFCI.
Swap `!DisassembleZeroes` and `if (DumpARMELFData)` conditions so that
in the false DisassembleZeroes case (default), `...` will be printed for
long consecutive zeroes, even when a data mapping symbol is active.
This is especially useful for certain lld tests that insert a huge
padding within a code section. Without `...` the output will be huge.
Pull Request: https://github.com/llvm/llvm-project/pull/109553
The proprietary PS4 linker implicitly adds `=/target/lib` and `.` as
library search paths. This behaviour was added to the PS5 linker via a
downstream patch in LLD. This really belongs in the driver, instead.
This change adds the driver behaviour to allow removal of the downstream
patch in LLD.
There are no plans to update the PS4 linker behaviour in the analogous
way, so do not pass the same search paths to the PS4 linker.
SIE tracker: TOOLCHAIN-16704
The CI has been a complete mess for the past week, and the only thing
preventing it from being back is the Clang tidy checks. Disable them (as
a total hack) to get CI back.
Make modernize-use-nullptr matcher also match "NULL", but not "0", when
it appears on a substituted type of a template specialization.
Previously, any matches on a substituted type were excluded, but this
meant that a situation like the following is not diagnosed:
```c++
template <typename T>
struct X {
T val;
X() { val = NULL; } // should diagnose
};
```
When the user says `NULL`, we expect that the destination type is always
meant to be a pointer type, so this should be converted to `nullptr`. By
contrast, we do not propose changing a literal `0` in that case, which
appears as initializers of both pointer and integer specializations in
reasonable real code. (If `NULL` is used erroneously in such a
situation, it should be changed to `0` or `{}`.)
Once PR #107223 lands, ASan can be enabled on Solaris/SPARC. This patch
does just that. As on Solaris/x86, the dynamic ASan runtime lib needs to
be linked with `-z now` to avoid an `AsanInitInternal` cycle.
Tested on `sparcv9-sun-solaris2.11` and `sparc64-unknown-linux-gnu`.
An example of this is the -mpure-code option. Without a config file
being used, an error message will print `-mpure-code`. But if a config
file is used, the error message will print `-mexecute-only`.
And fix the diagnostics for __builtin_is_constant_evaluated(). We can be
in a non-constant context, but calling an immediate function always
makes the context constant for the duration of that call.
Follow-up from a previous patch that turned bind_c into an enum for
procedure attribute.
This patch carries the elemental and pure Fortran attribute into FIR so
that the optimizer can leverage that info in the future (I think debug
info may also need to know these aspects since DWARF has DW_AT_elemental
and DW_AT_pure nodes).
SIMPLE from F2023 will be translated once it is handled in the
front-end.
NON_RECURSIVE is only meaningful on func.func since we are not
guaranteed to know that aspect on the caller side (it is not part of
Fortran characteristics). There is a DW_AT_recursive DWARF node. I will
do it while dealing with func.func attributes.
Follow on to #109715
This better matches this same variable in asan, ubsan, hwasan, and nsan.
Shows the logical coupling, and describes them as "dynamic only" which
is their intent.
Patch adds basic support for non-power-of-2 number of elements in
operands. The patch still requires that this number addresses whole
registers.
Reviewers: RKSimon, preames
Reviewed By: preames
Pull Request: https://github.com/llvm/llvm-project/pull/107273
Set up these tests so these are marked as unsupported when targeting
z/OS. Most would already be unsupported if you ran lit on z/OS. However,
they also need to be unsupported if the default triple is z/OS.
https://github.com/llvm/llvm-project/pull/109485 tried to std::min
between size_t and uint64_t. size_t on 32 bit is 32 bits.
https://lab.llvm.org/buildbot/#/builders/18/builds/4430/steps/4/logs/stdio
Explicitly select the size_t template to fix this.
This will truncate one of the arguments but that's the count_requested.
If you're debugging from a 32 bit host and you asked it to read
> 32 bit range of memory from a 64 bit target, you weren't going
to have any success anyway.
The final result needs to be size_t to resize the vector with.
This follows in the spirit of 7d82c99403,
and extends the costing API for compares and selects to provide
information about the operands passed in an analogous manner. This
allows us to model the cost of materializing the vector constant, as
some select-of-constants are significantly more expensive than others
when you account for the cost of materializing the constants involved.
This is a stepping stone towards fixing
https://github.com/llvm/llvm-project/issues/109466. A separate SLP patch
will be required to utilize the new API.
Previously SLP vectorize supported clustered vectorization for loads
only. This patch adds support for "clustered" vectorization for other
instructions.
If the buildvector node contains "clusters", which can be vectorized
separately and then inserted into the resulting buildvector result, it
is better to do, since it may reduce the cost of the vector graph and
produce better vector code.
The patch does some analysis, if it is profitable to try to do this kind
of extra vectorization. It checks the scalar instructions and its
operands and tries to vectorize them only if they result in a better
graph.
Reviewers: RKSimon
Reviewed By: RKSimon
Pull Request: https://github.com/llvm/llvm-project/pull/108430