This PR adds new patterns to improve the generated vector code for the emulation of any conversion that have to go through an i4 -> i8 type extension (only signed extensions are supported for now). This will impact any i4 -> i8/i16/i32/i64 signed extensions as well as sitofp i4 -> f8/f16/f32/f64.
The asm code generated for the supported cases is significantly better after this PR for both x86 and aarch64.
Summary:
FileEntry.Dir can be empty if debug info only contains relative path.
This caused an assertion failure when gsym segmentation is trying to
copy a file entry with empty dir. As the fitst entry of StringTable is
always empty (and is preserved), `StringOffsetMap` doesn't have key 0.
Hence, `find(0)` returns `End` and `operator->()` fails the assertion
Test Plan:
./bin/llvm-lit -sv llvm/test/tools/llvm-gsymutil/X86/elf-empty-dir.yaml
A recent patch to allow pFUnit to compile softened the diagnostic about
indistinguishable specific procedures to a portability warning. It turns
out that this was overkill -- for specific procedures containing no
optional or unlimited polymorphic dummy data arguments, a diagnosis of
"indistinguishable" can still be a hard error.
So adjust the analysis to be tri-state: two procedures are either
definitely distinguishable, definitely indistinguishable without
optionals or unlimited polymorphics, or indeterminate. Emit errors as
before for the definitely indistinguishable cases; continue to emit
portability warnings for the indeterminate cases.
When this patch is merged, all but one of the dozen or so tests that I
disabled in llvm-test-suite can be re-enabled.
Some metadata needs to be updated/finalized before the binary context is
emitted into the binary. Add the interface and use it for Linux ORC
update invocation.
Summary:
Recent patches have implemented builitin versions of these functions.
This patch simply removes uses of inline assembly to hopefully improve
optimizations in this area.
Summary:
The previous patch missed the `U` prefix, which caused the mask to be
considered signed. This meant that conversions would incorrectly treat a
full mask as a negative number and break things.
The `acc.declate_action` attribute was sometime misplaced as reported in
#79770.
This patch updates the lowering code to place the
postAllocate/postDeallocate actions at the correct place.
When there are multiple USE statement for a particular module using
renaming, it is necessary to collect a set of all of the original
renaming targets before processing any of USE statements that don't have
ONLY: clauses.
Currently, if there is a name in the module that can't be added to the
current scope -- due to a conflict with an internal or module
subprogram, or with a previously use-associated name -- the compiler
will emit a bogus error message even if that conflicting name appear on
a later USE statement of the same module as the target of a renaming.
The new regression test case added with this patch provides a motivating
example.
When a generic procedure interface, either declared or the result of
merging two use-associated generics, has two specific procedures
that are not distinguishable according to the rules in F'2023
subclause 15.4.3.4.5, emit a portability warning rather than a
hard error message. The rules in that subclause are not adequate
to detect pairs of specific procedures that admit an ambiguous
reference, as demonstrated by a case that arose in pFUnit. Further,
these distinguishability checks, even if sufficient to the task
of detecting pairs of specifics capable of ambiguous references,
should only apply to pairs where *every* reference would have to
be ambiguous -- and this can and is validated at every reference
anyway. Last, only XLF enforces these incomplete and needless
distinguishability rules -- every other compiler seems to just
check that each procedure reference resolves to exactly one
specific procedure.
If the standard were to complete lose subclause 15.4.3.4.5 and
its related note (C.11.6) -- which admits that the rules are
incomplete! -- and simply require that each generic procedure
reference resolve unambiguously to exactly one specific, nobody
would miss them. This patch changes this compiler to give them
lip service when requested, but they are now otherwise ignored.
Don't rewrite 0*X to 0 if X is not scalar. Up until now this hasn't
shown up as a bug because a scalar 0 works in nearly all expressions
where an array would be expected. But not in all cases -- this bad
rewrite can cause generic procedure resolution to fail when it causes an
actual argument to have an unsupported rank.
The __ehdr_start symbol is added by the linker and points to the ELF
file headers, which can be very far away from text. Treat it as a large
symbol under the medium/large code models. Performance to access
__ehdr_start is almost certainly not important.
There are a couple of other symbols that the linker adds [1], but this
is the most relevant one that may be far away from text.
[1]
547c395b27/lld/ELF/Writer.cpp (L226)
Scudo grabs all allocator locks in a pthread_atfork before the fork, and releases them after. This allows malloc to be used in a fork child of a multithreaded process, which is expressly forbidden by the standard, but very widely used. For example, Android's init uses std::string after fork when spawning services in android::init::EnterNamespaces and other places.
Any lock that is necessary to serve an allocator call must be handled this way. Otherwise there is a possibility that the lock is held during the call to fork, which results in it being held forever in the child process, and the next operation that needs it deadlocks.
When a real-valued reference to the MOD/MODULO intrinsic functions has
operands that are exact integers, use the fast exact integer algorithm
rather than calling std::fmod.
The runtime type information table generator couldn't handle a null
pointer returned correctly for a original (not instantiated) derived
type with kind parameters.
Fixes https://github.com/llvm/llvm-project/issues/79590.
Update loop interleaving count computation to address loops that require at least one scalar iteration in the epilogue loop. For this case, the available trip count for interleaving the loop is one less.
The interaction between --warn-backrefs was not tested, but if
--defsym-created reference causes archive member extraction, it seems
reasonable to suppress the diagnostic, which was the behavior before #78944.
Update code from https://reviews.llvm.org/D138847
`buildTestVector` is a standard DFS (walking a reduced ordered binary
decision diagram). Avoid shouldCopyOffTestVectorFor{True,False}Path
complexity and redundant `Map[ID]` lookups.
`findIndependencePairs` unnecessarily uses four nested loops (n<=6) to
find independence pairs. Instead, enumerate the two execution vectors
and find the number of mismatches. This algorithm can be optimized using
the marking function technique described in _Efficient Test Coverage
Measurement for MC/DC, 2013_, but this may be overkill.
This patch removes the llvm:: prefix within llvm-exegesis where it is
not necessary. This is most occurrences of the prefix within exegesis as
exegesis is within the llvm namespace. This patch makes things more
consistent as the vast majority of the code did not use the llvm::
prefix for anything.
The upload-artifact@v3 action is using Node 16, which is reaching EOL.
As a result, we are getting warnings prompting us to move our jobs over
to the latest version of upload-artifact.
Using dyn_cast allows us to use CastInst::getOperand instead of
Instruction::getOperand. This is more efficient since
CastInst::getOperand doesn't need to check how the operands are stored.
Instruction::getOperand has to consider HungOffUses.
demote the tree entry.
Need to check if all user nodes are marked for demotion before demoting
the node. Otherwise, some data info might be lost after vectorization.
Temporarily revert to unblock the CI bots, this is breaking the -DLLVM_ENABLE_MODULES=On
modules style build. I've notified Ismail.
This reverts commit 888501bc63.
This patch contains a set of pre-commit tests for changing the loop interleaving count computation in a subsequent patch in order to address loops that need to execute at least a single scalar iteration in the epilogue.
The very beginning already talks about how to git clone the repo. The
section about checking out specific versions doesn't really belong in
GettingStarted and seems unnecessary.
In Sema in `BuildReturnStmt(...)` when we try to determine is the type
is move eligible or copy elidable we don't currently check of the init
of the `VarDecl` contain errors or not. This can lead to a crash since
we may send a type that is not complete into `getTypeInfo(...)` which
does not allow this.
This fixes: https://github.com/llvm/llvm-project/issues/63244https://github.com/llvm/llvm-project/issues/79745