This patch adds tests for umax(x, 1u).
This patch fixes:
https://github.com/llvm/llvm-project/issues/60233
It turns out that commit 86b4d8645f on
Feb 8, 2023 already performs the instcombine transformation proposed
in the issue, so the issue requires no change on the codegen side.
In this file, most of the line don't have trailing spaces,
but some of them have. To keep consistent, remove the trailing
spaces.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D146697
By making the 64 bit integer literals unsigned. Otherwise some of them
are unexpectedly sign extended (and the compiler rightly diagnosed this
with warnings)
Initially added in D80506.
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D146667
On targets without ADDCARRY or ADDE, we need to emit a separate
SETCC to determine carry from the low half to the high half.
The high half is calculated by a series of ADDs.
When RHSLo and RHSHi are -1, without this patch, we get:
Hi = (add (add LHSHi,(setult Lo, LHSLo), -1)
Where as with the patch we get:
Hi = (sub LHSHi, (seteq LHSLo, 0))
Only RHSLo is -1 we can instead do (setne Lo, 0).
Similar to gcc: https://godbolt.org/z/M83f6rz39
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D146635
Following up on the comments in https://reviews.llvm.org/D144108 this
patch refactors the im2col conversion patterns for `linalg.conv_2d_nhwc_hwcf`
and `linalg.conv_2d_nchw_fchw` convolutions to use gather semantics for the im2col
packing `linalg.generic`.
Follow up work can include a similar pattern for depthwise convolutions
and a generalization of the patterns here to work with any `LinalgOp` as
well.
Differential Revision: https://reviews.llvm.org/D144678
Plumbing from the language level to the assume intrinsics with
separate_storage operand bundles.
Patch by David Goldblatt (davidtgoldblatt)
Differential Revision: https://reviews.llvm.org/D136515
Misc. cleanups for `WebAssemblyDebugValueManager`.
- Use `Register` for registers
- Simpler for loop iteration
- Rename a variable
- Reorder methods
- Reduce `SmallVector` size for `DBG_VALUE`s to 1; one def usually have
a single `DBG_VALUE` attached to it in most cases
- Add a few more lines of comments
Reviewed By: dschuff
Differential Revision: https://reviews.llvm.org/D146743
Sometimes the clang driver will receive a target triple where the
deployment version is too low to support the platform + arch. In those
cases, the compiler upgrades the final minOS which is what gets recorded
ultimately by the linker in LC_BUILD_VERSION. TextAPI should also reuse
this logic for capturing minOS in recorded TBDv5 files.
Reviewed By: ributzka
Differential Revision: https://reviews.llvm.org/D145690
It's possible to segfault in `DevirtModule::applyICallBranchFunnel` when
attempting to call `getCaller` on a call base that was erased in a prior
iteration. This can occur when attempting to find devirtualizable calls
via `findDevirtualizableCallsForTypeTest` if the vtable passed to
llvm.type.test is a global and not a local. The function works by taking
the first argument of the llvm.type.test call (which is a vtable),
iterating through all uses of it, and adding any relevant all uses that
are calls associated with that intrinsic call to a vector. For most
cases where the vtable is actually a *local*, this wouldn't be an issue.
Take for example:
```
define i32 @fn(ptr %obj) #0 {
%vtable = load ptr, ptr %obj
%p = call i1 @llvm.type.test(ptr %vtable, metadata !"typeid2")
call void @llvm.assume(i1 %p)
%fptr = load ptr, ptr %vtable
%result = call i32 %fptr(ptr %obj, i32 1)
ret i32 %result
}
```
`findDevirtualizableCallsForTypeTest` will check the call base ` %result
= call i32 %fptr(ptr %obj, i32 1)`, find that it is associated with a
virtualizable call from `%vtable`, find all loads for `%vtable`, and add
any instances those load results are called into a vector. Now consider
the case where instead `%vtable` was the global itself rather than a
local:
```
define i32 @fn(ptr %obj) #0 {
%p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2")
call void @llvm.assume(i1 %p)
%fptr = load ptr, ptr @vtable
%result = call i32 %fptr(ptr %obj, i32 1)
ret i32 %result
}
```
`findDevirtualizableCallsForTypeTest` should work normally and add one
unique call instance to a vector. However, if there are multiple
instances where this same global is used for llvm.type.test, like with:
```
define i32 @fn(ptr %obj) #0 {
%p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2")
call void @llvm.assume(i1 %p)
%fptr = load ptr, ptr @vtable
%result = call i32 %fptr(ptr %obj, i32 1)
ret i32 %result
}
define i32 @fn2(ptr %obj) #0 {
%p = call i1 @llvm.type.test(ptr @vtable, metadata !"typeid2")
call void @llvm.assume(i1 %p)
%fptr = load ptr, ptr @vtable
%result = call i32 %fptr(ptr %obj, i32 1)
ret i32 %result
}
```
Then each call base `%result = call i32 %fptr(ptr %obj, i32 1)` will be
added to the vector twice. This is because for either call base `%result
= call i32 %fptr(ptr %obj, i32 1) `, we determine it is associated with
a virtualizable call from `@vtable`, and then we iterate through all the
uses of `@vtable`, which is used across multiple functions. So when
scanning the first `%result = call i32 %fptr(ptr %obj, i32 1)`, then
both call bases will be added to the vector, but when scanning the
second one, both call bases are added again, resulting in duplicate call
bases in the CSInfo.CallSites vector.
Note this is actually accounted for in every other instance WPD iterates
over CallSites. What everything else does is actually add the call base
to the `OptimizedCalls` set and just check if it's already in the set.
We can't reuse that particular set since it serves a different purpose
marking which calls where devirtualized which `applyICallBranchFunnel`
explicitly says it doesn't. For this fix, we can just account for
duplicates with a map and do the actual replacements afterwards by
iterating over the map.
Differential Revision: https://reviews.llvm.org/D146267
Summary:
The previous patch fixed how we handle emitting atomics for targeting
NVPTX directly. This is the only other file that really does that and
has atomics and I forgot to update it.
Since Clang 16.0.0 users can target the `NVPTX` architecture directly
via `--target=nvptx64-nvidia-cuda`. However, this does not set the
atomic inlining size correctly. This leads to spurious warnings and
emission of runtime atomics that are never implemented. This patch
ensures that we set this to the appropriate pointer width. This will
always be 64 in the future as `nvptx64` will only be supported moving
forward.
Fixes: https://github.com/llvm/llvm-project/issues/61410
Reviewed By: tra
Differential Revision: https://reviews.llvm.org/D146750
In ee232506b8 I moved UnixSignal
initialization from lldbTarget to the various platform plugins. This
inadvertently broke lldb-server because lldb-server doesn't use
Platform plugins. lldb-server still needs to be able to create a
UnixSignals object for the host platform so we can add the relevant
platform plugin to lldb-server to make sure we always have a
HostPlatform.
Differential Revision: https://reviews.llvm.org/D146668
Add Timer and TimingManager which provide convenient way to meause the
execution time of code snippets. The output looks like,
```
-- Average Operation Time -- -- Name (# of Calls) --
1747.2(ns) popBatch (59)
92.3(ns) popBatchImpl (73)
101.6(ns) EmptyBatchProcess (5)
2587.0(ns) pushBlocksImpl (13)
```
Note that `EmptyBatchProcess` is nested under the timer `popBatchImpl`.
Reviewed By: cferris
Differential Revision: https://reviews.llvm.org/D143626
Vector dialect patterns have grown enormously in the past year to a point where they are now impenetrable.
Start reorganizing them towards finer-grained control.
Differential Revision: https://reviews.llvm.org/D146736
My recent change [1] extended the external-swift-debugging.cpp test, but
didn't account for PAC under which function pointers aren't trivially
comparable. We could use `ptrauth_strip()`, but for the test it's easier
to just the symbol name.
[1] https://reviews.llvm.org/D146264
We don't need an explicit AND mask, we can use KnownBits to determine if each element has (the same) single non-zero bit and shift that into the msb/signbit for MOVMSK to access directly.
When fixing the test earlier, we missed the JSON case for NaN and INF,
so handle those the same as for non-JSON, by creating the string
dynamically.
Reviewed By: abhina.sreeskantharajan
Differential Revision: https://reviews.llvm.org/D146739
The trailing return type arrow checker verifies that a declaration is
being parsed, however, this isn't true when inside of macros.
It turns out the existence of the auto keyword is enough to make
sure that we're dealing with a trailing return type, and whether we're
in a declaration doesn't matter.
Fixes https://github.com/llvm/llvm-project/issues/47664
Reviewed By: HazardyKnusperkeks, owenpan
Differential Revision: https://reviews.llvm.org/D141811
Building with -DLIBCXX_ENABLE_THREADS=OFF -DLIBCXXABI_ENABLE_THREADS=OFF
(like e.g. for wasm) fails after D146228 because of a misplaced std
namespace begin/end.
Reviewed By: philnik, #libc
Differential Revision: https://reviews.llvm.org/D146682