Commit Graph

479128 Commits

Author SHA1 Message Date
Pete Lawrence
92d8a28cc6 [lldb] Part 2 of 2 - Refactor CommandObject::DoExecute(...) return void (not bool) (#69991)
[lldb] Part 2 of 2 - Refactor `CommandObject::DoExecute(...)` to return
`void` instead of ~~`bool`~~

Justifications:
- The code doesn't ultimately apply the `true`/`false` return values.
- The methods already pass around a `CommandReturnObject`, typically
with a `result` parameter.
- Each command return object already contains:
	- A more precise status
	- The error code(s) that apply to that status

Part 1 refactors the `CommandObject::Execute(...)` method.
- See
[https://github.com/llvm/llvm-project/pull/69989](https://github.com/llvm/llvm-project/pull/69989)

rdar://117378957
2023-10-30 13:21:00 -07:00
Teresa Johnson
2446439f51 [MemProf] Handle profiles with missing column numbers (#70520)
Detect when we are matching a memprof profile with no column numbers,
and in that case treat all column numbers as 0 when matching. The
profiled binary might have been built with -gno-column-info, for
example.
2023-10-30 13:19:37 -07:00
Philip Reames
cc6f9cf5a2 [RISCV] Add zbb coverage to test file [nfc] 2023-10-30 13:18:35 -07:00
Andrew Gozillon
68c384676c [Flang][MLIR][OpenMP] Temporarily re-add basic handling of uses in target regions to avoid gfortran test-suite regressions
This was a regression introduced by myself in:

 6a62707c04

where I too hastily removed the basic handling of implicit captures
we have currently. This will be superseded by all implicit captures
being added to target operations map_info entries in a soon landing
series of patches, however, that is currently not the case so we must
continue to do some basic handling of these captures for the time
being. This patch re-adds that behaviour to avoid regressions.

Unfortunately this means some test changes as well as
getUsedValuesDefinedAbove grabs constants used outside
of the target region which aren't handled particularly
well currently.
2023-10-30 15:10:12 -05:00
Shilei Tian
0d5b7dd25c [OpenMP] Add a test for D158802 (#70678)
In D158802 we honored user's `thread_limit` value even with the
optimization
introduced in D152014. This patch adds a simple test.
2023-10-30 15:59:05 -04:00
Joseph Huber
9e390a1408 [libc][Obvious] Fix missing semicolon in AMDGPU loader implementation
Summary:
Title
2023-10-30 14:58:46 -05:00
Jakub Kuderski
651d88e332 [mlir][vector] Update reduction kind docs. NFC. (#70673)
Update the documentation surrounding reduction kinds. Highlight
different min/max reduction kinds for signed/unsigned integers and
floats. Update IR examples.
2023-10-30 15:58:33 -04:00
Michael Maitland
093bc6b61a [RISCV] SiFive7 VLDS Sched should not depend on VL when stride is x0. (#70266)
When stride is x0, a strided load should behave like a unit stride load,
which uses the VLDE sched class.

---------

Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>
2023-10-30 15:47:45 -04:00
Michael Maitland
04dd2ac03a [RISCV][GlobalISel] Select G_GLOBAL_VALUE (#70091)
G_GLOBAL_VALUE should be lowered into an absolute address if
`-codemodel=small` is used or into a PC-relative if `-codemodel=medium`
is used.

PR #68380 tried to create special instructions to do this, but I don't
see why we need to do that.
2023-10-30 15:46:36 -04:00
Vlad Serebrennikov
7c2ef38c36 [mlir][NFC] Use llvm::to_underlying in sparse tensor IR detail 2023-10-30 22:34:50 +03:00
Aiden Grossman
96410a6b14 Revert "[Github] Fetch all commits in PR for code formatting checks (#69766)"
This reverts commit 4aa12afb96.

This change introduced failures upon checking out the PR source code.
Pulling this out of tree while I investigate further.
2023-10-30 12:33:35 -07:00
Aiden Grossman
4aa12afb96 [Github] Fetch all commits in PR for code formatting checks (#69766)
This patch makes a couple changes to the PR code formatting check:
- Moves the `changed-files` action to before the checkout to make sure
that it pulls
information from the Github API rather than by running `git diff` to
alleviate some
performance problems.
- Checkout the head of the pull request head instead of the base of the
pull request
to ensure that we have the PR commits inside the checkout.
- Add an additional sparse checkout of the necessary LLVM tools to run
the action
to alleviate security problems introduced by checking out the head of
the pull
request. Only code from the base of the pull request runs.
- Adjust the commit references to be based on `HEAD` as Github doesn't
give
exact commit SHAs for the first commit in the PR.
2023-10-30 12:23:51 -07:00
Philip Reames
3f2ed812f0 [InstCombine] Infer nneg on zext when forming from non-negative sext (#70706)
Builds on #67982 which recently introduced the nneg flag on a zext
instruction. InstCombine is one of our largest canonicalizers of zext
from non-negative sext instructions, so set the flag there.
2023-10-30 12:09:43 -07:00
Vlad Serebrennikov
d0caa4eef7 [ADT] Backport std::to_underlying from C++23 (#70681)
This patch backports a one-liner `std::to_underlying` that came with C++23. This is useful for refactoring unscoped enums into scoped enums, because the latter are not implicitly convertible to integer types.

I followed libc++ implementation, but I consider their testing too heavy for us, so I wrote a simpler set of tests.
2023-10-30 23:06:28 +04:00
Jan Kokemüller
134c915955 [libc++] Fix UB in <expected> related to "has value" flag (#68552) (#68733)
The calls to std::construct_at might overwrite the previously set
__has_value_ flag in the case where the flag is overlapping with
the actual value or error being stored (since we use [[no_unique_address]]).
To fix this issue, this patch ensures that we initialize the
__has_value_ flag after we call std::construct_at.

Fixes #68552
2023-10-30 14:56:03 -04:00
Igor Kirillov
849f963e31 [CodeGen] Improve ExpandMemCmp for more efficient non-register aligned sizes handling (#70469)
* Enhanced the logic of ExpandMemCmp pass to merge contiguous
subsequences
  in LoadSequence, based on sizes allowed in `AllowedTailExpansions`.
* This enhancement seeks to minimize the number of basic blocks and
produce
  optimized code when using memcmp with non-register aligned sizes.
* Enable this feature for AArch64 with memcmp sizes modulo 8 equal to
  3, 5, and 6.

Reapplication of #69942 after fixing a bug
2023-10-30 18:40:48 +00:00
Philip Reames
89564f0b69 Regenerate a set of auto-update tests [nfc]
To reduce the spurious test delta in an upcoming change.
2023-10-30 11:36:43 -07:00
Jon Chesterfield
896749aa0d [amdgpu][openmp] Avoiding writing to packet header twice (#70695)
I think it follows from the HSA spec that a write to the first byte is
deemed significant to the GPU in which case writing to the second short
and reading back from it later would be safe. However, the examples for
this all involve an atomic write to the first 32 bits and it seems a
credible risk that the occasional CI errors abound invalid packets have
as their root cause that the firmware notices the early write to
packet->setup and treats that as a sign that the packet is ready to go.

That was overly-paranoid, however in passing noticed the code in libc is
genuinely invalid. The memset writes a zero to the header byte, changing
it from type_invalid (1) to type_vendor (0), at which point the GPU is
free to read the 64 byte packet and interpret it as a vendor packet,
which is probably why libc CI periodically errors about invalid packets.

Also a drive by change to do the atomic store on a uint32_t
consistently. I'm not sure offhand what __atomic_store_n on a uint16_t*
and an int resolves to, seems better to be unambiguous there.
2023-10-30 18:35:52 +00:00
Antonio Frighetto
9fe5700611 [AArch64] Add support for v8.4a ldapur/stlur
AArch64 backend now features v8.4a atomic Load-Acquire
RCpc and Store-Release register unscaled support.
2023-10-30 19:27:48 +01:00
Antonio Frighetto
a8799719f7 [AArch64] Introduce tests for PR67879 (NFC) 2023-10-30 19:27:48 +01:00
Simon Pilgrim
8094119376 [X86] IceLakeServer - ZMM FMA can only use Port0
Fix discrepancy from when this was forked from the SkylakeServer model

Confirmed with Agner + uops.info
2023-10-30 18:16:56 +00:00
Simon Pilgrim
1de5fe18d8 [MCA][X86] Add AVX512 FMA instruction test coverage 2023-10-30 18:16:56 +00:00
DaPorkchop_
b45236f133 [clang] Implement constexpr bit_cast for vectors (#66894)
This makes __builtin_bit_cast support converting to and from vector
types in a constexpr context.
2023-10-30 11:15:36 -07:00
Valentin Clement (バレンタイン クレメン)
0f8615f4dc [flang][openacc][openmp] Set correct location on atomic operations (#70680)
The location set on atomic operations in both OpenMP and OpenACC was
completly off. The real location needs to be created from the source
CharBlock of the parse tree node of the respective atomic statement.
This patch updates locations in lowering for atomic operations.
2023-10-30 10:35:43 -07:00
Nikita Popov
e46dd6fbc0 Revert "[InstCombine] Simplify and/or of icmp eq with op replacement (#70335)"
This reverts commit 1770a2e325.

Stage 2 llvm-tblgen crashes when generating X86GenAsmWriter.inc and
other files.
2023-10-30 18:33:03 +01:00
Craig Topper
9a7c26a399 [GISel] Restrict G_BSWAP to multiples of 16 bits. (#70245)
This is consistent with the IR verifier and SelectionDAG's getNode.

Update tests accordingly. I tried to keep some coverage of non-pow2 when
possible. X86 didn't like a G_UNMERGE_VALUES from s48 to 3 s16 that got
created when I tried s48.
2023-10-30 10:27:57 -07:00
Leandro Lupori
7358c26d6a [flang] Check for overflows in RESHAPE folding (#68342)
TotalElementCount() was modified to return std::optional<uint64_t>,
where std::nullopt means overflow occurred. Besides the additional
check in RESHAPE folding, all callers of TotalElementCount() were
changed, to also check for overflows.
2023-10-30 14:25:21 -03:00
Craig Topper
77e88db6b7 [RISCV][GISel] Add missing curly brace to test. NFC 2023-10-30 10:12:56 -07:00
Michael Buch
c3f7ca7810 [lldb][Test] TestDataFormatterLibcxxChrono.py: skip test on older clang versions (#70544)
These tests were failing on the LLDB public matrix build-bots for older
clang versions:
```
clang-7: warning: argument unused during compilation: '-nostdlib++' [-Wunused-command-line-argument]
error: invalid value 'c++20' in '-std=c++20'
note: use 'c++98' or 'c++03' for 'ISO C++ 1998 with amendments' standard
note: use 'gnu++98' or 'gnu++03' for 'ISO C++ 1998 with amendments and GNU extensions' standard
note: use 'c++11' for 'ISO C++ 2011 with amendments' standard
note: use 'gnu++11' for 'ISO C++ 2011 with amendments and GNU extensions' standard
note: use 'c++14' for 'ISO C++ 2014 with amendments' standard
note: use 'gnu++14' for 'ISO C++ 2014 with amendments and GNU extensions' standard
note: use 'c++17' for 'ISO C++ 2017 with amendments' standard
note: use 'gnu++17' for 'ISO C++ 2017 with amendments and GNU extensions' standard
note: use 'c++2a' for 'Working draft for ISO C++ 2020' standard
note: use 'gnu++2a' for 'Working draft for ISO C++ 2020 with GNU extensions' standard
make: *** [main.o] Error 1
```

The test fails because we try to compile it with `-std=c++20` (which is
required for std::chrono::{days,weeks,months,years}) on clang versions
that don't support the `-std=c++20` flag.

We could change the test to conditionally compile the C++20 parts of the
test based on the `-std=` flag and have two versions of the python
tests, one for the C++11 chrono features and one for the C++20 features.

This patch instead just disables the test on older clang versions
(because it's simpler and we don't really lose important coverage).
2023-10-30 17:08:25 +00:00
Nick Desaulniers
693941132e [docs] mention that DenseMap has a SmallDenseMap variant (#70677)
via https://github.com/llvm/llvm-project/pull/67699/files#r1375105711
2023-10-30 10:07:58 -07:00
Jake Egan
a1b4005bae [clang][Module] Mark test unsupported since objc doesn't have xcoff/g… (#70661)
…off support

Same as D135848. The newly added test fails with `fatal error: error in
backend: Objective-C support is unimplemented for object file format`.
2023-10-30 13:06:49 -04:00
Adrian Prantl
c42b640208 Fix the DEVELOPER_DIR computation (#70528)
The code was incorrectly going into the wrong direction by removing one
component instead of appendeing /Developer to it. Due to fallback
mechanisms in xcrun this never seemed to have caused any issues.
2023-10-30 10:00:40 -07:00
Craig Topper
284d136c4a [RISCV] Teach copyPhysReg to allow copies between GPR<->FPR32/FPR64 (#70525)
This is needed because GISel emits copies instead of bitcasts like
SelectionDAG.
2023-10-30 09:58:51 -07:00
Valentin Clement (バレンタイン クレメン)
f706837e2b [flang][mlir][openacc] Switch device_type representation to an enum (#70250)
Switch the representation from scalar integer to a enumeration. The
parser transform the string in the input to the correct enumeration.
2023-10-30 09:51:42 -07:00
Jay Foad
101008be83 [AMDGPU] CodeGen for 64-bit buffer atomic cmpswap intrinsics (#70475)
Implement codegen for:
llvm.amdgcn.raw.buffer.atomic.cmpswap.i64
llvm.amdgcn.raw.ptr.buffer.atomic.cmpswap.i64
llvm.amdgcn.struct.buffer.atomic.cmpswap.i64
llvm.amdgcn.struct.ptr.buffer.atomic.cmpswap.i64
2023-10-30 16:44:22 +00:00
Louis Dionne
3c5885535a [libc++][tests] Fix a few remaining instances of outdated static assertion regexes in our test suite (#70454)
This is a re-application of 166b3a8617, which was reverted in
fde1ecdec8 because it broke some tests.
2023-10-30 17:28:51 +01:00
Arthur Eubanks
f75370310c [X86] Print 'l' section flag for SHF_X86_64_LARGE (#70380)
When directly compiling to an object file we properly set the section
flag, but not when emitting assembly.
2023-10-30 09:24:18 -07:00
Timm Bäder
8a1719d3ed [clang][Interp][NFC] Use delegate() in VisitCXXBindTemporaryExpr 2023-10-30 17:20:27 +01:00
Z572
7de70e0f72 [Flang][OpenMP] Fix comments that should not be Sentinels on fixed format. (#68911)
Fixes #68653
2023-10-31 00:20:00 +08:00
Alan Phipps
f95b2f1acf Reland "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)"
Part 1 of 3. This includes the LLVM back-end processing and profile
reading/writing components. compiler-rt changes are included.

Differential Revision: https://reviews.llvm.org/D138846
2023-10-30 11:15:02 -05:00
Valentin Clement (バレンタイン クレメン)
dc8c2a7794 [flang][openacc][NFC] Add test for atomic with array ref (#70261)
After #69944 lowering of array ref in atomic operation works properly.
Add some lowering test to catch up regression in the future.
2023-10-30 09:08:57 -07:00
LLVM GN Syncbot
3746f20b56 [gn build] Port 72e6c1c70d 2023-10-30 15:58:08 +00:00
Nick Desaulniers
d9b15b068d [CGExprConstant] stop calling into ConstExprEmitter for Reference type destinations (#70366)
Fixes a bug introduced by
commit b54294e2c9 ("[clang][ConstantEmitter] have
tryEmitPrivate[ForVarInit] try ConstExprEmitter fast-path first")

In the added test case, the QualType is a LValueReferenceType.

    LValueReferenceType 0x558412998d90 'const char (&)[41]'
    `-ParenType 0x558412998d30 'const char[41]' sugar
      `-ConstantArrayType 0x558412998cf0 'const char[41]' 41
        `-QualType 0x55841294c271 'const char' const
          `-BuiltinType 0x55841294c270 'char'

Fixes: #69979
2023-10-30 08:48:31 -07:00
Luke Lau
fecd11ba87 [RISCV] Remove old peephole declaration in RISCVISelDAGToDAG.h. NFC
It was removed in 72e6c1c70d
2023-10-30 15:45:54 +00:00
David Spickett
bb9dced2d3 [lldb][AArch64][Linux] Rename Is<ext>Enabled to Is<ext>Present (#70303)
For most register sets, if it was enabled this meant you could use it,
it was present in the process. There was no present but turned off
state. So "enabled" made sense.

Then ZA came along (and soon to be ZT0) where ZA can be present in the
hardware when you have SME, but ZA itself can be made inactive. This
means that "IsZAEnabled()" doesn't mean is it active, it means do you
have SME. Which is very confusing when we actually want to know if ZA is
active.

So instead say "IsZAPresent", to make these checks more specific. For
things that can't be made inactive, present will imply "active" as
they're never inactive.
2023-10-30 15:45:40 +00:00
tsitdikov
8bc4462bc1 Remove unused variable. (#70670)
All usages of the variable have been removed in
https://github.com/llvm/llvm-project/pull/68689, we now need to clean it
up.
2023-10-30 16:37:30 +01:00
Natalie Chouinard
f89b85996a [HLSL][SPIR-V] Fix clang driver lang target test (#70330)
This test has been failing since the SPIR-V backend started failing
explicitly on unsupported shader types. Switched this test to a compute
shader since it is currently the only type supported.
2023-10-30 11:36:38 -04:00
Timm Baeder
56dab2cb07 [clang][Interp] Fix truncateCast() (#69911)
The added test case used to fail because we converted the LHS to `-1`.
2023-10-30 16:27:47 +01:00
Jessica Del
849297c97d [AMDGPU][wmma] - Add tied wmma intrinsic (#69903)
These new intrinsics, `amdgcn_wmma_tied_f16_16x16x16_f16` and
`amdgcn_wmma_tied_f16_16x16x16_f16`,
explicitly tie the destination accumulator matrix to the input
accumulator matrix.

The `wmma_f16` and `wmma_bf16` intrinsics only write to 16-bit of the
32-bit destination VGPRs.
Which half is determined via the `op_sel` argument. The other half of
the destination registers remains unchanged.

In some cases however, we expect the destination to copy the other
halves from the input accumulator.
For instance, when packing two separate accumulator matrices into one.
In that case, the two matrices
are tied into the same registers, but separate halves. Then it is
important to copy the other matrix values
to the new destination.
2023-10-30 16:23:49 +01:00
Luke Lau
72e6c1c70d [RISCV] Begin moving post-isel vector peepholes to a MF pass (#70342)
We currently have three postprocess peephole optimisations for vector
pseudos:

1) Masked pseudo with all ones mask -> unmasked pseudo
2) Merge vmerge pseudo into operand pseudo's mask
3) vmerge pseudo with all ones mask -> vmv.v.v pseudo

This patch aims to move these peepholes out of SelectionDAG and into a
separate RISCVFoldMasks MachineFunction pass.

There are a few motivations for doing this:

* The current SelectionDAG implementation operates on MachineSDNodes,
which are essentially MachineInstrs but require a bunch of logic to
reason about chain and glue operands. The RISCVII::has*Op helper
functions also don't exactly line up with the SDNode operands. Mutating
these pseudos and their operands in place becomes a good bit easier at
the MachineInstr level. For example, we would no longer need to check
for cycles in the DAG during performCombineVMergeAndVOps.

* Although it's further down the line, moving this code out of
SelectionDAG allows it to be reused by GlobalISel later on.

* In performCombineVMergeAndVOps, it may be possible to commute the
operands to enable folding in more cases (see
test/CodeGen/RISCV/rvv/vmadd-vp.ll). There is existing machinery to
commute operands in TII::commuteInstruction, but it's implemented on
MachineInstrs.

The pass runs straight after ISel, before any of the other machine SSA
optimization passes run. This is so that dead-mi-elimination can mop up
any vmsets that are no longer used (but if preferred we could try and
erase them from inside RISCVFoldMasks itself). This also means that
these peepholes are no longer run at codegen -O0, so this patch isn't
strictly NFC.

Only the performVMergeToVMv peephole is refactored in this patch, the
remaining two would be implemented later. And as noted by @preames, it
should be possible to move doPeepholeSExtW out of SelectionDAG as well.
2023-10-30 15:17:00 +00:00