intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-13 11:02:04 +08:00

Author	SHA1	Message	Date
Pete Lawrence	92d8a28cc6	[lldb] Part 2 of 2 - Refactor `CommandObject::DoExecute(...)` return `void` (not `bool`) (#69991 ) [lldb] Part 2 of 2 - Refactor `CommandObject::DoExecute(...)` to return `void` instead of ~~`bool`~~ Justifications: - The code doesn't ultimately apply the `true`/`false` return values. - The methods already pass around a `CommandReturnObject`, typically with a `result` parameter. - Each command return object already contains: - A more precise status - The error code(s) that apply to that status Part 1 refactors the `CommandObject::Execute(...)` method. - See [https://github.com/llvm/llvm-project/pull/69989](https://github.com/llvm/llvm-project/pull/69989) rdar://117378957	2023-10-30 13:21:00 -07:00
Teresa Johnson	2446439f51	[MemProf] Handle profiles with missing column numbers (#70520 ) Detect when we are matching a memprof profile with no column numbers, and in that case treat all column numbers as 0 when matching. The profiled binary might have been built with -gno-column-info, for example.	2023-10-30 13:19:37 -07:00
Philip Reames	cc6f9cf5a2	[RISCV] Add zbb coverage to test file [nfc]	2023-10-30 13:18:35 -07:00
Andrew Gozillon	68c384676c	[Flang][MLIR][OpenMP] Temporarily re-add basic handling of uses in target regions to avoid gfortran test-suite regressions This was a regression introduced by myself in: `6a62707c04` where I too hastily removed the basic handling of implicit captures we have currently. This will be superseded by all implicit captures being added to target operations map_info entries in a soon landing series of patches, however, that is currently not the case so we must continue to do some basic handling of these captures for the time being. This patch re-adds that behaviour to avoid regressions. Unfortunately this means some test changes as well as getUsedValuesDefinedAbove grabs constants used outside of the target region which aren't handled particularly well currently.	2023-10-30 15:10:12 -05:00
Shilei Tian	0d5b7dd25c	[OpenMP] Add a test for D158802 (#70678 ) In D158802 we honored user's `thread_limit` value even with the optimization introduced in D152014. This patch adds a simple test.	2023-10-30 15:59:05 -04:00
Joseph Huber	9e390a1408	[libc][Obvious] Fix missing semicolon in AMDGPU loader implementation Summary: Title	2023-10-30 14:58:46 -05:00
Jakub Kuderski	651d88e332	[mlir][vector] Update reduction kind docs. NFC. (#70673 ) Update the documentation surrounding reduction kinds. Highlight different min/max reduction kinds for signed/unsigned integers and floats. Update IR examples.	2023-10-30 15:58:33 -04:00
Michael Maitland	093bc6b61a	[RISCV] SiFive7 VLDS Sched should not depend on VL when stride is x0. (#70266 ) When stride is x0, a strided load should behave like a unit stride load, which uses the VLDE sched class. --------- Co-authored-by: Wang Pengcheng <wangpengcheng.pp@bytedance.com>	2023-10-30 15:47:45 -04:00
Michael Maitland	04dd2ac03a	[RISCV][GlobalISel] Select G_GLOBAL_VALUE (#70091 ) G_GLOBAL_VALUE should be lowered into an absolute address if `-codemodel=small` is used or into a PC-relative if `-codemodel=medium` is used. PR #68380 tried to create special instructions to do this, but I don't see why we need to do that.	2023-10-30 15:46:36 -04:00
Vlad Serebrennikov	7c2ef38c36	[mlir][NFC] Use `llvm::to_underlying` in sparse tensor IR detail	2023-10-30 22:34:50 +03:00
Aiden Grossman	96410a6b14	Revert "[Github] Fetch all commits in PR for code formatting checks (#69766 )" This reverts commit `4aa12afb96`. This change introduced failures upon checking out the PR source code. Pulling this out of tree while I investigate further.	2023-10-30 12:33:35 -07:00
Aiden Grossman	4aa12afb96	[Github] Fetch all commits in PR for code formatting checks (#69766 ) This patch makes a couple changes to the PR code formatting check: - Moves the `changed-files` action to before the checkout to make sure that it pulls information from the Github API rather than by running `git diff` to alleviate some performance problems. - Checkout the head of the pull request head instead of the base of the pull request to ensure that we have the PR commits inside the checkout. - Add an additional sparse checkout of the necessary LLVM tools to run the action to alleviate security problems introduced by checking out the head of the pull request. Only code from the base of the pull request runs. - Adjust the commit references to be based on `HEAD` as Github doesn't give exact commit SHAs for the first commit in the PR.	2023-10-30 12:23:51 -07:00
Philip Reames	3f2ed812f0	[InstCombine] Infer nneg on zext when forming from non-negative sext (#70706 ) Builds on #67982 which recently introduced the nneg flag on a zext instruction. InstCombine is one of our largest canonicalizers of zext from non-negative sext instructions, so set the flag there.	2023-10-30 12:09:43 -07:00
Vlad Serebrennikov	d0caa4eef7	[ADT] Backport std::to_underlying from C++23 (#70681 ) This patch backports a one-liner `std::to_underlying` that came with C++23. This is useful for refactoring unscoped enums into scoped enums, because the latter are not implicitly convertible to integer types. I followed libc++ implementation, but I consider their testing too heavy for us, so I wrote a simpler set of tests.	2023-10-30 23:06:28 +04:00
Jan Kokemüller	134c915955	[libc++] Fix UB in <expected> related to "has value" flag (#68552 ) (#68733 ) The calls to std::construct_at might overwrite the previously set __has_value_ flag in the case where the flag is overlapping with the actual value or error being stored (since we use [[no_unique_address]]). To fix this issue, this patch ensures that we initialize the __has_value_ flag after we call std::construct_at. Fixes #68552	2023-10-30 14:56:03 -04:00
Igor Kirillov	849f963e31	[CodeGen] Improve ExpandMemCmp for more efficient non-register aligned sizes handling (#70469 ) * Enhanced the logic of ExpandMemCmp pass to merge contiguous subsequences in LoadSequence, based on sizes allowed in `AllowedTailExpansions`. * This enhancement seeks to minimize the number of basic blocks and produce optimized code when using memcmp with non-register aligned sizes. * Enable this feature for AArch64 with memcmp sizes modulo 8 equal to 3, 5, and 6. Reapplication of #69942 after fixing a bug	2023-10-30 18:40:48 +00:00
Philip Reames	89564f0b69	Regenerate a set of auto-update tests [nfc] To reduce the spurious test delta in an upcoming change.	2023-10-30 11:36:43 -07:00
Jon Chesterfield	896749aa0d	[amdgpu][openmp] Avoiding writing to packet header twice (#70695 ) I think it follows from the HSA spec that a write to the first byte is deemed significant to the GPU in which case writing to the second short and reading back from it later would be safe. However, the examples for this all involve an atomic write to the first 32 bits and it seems a credible risk that the occasional CI errors abound invalid packets have as their root cause that the firmware notices the early write to packet->setup and treats that as a sign that the packet is ready to go. That was overly-paranoid, however in passing noticed the code in libc is genuinely invalid. The memset writes a zero to the header byte, changing it from type_invalid (1) to type_vendor (0), at which point the GPU is free to read the 64 byte packet and interpret it as a vendor packet, which is probably why libc CI periodically errors about invalid packets. Also a drive by change to do the atomic store on a uint32_t consistently. I'm not sure offhand what __atomic_store_n on a uint16_t* and an int resolves to, seems better to be unambiguous there.	2023-10-30 18:35:52 +00:00
Antonio Frighetto	9fe5700611	[AArch64] Add support for v8.4a `ldapur`/`stlur` AArch64 backend now features v8.4a atomic Load-Acquire RCpc and Store-Release register unscaled support.	2023-10-30 19:27:48 +01:00
Antonio Frighetto	a8799719f7	[AArch64] Introduce tests for PR67879 (NFC)	2023-10-30 19:27:48 +01:00
Simon Pilgrim	8094119376	[X86] IceLakeServer - ZMM FMA can only use Port0 Fix discrepancy from when this was forked from the SkylakeServer model Confirmed with Agner + uops.info	2023-10-30 18:16:56 +00:00
Simon Pilgrim	1de5fe18d8	[MCA][X86] Add AVX512 FMA instruction test coverage	2023-10-30 18:16:56 +00:00
DaPorkchop_	b45236f133	[clang] Implement constexpr bit_cast for vectors (#66894 ) This makes __builtin_bit_cast support converting to and from vector types in a constexpr context.	2023-10-30 11:15:36 -07:00
Valentin Clement (バレンタインクレメン)	0f8615f4dc	[flang][openacc][openmp] Set correct location on atomic operations (#70680 ) The location set on atomic operations in both OpenMP and OpenACC was completly off. The real location needs to be created from the source CharBlock of the parse tree node of the respective atomic statement. This patch updates locations in lowering for atomic operations.	2023-10-30 10:35:43 -07:00
Nikita Popov	e46dd6fbc0	Revert "[InstCombine] Simplify and/or of icmp eq with op replacement (#70335 )" This reverts commit `1770a2e325`. Stage 2 llvm-tblgen crashes when generating X86GenAsmWriter.inc and other files.	2023-10-30 18:33:03 +01:00
Craig Topper	9a7c26a399	[GISel] Restrict G_BSWAP to multiples of 16 bits. (#70245 ) This is consistent with the IR verifier and SelectionDAG's getNode. Update tests accordingly. I tried to keep some coverage of non-pow2 when possible. X86 didn't like a G_UNMERGE_VALUES from s48 to 3 s16 that got created when I tried s48.	2023-10-30 10:27:57 -07:00
Leandro Lupori	7358c26d6a	[flang] Check for overflows in RESHAPE folding (#68342 ) TotalElementCount() was modified to return std::optional<uint64_t>, where std::nullopt means overflow occurred. Besides the additional check in RESHAPE folding, all callers of TotalElementCount() were changed, to also check for overflows.	2023-10-30 14:25:21 -03:00
Craig Topper	77e88db6b7	[RISCV][GISel] Add missing curly brace to test. NFC	2023-10-30 10:12:56 -07:00
Michael Buch	c3f7ca7810	[lldb][Test] TestDataFormatterLibcxxChrono.py: skip test on older clang versions (#70544 ) These tests were failing on the LLDB public matrix build-bots for older clang versions: ``` clang-7: warning: argument unused during compilation: '-nostdlib++' [-Wunused-command-line-argument] error: invalid value 'c++20' in '-std=c++20' note: use 'c++98' or 'c++03' for 'ISO C++ 1998 with amendments' standard note: use 'gnu++98' or 'gnu++03' for 'ISO C++ 1998 with amendments and GNU extensions' standard note: use 'c++11' for 'ISO C++ 2011 with amendments' standard note: use 'gnu++11' for 'ISO C++ 2011 with amendments and GNU extensions' standard note: use 'c++14' for 'ISO C++ 2014 with amendments' standard note: use 'gnu++14' for 'ISO C++ 2014 with amendments and GNU extensions' standard note: use 'c++17' for 'ISO C++ 2017 with amendments' standard note: use 'gnu++17' for 'ISO C++ 2017 with amendments and GNU extensions' standard note: use 'c++2a' for 'Working draft for ISO C++ 2020' standard note: use 'gnu++2a' for 'Working draft for ISO C++ 2020 with GNU extensions' standard make: *** [main.o] Error 1 ``` The test fails because we try to compile it with `-std=c++20` (which is required for std::chrono::{days,weeks,months,years}) on clang versions that don't support the `-std=c++20` flag. We could change the test to conditionally compile the C++20 parts of the test based on the `-std=` flag and have two versions of the python tests, one for the C++11 chrono features and one for the C++20 features. This patch instead just disables the test on older clang versions (because it's simpler and we don't really lose important coverage).	2023-10-30 17:08:25 +00:00
Nick Desaulniers	693941132e	[docs] mention that DenseMap has a SmallDenseMap variant (#70677 ) via https://github.com/llvm/llvm-project/pull/67699/files#r1375105711	2023-10-30 10:07:58 -07:00
Jake Egan	a1b4005bae	[clang][Module] Mark test unsupported since objc doesn't have xcoff/g… (#70661 ) …off support Same as D135848. The newly added test fails with `fatal error: error in backend: Objective-C support is unimplemented for object file format`.	2023-10-30 13:06:49 -04:00
Adrian Prantl	c42b640208	Fix the DEVELOPER_DIR computation (#70528 ) The code was incorrectly going into the wrong direction by removing one component instead of appendeing /Developer to it. Due to fallback mechanisms in xcrun this never seemed to have caused any issues.	2023-10-30 10:00:40 -07:00
Craig Topper	284d136c4a	[RISCV] Teach copyPhysReg to allow copies between GPR<->FPR32/FPR64 (#70525 ) This is needed because GISel emits copies instead of bitcasts like SelectionDAG.	2023-10-30 09:58:51 -07:00
Valentin Clement (バレンタインクレメン)	f706837e2b	[flang][mlir][openacc] Switch device_type representation to an enum (#70250 ) Switch the representation from scalar integer to a enumeration. The parser transform the string in the input to the correct enumeration.	2023-10-30 09:51:42 -07:00
Jay Foad	101008be83	[AMDGPU] CodeGen for 64-bit buffer atomic cmpswap intrinsics (#70475 ) Implement codegen for: llvm.amdgcn.raw.buffer.atomic.cmpswap.i64 llvm.amdgcn.raw.ptr.buffer.atomic.cmpswap.i64 llvm.amdgcn.struct.buffer.atomic.cmpswap.i64 llvm.amdgcn.struct.ptr.buffer.atomic.cmpswap.i64	2023-10-30 16:44:22 +00:00
Louis Dionne	3c5885535a	[libc++][tests] Fix a few remaining instances of outdated static assertion regexes in our test suite (#70454 ) This is a re-application of `166b3a8617`, which was reverted in `fde1ecdec8` because it broke some tests.	2023-10-30 17:28:51 +01:00
Arthur Eubanks	f75370310c	[X86] Print 'l' section flag for SHF_X86_64_LARGE (#70380 ) When directly compiling to an object file we properly set the section flag, but not when emitting assembly.	2023-10-30 09:24:18 -07:00
Timm Bäder	8a1719d3ed	[clang][Interp][NFC] Use delegate() in VisitCXXBindTemporaryExpr	2023-10-30 17:20:27 +01:00
Z572	7de70e0f72	[Flang][OpenMP] Fix comments that should not be Sentinels on fixed format. (#68911 ) Fixes #68653	2023-10-31 00:20:00 +08:00
Alan Phipps	f95b2f1acf	Reland "[InstrProf][compiler-rt] Enable MC/DC Support in LLVM Source-based Code Coverage (1/3)" Part 1 of 3. This includes the LLVM back-end processing and profile reading/writing components. compiler-rt changes are included. Differential Revision: https://reviews.llvm.org/D138846	2023-10-30 11:15:02 -05:00
Valentin Clement (バレンタインクレメン)	dc8c2a7794	[flang][openacc][NFC] Add test for atomic with array ref (#70261 ) After #69944 lowering of array ref in atomic operation works properly. Add some lowering test to catch up regression in the future.	2023-10-30 09:08:57 -07:00
LLVM GN Syncbot	3746f20b56	[gn build] Port `72e6c1c70d`	2023-10-30 15:58:08 +00:00
Nick Desaulniers	d9b15b068d	[CGExprConstant] stop calling into ConstExprEmitter for Reference type destinations (#70366 ) Fixes a bug introduced by commit `b54294e2c9` ("[clang][ConstantEmitter] have tryEmitPrivate[ForVarInit] try ConstExprEmitter fast-path first") In the added test case, the QualType is a LValueReferenceType. LValueReferenceType 0x558412998d90 'const char (&)[41]' `-ParenType 0x558412998d30 'const char[41]' sugar `-ConstantArrayType 0x558412998cf0 'const char[41]' 41 `-QualType 0x55841294c271 'const char' const `-BuiltinType 0x55841294c270 'char' Fixes: #69979	2023-10-30 08:48:31 -07:00
Luke Lau	fecd11ba87	[RISCV] Remove old peephole declaration in RISCVISelDAGToDAG.h. NFC It was removed in `72e6c1c70d`	2023-10-30 15:45:54 +00:00
David Spickett	bb9dced2d3	[lldb][AArch64][Linux] Rename Is<ext>Enabled to Is<ext>Present (#70303 ) For most register sets, if it was enabled this meant you could use it, it was present in the process. There was no present but turned off state. So "enabled" made sense. Then ZA came along (and soon to be ZT0) where ZA can be present in the hardware when you have SME, but ZA itself can be made inactive. This means that "IsZAEnabled()" doesn't mean is it active, it means do you have SME. Which is very confusing when we actually want to know if ZA is active. So instead say "IsZAPresent", to make these checks more specific. For things that can't be made inactive, present will imply "active" as they're never inactive.	2023-10-30 15:45:40 +00:00
tsitdikov	8bc4462bc1	Remove unused variable. (#70670 ) All usages of the variable have been removed in https://github.com/llvm/llvm-project/pull/68689, we now need to clean it up.	2023-10-30 16:37:30 +01:00
Natalie Chouinard	f89b85996a	[HLSL][SPIR-V] Fix clang driver lang target test (#70330 ) This test has been failing since the SPIR-V backend started failing explicitly on unsupported shader types. Switched this test to a compute shader since it is currently the only type supported.	2023-10-30 11:36:38 -04:00
Timm Baeder	56dab2cb07	[clang][Interp] Fix truncateCast() (#69911 ) The added test case used to fail because we converted the LHS to `-1`.	2023-10-30 16:27:47 +01:00
Jessica Del	849297c97d	[AMDGPU][wmma] - Add tied wmma intrinsic (#69903 ) These new intrinsics, `amdgcn_wmma_tied_f16_16x16x16_f16` and `amdgcn_wmma_tied_f16_16x16x16_f16`, explicitly tie the destination accumulator matrix to the input accumulator matrix. The `wmma_f16` and `wmma_bf16` intrinsics only write to 16-bit of the 32-bit destination VGPRs. Which half is determined via the `op_sel` argument. The other half of the destination registers remains unchanged. In some cases however, we expect the destination to copy the other halves from the input accumulator. For instance, when packing two separate accumulator matrices into one. In that case, the two matrices are tied into the same registers, but separate halves. Then it is important to copy the other matrix values to the new destination.	2023-10-30 16:23:49 +01:00
Luke Lau	72e6c1c70d	[RISCV] Begin moving post-isel vector peepholes to a MF pass (#70342 ) We currently have three postprocess peephole optimisations for vector pseudos: 1) Masked pseudo with all ones mask -> unmasked pseudo 2) Merge vmerge pseudo into operand pseudo's mask 3) vmerge pseudo with all ones mask -> vmv.v.v pseudo This patch aims to move these peepholes out of SelectionDAG and into a separate RISCVFoldMasks MachineFunction pass. There are a few motivations for doing this: * The current SelectionDAG implementation operates on MachineSDNodes, which are essentially MachineInstrs but require a bunch of logic to reason about chain and glue operands. The RISCVII::hasOp helper functions also don't exactly line up with the SDNode operands. Mutating these pseudos and their operands in place becomes a good bit easier at the MachineInstr level. For example, we would no longer need to check for cycles in the DAG during performCombineVMergeAndVOps. Although it's further down the line, moving this code out of SelectionDAG allows it to be reused by GlobalISel later on. * In performCombineVMergeAndVOps, it may be possible to commute the operands to enable folding in more cases (see test/CodeGen/RISCV/rvv/vmadd-vp.ll). There is existing machinery to commute operands in TII::commuteInstruction, but it's implemented on MachineInstrs. The pass runs straight after ISel, before any of the other machine SSA optimization passes run. This is so that dead-mi-elimination can mop up any vmsets that are no longer used (but if preferred we could try and erase them from inside RISCVFoldMasks itself). This also means that these peepholes are no longer run at codegen -O0, so this patch isn't strictly NFC. Only the performVMergeToVMv peephole is refactored in this patch, the remaining two would be implemented later. And as noted by @preames, it should be possible to move doPeepholeSExtW out of SelectionDAG as well.	2023-10-30 15:17:00 +00:00

1 2 3 4 5 ...

479128 Commits