intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-26 03:56:16 +08:00

Author	SHA1	Message	Date
Steven Wu	e23dce6c97	[Support] Get correct number of physical cores on Apple Silicon Fix a bug that `computeHostNumPhysicalCores` is fallback to default unknown when building for Apple Silicon macs. rdar://80533675 Reviewed By: arphaman Differential Revision: https://reviews.llvm.org/D106012	2021-07-14 13:29:54 -07:00
Nicolas Vasilache	7b47de774f	[mlir] NFC - Add AffineMap::replace variant with dim/symbol inference	2021-07-14 20:29:12 +00:00
Philip Reames	b86ddfdb9a	Global variables with strong definitions cannot be freed With the current deref semantics, this is redundant - since we assume that anything which is dereferenceable (ever) can't be freed - but it becomes neccessary for the deref-at-point semantics. Testing wise, this is covered by test/CodeGen/X86/hoist-invariant-load.ll when -use-dereferenceable-at-point-semantics is active. I didn't bother duplicating the command line since a) it's an in-development mode, and b) the change is pretty obvious.	2021-07-14 13:26:18 -07:00
Martin Storsjö	d37689e9ab	[libcxx] [test] Remove a LIBCXX-WINDOWS-FIXME in trivial_abi/unique_ptr_ret This is the same thing that was clarified in D105906 for weak_ptr_ret. Differential Revision: https://reviews.llvm.org/D105965	2021-07-14 23:20:11 +03:00
Philip Reames	e75a2dfe20	[tests] Stablize tests for possible change in deref semantics There's a potential change in dereferenceability attribute semantics in the nearish future. See llvm-dev thread "RFC: Decomposing deref(N) into deref(N) + nofree" and D99100 for context. This change simply adds appropriate attributes to tests to keep transform logic exercised under both old and new/proposed semantics. Note that for many of these cases, O3 would infer exactly these attributes on the test IR. This change handles the idiomatic pattern of a dereferenceable object being passed to a call which can not free that memory. There's a couple other tests which need more one-off attention, they'll be handled in another change.	2021-07-14 13:05:43 -07:00
Stanislav Mekhanoshin	76b7d3432e	[AMDGPU] Add TII::isIgnorableUse() to allow VOP rematerialization Any def of EXEC prevents rematerialization of any VOP instruction because of the physreg use. Create a callback to check if the physreg use can be ingored to allow rematerialization. Differential Revision: https://reviews.llvm.org/D105836	2021-07-14 13:03:58 -07:00
Alexey Bataev	ba2690b17b	[SLP][NFC]Fix variables names, NFC.	2021-07-14 12:43:45 -07:00
Fangrui Song	3bda1c4e22	[docs] Fix :option:`--file-header` reference in llvm-readelf.rst after D105532	2021-07-14 12:39:22 -07:00
Simon Pilgrim	4fd0addb68	[SLP] Fix case of variable name. NFCI.	2021-07-14 20:20:04 +01:00
Geoffrey Martin-Noble	8461995d35	[Bazel] Uniformly export all MLIR td files CMake would have no restrictions on this and the custom list is a pain to maintain. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D106003	2021-07-14 12:18:44 -07:00
Louis Dionne	850b57c5fb	[runtimes] Bring back TARGET_TRIPLE This commit reverts `5099e01568` and `77396bbc98`, which broke the build in various ways. I'm reverting until I can investigate, since that change appears to be way more subtle than it seemed.	2021-07-14 15:15:22 -04:00
Roman Lebedev	dfbfc277b2	[NFC] Drop redundant check prefixes in newly added test file	2021-07-14 22:14:36 +03:00
Nikita Popov	cd88a01cb8	[Attributes] Use single method to fetch type from AttributeSet (NFC) While it is nice to have separate methods in the public AttributeSet API, we can fetch the type from the internal AttributeSetNode using a generic API for all type attribute kinds.	2021-07-14 21:10:56 +02:00
Roman Lebedev	a4856c739c	[NFC][PhaseOrdering] Add test for the lack of CSE after SimplifyCFG (PR51092)	2021-07-14 22:07:38 +03:00
David Green	31b8f40006	[ARM] Move add(VMLALVA(A, X, Y), B) to VMLALVA(add(A, B), X, Y) For i64 reductions we currently try and convert add(VMLALV(X, Y), B) to VMLALVA(B, X, Y), incorporating the addition into the VMLALVA. If we have an add of an existing VMLALVA, this patch pushes the add up above the VMLALVA so that it may potentially be simplified further, for example being folded into another VMLALV. Differential Revision: https://reviews.llvm.org/D105686	2021-07-14 20:06:49 +01:00
Vitaly Buka	14362bf1b2	[scudo] Don't enabled MTE for small alignment Differential Revision: https://reviews.llvm.org/D105954	2021-07-14 12:04:16 -07:00
Mehdi Amini	fbab8e6f10	Remove uses of deprecated target AllPassesAndDialectsNoRegistration in Bazel (NFC) It was an alias for a long time.	2021-07-14 19:02:50 +00:00
Nikita Popov	5e4b33fe92	[Verifier] Improve incompatible attribute type check A couple of attributes had explicit checks for incompatibility with pointer types. However, this is already handled generically by the typeIncompatible() check. We can drop these after adding SwiftError to typeIncompatible(). However, the previous implementation of the check prints out all attributes that are incompatible with a given type, even though those attributes aren't actually used. This has the annoying result that the error message changes every time a new attribute is added to the list. Improve this by explicitly finding which attribute isn't compatible and printing just that.	2021-07-14 21:02:10 +02:00
Saleem Abdulrasool	9c2de23821	Demangle: correct swift_async demangling for Microsoft scheme The emission was corrected for the swift_async calling convention but the demangling support was not. This repairs the demangling support as well.	2021-07-14 11:43:44 -07:00
Eli Friedman	1e30bf8621	[SelectionDAG] Add an overload of getStepVector that assumes step 1. This is mostly a minor convenience, but the pattern seems frequent enough to be worthwhile (and we'll probably add more uses in the future). Differential Revision: https://reviews.llvm.org/D105850	2021-07-14 11:37:01 -07:00
Thomas Lively	970e090010	[WebAssembly] Codegen for v128.loadX_lane instructions Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433. Differential Revision: https://reviews.llvm.org/D105950	2021-07-14 11:31:53 -07:00
Louis Dionne	5099e01568	[runtimes] Inherit the TARGET_TRIPLE that may be set by LLVM	2021-07-14 14:29:29 -04:00
Thomas Lively	122b0220fd	[WebAssembly] Remove datalayout strings from llc tests The data layout strings do not have any effect on llc tests and will become misleadingly out of date as we continue to update the canonical data layout, so remove them from the tests. Differential Revision: https://reviews.llvm.org/D105842	2021-07-14 11:17:08 -07:00
Fangrui Song	7de2173c2a	[ELF] --fortran-common: prefer STB_WEAK to COMMON The ELF specification says "The link editor honors the common definition and ignores the weak ones." GNU ld and our Symbol::compare follow this, but the --fortran-common code (D86142) made a mistake on the precedence. Fixes https://bugs.llvm.org/show_bug.cgi?id=51082 Reviewed By: peter.smith, sfertile Differential Revision: https://reviews.llvm.org/D105945	2021-07-14 10:18:30 -07:00
David Green	338314f9c2	[ARM] Lower v16i8 -> i64 VMLA reductions. MVE does not have a VMLALV instruction that can perform v16i8 -> i64 reductions, like it does for v8i16->i64 and v4i32->i64 reductions. That means that the pattern to create them will be spilt up by type legalization, creating a lot of instructions. This extends the patterns for matching i64 reductions a little to handle the v16i8->i64 case. We need to turn them into a pair of v8i16->i64 VMLALVs that each perform half of the reduction and are summed together (so the later is a VMLALVA). The order of the lanes does not matter for the reduction so we generate a MVEEXT for the extension, that will either be folded into a extending load or can be optimized to a VREV/VMOVL. Some of the resulting codegen isn't optimal, but will be improved in a later patch. Differential Revision: https://reviews.llvm.org/D105680	2021-07-14 18:11:32 +01:00
Sanjay Patel	ca6e117d86	[InstCombine] reorder icmp with offset folds for better results This set of folds was added recently with: `c7b658aeb5` `0c400e8953` `40b752d28d` ...and I noted that this wasn't likely to fire in code derived from C/C++ source because of nsw in particular. But I didn't notice that I had placed the code above the no-wrap block of transforms. This is likely the cause of regressions noted from the previous commit because -- as shown in the test diffs -- we may have transformed into a compare with an arbitrary constant rather than a simpler signbit test.	2021-07-14 12:12:05 -04:00
Sanjay Patel	b155c871f2	[InstCombine] add tests for icmp with constant offset and no-wrap flags; NFC	2021-07-14 12:12:05 -04:00
Sander de Smalen	efaf3099c8	[LV] Print remark when loop cannot be vectorized due to invalid costs. This patch emits remarks for instructions that have invalid costs for a given set of vectorization factors. Some example output: t.c:4:19: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): load dst[i] = sinf(src[i]); ^ t.c:4:14: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1, vscale x 2, vscale x 4): call to llvm.sin.f32 dst[i] = sinf(src[i]); ^ t.c:4:12: remark: Instruction with invalid costs prevented vectorization at VF=(vscale x 1): store dst[i] = sinf(src[i]); ^ Reviewed By: fhahn, kmclaughlin Differential Revision: https://reviews.llvm.org/D105806	2021-07-14 17:11:33 +01:00
Matt Arsenault	47269da5d8	GlobalISel: Handle lowering non-power-of-2 extloads	2021-07-14 11:54:11 -04:00
Sander de Smalen	eac1670739	[CostModel][AArch64] Make loads/stores of <vscale x 1 x eltty> invalid. At the moment, <vscale x 1 x eltty> are not yet fully handled by the code-generator, so to avoid vectorizing loops with that VF, we mark the cost for these types as invalid. The reason for not adding a new "TTI::getMinimumScalableVF" is because the type is supposed to be a type that can be legalized. It partially is, although the support for these types need some more work. Reviewed By: paulwalker-arm, dmgreen Differential Revision: https://reviews.llvm.org/D103882	2021-07-14 16:44:22 +01:00
Aaron Ballman	aefd6c615c	Combine two diagnostics into one and correct grammar The anonymous and non-anonymous bit-field diagnostics are easily combined into one diagnostic. However, the diagnostic was missing a "the" that is present in the almost-identically worded warn_bitfield_width_exceeds_type_width diagnostic, hence the changes to test cases.	2021-07-14 11:43:28 -04:00
Jay Foad	372bb08252	[AMDGPU] Check llc-pipeline.ll with -match-full-lines -strict-whitespace This prevents breaking the indentation that shows the structure of the pass managers. Differential Revision: https://reviews.llvm.org/D105891	2021-07-14 16:33:50 +01:00
Alexey Bataev	2eb50baf05	[SLP]Workaround for InsertSubVector cost. The cost of the InsertSubvector shuffle kind cost is not complete and may end up with just extracts + inserts costs in many cases. Added a workaround to represent it as a generic PermuteSingleSrc, which is still pessimistic but better than InsertSubvector. Differential Revision: https://reviews.llvm.org/D105827	2021-07-14 07:54:24 -07:00
Louis Dionne	77396bbc98	[runtimes] NFCI: Drop intermediate CMake variable TARGET_TRIPLE We might as well use the various XXX_TARGET_TRIPLE variables directly.	2021-07-14 10:49:28 -04:00
Yitzhak Mandelbaum	93dc73b1e0	[Lexer] Fix bug in `makeFileCharRange` called on split tokens. When the end loc of the specified range is a split token, `makeFileCharRange` does not process it correctly. This patch adds proper support for split tokens. Differential Revision: https://reviews.llvm.org/D105365	2021-07-14 14:36:31 +00:00
Peixin Qiao	67002b5f20	[flang][OpenMP] Fix semantic check of test case in taskloop simd construct The following semantic check is removed in OpenMP Version 5.0: ``` Taskloop simd construct restrictions: No reduction clause can be specified. ``` Also fix several typos. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D105874	2021-07-14 10:34:17 -04:00
Jinsong Ji	fe52296a34	[AIX] Enable dollar sign as PC in inlineasm $ is used as PC for PowerPC inlineasm, ELF use it, enable it for AIX XCOFF as well. Reviewed By: #powerpc, amyk, nemanjai Differential Revision: https://reviews.llvm.org/D105956	2021-07-14 13:37:52 +00:00
Matthias Springer	b70dde522d	[mlir][linalg] Fix typo in ExtractSliceOfPadTensorSwapPattern Differential Revision: https://reviews.llvm.org/D105607	2021-07-14 22:30:32 +09:00
oToToT	56e6d4742e	[docs] Update CMake cross compiling guide link The CMake community Wiki has been moved to the [[ https://gitlab.kitware.com/cmake/community/wikis/home \| Kitware GitLab Instance ]]. Also, the original anchor for `Information how to set up various cross compiling toolchains` section might not work as expected. The original content is now being collapsed, so browser won't navigate to the right section directly. Hence, I think it might be better to provide the section name instead of `this section` with link to help readers find the right section by themselves. Reviewed By: void Differential Revision: https://reviews.llvm.org/D104996	2021-07-14 21:19:42 +08:00
Tim Northover	b18bda6791	ARM: reuse existing libcall global variable if possible. If we try to create a new GlobalVariable on each iteration, the Module will detect the name collision and "helpfully" rename later iterations by appending ".1" etc. But "___udivsi3.1" doesn't exist and we definitely don't want to try to call it. So instead check whether there's already a global with the right name in the module and use that if so.	2021-07-14 14:14:47 +01:00
Sanjay Patel	25ee55c0ba	[SLP] match logical and/or as reduction candidates This has been a work-in-progress for a long time...we finally have all of the pieces in place to handle vectorization of compare code as shown in: https://llvm.org/PR41312 To do this (see PhaseOrdering tests), we converted SimplifyCFG and InstCombine to the poison-safe (select) forms of the logic ops, so now we need to have SLP recognize those patterns and insert a freeze op to make a safe reduction: https://alive2.llvm.org/ce/z/NH54Ah We get the minimal patterns with this patch, but the PhaseOrdering tests show that we still need adjustments to get the ideal IR in some or all of the motivating cases. Differential Revision: https://reviews.llvm.org/D105730	2021-07-14 09:02:31 -04:00
Gabor Marton	bdf31471c7	[Analyzer][solver] Add dump methods for (dis)equality classes. This proved to be very useful during debugging. Differential Revision: https://reviews.llvm.org/D103967	2021-07-14 13:45:02 +02:00
Alexander Shaposhnikov	d21772fa21	[lld][MachO] Code cleanup Make use of ArgList::getLastArgValue. NFC. Test plan: make check-lld-macho Differential revision: https://reviews.llvm.org/D105452	2021-07-14 04:33:09 -07:00
Djordje Todorovic	df686842bc	[RemoveRedundantDebugValues] Add a Pass that removes redundant DBG_VALUEs This new MIR pass removes redundant DBG_VALUEs. After the register allocator is done, more precisely, after the Virtual Register Rewriter, we end up having duplicated DBG_VALUEs, since some virtual registers are being rewritten into the same physical register as some of existing DBG_VALUEs. Each DBG_VALUE should indicate (at least before the LiveDebugValues) variables assignment, but it is being clobbered for function parameters during the SelectionDAG since it generates new DBG_VALUEs after COPY instructions, even though the parameter has no assignment. For example, if we had a DBG_VALUE $regX as an entry debug value representing the parameter, and a COPY and after the COPY, DBG_VALUE $virt_reg, and after the virtregrewrite the $virt_reg gets rewritten into $regX, we'd end up having redundant DBG_VALUE. This breaks the definition of the DBG_VALUE since some analysis passes might be built on top of that premise..., and this patch tries to fix the MIR with the respect to that. This first patch performs bacward scan, by trying to detect a sequence of consecutive DBG_VALUEs, and to remove all DBG_VALUEs describing one variable but the last one: For example: (1) DBG_VALUE $edi, !"var1", ... (2) DBG_VALUE $esi, !"var2", ... (3) DBG_VALUE $edi, !"var1", ... ... in this case, we can remove (1). By combining the forward scan that will be introduced in the next patch (from this stack), by inspecting the statistics, the RemoveRedundantDebugValues removes 15032 instructions by using gdb-7.11 as a testbed. Differential Revision: https://reviews.llvm.org/D105279	2021-07-14 04:29:42 -07:00
Simon Pilgrim	d561b6fbdb	[InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183) (REAPPLIED) As discussed on PR50183, we already fold to prefer 'select-of-idx' vs 'select-of-gep': define <4 x i32>* @select0a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) { %gep0 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1 %gep1 = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a3 %sel = select i1 %a2, <4 x i32>* %gep0, <4 x i32>* %gep1 ret <4 x i32>* %sel } --> define <4 x i32>* @select1a(<4 x i32>* %a0, i64 %a1, i1 %a2, i64 %a3) { %sel = select i1 %a2, i64 %a1, i64 %a3 %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel ret <4 x i32>* %gep } This patch adds basic handling for the 'fallthrough' cases where the gep idx == 0 has been folded away to the base address: define <4 x i32>* @select0(<4 x i32>* %a0, i64 %a1, i1 %a2) { %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %a1 %sel = select i1 %a2, <4 x i32>* %a0, <4 x i32>* %gep ret <4 x i32>* %sel } --> define <4 x i32>* @select1(<4 x i32>* %a0, i64 %a1, i1 %a2) { %sel = select i1 %a2, i64 0, i64 %a1 %gep = getelementptr inbounds <4 x i32>, <4 x i32>* %a0, i64 %sel ret <4 x i32>* %gep } Reapplied with a fix for the bpf "-bpf-disable-avoid-speculation" tests Differential Revision: https://reviews.llvm.org/D105901	2021-07-14 12:21:01 +01:00
Chuanqi Xu	12d04ce956	[NFC] [Coroutines] Remove unused CoroFree	2021-07-14 19:13:12 +08:00
Bruce Mitchener	f7d931ac37	[lldb][docs] Remove mention of subversion. NFC. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D103744	2021-07-14 11:04:07 +00:00
Simon Pilgrim	ee71c1bbcc	[X86] Implement smarter instruction lowering for FP_TO_UINT from f32/f64 to i32/i64 and vXf32/vXf64 to vXi32 for SSE2 and AVX2 by using the exact semantic of the CVTTPS2SI instruction. We know that "CVTTPS2SI" returns 0x80000000 for out of range inputs (and for FP_TO_UINT, negative float values are undefined). We can use this to make unsigned conversions from vXf32 to vXi32 more efficient, particularly on targets without blend using the following logic: small := CVTTPS2SI(x); fp_to_ui(x) := small \| (CVTTPS2SI(x - 2^31) & ARITHMETIC_RIGHT_SHIFT(small, 31)) Even on targets where "PBLENDVPS"/"PBLENDVB" exists, it is often a latency 2, low throughput instruction so this logic is applied there too (in particular for AVX2 also). It furthermore gets rid of one high latency floating point comparison in the previous lowering. @TomHender checked the correctness of this for all possible floats between -1 and 2^32 (both ends excluded). Original Patch by @TomHender (Tom Hender) Differential Revision: https://reviews.llvm.org/D89697	2021-07-14 12:03:49 +01:00
LLVM GN Syncbot	90e7f5d259	[gn build] Port `c08dabb0f4`	2021-07-14 10:49:08 +00:00
Simon Pilgrim	0722f3d0fa	Revert rGb803294cf78714303db2d3647291a2308347ef23 : "[InstCombine] Fold (select C, (gep Ptr, Idx), Ptr) -> (gep Ptr, (select C, Idx, 0)) (PR50183)" Missed some BPF test changes that need addressing	2021-07-14 11:48:37 +01:00

1 2 3 4 5 ...

393605 Commits