intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-01-26 03:56:16 +08:00

Author	SHA1	Message	Date
Fraser Cormack	688d71ea88	[libclc] Track dependencies through dependency files (#86965 ) This commit fixes the problem of missing build dependencies between libclc source files and their various includes (namely headers and .inc files). We would like to do this with compiler-generated dependency files because then the dependencies are accurate and there are no false positives, leading to unnecessary rebuilds. This is how regular C/C++ dependencies are usually tracked by CMake. Note that this variable is an internal API so is not guaranteed to work, but then again all of CMake's support for new languages (which we use for CLC/LL languages) is an internal API. On balance this change is probably worth it due to how minimally invasive it is. It should work with all supported compilers and CMake generators.	2024-03-28 20:29:25 +00:00
Jan Svoboda	44af53b22a	[clang][modules] Avoid calling expensive `SourceManager::translateFile()` (#86216 ) The `ASTWriter` algorithm for computing affecting module maps uses `SourceManager::translateFile()` to get a `FileID` from a `FileEntry`. This is slow (O(n)) since the function performs a linear walk over `SLocEntries` until it finds one with a matching `FileEntry`. This patch removes this use of `SourceManager::translateFile()` by tracking `FileID` instead of `FileEntry` in couple of places in `ModuleMap`, giving `ASTWriter` the desired `FileID` directly. There are no changes required for clients that still want a `FileEntry` from `ModuleMap`: the existing APIs internally use `SourceManager` to perform the reverse `FileID` to `FileEntry` conversion in O(1).	2024-03-28 13:02:48 -07:00
Simon Pilgrim	5b06de7f99	[X86] Add isLogicOp helper to match ISD::AND/OR/XOR and X86ISD::ANDNP We could easily support the X86ISD 'float' variants of the logic ops as well, but we don't have good test coverage at the moment (they're mainly for SSE1 targets).	2024-03-28 19:39:17 +00:00
Philip Reames	346f49927f	[RISCV] Add add_like PatFrags to reduce number of required patterns [nfc] (#86983 ) This is an NFC prep patch for an upcoming change which is going to support or_is_add in a bunch more cases. Posting separately because tblgen is not particularly my strong suit.	2024-03-28 12:32:17 -07:00
Joshua Batista	6f10dccbab	[HLSL] Add validation for the -enable-16bit-types option (#85340 ) Previously, the clang compiler with the dxc driver would accept the -enable-16bit-types flag without checking to see if the required conditions are met for proper processing of the flag. Specifically, -enable-16bit-types requires a shader model of at least 6.2 and an HLSL version of at least 2021. This PR adds a validation check for these other options having the required values, and emits an error if these constraints are not met. Fixes #57876 --------- Co-authored-by: Damyan Pepper <damyanp@microsoft.com> Co-authored-by: Chris B <cbieneman@microsoft.com>	2024-03-28 12:13:48 -07:00
Christian Sigg	6e58efac16	[mlir][bazel] Export headers either from :Transforms or :TransformUtils (#86819 ) Split them according to their implementation. Ideally, header files should be used by only one target, but this is hard because CMake is less strict with headers (no layering check). But even with bazel, headers should only be exported once in the `hdrs` attribute. Other targets may use them in the `srcs` attribute to avoid circular dependencies.	2024-03-28 20:12:27 +01:00
Aaron Ballman	c2f3a11dbe	Remove an accidental duplicate C status page entry	2024-03-28 14:52:43 -04:00
Noah Goldstein	0e78655731	[LVI] Use m_AddLike instead of m_Add when matching simple condition We have more complete logic for handling `Add`, so try to use that logic for `or disjoint` (which can definitionally be treated as `add`). Closes #86058	2024-03-28 13:49:05 -05:00
Noah Goldstein	efa1544c2c	[LVI] Add tests for tracking `or disjoint` like add; NFC	2024-03-28 13:49:05 -05:00
Noah Goldstein	637421cb88	[ValueTracking] Tracking `or disjoint` conditions as `add` in Assumption/DomCondition Cache We can definitionally treat `or disjoint` as `add` anywhere. Closes #86302	2024-03-28 13:49:05 -05:00
Noah Goldstein	cba6df99e0	[ValueTracking] Add basic tests tracking `or disjoint` conditions as `add`; NFC	2024-03-28 13:49:04 -05:00
Aaron Ballman	8ee5a3fd1c	[C99] Claim conformance to WG14 N590 (#86985 ) This adds test coverage for implementation limits that were defined in WG14 N590. The original content of that paper is not available, so this actually tests against the limits as of C23.	2024-03-28 14:39:58 -04:00
Alastair Houghton	7be847e60f	[libc++abi] Disable forced_unwind4 test for musl. (#85096 ) This test won't pass on musl, but we should still run it for other Linux platforms. rdar://123436716	2024-03-28 14:37:45 -04:00
Jakub Kuderski	d61ec513c4	[mlir][spirv] Add IsInf/IsNan expansion for WebGPU (#86903 ) These non-finite math ops are supported by SPIR-V but not by WGSL. Assume finite floating point values and expand these ops into `false`. Previously, this worked by adding fast math flags during conversion from arith to spirv, but this got removed in https://github.com/llvm/llvm-project/pull/86578. Also do some misc cleanups in the surrounding code.	2024-03-28 14:13:04 -04:00
Haojian Wu	599027857e	[clang] Add invalid check in NormalizedConstraint::fromConstraintExpr. (#86943 ) This is an oversight spot in #86869, we should always check the invalid bit after constructing the `Sema::InstantiatingTemplate` RAII object.	2024-03-28 18:58:52 +01:00
Yijia Gu	da5d576026	[mlir] fix empty spaces in bazel file	2024-03-28 10:56:15 -07:00
Zaara Syeda	6582509daa	[AIX] Handle toc-data offset overflowing 16-bits (#80092 ) When the toc-data offset overflows the 16-bits, we can truncate the value to the 16-bit value as the linker will handle overflow through fixup code.	2024-03-28 13:55:13 -04:00
Oleksandr "Alex" Zinenko	0b790572b1	[mlir] propagate silenceable failures in transform.foreach_match (#86956 ) The original implementation was eagerly reporting silenceable failures from actions as definite failures. Since silenceable failures are intended for cases when the IR has not been irreversibly modified, it's okay to propagate them as silenceable failures of the parent op. Fixes #86834.	2024-03-28 18:52:10 +01:00
Yijia Gu	2af3b43642	[mlir] fix bazel error for MatchInterfaces dialect	2024-03-28 10:49:27 -07:00
Prashant Kumar	aa7ae1ba0b	[mlir][tensor] Fold producer linalg transpose with consumer unpack an… (#86795 ) …d viceversa -- Adds folding of producer linalg transpose op with consumer unpack op, also adds folding of producer unpack op and consumer transpose op. -- Minor bug fixes w.r.t. to the test cases.	2024-03-28 23:13:33 +05:30
Andrii Levitskiy	6dceea3cb2	[HLSL] prevent generation of wrong double intrinsics. (#86932 ) As #86555, we should cover all of non-double builtins. Closes #86818	2024-03-28 13:34:31 -04:00
Maksim Panchenko	35e7d458c9	[BOLT] Add rewriting support for Linux kernel __bug_table (#86908 ) Update instruction locations in the __bug_table section after new code is emitted. If an instruction with associated bug ID was deleted, overwrite its location with zero.	2024-03-28 10:30:27 -07:00
Justin Bogner	237572f2ff	[SPIR-V] Fix paths when copying spriv-dis and spirv-val on windows (#86876 ) We need `CMAKE_EXECUTABLE_SUFFIX` here so we get the paths right when they end in `.exe`.	2024-03-28 10:29:30 -07:00
Jonas Paulsson	16b7cc69ef	[SystemZ] Eliminate call sequence instructions early. (#77812 ) On SystemZ, the outgoing argument area which is big enough for all calls in the function is created once during the prolog, as opposed to adjusting the stack around each call. The call-sequence instructions are therefore not really useful any more than to compute the maximum call frame size, which has so far been done by PEI, but can just as well be done at an earlier point. This patch removes the mapping of the CallFrameSetupOpcode and CallFrameDestroyOpcode and instead computes the MaxCallFrameSize directly after instruction selection and then removes the ADJCALLSTACK pseudos. This removes the confusing pseudos and also avoids the problem of having to keep the call frame size accurate when creating new MBBs. This fixes #76618 which exposed the need to maintain the call frame size when splitting blocks (which was not done).	2024-03-28 18:26:38 +01:00
Xiangyang (Mark) Guo	1607e8212c	[InlineCost] Disable cost-benefit when sample based PGO is used (#86626 ) #66457 makes InlineCost to use cost-benefit by default, which causes 0.4-0.5% performance regression on multiple internal workloads. See discussions https://github.com/llvm/llvm-project/pull/66457. This pull request reverts it. Co-authored-by: helloguo <helloguo@meta.com>	2024-03-28 10:11:57 -07:00
OverMighty	6aee1f9d18	[libc][math][c23] Fix bounds checking and add FE_INVALID raising in {,u}fromfp{,x}* (#86892 ) See https://github.com/llvm/llvm-project/pull/86692#issuecomment-2024044889 and https://github.com/llvm/llvm-project/pull/86892#discussion_r1542276037. cc @lntue @nickdesaulniers	2024-03-28 13:08:14 -04:00
Balázs Kéri	8dcff10e9b	[clang][analyzer] Improve documentation of StreamChecker (NFC). (#83858 )	2024-03-28 18:04:35 +01:00
Peiming Liu	23ca8e654d	[NFC][mlir][tensor][transform] fix compilation warning. (#86977 )	2024-03-28 09:58:05 -07:00
Nick Desaulniers	ad97ee2531	[libc][support][FixedVector] add reverse iterator (#86732 ) Critically, we don't want to return an iterator to the end of the underlying cpp::array "store." Add a test to catch this issue. This will be used by __cxa_finalize to iterate backwards through a FixedVector. Link: #85651	2024-03-28 09:53:44 -07:00
Charlie Barto	423832421b	[asan][windows] Weak function interception support in instruction size decoder. (#86570 ) This makes it so we'll be able to decode the instructions used in the weak function stubs from https://github.com/llvm/llvm-project/pull/81677. This code doesn't technically require those changes. Co-authored-by: Amy Wishnousky <amyw@microsoft.com>	2024-03-28 09:52:25 -07:00
Kazu Hirata	706c1302f9	[Dialect] Fix a warning This patch fixes: mlir/lib/Dialect/Tensor/Transforms/MergeConsecutiveInsertExtractSlicePatterns.cpp:158:17: error: 'matchAndRewrite' overrides a member function but is not marked 'override' [-Werror,-Wsuggest-override]	2024-03-28 09:43:04 -07:00
Keith Smiley	39fe729502	[lld-macho] Ignore -no_warn_duplicate_libraries flag (#86303 ) This is a new ld64 flag (along with `-warn_duplicate_libraries`), where the warning is enabled by default, and it can be useful to ignore since it can be hard to dedup library flags across large builds. This doesn't ignore the enabling version since if someone manually passed that and lld didn't respect it, we probably want the user to know that.	2024-03-28 09:41:08 -07:00
Farzon Lotfi	36b86438d7	[DXIL] Implement pow lowering (#86733 ) closes #86179 - `DXILIntrinsicExpansion.cpp` - add the pow expansion to exp2(y*log2(x))	2024-03-28 12:32:28 -04:00
Nathan Gauër	0f61051f54	[clang][HLSL][SPRI-V] Add convergence intrinsics (#80680 ) HLSL has wave operations and other kind of function which required the control flow to either be converged, or respect certain constraints as where and how to re-converge. At the HLSL level, the convergence are mostly obvious: the control flow is expected to re-converge at the end of a scope. Once translated to IR, HLSL scopes disapear. This means we need a way to communicate convergence restrictions down to the backend. For this, the SPIR-V backend uses convergence intrinsics. So this commit adds some code to generate convergence intrinsics when required. --------- Signed-off-by: Nathan Gauër <brioche@google.com>	2024-03-28 17:18:05 +01:00
Fangrui Song	2763353891	[Object,ELFType] Rename TargetEndianness to Endianness (#86604 ) `TargetEndianness` is long and unwieldy. "Target" in the name is confusing. Rename it to "Endianness". I cannot find noticeable out-of-tree users of `TargetEndianness`, but keep `TargetEndianness` to make this patch safer. `TargetEndianness` will be removed by a subsequent change.	2024-03-28 09:10:34 -07:00
Nick Desaulniers	7789ec067d	[libc] s/NULL/nullptr (#86867 ) Otherwise we need to pull in stddef.h for the declaration of NULL.	2024-03-28 08:43:11 -07:00
Craig Topper	152fcf6e77	[RISCV] Add validation of SPIMM for cm.push/pop. (#84989 ) This checks the immediate is a multiple of 16 bytes.	2024-03-28 08:38:18 -07:00
Craig Topper	f90813543b	[MCP] Use MachineInstr::all_defs instead of MachineInstr::defs in hasOverlappingMultipleDef. (#86889 ) defs does not return the defs for inline assembly. We need to use all_defs to find them. Fixes #86880.	2024-03-28 08:37:19 -07:00
Jerry Wu	f566b079f1	[MLIR] Add pattern to fold insert_slice of extract_slice (#86328 ) Fold the `tensor.insert_slice` of `tensor.extract_slice` into `tensor_extract_slice` when the `insert_slice` simply expand some unit dims dropped by the `extract_slice`.	2024-03-28 11:18:47 -04:00
Jonas Paulsson	94b5c118b3	[ISel] Move handling of atomic loads from SystemZ to DAGCombiner (NFC). (#86484 ) The folding of sign/zero extensions into an atomic load by specifying an extension type is not target specific, and therefore belongs in the DAGCombiner rather than in the SystemZ backend. - Handle atomic loads similarly to regular loads by adding AtomicLoadExtActions with set/get methods. - Move SystemZ extendAtomicLoad() to DagCombiner.cpp.	2024-03-28 16:14:35 +01:00
Fraser Cormack	e251f56a4d	[libclc] Make CMake messages better fit into LLVM (#86945 ) The libclc project is currently only properly supported as an external project. However, when trying to get it to also build in-tree, the CMake configuration messages it outputs stand out amongst the rest of the LLVM projects and sub-projects. This commit makes all messages clear that they belong to the libclc project, as well as turning them into 'STATUS' messages where appropriate.	2024-03-28 15:11:30 +00:00
martinboehme	ae280281ce	[clang][dataflow] Fix for value constructor in class derived from optional. (#86942 ) The constructor `Derived(int)` in the newly added test `ClassDerivedFromOptionalValueConstructor` is not a template, and this used to cause an assertion failure in `valueOrConversionHasValue()` because `F.getTemplateSpecializationArgs()` returns null. (This is modeled after the `MaybeAlign(Align Value)` constructor, which similarly causes an assertion failure in the analysis when assigning an `Align` to a `MaybeAlign`.) To fix this, we can simply look at the type of the destination type which we're constructing or assigning to (instead of the function template argument), and this not only fixes this specific case but actually simplifies the implementation. I've added some additional tests for the case of assigning to a nested optional because we didn't have coverage for these and I wanted to make sure I didn't break anything.	2024-03-28 16:05:11 +01:00
Andrzej Warzyński	d3aa92ed14	[mlir][vector] Add support for scalable vectors to VectorLinearize (#86786 ) Adds support for scalable vectors to patterns defined in VectorLineralize.cpp. Linearization is disable in 2 notable cases: * vectors with more than 1 scalable dimension (we cannot represent vscale^2), * vectors initialised with arith.constant that's not a vector splat (such arith.constant Ops cannot be flattened).	2024-03-28 14:53:21 +00:00
Ed Maste	ffed554f2d	[libc++] Switch FreeBSD to C++26 (#86658 )	2024-03-28 10:52:50 -04:00
Andrzej Warzyński	d7753989ea	[mlir][linalg] Add e2e test for linalg.mmt4d + pack/unpack (#84964 ) This is a follow-up for #81790. This patch basically extends: * test/Integration/Dialect/Linalg/CPU/mmt4d.mlir with pack/unpack ops so that to overall computation is a matrix multiplication (as opposed to linalg.mmt4d). For comparison (and to make it easier to verify correctness), linalg.matmul is also included in the test.	2024-03-28 14:52:08 +00:00
Alexey Bataev	d7975c9d93	[SLP]Add better minbitwidth analysis for udiv/urem instructions. Adds improved bitwidth analysis for udiv/urem instructions. The analysis is based on similar version in InstCombiner. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/85928	2024-03-28 10:35:15 -04:00
Alfie Richards	ff870aeeb7	[ARM] Add reference to `ARMAsmParser` in `ARMOperand` (#86110 )	2024-03-28 14:06:40 +00:00
Yingwei Zheng	a515ea553f	[OCaml] Fix buildbot failure caused by `caa2258`. NFC. Closes #86944.	2024-03-28 22:00:04 +08:00
Akira Hatanaka	84780af4b0	[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923 ) To authenticate pointers, CodeGen needs access to the key and discriminators that were used to sign the pointer. That information is sometimes known from the context, but not always, which is why `Address` needs to hold that information. This patch adds methods and data members to `Address`, which will be needed in subsequent patches to authenticate signed pointers, and uses the newly added methods throughout CodeGen. Although this patch isn't strictly NFC as it causes CodeGen to use different code paths in some cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any changes in functionality as it doesn't add any information needed for authentication. In addition to the changes mentioned above, this patch introduces class `RawAddress`, which contains a pointer that we know is unsigned, and adds several new functions for creating `Address` and `LValue` objects. This reapplies `d9a685a9dd`, which was reverted because it broke ubsan bots. There seems to be a bug in coroutine code-gen, which is causing EmitTypeCheck to use the wrong alignment. For now, pass alignment zero to EmitTypeCheck so that it can compute the correct alignment based on the passed type (see function EmitCXXMemberOrOperatorMemberCallExpr).	2024-03-28 06:54:36 -07:00
Amy Kwan	a3efc53f16	[AIX][TLS] Produce a faster local-exec access sequence for the "aix-small-tls" global variable attribute (#83053 ) Similar to `3f46e5453d`, this patch allows the backend to produce a faster access sequence for the local-exec TLS model, where loading from the TOC can be avoided, for local-exec TLS variables that are annotated with the "aix-small-tls" attribute. The expectation is for local-exec TLS variables to be set with this attribute through PGO. Furthermore, the optimized access sequence is only generated for local-exec TLS variables annotated with "aix-small-tls", only if they are less than ~32KB in size.	2024-03-28 09:18:45 -04:00

1 2 3 4 5 ...

494233 Commits