intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-02-05 22:17:23 +08:00

Author	SHA1	Message	Date
Andrzej Warzynski	ec1981f4ed	[mlir][vector] Fix link in docs (nfc)	2024-08-06 19:29:39 +01:00
vporpo	36f0d64818	[SandboxIR] Implement AllocaInst (#102027 ) This patch implements sandboxir::AllocaInst which mirrors llvm::AllocaInst.	2024-08-06 11:24:13 -07:00
Valentin Clement (バレンタインクレメン)	a3ccaed3b9	[flang][cuda] Allocate local descriptor in managed memory (#102060 ) This patch adds entry point in the runtime to be able to allocate descriptors in managed memory. These entry points currently only call `CUFAllocManaged` and `CUFFreeManaged` but could be more complicated in the future. `cuf.alloc` and `cuf.free` related to local descriptors are converted into runtime calls.	2024-08-06 11:17:11 -07:00
lntue	f133dd92f8	[libc][math] Improve the error analysis and accuracy for pow function. (#102098 )	2024-08-06 14:08:36 -04:00
Kevin Frei	e77ac42bcc	[lldb][debuginfod] Fix the DebugInfoD PR that caused issues when working with stripped binaries (#99362 ) @walter-erquinigo found the the [PR with testing and a fix for DebugInfoD](https://github.com/llvm/llvm-project/pull/98344) caused an issue when working with stripped binaries. The issue is that when you're working with split-dwarf, there are 3 possible files: The stripped binary the user is debugging, the "only-keep-debug" or unstripped binary, plus the `.dwp` file. The debuginfod plugin should provide the unstripped/OKD binary. However, if the debuginfod plugin fails, the default symbol locator plugin will just return the stripped binary, which doesn't help. So, to address that, the SymbolVendorELF code checks to see if the SymbolLocator's ExecutableObjectFile request returned the same file, and bails if that's the case. You can see the specific diff as the second commit in the PR. I'm investigating adding a test: I can't quite get a simple repro, and I'm unwilling to make any additional changes to Makefile.rules to this diff, for Pavlovian reasons.	2024-08-06 11:06:04 -07:00
Joseph Huber	3983bf6040	[libc] Fix GPU argument vector writing `nullptr` to string Summary: The intention behind this code was to null terminate the `envp` string, but it accidentally went into the string data.	2024-08-06 13:03:06 -05:00
Natan-GabrielTiutiuIntel	5e6d5c01e0	[mlir] Add --list-passes option to mlir-opt (#100420 ) Currently, the only way to see the passes that were registered is by calling “mlir-opt --help”. However, for compilers with 500+ passes, the help message becomes too long and sometimes hard to understand. In this PR I add a new "--list-passes" option to mlir-opt, which can be used for printing only the registered passes, a feature that would be extremely useful.	2024-08-06 20:00:51 +02:00
Yingwei Zheng	07b29fc808	[ConstantRange] Improve `shlWithNoWrap` (#101800 ) Closes https://github.com/dtcxzyw/llvm-tools/issues/22.	2024-08-07 02:00:33 +08:00
Mark de Wever	4dee6411e0	[libc++] Implements LWG3130. (#101889 ) This adds addressof at the required places in [input.output]. Some of the new tests failed since string used operator& internally. These have been fixed too. Note the new fstream tests perform output to a basic_string instead of a double. Using a double requires num_get specialization num_get<CharT, istreambuf_iterator<CharT, char_traits_operator_hijacker<CharT>> This facet is not present in the locale database so the conversion would fail due to a missing locale facet. Using basic_string avoids using the locale. As a drive-by fixes several bugs in the ofstream.cons tests. These tested ifstream instead of ofstream with an open mode. Implements: - LWG3130 [input.output] needs many addressof Closes #100246.	2024-08-06 19:47:56 +02:00
Sterling-Augustine	66f4e3f8db	[SandboxIR] Implement missing PHINode functions (#101734 ) replaceIncomingBlockWith and removeIncomingValueIf are both straightforward and done. I'll defer copyIncomingBlocks until a couple of other changes that also handle blocks go in.	2024-08-06 17:45:30 +00:00
Mark de Wever	642259a2f2	[libc++][chrono][test] Fixes bogus loops. (#101890 ) Changes the loop range to match similar tests and avoids zero iterations. The original motivation to reduce the number of iterations was to allow the test to be executed during constant evaluation. Fixes: https://github.com/llvm/llvm-project/issues/100502	2024-08-06 19:38:46 +02:00
gonzalobg	f55abd545d	[NVPTX] Add Volta Atomic SequentiallyConsistent Load and Store Operations (#98551 ) This PR Builds on #98022 . It adds support for Volta's SequentiallyConsistent Load and Store operations at system scope.	2024-08-06 10:32:51 -07:00
Alexis Engelke	85bf0a6b44	[CodeGen] Fix PreISelLowering not reporting changes (#102184 ) expandVectorPredication may change code, even if the intrinsic itself remains in the code. Report changes whenever such an intrinsic is encountered, because code could have been changed. Another follow-up fix for #101652 to fix expensive-checks-only failure.	2024-08-06 19:30:42 +02:00
Shilei Tian	31a999c1ad	[Clang][Doc] Fix an error in `OpenMPSupport.rst`	2024-08-06 13:28:35 -04:00
aaryanshukla	0395bf7636	[libc][math][c23] Add ffma{,l,f128} and fdiv{,l,f128} C23 math functions #101089 (#101253 ) - added all variations of ffma and fdiv - will add all new headers into yaml for next patch - only fsub is left then all basic operations for float is complete --------- Co-authored-by: OverMighty <its.overmighty@gmail.com>	2024-08-06 10:19:54 -07:00
Matt Arsenault	3e3ea54aad	AMDGPU: Add some leaf intrinsics to isAlwaysUniform (#101925 ) These would always be uniform anyway, but it shouldn't hurt to mark them as always uniform. This will help use TTI::isAlwaysUniform in place of proper uniformity analysis in trivial situations.	2024-08-06 21:09:04 +04:00
Steven Wu	c826c07481	[Test] Update clang/test/Modules/crash-vfs-include-pch.m (#102080 ) Avoid the driver error for mis-using a clang cc1 flag as driver flag in the crash test.	2024-08-06 10:06:48 -07:00
Yeoul Na	1119a08050	[BoundsSafety][NFC] Remove the unused parameter 'Decls' from 'Sema::C… (#102076 ) …heckCountedByAttrOnField' llvm::SmallVectorImpl<TypeCoupledDeclRefInfo> &Decls is a vector of declarations referred to by the argument of 'counted_by' attributes and fields. 'BuildCountAttributedArrayOrPointerType' had been made self-contained to produce the 'Decls' within itself to allow 'TreeTransform' to invoke the function without having to call 'Sema::CheckCountedByAttrOnField' again. Thus, 'Decls' produced by `Sema::CheckCountedByAttrOnField` is never used.	2024-08-06 10:04:25 -07:00
Shilei Tian	0c2ded6706	[NFC] Fix compile warning introduced in #99732	2024-08-06 12:45:29 -04:00
Krystian Stasiowski	55ea36002b	[Clang][Sema] Make UnresolvedLookupExprs in class scope explicit specializations instantiation dependent (#100392 ) A class member named by an expression in a member function that may instantiate to a static _or_ non-static member is represented by a `UnresolvedLookupExpr` in order to defer the implicit transformation to a class member access expression until instantiation. Since `ASTContext::getDecltypeType` only creates a `DecltypeType` that has a `DependentDecltypeType` as its canonical type when the operand is instantiation dependent, and since we do not transform types unless they are instantiation dependent, we need to mark the `UnresolvedLookupExpr` as instantiation dependent in order to correctly build a `DecltypeType` using the expression as its operand with a `DependentDecltypeType` canonical type. Fixes #99873.	2024-08-06 12:40:44 -04:00
Alexis Engelke	b7cd564fa3	[IR] Don't verify module flags on every access (#102153 ) `8b4306ce05` introduced validity checks for every module flag access, because the auto-upgrader uses named metadata before verifying the module. This causes overhead for all other accesses, and the check is, in fact, only need at that single place. Change the upgrader to be careful when accessing module flags before the module is verified and remove the checks on all other occasions. There are two tangential optimizations included: first, when querying a specific flag, don't enumerate all other flags into a vector as well. Second, don't use a Twine for getNamedMetadata(), which has materialization overhead -- all call sites use simple strings that can be implicitly converted to a StringRef.	2024-08-06 18:33:26 +02:00
Alexis Engelke	1d2b6d9d4d	[Support] Use block numbers for DomTree construction (#101706 ) Similar to #101705, do the same optimization for dominator tree construction.	2024-08-06 18:23:01 +02:00
Jacek Caban	f949b03661	[llvm-readobj][NFC] Don't use startLine in a middle of a line in ObjDumper. (#102071 )	2024-08-06 18:04:23 +02:00
Joseph Huber	8c6a6f1a70	[libc] Make RPC malloc implementation return 'nullptr' on alloc failure Summary: `malloc` is supposed to return `nullptr` if it fails, not exit with an error code.	2024-08-06 11:03:40 -05:00
Joshua Baehring	2336ef96b3	[scudo] Refactor store() and retrieve(). (#102024 ) store() and retrieve() have been refactored so that the scudo headers are abstracted away from cache operations.	2024-08-06 08:52:19 -07:00
Sirui Mu	2f28378317	[libc] Fix builds on Windows (#102162 ) This PR changes several places in the CMake scripts to make libc build on Windows. It adds the `errno` entrypoint to the Windows target. A mistake in the overlay build doc is also fixed. Tests still cannot be built on Windows because of the lack of osutils.	2024-08-06 08:38:05 -07:00
Krystian Stasiowski	b9183d0d0e	[Clang][Sema] Ensure that the selected candidate for a member function explicit specialization is more constrained than all others (#101721 ) The selection of the most constrained candidate for member function explicit specializations introduced in #88963 does not check whether the selected candidate is more constrained than all other candidates, which can result in ambiguities being undiagnosed. This patch addresses the issue.	2024-08-06 11:33:50 -04:00
Mike Rice	6250313291	[clang] Fix compile-time regression from attribute arg checking change (#101768 ) In `2acf77f987` code was added to use the 'full' name including syntax and scope. Instead of building up a large string for each name, add syntax and scope checks to the value expression in tablegen. There is already code to generate expressions for target specific attributes. This change refactors and adds to that code to include syntax and scope checks. The tablegen avoids generating the complicated expression unless there are two attributes using the same name, otherwise the case values will be as simple as before. Removes the currently unused attributeHasStrictIdentifierArgAtIndex function and the related tablegen.	2024-08-06 08:28:56 -07:00
Daniil Kovalev	15d4a84e79	[PAC][ELF][AArch64] Encode signed GOT flag in PAuth core info (#96159 ) Treat 8th bit of version value for llvm_linux platform as signed GOT flag. - clang: define `PointerAuthELFGOT` LangOption and set 8th bit of `aarch64-elf-pauthabi-version` LLVM module flag correspondingly; - llvm-readobj: print `PointerAuthELFGOT` or `!PointerAuthELFGOT` in version description of llvm_linux platform depending on whether the flag is set.	2024-08-06 18:24:01 +03:00
Slava Zakharin	9684c87d14	[flang][runtime] Fixed performance regression in CopyElement. (#102081 ) Polyhedron/capacita,protein and CPU2000/facerec,wupwise showed up to 60% regression on x86 after #101421. The memcpy loops of the toAt and fromAt arrays that are run to create the initial work item end up being encoded as 'rep mov', and they add noticeable overhead comparing to the total amount of work. 'rep mov' is not the best choise for small size memcpy (e.g. when the array rank is 1 or 2, it would be quite slow). Moreover, the rest of the stack related setup is also noticeable for the simple cases. I added a shortcut for the simple copy case, and also got rid of the initial toAt/fromAt copies by allowing the CopyDescriptor to use the external subscript storages.	2024-08-06 08:23:21 -07:00
Kazu Hirata	b809671a41	[Serialization] Fix a warning This patch fixes: clang/lib/Serialization/ASTReader.cpp:11426:13: error: unused variable '_' [-Werror,-Wunused-variable]	2024-08-06 08:22:00 -07:00
Siu Chi Chan	048f350377	Move HIP fatbin sections farther away from .text This would avoid wasting relocation range to jump over the HIP fatbin sections and therefore alleviate relocation overflow pressure.	2024-08-06 15:17:59 +00:00
Mikhail R. Gadelha	92a01683cb	[libc] Enable more entrypoints for riscv (#102055 ) This patch enables more entrypoints for riscv. The changes to the test cases are introduced to support rv32 which has long double but doesn't have int128	2024-08-06 12:16:40 -03:00
Sharadh Rajaraman	bd576fe342	[clang][driver][clang-cl] Support `--precompile` and `-fmodule-*` options in Clang-CL (#98761 ) This PR is the first step in improving the situation for `clang-cl` detailed in [this LLVM Discourse thread](https://discourse.llvm.org/t/clang-cl-exe-support-for-c-modules/72257/28). There has been some work done in #89772. I believe this is somewhat orthogonal. This is a work-in-progress; the functionality has only been tested with the [basic 'Hello World' example](https://clang.llvm.org/docs/StandardCPlusPlusModules.html#quick-start), and proper test cases need to be written. I'd like some thoughts on this, thanks! Partially resolves #64118.	2024-08-06 23:05:55 +08:00
Alexey Bataev	3c3ea7e751	[SLP]Better sorting of cmp instructions by comparing type sizes. Currently SLP vectorizer compares cmp instructions by the type id of the compared operands, which may failed in case of different integer types, for example, which have same type id, but different sizes. Patch adds comparison by type sizes to fix this. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/102132	2024-08-06 11:03:36 -04:00
Shilei Tian	cee594cf36	[Clang][Sema][OpenMP] Allow `num_teams` to accept multiple expressions (#99732 ) By the OpenMP standard, `num_teams` clause can only accept one expression (for now). In this patch, we extend it to allow to accept multiple expressions when it is used with `target teams ompx_bare` construct. This will allow to launch a multi-dim grid, same as CUDA/HIP.	2024-08-06 10:55:15 -04:00
Andrey Timonin	f0178d881c	[NFC][stlextras] Delete repetition of are (#101977 )	2024-08-06 10:28:21 -04:00
Alexis Engelke	a4837fe3c1	[CodeGen] Allow PreISel lowering to run without TM (#102150 ) Fixes #101652 after build bot failures where TM in the opt pass builder is nullptr.	2024-08-06 16:21:56 +02:00
Timm Baeder	e958456840	[clang][Interp] Ignore ObjCBoxedExpr subexpr... (#102136 ) ... if it can't be expressed as a constant initializer.	2024-08-06 16:19:56 +02:00
LLVM GN Syncbot	de5081c15a	[gn build] Port `9fb196b469`	2024-08-06 14:10:33 +00:00
Yeting Kuo	9fb196b469	[RISCV] Insert simple landing pad for taken address labels. (#91855 ) This patch implements simple landing pad labels ([pr]). When Zicfilp enabled, this patch inserts `lpad 0` at the beginning of basic blocks which are possible to be landed by indirect jumps. This patch also supports option riscv-landing-pad-label to make users cpable to set nonzero fixed labels. Using nonzero fixed label force setting t2 before indirect jumps. It's less portable but more strict than original implementation. [pr]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/417	2024-08-06 22:04:48 +08:00
Yingwei Zheng	6def5170e8	[InstCombine] Fold `(X & Mask) == 0 ? TC : FC -> TC binop (X & Mask)` (#100437 ) Alive2: https://alive2.llvm.org/ce/z/d9wV7N	2024-08-06 22:04:24 +08:00
Nuno Lopes	2499978aae	Convert a couple of undef placeholders to poison [NFC]	2024-08-06 15:03:16 +01:00
Matt Arsenault	4f067dc467	TTI: Fix special casing vectorization costs of saturating add/sub (#97463 )	2024-08-06 17:33:52 +04:00
Alexey Bataev	daf4a06e5c	[SLP]Try detect strided loads, if any pointer op require extraction. If any pointer operand of the non-cosencutive loads is an instructions with the user, which is not part of the current graph, and, thus, requires emission of the extractelement instruction, better to try to detect if the load sequence can be repsented as strided load and extractelement instructions for pointers are not required. Reviewers: preames, RKSimon, topperc Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/101668	2024-08-06 09:20:50 -04:00
Steven Wu	f9b69a378c	Revert "[CMake] Fold export_executable_symbols_* into function args. (#101741 )" This reverts commit `5c56b46a32`. This break lld build when using GENERATE_DRIVER.	2024-08-06 06:08:16 -07:00
Vladislav Dzhidzhoev	a0fa9a308d	[LLDB][test] Update Makefile.rules to support Windows host+Linux target (#99266 ) These changes aim to support cross-compilation build on Windows host for Linux target for API tests execution. They're not final: changes will follow for refactoring and adjustments to make all tests pass. Chocolatey make is recommended to be used since it is maintained better than GnuWin32 mentioned here https://lldb.llvm.org/resources/build.html#windows (latest GnuWin32 release is dated by 2010) and helps to avoid problems with building tests (for example, GnuWin32 make doesn't support long paths and there are some other failures with building for Linux with it). Co-authored-by: Pavel Labath <pavel@labath.sk>	2024-08-06 15:07:12 +02:00
Aaron Ballman	295e4f49ae	Correct a comment and update a return type; NFC These changes were inspired by a post-commit review comment: https://github.com/llvm/llvm-project/pull/97274#pullrequestreview-2220175564	2024-08-06 08:55:30 -04:00
Alexey Bataev	df0f31315e	[SLP][NFC]Update test checks.	2024-08-06 05:55:03 -07:00
OverMighty	936515c7a5	[libc][math][c23] Add exp2f16 C23 math function (#101217 ) Part of #95250.	2024-08-06 14:44:01 +02:00

1 2 3 4 5 ...

507580 Commits