intel/llvm - llvm - Gitea: Git with a cup of tea

intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-02-06 23:31:50 +08:00

Author	SHA1	Message	Date
Joshua Cranmer	bcad161db3	[Clang][SPIR-V] Emit target extension types for OpenCL types on SPIR-V. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D141008	2023-03-13 14:20:24 -04:00
Yaxun (Sam) Liu	37114036aa	[AMDGPU] Mark mbcnt as convergent since it depends on CFG. Otherwise some passes will try to merge them and cause incorrect results. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D145072	2023-03-02 11:56:32 -05:00
Nikita Popov	3d84f4268d	[Clang] Convert some tests to opaque pointers (NFC)	2023-02-17 09:49:03 +01:00
Diana Picus	819dfc338b	[AMDGPU] Autogenerate checks for several tests. NFCI	2023-02-16 10:54:34 +01:00
Matt Arsenault	647925648a	clang/OpenCL: Apply default attributes to enqueued blocks This was missing important environment context, like denormal-fp-math and target-features. Curiously this seems to be losing nounwind. Note this only fixes the actual invoke kernel. The invoke function is already setting the default attribute set for internal functions. However that is still buggy since it's not applying any use function attributes (it's also missing uniform-work-group-size). There seem to be too many different functions for setting attributes with inconsistent behavior. The Function overload of addDefaultFunctionAttributes seems to miss the target-cpu and target-features. The AttrBuilder one seems to miss optnone (but that seems to be disallowed on blocks anyway). Neither one calls setTargetAttributes, when it probably should. uniform-work-group-size is also set through AMDGPU code when it should be emitting generically as a language property. I also noticed update_cc_test_checks for attributes seem to not connect the captured attribute variables to the attributes at the end (although I think the numbers happen to work out correctly).	2023-01-30 15:03:15 -04:00
Matt Arsenault	d12ee4bf7c	clang/OpenCL: Extend tests for enqueued block attributes Baseline tests showing that enqueued blocks are not getting the correct attributes applied.	2023-01-30 15:03:15 -04:00
Matt Arsenault	00f6a7f02f	clang/OpenCL: Fix not setting convergent on block invoke kernels Yet another example how convergent not being the default is dangerous and backwards.	2023-01-30 15:03:14 -04:00
Stanislav Mekhanoshin	df0488369d	[AMDGPU] Split dot7 feature Differential Revision: https://reviews.llvm.org/D142507	2023-01-26 10:34:36 -08:00
Nikita Popov	eaea793d5e	[Clang] Convert some tests to opaque pointers (NFC) These are all tests that end up running SROA.	2023-01-26 11:33:19 +01:00
Stanislav Mekhanoshin	870b92977e	[AMDGPU] Split dot8 feature Differential Revision: https://reviews.llvm.org/D142407	2023-01-24 11:16:07 -08:00
Stanislav Mekhanoshin	4ab2246d48	[AMDGPU] Remove dot1 and dot6 features from clang for gfx11 These are unsupported. Differential Revision: https://reviews.llvm.org/D142493	2023-01-24 10:52:42 -08:00
Sven van Haastregt	1495210914	[OpenCL] Always add nounwind attribute for OpenCL Neither OpenCL nor C++ for OpenCL support exceptions, so add the `nounwind` attribute unconditionally for those languages. Differential Revision: https://reviews.llvm.org/D142033	2023-01-20 12:01:22 +00:00
Matt Arsenault	7f2f6eec3e	clang/OpenCL: Check calling convention in test update_cc_test_checks misses this, so make sure at least one block enqueue test manually checks the calling convention for the kernel.	2023-01-12 13:39:23 -05:00
Nikita Popov	02856565ac	[Clang] Emit noundef metadata next to range metadata To preserve the previous semantics after D141386, adjust places that currently emit !range metadata to also emit !noundef metadata. This retains range violation as immediate undefined behavior, rather than just poison. Differential Revision: https://reviews.llvm.org/D141494	2023-01-12 10:03:05 +01:00
Paul Walker	eae26b6640	[IRBuilder] Use canonical i64 type for insertelement index used by vector splats. Instcombine prefers this canonical form (see getPreferredVectorIndex), as does IRBuilder when passing the index as an integer so we may as well use the prefered form from creation. NOTE: All test changes are mechanical with nothing else expected beyond a change of index type from i32 to i64. Differential Revision: https://reviews.llvm.org/D140983	2023-01-11 14:08:06 +00:00
Matt Arsenault	f9559b1e30	clang: Convert test to generated checks and opaque pointers	2023-01-10 20:35:49 -05:00
Matt Arsenault	81849497b4	clang/AMDGPU: Remove flat-address-space from feature map This was only used for checking if is_shared/is_private were legal, which we're not bothering to do anymore. This is apparently visible to more than the target attribute (which seems to silently ignore unrecognized features), so this has the potential to break something (i.e. see the OpenMP test change)	2023-01-05 16:35:04 -05:00
Nikita Popov	aae20a7421	[CodeGenOpenCL] Convert some tests to opaque pointers (NFC)	2023-01-05 10:57:30 +01:00
Matt Arsenault	e630d9b299	AMDGPU/clang: Remove target features from address space test builtins It turns out we can codegen these on targets without flat addressing, although the runtime probably didn't put anything useful there. The proper diagnostic would be to disallow flat pointer uses or languages with them, not this one edge case. Allows removing one of the special cases requiring subtarget support in the device libraries.	2022-12-29 18:46:41 -05:00
Matt Arsenault	f4bcd7f598	AMDGPU/clang: Add builtins for llvm.amdgcn.ballot Use explicit _w32/_w64 suffixes for the wave size to be consistent with the existing other wave dependent intrinsics. Also start diagnosing trying to use both wave32 and wave64. I would have preferred to avoid the +wavefrontsize64 spam on targets where that's the only option, but avoiding this seems to be more work than I expected.	2022-12-29 17:58:55 -05:00
Roman Lebedev	96d3c82645	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 3)" While the PPC litte-endian miscompile did get addressed by https://reviews.llvm.org/D140046 the PPV big-endian bots are still unhappy. https://lab.llvm.org/buildbot/#/builders/93/builds/12560 This reverts commit 7bd358bcb4e358b4351c69e02ef76939e08acdc7.	2022-12-16 22:58:41 +03:00
Roman Lebedev	cfd594f8bb	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 3) * This is a recommit of `3c4d2a0396`, * which was reverted in `25f01d593c`, because it exposed a miscompile in PPC backend, which was resolved in https://reviews.llvm.org/D140089 / `cb3f415cd2`. * which was a recommit of `cf624b23bc`, * which was reverted in `5cfc22cafe`, because the cut-off on the number of vector elements was not low enough, and it triggered both SDAG SDNode operand number assertions, 5and caused compile time explosions in some cases. Let's try with something really REALLY conservative first, just to get somewhere, and try to bump it later. FIXME: should this respect TTI reg width * num vec regs? Original commit message: Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-12-16 19:27:38 +03:00
Nikita Popov	9466b49171	[Clang] Convert various tests to opaque pointers (NFC) These were all tests where no manual fixup was required.	2022-12-12 17:11:46 +01:00
Johannes Doerfert	f6e3a89cc0	[AMDGPU] Annotate the intrinsics to be default and nocallback Differential Revision: https://reviews.llvm.org/D135155	2022-12-07 14:25:25 -08:00
Manuel Brito	481170cb55	[Clang][CodeGen] Use poison instead of undef for extra argument in __builtin_amdgcn_mov_dpp [NFC] Differential Revision: https://reviews.llvm.org/D138755	2022-12-06 12:40:33 +00:00
Ron Lieberman	ca856fff1c	Revert "enable code-object-version=5" very sorry wrong repo. This reverts commit `d882ba7aea`.	2022-11-29 15:21:09 -06:00
Ron Lieberman	d882ba7aea	enable code-object-version=5	2022-11-29 15:11:57 -06:00
Sven van Haastregt	b0e4897a1b	[OpenCL] Remove arm-integer-dot-product extension pragmas This extension only adds builtin functions and thus doesn't need to be included as an extension. Instead of a pragma, the builtin functions of the extension can be exposed through enabling preprocessor defines.	2022-11-29 13:26:50 +00:00
David Stuttard	7940888c59	[AMDGPU] Intrinsic to expose s_wait_event for export ready Differential Revision: https://reviews.llvm.org/D138216	2022-11-28 11:26:15 +00:00
Roman Lebedev	25f01d593c	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 2)" TableGen is still getting miscompiled on PPC buildbots. Sent a mail with request for help. This reverts commit `3c4d2a0396`.	2022-11-27 00:00:06 +03:00
Roman Lebedev	3c4d2a0396	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes (take 2) This is a recommit of `cf624b23bc`, which was reverted in `5cfc22cafe`, because the cut-off on the number of vector elements was not low enough, and it triggered both SDAG SDNode operand number assertions, and caused compile time explosions in some cases. Let's try with something really REALLY conservative first, just to get somewhere, and try to bump it (to 64/128) later. FIXME: should this respect TTI reg width * num vec regs? Original commit message: Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-11-26 23:19:15 +03:00
Alex Richardson	54ad4d2dd1	Drop redundant pipe to opt -instnamer in clang tests This used to be required, but the difference between asserts/!asserts builds no longer exists for %clang_cc1 (only for %clang), so they pass just fine without this flag.	2022-11-25 11:34:55 +00:00
Benjamin Kramer	5cfc22cafe	Revert "[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes" This reverts commit `cf624b23bc`. It triggers crashes in clang, see the comments on github on the original change.	2022-11-23 13:11:16 +01:00
Roman Lebedev	cf624b23bc	[SROA] `isVectorPromotionViable()`: memory intrinsics operate on vectors of bytes Now, there's a big caveat here - these bytes are abstract bytes, not the i8 we have in LLVM, so strictly speaking this is not exactly legal, see e.g. https://github.com/AliveToolkit/alive2/issues/860 ^ the "bytes" "could" have been a pointer, and loading it as an integer inserts an implicit ptrtoint. But at the same time, InstCombine's `InstCombinerImpl::SimplifyAnyMemTransfer()` would expand a memtransfer of 1/2/4/8 bytes into integer-typed load+store, so this isn't exactly a new problem. Note that in memory, poison is byte-wise, so we really can't widen elements, but SROA seems to be inconsistent here. Fixes #59116.	2022-11-23 02:38:25 +03:00
Arthur Eubanks	e564f5153f	[clang][test] Avoid UB in overload.cl	2022-11-13 14:02:24 -08:00
Zahira Ammarguellat	91628f0616	The handling of 'funsafe-math-optimizations' doesn't update the 'MathErrno' flag. But the driver checks for 'fno-math-errno' before passing 'funsafe-math-optimizations' to the FE. In GCC, the option 'funsafe-math-optimizations' doesn't affect the 'fmath-errno' flag. This patch aligns clang with GCC. '-ffast-math' sets the FPContract to 'fast'. But 'funsafe-math-optimizations' the driver doesn't consider the FPContract when handling the option. Unfortunately there are places in the BE that interpret unsafe math mode as allowing FMA. This patch makes -ffast-math' and 'funsafe-math-optimizations' behave similarly in regard to the setting of the FPContract. Differential Revision: https://reviews.llvm.org/D137578	2022-11-11 10:24:12 -05:00
Bjorn Pettersson	5f9a82683d	[clang][test] Use opt -passes=<name> instead of opt -name Updated the RUN line in several test cases to use the new PM syntax opt -passes=<pipeline> instead of the deprecated syntax opt -pass1 -pass2 This was not a complete cleanup in clang/test. But just a swipe using some simple search-and-replace. Mainly for RUN lines involving -mem2reg, -instnamer and -early-cse.	2022-11-08 12:15:42 +01:00
Nikita Popov	304f1d59ca	[IR] Switch everything to use memory attribute This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only. The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions. High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead. Differential Revision: https://reviews.llvm.org/D135780	2022-11-04 10:21:38 +01:00
Matt Arsenault	f59f116bd5	AMDGPU: Add __builtin_amdgcn_permlane64	2022-10-13 21:12:11 -07:00
Petar Avramovic	dcc756d03e	[AMDGPU] Pattern for flat atomic fadd f64 intrinsic with local addr Fix regression from clang opencl test in builtins-fp-atomics-gfx90a.cl test_flat_add_local_f64 caused by D130579 Revert `a3becb333d`. Differential Revision: https://reviews.llvm.org/D134568	2022-09-25 13:25:41 +02:00
Petar Avramovic	a3becb333d	[clang][AMDGPU] Temporarily disable clang atomic fadd test for gfx90a Test is broken by D130579. Temporarily disable to silence builbot failures.	2022-09-23 21:49:16 +02:00
Stanislav Mekhanoshin	e540965915	[AMDGPU] Added __builtin_amdgcn_ds_bvh_stack_rtn Differential Revision: https://reviews.llvm.org/D133966	2022-09-16 02:42:09 -07:00
Fangrui Song	74742147ee	[test] Change cc1 -fvisibility to -fvisibility=	2022-09-02 12:36:44 -07:00
Muhammad Omair Javaid	18de7c6a3b	Revert "[InstCombine] Treat passing undef to noundef params as UB" This reverts commit `c911befaec`. It has broken LLDB Arm/AArch64 Linux buildbots. I dont really understand the underlying reason. Reverting for now make buildbot green. https://reviews.llvm.org/D133036	2022-09-02 16:09:50 +05:00
Arthur Eubanks	c911befaec	[InstCombine] Treat passing undef to noundef params as UB Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D133036	2022-09-01 15:16:45 -07:00
Anastasia Stulova	6b1a04529c	[OpenCL][SPIR-V] Test extern functions with a pointer arg. Added a test case that enhances coverage of opaque pointers particularly for the problematic case with extern functions for which there is no solution found for type recovery. Differential Revision: https://reviews.llvm.org/D130768	2022-09-01 10:22:47 +01:00
Yaxun (Sam) Liu	9f6cb3e9fd	[AMDGPU] Add builtin s_sendmsg_rtn Reviewed by: Brian Sumner, Artem Belevich Differential Revision: https://reviews.llvm.org/D132140 Fixes: SWDEV-352017	2022-08-22 18:29:23 -04:00
Austin Kerbow	b0f4678b90	[AMDGPU] Add iglp_opt builtin and MFMA GEMM Opt strategy Adds a builtin that serves as an optimization hint to apply specific optimized DAG mutations during scheduling. This also disables any other mutations or clustering that may interfere with the desired pipeline. The first optimization strategy that is added here is designed to improve the performance of small gemm kernels on gfx90a. Reviewed By: jrbyrnes Differential Revision: https://reviews.llvm.org/D132079	2022-08-19 15:38:36 -07:00
Austin Kerbow	f5b21680d1	[AMDGPU] Add amdgcn_sched_group_barrier builtin This builtin allows the creation of custom scheduling pipelines on a per-region basis. Like the sched_barrier builtin this is intended to be used either for testing, in situations where the default scheduler heuristics cannot be improved, or in critical kernels where users are trying to get performance that is close to handwritten assembly. Obviously using these builtins will require extra work from the kernel writer to maintain the desired behavior. The builtin can be used to create groups of instructions called "scheduling groups" where ordering between the groups is enforced by the scheduler. __builtin_amdgcn_sched_group_barrier takes three parameters. The first parameter is a mask that determines the types of instructions that you would like to synchronize around and add to a scheduling group. These instructions will be selected from the bottom up starting from the sched_group_barrier's location during instruction scheduling. The second parameter is the number of matching instructions that will be associated with this sched_group_barrier. The third parameter is an identifier which is used to describe what other sched_group_barriers should be synchronized with. Note that multiple sched_group_barriers must be added in order for them to be useful since they only synchronize with other sched_group_barriers. Only "scheduling groups" with a matching third parameter will have any enforced ordering between them. As an example, the code below tries to create a pipeline of 1 VMEM_READ instruction followed by 1 VALU instruction followed by 5 MFMA instructions... // 1 VMEM_READ __builtin_amdgcn_sched_group_barrier(32, 1, 0) // 1 VALU __builtin_amdgcn_sched_group_barrier(2, 1, 0) // 5 MFMA __builtin_amdgcn_sched_group_barrier(8, 5, 0) // 1 VMEM_READ __builtin_amdgcn_sched_group_barrier(32, 1, 0) // 3 VALU __builtin_amdgcn_sched_group_barrier(2, 3, 0) // 2 VMEM_WRITE __builtin_amdgcn_sched_group_barrier(64, 2, 0) Reviewed By: jrbyrnes Differential Revision: https://reviews.llvm.org/D128158	2022-07-28 10:43:14 -07:00
Aaron Ballman	7068aa9841	Strengthen -Wint-conversion to default to an error Clang has traditionally allowed C programs to implicitly convert integers to pointers and pointers to integers, despite it not being valid to do so except under special circumstances (like converting the integer 0, which is the null pointer constant, to a pointer). In C89, this would result in undefined behavior per 3.3.4, and in C99 this rule was strengthened to be a constraint violation instead. Constraint violations are most often handled as an error. This patch changes the warning to default to an error in all C modes (it is already an error in C++). This gives us better security posture by calling out potential programmer mistakes in code but still allows users who need this behavior to use -Wno-error=int-conversion to retain the warning behavior, or -Wno-int-conversion to silence the diagnostic entirely. Differential Revision: https://reviews.llvm.org/D129881	2022-07-22 15:24:54 -04:00

1 2 3 4 5 ...

640 Commits