Commit Graph

513451 Commits

Author SHA1 Message Date
Fraser Cormack
9f3728d157 [libclc] Fix installation w/ ENABLE_RUNTIME_SUBNORMAL (#109926)
The `ARCHIVE` artifact kind is not valid for `install(FILES ...)`.

Additionally, install wasn't resolving the target's `TARGET_FILE`
properly and was trying to find it in the top-level build directory, rather than
in the libclc binary directory. This is because our `TARGET_FILE`
properties were being set to relative paths. The cmake behaviour they
are trying to mimic - `$<TARGET_FILE:$tgt>` - provides an absolute path.

As such this patch updates instances where we set the `TARGET_FILE`
property to return an absolute path.
2024-09-30 10:48:30 +01:00
Matt Arsenault
81ba95cefe FastISel: Fix incorrectly using getPointerTy (#110465)
This was using the default address space instead of the
correct one.

Fixes #56055
2024-09-30 13:43:53 +04:00
Matt Arsenault
5883ad34d6 DAG: Handle vector legalization of minimumnum/maximumnum (#109779)
Follow the same patterns as the other min/max variants.
2024-09-30 13:43:35 +04:00
Abid Qadeer
d556e38fe8 [flang][debug] Support derived type components with box types. (#109424)
Our support for derived types uses `getTypeSizeAndAlignment` to
calculate the offset of the members. The `fir.box` was not supported in
that function. It meant that any member which required descriptor was
not supported in the derived type.
    
We convert the type into an llvm type and then use the DataLayout to
calculate the size/offset of a member. There is no dependency on
`getTypeSizeAndAlignment` to get the size of the types.

There are 2 other changes in this PR:

1. The `recID` field is used to handle cases where we have a member
references its parent type.

2. A type cache is maintained to avoid duplication. It is also needed
for circular reference case.


Fixes #108001.
2024-09-30 10:31:56 +01:00
Jay Foad
6f956e3117 [AMDGPU] Rename LocalMemorySize features to AddressableLocalMemorySize (#110242)
Change the names of the TableGen features to match the names used by
AMDGPUSubtarget. "Addressable" refers to the amount that can be accessed
by a single workgroup. Add some explanatory comments. NFC.
2024-09-30 10:29:31 +01:00
Abhishek Varma
b8c974f093 [MLIR][TilingInterface] Extend consumer fusion for multi-use of producer shared by terminator ops (#110105)
-- This commit extends consumer fusion to take place even if the
producer has multiple uses.
-- The multiple uses of the producer essentially means that besides the
consumer op in concern, the only other uses of the producer are
allowed in :-
   1. scf.yield
   2. tensor.parallel_insert_slice

Signed-off-by: Abhishek Varma <abhvarma@amd.com>
2024-09-30 14:51:06 +05:30
Matt Arsenault
8e0daabe97 AMDGPU: Make a frame index test more realistic
We do not expect to see live carry out outputs on these adds,
so add a dead flag. Split the test for the degenerate case. This
makes it more apparent a regression in a future commit does not matter.
2024-09-30 13:02:56 +04:00
Petar Avramovic
83fe85115d AMDGPU: Fix inst-selection of large scratch offsets with sgpr base (#110256)
Use i32 for offset instead of i16, this way it does not get interpreted
as negative 16 bit offset.
2024-09-30 10:44:59 +02:00
Viktoriia Bakalova
93eaa99289 [abi] [ItaniumMangle] Remove a test case that fails due to expected r… (#110467)
…edefinition failures.
2024-09-30 10:40:57 +02:00
Petar Avramovic
e9d12a6b45 AMDGPU: Add test for 16 bit unsigned scratch offsets (#110255)
Large scratch offset with one on highest bit selected as negative,
negative offset has same binary representation in 16 bits as large
unsigned offset.
2024-09-30 10:39:17 +02:00
Caroline Concatto
f627c453db Fix test for PR#109680
The patch https://github.com/llvm/llvm-project/pull/109680 is failing
because of the test sme-callee-save-restore-pairs.ll.
This patch fixes the output of the test
2024-09-30 08:27:22 +00:00
Nikita Popov
f445e39ab2 [SimplifyCFG] Use isWritableObject() API (#110127)
SimplifyCFG store speculation currently has some homegrown code to check
for a writable object, handling the alloca special case only.

Switch it to use the generic isWritableObject() API, which means that we
also support byval arguments, allocator return values, and writable
arguments.

I've adjusted isWritableObject() to also check for the noalias attribute
when handling writable. Otherwise, I don't think that we can generalize
from at-entry writability. This was not relevant for previous uses of
the function, because they'd already require noalias for other reasons
anyway.
2024-09-30 10:03:46 +02:00
Danial Klimkin
dd2792ac7d [bazel] Fix build past 6292f117c3 (#110459) 2024-09-30 09:46:41 +02:00
CarolineConcatto
a548eded70 [AArch64][SME]Check streaming mode when using SME2 instruction in fra… (#109680)
…me lowering

SME instructions can only be used in streaming mode. PTRUE for
predicated counter and the ld/st pair can be used when:
  sve2.1  is available or
  sme2 available in function in streaming mode.
Previously the frame lowering only checking if sme2 available when
building the machine instruction.
This fix checks if sme2 is available and is subtarget in streaming mode
2024-09-30 08:42:41 +01:00
Nikita Popov
f5c02dd06e [MemCpyOpt] Use EarliestEscapeInfo (#110280)
Pass EarliestEscapeInfo to BatchAA in MemCpyOpt. This allows memcpy
elimination in cases where one of the involved pointers is captured
after the relevant memcpy/call.
2024-09-30 09:35:54 +02:00
Mel Chen
f8373cb0f9 [LV] Reuse VPReplicateRecipe to handle scalar stores in exit block. (#106342)
This patch separates the computation of the final reduction result and
the intermediate stores of reduction.

---------

Co-authored-by: Florian Hahn <flo@fhahn.com>
2024-09-30 15:35:09 +08:00
Viktoriia Bakalova
147558e31c [clang][ItaniumMangle] Mangle friend function templates with a constr… (#110247)
…aint that depends on a template parameter from an enclosing template as
members of the enclosing class.

Such function templates should be considered member-like constrained
friends per [temp.friend]p9 and
https://github.com/itanium-cxx-abi/cxx-abi/issues/24#issuecomment-934977198).
2024-09-30 09:29:02 +02:00
Balázs Kéri
0d384fe978 [clang][analyzer] Move 'alpha.core.PointerSub' checker into 'security.PointerSub' (#107596) 2024-09-30 09:16:27 +02:00
Matt Arsenault
a87640c97e AMDGPU: Fix assertion on load of vector of pointers (#110436)
Fix InferAddressSpaces asserting on a load of a vector of flat
pointers.

Fixes #110433
2024-09-30 10:16:38 +04:00
Chuanqi Xu
af47038fb1 [clangd] [C++20] [Modules] Support code complete for C++20 modules (#110083)
According to https://github.com/ChuanqiXu9/clangd-for-modules/issues/9,
I surprisingly found the support for C++20 modules doesn't support code
completion well.

After debugging, I found there are problems:
(1) We forgot to call `adjustHeaderSearchOptions` in code complete. This
may be an easy oversight.
(2) In `CodeCompleteOptions::getClangCompleteOpts`, we may set
`LoadExternal` as false when index is available. But we have support
modules with index. So it is conflicting. Given modules are opt in now,
I think it makes sense to to set LoadExternal as true when modules are
enabled.

This is a small fix and I wish it can land faster.
2024-09-30 13:07:41 +08:00
Joshua Cao
0bc98349c8 [LICM] Use DomTreeUpdater version of SplitBlockPredecessors, nfc (#107190)
The DominatorTree version is marked for deprecation, so we use the
DomTreeUpdater version. We also update sinkRegion() to iterate over
basic blocks instead of DomTreeNodes. The loop body calls
SplitBlockPredecessors. The DTU version calls
DomTreeUpdater::apply_updates(), which may call DominatorTree::reset().
This invalidates the worklist of DomTreeNodes to iterate over.
2024-09-29 21:28:45 -07:00
bwlodarcz
6f3c15163f [SPIR-V] Fix of OpString separator in DI test (#110249)
Windows have different separators for paths than Unix based OS. One of
the tests in debug-compilation-unit.ll didn't have Win supported '\\'
variant which broken test suite on that OS.
2024-09-29 21:08:36 -07:00
Helena Kotas
e20bf28987 [HLSL] Replace element_type* handles in HLSLExternalSemaSource with __hlsl_resource_t builtin type (#110079)
Replace `element_type*` handles in HLSLExternalSemaSource with
`__hlsl_resource_t` builtin type.

The handle used to be defined as `element_type*` which was used by the
provisional subscript operator implementation. Now that the handle is
`__hlsl_resource_t` the subscript placeholder implementation was updated
to add `element_type* e;` field to the resource struct. and return a
reference to that. This field is just a temporary workaround until the
indexing is implemented properly in llvm/llvm-project#95956, at which
point the field will be removed. This seemed like a better solution than
disabling many of the existing tests that already use the `[]` operator.
One test has to be disabled nevertheless because an error based on
interactions of const and template instantiation (potential bug that can
be investigated once indexing is implemented the right way).

Fixes #84824
2024-09-29 20:41:54 -07:00
Lang Hames
6292f117c3 [ORC-RT] Rename sections_tracker.h to record_section_tracker.h.
This matches the type name defined in this header.
2024-09-30 13:20:02 +10:00
kadir çetinkaya
4ef77d61b2 [include-cleaner] Attach Header to AnalysisResults for misisng headers (#110272)
Currently callers of analyze can't get detailed information about a
missing header, e.g. resolve path. Only way to get at this is to use low
level walkUsed funciton, which is way more complicated than just calling
analyze.

This enables further analysis, e.g. when includes are spelled relative
to inner directories, caller can still know their path relative to
repository root.
2024-09-30 04:57:19 +02:00
Kazu Hirata
64f2bff12b [ReachingDefAnalysis] Turn MBBReachingDefsInfo into a proper class (NFC) (#110432)
I'm trying to speed up the reaching def analysis by changing the
underlying data structure.  Turning MBBReachingDefsInfo into a proper
class decouples the data structure and its users.  This patch does not
change the existing three-dimensional vector structure.

---------

Co-authored-by: Nikita Popov <github@npopov.com>
2024-09-29 19:37:53 -07:00
Yingwei Zheng
1efd1227b2 [InstCombine] Fold icmp eq/ne (X *nw Z), (Y *nw Z) -> icmp eq/ne Z, 0 when X != Y (#110413)
Alive2: https://alive2.llvm.org/ce/z/9oDP6K
I found this pattern in
04e75858d7/casadi/core/repmat.cpp (L70-L78).
2024-09-30 10:21:20 +08:00
Koakuma
dbad963a69 [SPARC] Align i128 to 16 bytes in SPARC datalayouts (#106951)
Align i128s to 16 bytes, following the example at
https://reviews.llvm.org/D86310.

clang already does this implicitly, but do it in backend code too for
the benefit of other frontends (see e.g
https://github.com/llvm/llvm-project/issues/102783 &
https://github.com/rust-lang/rust/issues/128950).
2024-09-30 08:32:33 +07:00
Longsheng Mou
129ade21bd [mlir][sparse] Replace getSparseTensorType with tryGetSparseTensorType (#109435)
This PR fixes a bug in `SparseTensorDimOpRewriter` when `tensor.dim` has
an unranked tensor type. To prevent crashes, we now use
`tryGetSparseTensorType` instead of `getSparseTensorType`. Fixes
#107807.
2024-09-30 09:16:55 +08:00
Fangrui Song
c490d349c5 [ELF] Pass Ctx & to Relocations 2024-09-29 16:15:32 -07:00
Fangrui Song
079b8327ec [ELF] Pass Ctx & to InputFiles and SyntheticSections 2024-09-29 16:06:47 -07:00
Fangrui Song
cc6c059dc1 [ELF] Pass Ctx & to Writer 2024-09-29 15:54:28 -07:00
Fangrui Song
17473182f5 [ELF] Pass Ctx & to MapFile 2024-09-29 15:39:52 -07:00
Fangrui Song
5c33424778 [ELF] Pass Ctx & to MarkLive 2024-09-29 15:32:16 -07:00
Fangrui Song
04e69ad727 [ELF] Pass Ctx & to Thunk 2024-09-29 15:20:01 -07:00
Fangrui Song
cf30e8e153 [ELF] Pass Ctx & to Thunk 2024-09-29 14:59:57 -07:00
Fangrui Song
877e49f2b8 [CSKY] Use MCRegister. NFC 2024-09-29 14:51:54 -07:00
Fangrui Song
bab5d5b6b0 [ELF] Pass Ctx & to ICF and SymbolTable 2024-09-29 14:45:00 -07:00
Craig Topper
2ab9233f4f [MSP430] Use MCRegister. NFC 2024-09-29 13:33:05 -07:00
Craig Topper
c38b5c81bb [LoongArch] Use MCRegister. NFC 2024-09-29 13:22:28 -07:00
Craig Topper
65c41da7d2 [Lanai] Use MCRegister. NFC 2024-09-29 13:22:18 -07:00
Craig Topper
886b2b258f [BPF] Use MCRegister. NFC 2024-09-29 13:22:11 -07:00
Florian Hahn
2c8836c899 [LV] Don't consider predicated insts as invariant unconditionally in CM.
Predicated instructions cannot hoisted trivially, so don't treat them as
uniform value in the cost model.

This fixes a difference between legacy and VPlan-based cost model.

Fixes https://github.com/llvm/llvm-project/issues/110295.
2024-09-29 20:31:24 +01:00
Keith Smiley
51e0a997ca [test-release.sh] Fix sed encoding issues on macOS (#105989)
When using bsd sed that ships with macOS on the object files for
comparison, every command would error with

```
sed: RE error: illegal byte sequence
```

This was potentially fixed for an older version in
6c52b02e7d but even the commands in the
example there still have this error. You can repro this with any binary:

```
$ sed s/a/b/ /bin/ls >/dev/null
sed: RE error: illegal byte sequence
```

Where LC_CTYPE appears to no longer solve the issue:

```
$ LC_CTYPE=C sed s/a/b/ /bin/ls >/dev/null
sed: RE error: illegal byte sequence
```

But this change with LC_ALL does:

```
$ LC_ALL=C sed s/a/b/ /bin/ls >/dev/null; echo $?
0
```

It seems like the difference here is that if you have LC_ALL set to
something else, LC_CTYPE does not override it. More info:
https://stackoverflow.com/a/23584470/902968
2024-09-29 12:21:08 -07:00
Timm Baeder
6f04e65c3c [clang][bytecode] Implement fixed-point shifts (#110429) 2024-09-29 20:56:17 +02:00
Kazu Hirata
76f2fa8163 [Transforms] Avoid repeated hash lookups (NFC) (#110400) 2024-09-29 08:55:03 -07:00
Kazu Hirata
19e5a529e8 [GlobalISel] Avoid repeated hash lookups (NFC) (#110399) 2024-09-29 08:54:42 -07:00
Kazu Hirata
a341820fef [LiveDebugValues] Simplify code with MapVector::insert_or_assign (NFC) (#110398)
Note that we must use insert_or_assign because operator[] would
require DbgValue to have the default constructor.
2024-09-29 08:54:06 -07:00
Kazu Hirata
8a9e9a89f0 [Analysis] Avoid repeated hash lookups (NFC) (#110397) 2024-09-29 08:50:37 -07:00
Timm Baeder
5c811ccc4d [clang][bytecode] Implement more binary operators for fixed point types (#110423) 2024-09-29 17:36:17 +02:00