RISCVGenDAGISel.inc can call this before it checks the node type.
Ensure the type is scalar before wasting time to do the more
computationally expensive checks.
This also avoids an assertion if we hit a VMV_X_S instruction
which doesn't have a VL operand which vectorPseudoHasAllNBitUsers
expects.
MultiClass argument is not used any more since aa84326.
Besides, for maintainability, we should put the implementation of
qualifying name in one place (that is `QualifyName` function), so
`Scoper` is removed and we use `IsMC` to indicate that we are in a
multiclass.
There are already 3 issues about the broken state of
-fdelayed-template-parsing and C++20 modules:
- https://github.com/llvm/llvm-project/issues/61068
- https://github.com/llvm/llvm-project/issues/64810
- https://github.com/llvm/llvm-project/issues/65027
The problem is more complex than I thought. I am not sure how to fix it
properly now. Given the complexities and -fdelayed-template-parsing is
actually an extension to support old MS codes, I think it may make sense
to not enable the -fdelayed-template-parsing option by default with
C++20 modules to give more user friendly experience. Users who still
want -fdelayed-template-parsing can specify it explicitly.
Also according to https://learn.microsoft.com/en-us/cpp/build/reference/permissive-standards-conformance?view=msvc-170, MSVC actually defaults to -fno-delayed-template-parsing (/Zc:twoPhase-
with MSVC CLI) if using C++20. So we match the behavior with MSVC here to
not enable -fdelayed-template-parsing by default after C++20.
Add a new attribute `bufferization.manual_deallocation` that can be
attached to allocation and deallocation ops. Buffers that are allocated
with this attribute are assigned an ownership of "false". Such buffers
can be deallocated manually (e.g., with `memref.dealloc`) if the
deallocation op also has the attribute set. Previously, the
ownership-based buffer deallocation pass used to reject IR with existing
deallocation ops. This is no longer the case if such ops have this new
attribute.
This change is useful for the sparse compiler, which currently
deallocates the sparse tensor buffers by itself.
When linking an executable with a slightly larger executable,
ld.lld --call-graph-profile-sort=cdsort can be very slow (see #68638).
```
4.6% 20.7Mi .text.hot
3.5% 15.9Mi .text
3.4% 15.2Mi .text.unknown
```
Add cl option `cdsort-max-chain-size`, which is similar to
`ext-tsp-max-chain-size`, and set it to 128, to improve performance.
In `ld.lld @response.txt --threads=4 --call-graph-profile-sort=cdsort
--time-trace"
builds, the "Total Sort sections" time is measured as follows:
* -mllvm -cdsort-max-chain-size=64: 1.321813
* -mllvm -cdsort-max-chain-size=128: 2.030425
* -mllvm -cdsort-max-chain-size=256: 2.927684
* -mllvm -cdsort-max-chain-size=512: 5.493106
* unlimited: 9 minutes
The rest part takes 6.8s.
lldb/test/Shell/Breakpoint/breakpoint-command.test adds a python
command, to be executed when a breakpoint hits, that writes out a
number. It then runs, hits the breakpoint and checks that the number is
present exactly once.
The problem is that on some systems the test can be run in a filepath
that happens to contain the number (e.g. auto-generated directory
names). The number is then detected multiple times and the test fails.
This patch fixes the issue by using a string instead, particularly a
string with spaces, which is very unlikely to be auto-generated by any
system.
Fix for a crash reported in #64975.
The crash occurs in the cast located
[here](ece5dd101c/mlir/lib/Analysis/DataFlow/DeadCodeAnalysis.cpp (L262))
because `llvm.unreachable` doesn't implement
`RegionBranchTerminatorOpInterface`.
The crash is caused by `DeadCodeAnalysis` assuming that
`isa<RegionBranchOpInterface>(op->getParentOp())` implies
`isa<RegionBranchTerminatorOpInterface>(op)` in
`DeadCodeAnalysis::visit()`.
This patch tried to fix this by enabling the analysis to proceed
regardless of whether `op` is a `RegionBranchTerminatorOpInterface`.
The pass only requires that it can determine a uniquely identified
target at some offsets. Multiple relocations at the same offset are fine
otherwise and will be required when adding exception handling support
for RISC-V.
Add simplification for redundant trunc(zext A) pairs. Generally apply a
transform from D149903.
Depends on D159200.
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D159202
These files satisfy all of the following:
- misc-include-cleaner indicates that these files do not need
Endian.h.
- They do not mention "endian" anywhere.
- They do not include any *.inc or *.def, which could need
llvm::support::endian.
This patch adds "nice-to-have" feature in lit.
it prints the total number of discovered tests at the beginning. It is
covenient to see the total number of tests and avoid scrolling up to the
beginning of log.
Further, this patch also prints %ge of tests.
This patch fixes tests pointed by previous attempt of landing this
patch.
Reviewed By: RoboTux, jdenny-ornl
Co-authored-by: Madhur A <madhura@nvidia.com>
Due to an issue when lowering from scf to spirv as there was no
conversion pass for index to spirv, we are motivated to add a conversion
pass from the Index dialect to the SPIR-V dialect. Furthermore, we add
the new conversion patterns to the scf-to-spirv conversion.
Fixes https://github.com/llvm/llvm-project/issues/63713
---------
Co-authored-by: Jeremy Kun <jkun@google.com>