The LLVM Style Guide says the following about error and warning messages
[1]:
> [T]o match error message styles commonly produced by other tools,
> start the first sentence with a lowercase letter, and finish the last
> sentence without a period, if it would end in one otherwise.
I often provide this feedback during code review, but we still have a
bunch of places where we have inconsistent error message, which bothers
me as a user. This PR identifies a handful of those places and updates
the messages to be consistent.
[1] https://llvm.org/docs/CodingStandards.html#error-and-warning-messages
Our downstream libclc add a few more targets that customizes build_flags
and opt_flags. Then in each customization block, MACRO_ARCH is defined
to be ${ARCH}.
Hoisting MACRO_ARCH definition out of if-else-end block avoids code
duplication. This also avoids potential error when MACRO_ARCH definition
is forgotten, e.g. in https://github.com/intel/llvm/pull/19971.
# Patch
Currently, in Server Mode (i.e. `--connection`), all debuggers are
destroyed when the **lldb-dap process** terminates. This causes logging
and release of resources to be delayed. This can also cause congestion
if multiple debuggers have the same destroy callbacks, which will fight
for the same resources (e.g. web requests) at the same time.
Instead, the debuggers can be destroyed as early as when the **debug
session** terminates. This way, logging and release of release of
resources can happen as soon as possible. Congestion can also be
naturally reduced, because it's unlikely that all debug sessions will
terminate at the same time.
# Tests
See PR #156231.
In order to help LLVM disambiguate accesses to the COMMON
block variables, this patch creates a TBAA sub-tree for each
COMMON block, and the places all variables belonging to this
COMMON block into this sub-tree. The structure looks like this:
```
common /blk/ a, b, c
"global data"
|
|- "blk_"
|
|- "blk_/bytes_0_to_3"
|- "blk_/bytes_4_to_7"
|- "blk_/bytes_8_to_11"
```
The TBAA tag for "a" is created in "blk_/bytes_0_to_3" root, etc.
The byte spans are created based on the `storage` information
provided by `fir.declare` (#155742).
See also:
https://discourse.llvm.org/t/rfc-flang-representation-for-objects-inside-physical-storage/88026
Existing `__ubsan_report_error` should be enough to solve this.
Ability to override on two levels, may result in hard to debug bugs
when in the same binary strong __ubsan_report_error and
__ubsan_handle_##name##_minimal_abort
defined in unrelated components.
With one entry point we will have at least linking error.
Reverts llvm/llvm-project#154220
Remove another build_vector pattern which takes a i16 but placed in a
VGPR_32 from true16 mode. This stop isel from generating illegal
"vgpr_32 = COPY vgpr_16".
ISel will use vgpr16 build vector pattern in true16 mode instead
This patch introduces a new scripting affordance in lldb:
`ScriptedFrame`.
This allows user to produce mock stackframes in scripted threads and
scripted processes from a python script.
With this change, StackFrame can be synthetized from different sources:
- Either from a dictionary containing a load address, and a frame index,
which is the legacy way.
- Or by creating a ScriptedFrame python object.
One particularity of synthezising stackframes from the ScriptedFrame
python object, is that these frame have an optional PC, meaning that
they don't have a report a valid PC and they can act as shells that just
contain static information, like the frame function name, the list of
variables or registers, etc. It can also provide a symbol context.
rdar://157260006
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Print the source location of the instruction definition in comment next
to the enum value for each instruction. To make this more readable,
change formatting of the instruction enums to be better aligned.
Example output:
```
VLD4qWB_register_Asm_8 = 573, // (ARMInstrNEON.td:8849)
VMOVD0 = 574, // (ARMInstrNEON.td:6337)
VMOVDcc = 575, // (ARMInstrVFP.td:2466)
VMOVHcc = 576, // (ARMInstrVFP.td:2474)
VMOVQ0 = 577, // (ARMInstrNEON.td:6341)
```
The arithmetic expansion requires fewer registers, and is often fewer
instructions. The critical path does increase by (up to) one
instruction.
This is a sub-case of the expansion we do without zicond, but restricted
specifically to the simm12 case. In the general case where the other
source is a register using zicond is likely better. (Edit: While
technically true, this is a bit misleading, we do this in
combineSelectToBinOp which is also used in the zicond path, just further
down.)
Otherwise we end up running ninja without any targets specified which
just builds the rest of the default enabled targets. This shouldn't have
too much impact, but can involve building extra things that we don't
need.
This also makes the monolithic-windows.sh script consistent with the
monolithic-windows.sh script.
The branch weights capture probability. The probability has everything to do with the (SSA) value the condition is predicated on, and nothing to do with the position in the CFG.
This patch adds basic support for constant record initializer list
expressions. There's a couple of limitations:
* No zero initialized padding bytes in C mode
* No bitfields
* No designated initializer lists
* Record alignments are not calculated, yet
* ILEs of derived records don't work, yet
* The constant attribute is not propagated to the backend, resulting in
non-constants being emitted in the LLVM IR
It was added in dd33f9cdef to describe
thread-local errno, but is no longer used in the codebase (with the
exception of a single integration test, but the llvm-libc-provided
`#define _Thread_local thread_local` is not needed there anyway, since
`_Thread_local` is a keyword from C11 onwards.
This patch enhances the GPU support documentation page (`support.html`)
by adding a new, detailed section for `math.h`. This new section
presents the results of the GPU math conformance tests, providing
quantitative data on the accuracy of the supported higher math
functions.
This adds support for zero-initialization during delegating constructor
processing.
Note, this also adds code to skip emitting constructors that are trivial
and default to match the classic codegen behavior. The incubator does
not skip these constructors, but I have found a case where this results
in a call to a default constructor that is never defined.
This patch consolidates updating loop metadata and profile info for both
the remainder and vector loops in a single place. This is NFC, modulo
consistently applying vectorization specific metadata also in the
experimental VPlan-native path.
Split off from https://github.com/llvm/llvm-project/pull/154510.
Expressions/references with 'bounds' are going to need to do
initialization significantly differently, so we need to have the
initializer and the declaration 'separate' in the future. This patch
splits the AST node into two, and normalizes them a bit.
Additionally, since this required significant work on the recipe
generation, this patch also does a bit of a refactor to improve
readability and future expansion, now that we have a good understanding
of how these are going to look.
The name is most interesting and if you really need the number you can
use the name to find the entry in the enum or use the first field of the
table row.
Add a bunch of mnemonics to the command options now that they're
highlighted in the help output. This uncovered two issues:
- We had an instance where we weren't applying the ANSI formatting.
- We had a place where we were now incorrectly computing the column
width.
Both are fixed by this PR.
Some passes synthesize functions, e.g. WPD, so we may need to indicate “this synthesized function’s entry count cannot be estimated at compile time” - akin to `branch_weights`.
Issue #147390
MS link.exe provides the `/sectionlayout:@` option to specify the order
of output sections at the granularity of individual sections. LLD/COFF
currently does not have capability for user-controlled ordering of one
or more output sections (as LLD/COFF does not support linker scripts),
and this PR adds the option to align with MS link.exe.
The option accepts only a file that specifies the order of sections, one
per line. For example, `mylayout.txt` could emit the `.text` section
after all other sections while preserving the original relative order of
the remaining sections.
```
.data
.rdata
.pdata
.rsrc
.reloc
.text
```
```bash
echo 'int main() { return 0; }' > main.c
cl main.c /link /entry:main /sectionlayout:@mylayout.txt
llvm-readobj --sections main.exe
```
In architectures where pointers may contain metadata, such as arm64e,
the metadata may need to be cleaned prior to sending this pointer to be
used in expression evaluation generated code.
This patch is a step towards allowing consumers of pointers to decide
whether they want to keep or remove metadata, as opposed to discarding
metadata at the moment pointers are created. See #150537.
This was tested running the LLDB test suite on arm64e.
(The first attempt at this patch caused a failure in
TestScriptedProcessEmptyMemoryRegion.py. This test exercises a case
where IRMemoryMap uses host memory in its allocations; pointers to such
allocations should not be fixed, which is what the original patch failed
to account for).
Commutable instruction can be reordering during tree building, and if
the parent node is not scheduled, its ScheduleData elements are
considered independent and compiler do not looks for reordered operands.
Need to cancel scheduling of copyables in this case.
When partially or runtime unrolling loops with reductions, currently the
reductions are performed in-order in the loop, negating most benefits
from unrolling such loops.
This patch extends unrolling code-gen to keep a parallel reduction phi
per unrolled iteration and combining the final result after the loop.
For out-of-order CPUs, this allows executing mutliple reduction chains
in parallel.
For now, the initial transformation is restricted to cases where we
unroll a small number of iterations (hard-coded to 4, but should maybe
be capped by TTI depending on the execution units), to avoid introducing
an excessive amount of parallel phis.
It also requires single block loops for now, where the unrolled
iterations are known to not exit the loop (either due to runtime
unrolling or partial unrolling). This ensures that the unrolled loop
will have a single basic block, with a single exit block where we can
place the final reduction value computation.
The initial implementation also only supports parallelizing loops with a
single reduction and only integer reductions. Those restrictions are
just to keep the initial implementation simpler, and can easily be
lifted as follow-ups.
With corresponding TTI to the AArch64 unrolling preferences which I will
also share soon, this triggers in ~300 loops across a wide range of
workloads, including LLVM itself, ffmgep, av1aom, sqlite, blender,
brotli, zstd and more.
PR: https://github.com/llvm/llvm-project/pull/149470
In the case you link libedit statically with a vendored sysroot, this
flag is also required. It should be harmless in the case you link it
dynamically since libedit already links libbsd otherwise.
Add RISCVVLOptimizer supported for unit-stride, strided, and indexed
strided segmented stores. The biggest change was adding the capability
to look through INSERT_SUBREG, which was used for composing segmented
register class values.
Fix#149350
A couple of the ubuntu CI bots failed to compile saying that
MappingTraits was unqualified despite a
`using llvm::yaml::MappingTraits` earlier in the file.
For PR https://github.com/llvm/llvm-project/pull/153911
env -i is needed for some lit tests. The feature requires a minimal
amount of work to support and there is no easy way to rewrite the tests
that require it.
At least two tests that need this:
1. clang/test/Driver/env.c
2. lldb/test/Shell/Host/TestCustomShell.test
This PR is a reapply of
https://github.com/llvm/llvm-project/pull/154949, which failed one of
sanitizer checks.
The issue was querying the `warpOp` results in `LoadDistribution` after
calling `moveRegionToNewWarpOpAndAppendReturns()`, which resulted in use
after free. This PR solves the issue by moving the op query before the
call and is otherwise identical to the one linked above.
---------
Co-authored-by: Charitha Saumya <136391709+charithaintc@users.noreply.github.com>