These APIs are MachO specific, and the interfaces are about to be
extended to support more MachO-specific behavior. For now it makes sense
to group them with other MachO specific APIs in MachO.h.
Arm64EC indirect calls use a function __os_arm64x_check_icall... this
has one obvious return value, x11, which is the function to call.
However, it actually returns one other important value: x9, which is the
final destination for the emulator after the call. If the call is
calling x64 code, x9 is used by the thunk.
Previously, we didn't model this, and it mostly worked because the
compiler usually doesn't modify x9 in the narrow window between the
check, and the call. That said, it can happen in some cases; one
reliable way is to do an indirect tail-call with stack protectors
enabled. (You can also just get unlucky with register allocation, but
it's harder to write a testcase for that.)
This patch uses the cfguardtarget bundle to simplify the calling
convention handling, for similar reasons that x64 uses it: modifying
arbitrary calls is difficult without a separate marking.
Fixes#167430.
There seem to be cases where the workflow status is completed but the
jobs have not completed. We need to gracefully handle these changes to
avoid a crash loop in the metrics container.
This reverts commit bde9062418.
This caused failures on Darwin that were not caught by upstream
buildbots. Reverting for now to give myself some time to fix.
The core LLVM library implements a specialization for
`ilist_node_base<true, void>`, which is used by other components. This
is needed to link properly when building LLVM as a library on Windows.
This effort is tracked in #109483.
These functions should be declared in `stdlib.h`, not `wchar.h`, as
confusing as it is. Move them to the proper header file and matching
directories in src/ and test/ trees.
This was discovered while testing libc++ build against llvm-libc, which
re-declares functions like mbtowc in std-namespace in `<cstdlib>`
header, and then uses those functions in its locale implementation.
This commit adds a new helper function that creates various mock objects
that can be used in dwarf expression testing. The optional register
value and memory contents are used to create MockProcessWithMemRead and
MockRegisterContext that can return expected memory contents and
register values.
This simplifies some tests by removing redundant code that creates these
objects in individual tests and consolidates the logic into one place.
We build the callsite graph by first adding nodes and edges for all
allocation contexts, then match the interior callsite nodes onto actual
calls (IR or summary), which due to inlining may result in the
generation of new nodes representing the inlined context sequence. We
attempt to update edges correctly during this process, but in the case
of recursion this becomes impossible to always get correct.
Specifically, when creating new inlined sequence nodes for stack ids on
recursive cycles we can't always update correctly, because we have lost
the original ordering of the context.
This PR introduces a mechanism, guarded by -memprof-top-n-important=
flag, to keep track of extra information for the largest N cold
contexts. Another flag -memprof-fixup-important (enabled by default)
will perform more expensive fixup of the edges for those largest N cold
contexts, by saving and walking the original ordered list of stack ids
from the context.
This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.
The validation functionality has detected several issues, see
`PPCSelectionDAGInfo::verifyTargetNode()`.
Most of the nodes have a description in `*.td` files and were
successfully "imported". Those that don't have a description are listed
in the enum in `PPCSelectionDAGInfo.td`. These nodes are not validated.
Part of #119709.
Pull Request: https://github.com/llvm/llvm-project/pull/168108
This change adds the ACCImplicitRoutine pass which implements the
OpenACC specification for implicit routine directives (OpenACC 3.4 spec,
section 2.15.1).
According to the specification: "If no explicit routine directive
applies to a procedure whose definition appears in the program unit
being compiled, then the implementation applies an implicit routine
directive to that procedure if any of the following conditions holds:
The procedure is called or its address is accessed in a compute region."
The pass automatically generates `acc.routine` operations for functions
called within OpenACC compute constructs or within existing routine
functions that do not already have explicit routine directives. It
recursively applies implicit routine directives while avoiding infinite
recursion when dependencies form cycles.
Key features:
- Walks through all OpenACC compute constructs (parallel, kernels,
serial) to identify function calls
- Creates implicit `acc.routine` operations for functions without
explicit routine declarations
- Recursively processes existing `acc.routine` operations to handle
transitive dependencies
- Avoids infinite recursion through proper tracking of processed
routines
- Respects device-type specific bind clauses to skip routines bound to
different device types
Requirements:
- Function operations must implement `mlir::FunctionOpInterface` to be
identified and associated with routine directives.
- Call operations must implement `mlir::CallOpInterface` to detect
function calls and traverse the call graph.
- Optionally pre-register `acc::OpenACCSupport` if custom behavior is
needed for determining if a symbol use is valid within GPU regions (such
as functions which are already considerations for offloading even
without `acc routine` markings)
Co-authored-by: delaram-talaashrafi<dtalaashrafi@nvidia.com>
During the initialization sequence in our tests the first 'threads'
response sould only be kept if the process is actually stopped,
otherwise we will have stale data.
In VSCode, during the debug session startup sequence immediately after
'configurationDone' a 'threads' request is made. This initial request is
to retrieve the main threads name and id so the UI can be populated.
However, in our tests we do not want to cache this value unless the
process is actually stopped. We do need to make this initial request
because lldb-dap is caching the initial thread list during
configurationDone before the process is resumed. We need to make this
call to ensure the cached initial threads are purged.
I noticed this in a CI job for another review
(https://github.com/llvm/llvm-project/actions/runs/19348261989/job/55353961798)
where the tests incorrectly failed to fetch the threads prior to
validating the thread names.
Starting in version 15, GCC emits a `.base64` directive instead of
`.string` or `.ascii` for char arrays of length `>= 3`.
See [this godbolt link](https://godbolt.org/z/ebhe3oenv) for an example.
This patch adds support for the .base64 directive to AsmParser.cpp, so
tools like `llvm-mc` can process the output of GCC more effectively.
This addresses #165499.
The motivation is to allow passes such as MachineLICM to hoist trivial
FMOV instructions out of loops, where previously it didn't do so even
when the RHS is a constant.
On most architectures, these expensive move instructions have a latency
of 2-6 cycles, and certainly not cheap as a 0-1 cycle move.
In `-Wunsafe-buffer-usage`, many safe pattern checks can benefit from
constant folding. This commit improves null-terminated pointer checks by
folding conditional expressions.
rdar://159374822
---------
Co-authored-by: Balázs Benics <benicsbalazs@gmail.com>
These are simply implemented as specializations of strtofloatingpoint
for double / long double and for wchar_t. The unit tests are copied from
the strtod / strtold ones.
Update VPlan to populate VPIRMetadata during VPInstruction construction
and use it when creating widened recipes, instead of constructing
VPIRMetadata from the underlying IR instruction each time.
This centralizes VPIRMetadata in VPInstructions and ensures metadata is
consistently available throughout VPlan transformations.
PR: https://github.com/llvm/llvm-project/pull/167253
We want to eliminate all .compile.fail.cpp tests since they are brittle:
these tests pass regardless of the specific compilation error, which
means that e.g. a mising include will render the test null.
This is not an exhaustive pass, just a few tests I stumbled upon.
Otherwise, we end up using whatever system-provided compiler runtime is
available, which doesn't work on macOS since compiler-rt is located
inside the toolchain path, which can't be found by default.
However, disable the tests for compiler-rt since those are linking
against the system C++ standard library while using the just-built
libc++ headers, which is non-sensical and leads to undefined references
on macOS.
Replace addMetadata with setMetadata, which sets metadata, updating
existing entries or adding a new entry otherwise.
This isn't strictly needed at the moment, but will be needed for
follow-up patches.
Currently there are no 32 bit presubmit builds for libc. This PR
performs 32 bit build only (no test) to check any changes that land in
libc break 32 bit builds.
Co-authored-by: Aiden Grossman <aidengrossman@google.com>
This allows SDNodes to be validated against their expected type profiles
and reduces the number of changes required to add a new node.
There is only one node that is missing a description -- `GET_CCMASK`,
others were successfully imported.
Part of #119709.
Pull Request: https://github.com/llvm/llvm-project/pull/168113
This patch adds the following device ASan hooks and guarded macros in
__clang_hip_libdevice_declares.h
- Function Declarations
- __asan_poison_memory_region
- __asan_unpoison_memory_region
- __asan_address_is_poisoned
- __asan_region_is_poisoned
- Macros
- ASAN_POISON_MEMORY_REGION
- ASAN_UNPOISON_MEMORY_REGION
Adding structured types for the evaluate request handler.
This should be mostly a non-functional change. I did catch some spelling
mistakes in our tests ('variable' vs 'variables').