The driver-generated -cc1 command-lines for C++ named module inputs
introduce some command-line options which affect the canonical module
build command (and therefore the context hash).
This resets those options.
1. Fixed Android setjmp issue. The root cause is that TSan initializes
before longjmp_xor_key is set up. During __libc_init_vdso, a call to
strcmp triggers TSan initialization, which occurs before
__libc_init_setjmp_cookie. The solution is to call
InitializeLongjmpXorKey on the first use of longjmp_xor_key.
Additionally, correct LONG_JMP_SP_ENV_SLOT by following the bionic
source code.
2. Skip thr object range check on Android. On Android, thr is allocated
on the heap, causing the check to fail.
3. Disable intercepting clone on Android. pthread_create internally
calls clone. Disabling the interception of clone resolves the issue in
most scenarios.
4. Use a workaround to recover the thr pointer stored in
TLS_SLOT_SANITIZER slot, whose value was modified by Skia.
This PR solved the issue from NDK
https://github.com/android/ndk/issues/1041.
Test project: https://github.com/bytedance/android_tsan_sample/
Add OnDiskDataAllocator, which is the data pool implementation inside a
OnDiskCAS that stores data in a single file. It is a based on
MappedFileRegionArena and wrapped inside a CAS database file.
This adds the initial support for lowering the 'ctor' region of
cir.global operations to an init function which is called from a
TU-specific static initialization function.
This does not yet add an attribute to hold a list of global
initializers. That will be added in a future change.
Refactor and replace explicit Imm `getImm*Encodng() | isU*Imm() |
isS*Imm()` functions to a generic one that takes a template.
This is in prep for followup batch to implement `paddis` which takes a
pcrel Imm == 32bits. Doing this
refactor so we don't have to copy and paste the same set of functions
again with only the bit length changes.
Adds support for elementwise and aggregate splat casting struct types
with bitfields. Replacing existing Flattening function which used to
produce a list of GEPs representing a flattened object with one that
produces a list of LValues representing a flattened object. The LValues
can be used by EmitStoreThroughLValue and EmitLoadOfLValue, ensuring
bitfields are properly loaded and stored. This also simplifies the code
in the elementwise and aggregate splat casting functions.
Closes#125986
Add a GPL notice to the GFortran test suite documentation and redirect
to the LICENSE file distributed with the test suite.
Co-authored-by: Cameron McInally <cmcinally@nvidia.com>
Fix crashes in the verifier of `transform.with_named_sequence` attribute
attached to a symbol table operation caused by it constructing a call
graph inside the symbol table. The call graph construction assumes calls
and callables, such as functions or named sequences, have been verified,
but it is not yet the case when the attribute verifier on the (parent)
symbol table operation runs. Trigger such verification manually before
constructing the call graph. This adds redundancy in verification, but
there is currently no mechanism to change the order of verificaiton. In
performance-critical scenarios, verification can be disabled altogether.
Remove unnecessary verfificaton from `transform::IncludeOp::getEffects`.
It was introduced along with the op definition as the op used to inspect
the body of callee, which assumed the body existed, to identify handle
consumption behavior. This was later evolved to having explicit argument
attributes on the callee, which handles the absence of such attributes
gracefully without the need for verification, but the verification was
never removed. It would have been causing infinite recursion if kept in
place.
Fixes#159646.
Fixes#159734.
Fixes#159736.
This PR, which supersedes
https://github.com/llvm/llvm-project/pull/139943, extends the scenarios
where the 'norecurse' attribute can be inferred.
Currently, the 'norecurse' attribute is only inferred if all called
functions also have this attribute. This change introduces a new pass in
the LTO pipeline, run after Whole Program Devirtualization, to broaden
the inference criteria. The new pass inspects all functions in the
module and sets a flag if any functions are external or have their
addresses taken (while ignoring those already marked norecurse). This
flag is then used with the existing conditions to enable inference in
more cases.
This enhancement allows 'norecurse' to be applied in situations where a
function calls a recursive function, but is not part of the same
recursion chain.
For example, foo can now be marked 'norecurse' in the following
scenarios:
`foo -> callee1 -> callee2 -> callee2`
In this case, foo and callee1 can both be marked 'norecurse' because
they're not part of the callee2 recursion.
Similarly, foo can be marked 'norecurse' here:
`foo -> callee1 -> callee2 -> callee1`
Here, foo is not part of the callee1 -> callee2 -> callee1 recursion
chain, so it can be marked 'norecurse'.
Report __opencl_c_program_scope_global_variables and
__opencl_c_device_enqueue as supported. These 2.0 features are
supported but were missing from the extension map.
__opencl_c_atomic_scope_all_devices should also be reported, but
that seems to not just work by adding it to the map for some
reason.
The existing test for these macros was also broken, since it was
missing CL3.0 run lines, so add those.
Test added by #159366
This is causing objdump to crash more often than not on our 2 stage
SVE bots, disabling it and I will investigate tomorrow.
Could be the changes in the PR, or a pre-existing codegen or
llvm-objdump problem.
This isn't what this is for. In the sense this hook is concerned with,
you can copy between AGPRs. This only changes some DAG scheduling
decisions; later passes are responsible for dealing with the bad
agpr-agpr handling.
PR #158641 introduced an issue where i128 accumulator types resulted
in a valid cost, because for a <2 x i128> type the code that
checks for unsupported type legalization would see a type action
of 'TypeSplitVector' which is supported, even though the legalised
type of <1 x i128> would require further scalarization.
This fixes https://github.com/llvm/llvm-project/issues/162009
Make findBaseObject() look through addrspacecast, so that
getAliaseeObject() works with an aliasee that uses and addrspacecast.
This fixes a crash during module summary index emission.
Fixes https://github.com/llvm/llvm-project/issues/161646.
These work the same as the other two (private and reduction) except that
the expression for the 'init' is a copy instead of a default/value init,
and in a separate region. This patch gets all of that correct, and
ensures we generate these as expected.
There is a little extra work to make sure that the bounds-loop
generation does 2 separate array index operations, otherwise this is
very much like the reduction implementation.
Fixes#152893.
An assert was raised when a constexpr virtual function was called from
an constexpr array element with -fexperimental-new-constant-interpreter
set.
Implement SymbolOpInterface on tosa.variable so that it's declaration is
automatically inserted into its parents SymbolTable.
Verifiers for tosa.variable_read/write can now look up the symbol and
guarantee it exists, and duplicate names are caught at creation time.
Previously this was completed by walking the graph which could be
inefficient.
Unfortunately, the Symbol trait expects to find a symbol name via a
hard-coded attribute name "sym_name". Therefore, "name" is renamed
to"sym_name" and a getName() wrapper is provided for backwards
compatibility.
This change also restricts tosa.variable declarations to ops that carry
a SymbolTable (e.g. modules), rather than allowing them to be placed
inside a func.func.
Note: EXT-VARIABLE is an experimental extension in the TOSA
specification, so is not subject to backwards compatibility guarantees.
Ninja is officially included among the preinstalled tools on the Windows
runners now.
This should reduce the risk for stray failures here; sometimes,
attempting to install Ninja through Chocolatey have caused spurious
failures.
The registration of this callback handler was disabled for some reason.
Local testing did not bring up any issues when I enabled it.
Side effect is: Silences current warning about unused function.
This commit implements the backend portion of the typed buffer counter
proposal described in
https://github.com/llvm/wg-hlsl/blob/main/proposals/0023-typed-buffer-counters.md.
This is the second part of the implementation, focusing on the LLVM IR
and SPIR-V backend.
Specifically, this commit implements the "LLVM IR Generation and Backend
Handling"
section of the proposal. This includes:
- Adding the `llvm.spv.resource.counterhandlefromimplicitbinding` and
`llvm.spv.resource.counterhandlefrombinding` intrinsics.
- Implementing the selection of these intrinsics in the SPIRV backend to
generate the correct `OpVariable` and `OpDecorate` instructions for
the counter buffer.
- Handling `IncrementCounter` and `DecrementCounter` via a new
`llvm.spv.resource.updatecounter` intrinsic, which is lowered to
`OpAtomicIAdd`.
- Adding a new test file to verify the implementation.
Contributes to https://github.com/llvm/llvm-project/issues/137032
---------
Co-authored-by: Marcos Maronas <marcos.maronas@intel.com>
When using AppleClang the `clang` feature flag is not set, but the
compiler supports `-flax-vector-conversions=integer`. This adds another
`ADDITIONAL_COMPILE_FLAGS` for AppleClang to fix the CI.