Then use this in the Flang compiler for parsing the OpenMP declare
reduction.
This has no real functional change to the existing code, it's only
moving the declaration itself around.
A few tests has been updated, to reflect the new type names.
Currently, all derived types are initialized through `_FortranAInitialize`, which is functionally correct, but bears poor runtime performance. This patch falls back on global initialization for "simpler" derived types to speed up the initialization.
The semantic of pointer assignments inside FORALL requires evaluating
the targets (RHS) and pointer variables (LHS) of all iterations before
evaluating the assignments.
In practice, if the compiler can prove that the RHS and LHS evaluations
are not impacted by the assignments, the evaluation of the FORALL
assignment statement can be done in a single loop. However, if the
compiler cannot prove this, it needs to "save" the addresses of the
targets and/or the pointer descriptors of each iterations before doing
the assignments.
This patch implements the most common cases where there is no lower bound
spec, no bounds remapping, the LHS is not polymorphic, and the RHS is
not NULL.
The HLFIR operation used to represent assignments inside FORALL can be
used for pointer assignments to (the only difference being that the LHS
is a descriptor address).
The analysis for intrinsic assignment can be reused, with the
distinction that the RHS data is not read during the assignment.
The logic that is used to save LHS in intrinsic assignments inside
FORALL is extracted to be used for the RHS of pointer assignments when
needed (saving a descriptor value).
Pointer assignment LHS are just descriptor addresses and are saved as
int_ptr values.
This PR is a follow-up to #128655.
It adds another test to ensure deterministic ordering in `.mod` files
and includes related changes to prevent non-deterministic ordering
caused by iterating over a set ordered by heap pointers. This issue is
particularly noticeable when using Flang as a library and compiling the
same files multiple times.
The reduced test case is as minimal as possible. We were unable to
reproduce the issue with a smaller set of files.
Previously, nested scopes for implied DO loops with DATA statements were
disallowed, which meant that the following code couldn't compile due to
re-use of `j` loop variable name:
DATA (a(i),(b(i,j),j=1,3),(c(i,j),j=1,3),i=0,4)/
This change allows nested scopes implied DO loops, which allows the code
above to compile.
Tests modified to in accordance with this change:
Semantics/resolve40.f90, Semantics/symbol09.f90
The HLFIR inlining of MAXVAL kicks in at O1 and more when the argument
is an array component reference but the implementation did not account
for the rare cases where the array components have non default lower
bounds.
This patch fixes the issue by using `getElementAt` to compute the
element address.
Rename `indices` to `oneBasedIndices` for more clarity.
While checking if a type should be cached or not, we use
`getDerivedType` to peel outer layers and get to the base type. This
function did not peel the `fir.class` which caused the algorithm to
fail.
Fixes#128606.
As long as it is a host-only function, it cannot be referenced
by the flang-rt's ErfcScaled entry points. With the markup in place,
it is compiling properly by a CUDA compiler.
Two messages that complain about local variables mention that they don't
have the SAVE attribute; in both cases, it would be okay if they were
ALLOCATABLE instead. Clarify the messages.
Refine handling of NULL(...) in semantics to properly distinguish
NULL(), NULL(objectPointer), NULL(procPointer), and NULL(allocatable)
from each other in relevant contexts.
Add IsNullAllocatable() and IsNullPointerOrAllocatable() utility
functions. IsNullAllocatable() is true only for NULL(allocatable); it is
false for a bare NULL(), which can be detected independently with
IsBareNullPointer().
IsNullPointer() now returns false for NULL(allocatable).
ALLOCATED(NULL(allocatable)) now works, and folds to .FALSE.
These utilities were modified to accept const pointer arguments rather
than const references; I usually prefer this style when the result
should clearly be false for a null argument (in the C sense), and it
helped me find all of their use sites in the code.
I merged a patch yesterday
(https://github.com/llvm/llvm-project/pull/128980) that strengthened
error detection of indistinguishable specific procedures in a type-bound
generic procedure, and broke a couple of tests. Refine the check so that
it doesn't flag valid cases of overridden bindings, and add a thorough
test with all of the boundary cases that I can think of.
Since evaluate::Expr can show up in the parse tree in the semantic
analysis step, make it possible to dump its structure in the Semantics
module.
The Lower module depends on Semantics, so the code is still accessible
in it.
Define the intrinsic `CO_REDUCE` and add semantic checks.
A test was already present but was at `XFAIL`. It has been modified to
take new messages into the output.
Flang generates slower code for `CSHIFT(CSHIFT(PTR(:,:,I),sh1,1),sh2,2)`
pattern in facerec than other compilers. The first CSHIFT can be done
as two memcpy's wrapped in a loop for the second dimension.
This does require creating a temporary array, but it seems to be faster,
than the current hlfir.elemental inlining.
I started with modifying the new index computation in
hlfir.elemental inlining: the new arith.select approach does enable
some vectorization in LLVM, but on x86 it is using gathers/scatters
and does not give much speed-up.
I also experimented with LoopBoundSplitPass
and InductiveRangeCheckElimination for a simple (not chained) CSHIFT
case, but I could not adjust them to split the loop with a condition
on the value of the IV into two loops with disjoint iteration spaces.
I thought if I could do it, I would be able to keep the hlfir.elemental
inlining mostly untouched, and then adjust the hlfir.elemental inlining
heuristics for the facerec case.
Since I was not able to make these pass work for me, I added a special
case inlining for CSHIFT(ARRAY,SH,DIM=1) via hlfir.eval_in_mem.
If ARRAY is not statically known to have the contiguous leading
dimension, there is a dynamic check for contiguity, which allows
exposing it to LLVM and enabling the rewrite of the copy loops
into memcpys. This approach is stepping on the toes of LoopVersioning,
but it is helpful in facerec case.
I measured ~6% speed-up on grace, and ~4% on zen4.
The syntax with the object list following the memory-order clause has
been removed in OpenMP 5.2. Still, accept that syntax with versions >=
5.2, but treat it as deprecated (and emit a warning).
The DECLARE REDUCTION allows the initialization part to be either an
expression or a call to a subroutine.
This modifies the parsing and semantic analysis to allow the use of the
subroutine, in addition to the simple expression that was already
supported.
New tests in parser and semantics sections check that the generated
structure is as expected.
DECLARE REDUCTION lowering is not yet implemented, so will end in a
TODO. A new test with an init subroutine is added, that checks that this
variant also ends with a "Not yet implemented" message.
`-fd-lines-as-code` and `-fd-lines-as-comments` enables treatment for
lines beginning with `d` or `D` in fixed form sources.
Using these options in free form has no effect.
If the `-fd-lines-as-code` option is given they are treated as if the
first column contained a blank.
If the `-fd-lines-as-comments` option is given, they are treated as
comment lines.
After #127231, fir.coordinate_of should directly carry the field.
I updated the lowering and codegen tests in #12731, but not the FIR to
FIR tests, which is what this patch is cleaning up.
This patch updates fir.coordinate_op to carry the field index as
attributes instead of relying on getting it from the fir.field_index
operations defining its operands.
The rational is that FIR currently has a few operations that require
DAGs to be preserved in order to be able to do code generation. This is
the case of fir.coordinate_op, which requires its fir.field operand
producer to be visible.
This makes IR transformation harder/brittle, so I want to update FIR to
get rid if this.
Codegen/printer/parser of fir.coordinate_of and many tests need to be
updated after this change.
The code that checks for conflicts between type-bound defined I/O
generic procedures and non-type-bound defined I/O interfaces only works
when then procedures are defined in the same module as subroutines. It
doesn't catch conflicts when either are external procedures, procedure
pointers, dummy procedures, &c. Extend the checking to cover those cases
as well.
Fixes https://github.com/llvm/llvm-project/issues/128752.
…cific
When checking generic procedures for indistinguishable specific
procedures, don't neglect to include specific procedures from any
accessible instance of the generic procedure inherited from its parent
type..
Fixes https://github.com/llvm/llvm-project/issues/128760.
The definition of an array constructor doesn't preclude the use of
[character(:)::] or [character(*)::] directly, but there is language
elsewhere in the standard that restricts their use to specific contexts,
neither of which include explicitly typed array constructors.
Fixes https://github.com/llvm/llvm-project/issues/128755.
Enforce an obscure constraint from the standard: an abstract interface
is not allowed to have the same name as an intrinsic type keyword. I
suspect this is meant to prevent a declaration like "PROCEDURE(REAL),
POINTER :: P" from being ambiguous.
Fixes https://github.com/llvm/llvm-project/issues/128744.
A few bits of semantic checking need a variant of the
ResolveAssociations utility function that stops when hitting a construct
entity for a type or class guard. This is necessary for cases like the
bug below where the analysis is concerned with the type of the name in
context, rather than its shape or storage or whatever. So add a flag to
ResolveAssociations and GetAssociationRoot to make this happen, and use
it at the appropriate call sites.
Fixes https://github.com/llvm/llvm-project/issues/128608.
When checking for conflicts between type-bound generic defined I/O
procedures and non-type-bound defined I/O generic interfaces, don't
worry about conflicts where the type-bound generic interface is
inaccessible in the scope around the non-type-bound interface.
Fixes https://github.com/llvm/llvm-project/issues/126797.
…TENT
A dummy procedure pointer with no INTENT attribute may associate with an
actual argument that is the result of a reference to a function that
returns a procedure pointer, we think.
Fixes https://github.com/llvm/llvm-project/issues/126950.
A derived type with a component of the same name as the type is not
extensible... unless the extension occurs in another module where the
conflicting component is inaccessible.
Fixes https://github.com/llvm/llvm-project/issues/126114.
Modules read from module files must have their symbols tagged with the
ModFile flag to suppress all warnings messages that might be emitted for
their contents. (Actionable warnings will have been emitted when the
modules were originally compiled, so we don't want to repeat them later
when the modules are USE'd.) The module symbols of the additional
modules in hermetic module files were not being tagged with that flag;
fix.
The check that "v_list" be deferred shape is just wrong; there are no
deferred shape non-pointer non-allocatable dummy arguments in Fortran.
Correct to check for an assumed shape dummy argument. And de-split the
error messages that were split across multiple source lines, making them
much harder to find with grep.
Fixes https://github.com/llvm/llvm-project/issues/125878.
As I read the standard, an unlimited polymorphic pointer or target
should be viewed as compatible with any data target or data pointer when
used in the two-argument form of the intrinsic function ASSOCIATED().
Fixes https://github.com/llvm/llvm-project/issues/125774.
…dummy
We presently allow a NULL() actual argument to associate with a
non-optional dummy allocatable argument only under INTENT(IN). This is
too strict, as it precludes the case of a dummy argument with default
intent. Continue to require that the actual argument be definable under
INTENT(OUT) and INTENT(IN OUT), and (contra XLF) interpret NULL() as
being an expression, not a definable variable, even when it is given an
allocatable MOLD.
Fixes https://github.com/llvm/llvm-project/issues/115984.
Some Arm processors support exception halting control and some do not.
An Arm executable will run on either type of processor, so it is
effectively unknown at compile time whether or not this support will be
available. ieee_support_halting is therefore implemented with a runtime
check.
The result of a call to ieee_support_standard depends in part on support
for halting control. Update the ieee_support_standard implementation to
check for support for halting control at runtime.
This patch changes the codegen for non-precise erfc calls to generate
math.erfc ops. This wasn't done before because the math dialect did not
have a erfc operation at the time.
Extend semantic checks for `omp loop` directive to report errors when a
`reduction` clause is specified on a standalone `loop` directive with
`teams` binding.
This is similar to how clang behaves.
See RFC:
https://discourse.llvm.org/t/rfc-openmp-supporting-delayed-task-execution-with-firstprivate-variables/83084
The aim here is to ensure that tasks which are not executed for a while
after they are created do not try to reference any data which are now
out of scope. This is done by packing the data referred to by the task
into a heap allocated structure (freed at the end of the task).
I decided to create the task context structure in
OpenMPToLLVMIRTranslation instead of adapting how it is done
CodeExtractor (via OpenMPIRBuilder] because CodeExtractor is (at least
in theory) generic code which could have other unrelated uses.
This PR adds a test to ensure deterministic ordering in `.mod` files. It
also includes related changes to prevent non-deterministic symbol
ordering caused by pointers outside the cooked source. This issue is
particularly noticeable when using Flang as a library and compiling the
same files multiple times.