intel/llvm

mirror of https://github.com/intel/llvm.git synced 2026-02-03 02:26:27 +08:00

Go to file

Matt a1826b4d26 [OpenMP][SIMD][FIX] Use conservative "omp simd ordered" lowering (#126172 )

A proposed fix for the issue #95611, [OpenMP][SIMD] ordered has no
effect in a loop SIMD region as of LLVM 18.1.0

Changes:

- Implement new lowering behavior: Conservatively serialize "omp simd"
loops that have `omp simd ordered` directive to prevent incorrect
vectorization (which results in incorrect execution behavior of the
miscompiled program).

Implementation outline:

- We start with the optimistic default initial value of
`LoopStack.setParallel(/Enable=/true);` in
`CodeGenFunction::EmitOMPSimdInit(const OMPLoopDirective &D)`.
- We only disable the loop parallel memory access assumption with `if
(HasOrderedDirective) LoopStack.setParallel(/Enable=/false);` using the
`HasOrderedDirective` (which tests for the presence of an
`OMPOrderedDirective`).
- This results in no longer incorrectly vectorizing the loop when the
`omp simd ordered` directive is present.

Motivation: We'd like to prevent incorrect vectorization of the loops
marked with the `#pragma omp ordered simd` directive which has
previously resulted in miscompiled code.

At the same time, we'd like the usage outside of the `#pragma omp
ordered simd` context to remain unaffected: Note that in the test
"clang/test/OpenMP/ordered_codegen.cpp" we only "lose" the
`!llvm.access.group` metadata in `foo_simd` alone.

This is conservative, in that it's possible some of the loops would be
possible to vectorize, but we prefer to avoid miscompilation of the
loops that are currently illegal to vectorize.

A concrete example follows:

```cpp
// "test.c"
#include <float.h>
#include <math.h>
#include <omp.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int compare_float(float x1, float x2, float scalar) {
    const float diff = fabsf(x1 - x2);
    x1 = fabsf(x1);
    x2 = fabsf(x2);
    const float l = (x2 > x1) ? x2 : x1;
    if (diff <= l * scalar * FLT_EPSILON)
        return 1;
    else
        return 0;
}

#define ARRAY_SIZE 256

__attribute__((noinline)) void initialization_loop(
    float X[ARRAY_SIZE][ARRAY_SIZE], float Y[ARRAY_SIZE][ARRAY_SIZE]) {
    const float max = 1000.0;
    srand(time(NULL));
    for (int r = 0; r < ARRAY_SIZE; r++) {
        for (int c = 0; c < ARRAY_SIZE; c++) {
            X[r][c] = ((float)rand() / (float)(RAND_MAX)) * max;
            Y[r][c] = X[r][c];
        }
    }
}

__attribute__((noinline)) void omp_simd_loop(float X[ARRAY_SIZE][ARRAY_SIZE]) {
    for (int r = 1; r < ARRAY_SIZE; ++r) {
        for (int c = 1; c < ARRAY_SIZE; ++c) {
#pragma omp simd
            for (int k = 2; k < ARRAY_SIZE; ++k) {
#pragma omp ordered simd
                X[r][k] = X[r][k - 2] + sinf((float)(r / c));
            }
        }
    }
}

__attribute__((noinline)) int comparison_loop(float X[ARRAY_SIZE][ARRAY_SIZE],
                                              float Y[ARRAY_SIZE][ARRAY_SIZE]) {
    int totalErrors_simd = 0;
    const float scalar = 1.0;
    for (int r = 1; r < ARRAY_SIZE; ++r) {
        for (int c = 1; c < ARRAY_SIZE; ++c) {
            for (int k = 2; k < ARRAY_SIZE; ++k) {
                Y[r][k] = Y[r][k - 2] + sinf((float)(r / c));
            }
        }
        // check row for simd update
        for (int k = 0; k < ARRAY_SIZE; ++k) {
            if (!compare_float(X[r][k], Y[r][k], scalar)) {
                ++totalErrors_simd;
            }
        }
    }
    return totalErrors_simd;
}

int main(void) {
    float X[ARRAY_SIZE][ARRAY_SIZE];
    float Y[ARRAY_SIZE][ARRAY_SIZE];

    initialization_loop(X, Y);
    omp_simd_loop(X);
    const int totalErrors_simd = comparison_loop(X, Y);

    if (totalErrors_simd) {
        fprintf(stdout, "totalErrors_simd: %d \n", totalErrors_simd);
        fprintf(stdout, "%s : %d - FAIL: error in ordered simd computation.\n",
                __FILE__, __LINE__);
    } else {
        fprintf(stdout, "Success!\n");
    }

    return totalErrors_simd;
}
```

Before:

```
$ clang -fopenmp-simd -O3 -ffast-math -lm test.c -o test && ./test
totalErrors_simd: 15408
test.c : 76 - FAIL: error in ordered simd computation.
```

clang 19.1.0: https://godbolt.org/z/6EvhxqEhe

After:

```
$ clang -fopenmp-simd -O3 -ffast-math test.c -o test && ./test
Success!
```

Co-authored-by: Matt P. Dziubinski <matt-p.dziubinski@hpe.com>

2025-02-12 08:53:47 -05:00

.ci

…

.github

[GitHub] Add aaronmondal to Bazel codeowners (#126760 )

2025-02-11 09:24:05 -08:00

bolt

[BOLT] Use getMainExecutable() (#126698 )

2025-02-12 09:44:26 +01:00

clang

[OpenMP][SIMD][FIX] Use conservative "omp simd ordered" lowering (#126172 )

2025-02-12 08:53:47 -05:00

clang-tools-extra

[clang-tidy] Added support for 3-argument std::string ctor in bugprone-string-constructor check (#123413 )

2025-02-11 09:09:57 +08:00

cmake

Bump version to 21.0.0git (#124870 )

2025-01-28 19:48:43 -08:00

compiler-rt

[PGO][Offload] Profile profraw generation for GPU instrumentation #76587 (#93365 )

2025-02-11 23:30:54 -06:00

cross-project-tests

…

flang

[flang][OpenMP] Handle fixed length charaters in delayed privatization (#126704 )

2025-02-12 11:04:26 +01:00

libc

[libc] implement endian related macros (#126368 )

2025-02-12 10:17:09 +08:00

libclc

[libclc] Move conversion builtins to the CLC library (#124727 )

2025-02-12 08:55:02 +00:00

libcxx

[libc++] Remove conditional for attributes that are always available (#126879 )

2025-02-12 13:48:54 +01:00

libcxxabi

[libc++abi] Add a missing include for abort() (#126865 )

2025-02-12 14:18:02 +01:00

libunwind

[libunwind] Unwind through loongarch64/Linux sigreturn frame (#123682 )

2025-02-08 09:48:41 +08:00

lld

[LLD][MinGW] Add support for wrapped symbols on ARM64X (#126296 )

2025-02-10 22:52:11 +01:00

lldb

[lldb] Support disassembling discontinuous functions (#126505 )

2025-02-12 10:47:22 +01:00

llvm

Reland "CodeGen][NewPM] Port MachineScheduler to NPM. (#125703 )" (#126684 )

2025-02-12 18:54:39 +05:30

llvm-libgcc

…

mlir

[MLIR][mesh] Mesh fixes (#124724 )

2025-02-12 12:44:48 +01:00

offload

[PGO][Offload] Fix pgo1.c (#126864 )

2025-02-12 00:54:31 -06:00

openmp

[OpenMP][SIMD][FIX] Use conservative "omp simd ordered" lowering (#126172 )

2025-02-12 08:53:47 -05:00

polly

[Polly] Ensure i1 preload condition

2025-01-27 18:47:12 +01:00

pstl

Bump version to 21.0.0git (#124870 )

2025-01-28 19:48:43 -08:00

runtimes

…

third-party

[benchmark] Sync a few commits from upstream to help with CPU count (#126410 )

2025-02-10 00:06:25 -05:00

utils/bazel

[mlir][bazel] Fix after 0fd50ec9a3

2025-02-12 14:28:37 +01:00

.clang-format

…

.clang-tidy

…

.git-blame-ignore-revs

…

.gitattributes

…

.gitignore

…

.mailmap

…

CODE_OF_CONDUCT.md

…

CONTRIBUTING.md

…

LICENSE.TXT

…

pyproject.toml

…

README.md

…

SECURITY.md

…

README.md

The LLVM Compiler Infrastructure

Welcome to the LLVM project!

This repository contains the source code for LLVM, a toolkit for the construction of highly optimized compilers, optimizers, and run-time environments.

The LLVM project has multiple components. The core of the project is itself called "LLVM". This contains all of the tools, libraries, and header files needed to process intermediate representations and convert them into object files. Tools include an assembler, disassembler, bitcode analyzer, and bitcode optimizer.

C-like languages use the Clang frontend. This component compiles C, C++, Objective-C, and Objective-C++ code into LLVM bitcode -- and from there into object files, using LLVM.

Other components include: the libc++ C++ standard library, the LLD linker, and more.

Getting the Source Code and Building LLVM

Consult the Getting Started with LLVM page for information on building and running LLVM.

For information on how to contribute to the LLVM project, please take a look at the Contributing to LLVM guide.

Getting in touch

Join the LLVM Discourse forums, Discord chat, LLVM Office Hours or Regular sync-ups.

The LLVM project has adopted a code of conduct for participants to all modes of communication within the project.

Languages

LLVM 41.3%

C++ 31.5%

C 13%

Assembly 9.5%

MLIR 1.5%

Other 2.8%