Commit Graph

1297 Commits

Author SHA1 Message Date
Christian Sigg
cd482968dc [Bazel][mlir] Avoid ODR violation introduced in 7ab749c.
This change also prepares for 9119325 to land again.

Adds `mlir_c_runner_utils_hdrs` and `mlir_runner_utils_hdrs` targets which do not depend on `//llvm::Support`.

These can be used by other 'runner.so' targets if they are loaded along with the 'runner_utils.so' without calling `__mlir_execution_engine_init()` twice.
2023-06-22 08:00:50 +02:00
Guillaume Chatelet
bd1cba9f4f Revert D148717 "[libc] Improve memcmp latency and codegen"
Once integrated in our codebase the patch triggered a bunch of failing
tests. We do not yet understand where the bug is but we revert it to
move forward with integration.
This reverts commit 5e32765c15.
2023-06-21 12:37:14 +00:00
Christian Sigg
699e64c0d9 Revert "[Bazel][mlir] Fix ODR violation introduced in 7ab749c."
This reverts commit e83c8c3600.

Depending only on the support header files is not sufficient.
2023-06-21 14:29:44 +02:00
Christian Sigg
e83c8c3600 [Bazel][mlir] Fix ODR violation introduced in 7ab749c. 2023-06-21 11:15:09 +02:00
Christian Sigg
7ab749c3a8 [Bazel][mlir] Fix after bba2b65611 2023-06-20 23:00:38 +02:00
Tue Ly
46aa659a32 [libc][math] Improve exp2f performance.
Re-organize special cases and add a special case when `|x| < 2^-5`.

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153134
2023-06-20 09:34:20 -04:00
Tue Ly
5dbd5118ec [libc][math] Improve tanhf performance.
Re-order exceptional branches and slightly adjust the evaluation.

Performance tested with the CORE-MATH project on AMD EPYC 7B12 (clocks/op)

Reciprocal throughputs:
```
--- BEFORE ---

$ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 7.794 + 0.102 clc/call; Median-Min = 0.066 clc/call; Max = 8.267 clc/call;
[####################] 100 %. (with -msse4.2)
Ntrial = 20 ; Min = 10.783 + 0.172 clc/call; Median-Min = 0.144 clc/call; Max = 11.446 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 18.926 + 0.381 clc/call; Median-Min = 0.342 clc/call; Max = 19.623 clc/call;

--- AFTER ---

$ CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 6.598 + 0.085 clc/call; Median-Min = 0.052 clc/call; Max = 6.868 clc/call;
[####################] 100 %  (with -msse4.2)
Ntrial = 20 ; Min = 9.245 + 0.304 clc/call; Median-Min = 0.248 clc/call; Max = 10.675 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 11.724 + 0.440 clc/call; Median-Min = 0.444 clc/call; Max = 12.262 clc/call;
```

Latency:
```
--- BEFORE ---

$ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 38.821 + 0.157 clc/call; Median-Min = 0.122 clc/call; Max = 39.539 clc/call;
[####################] 100 %. (with -msse4.2)
Ntrial = 20 ; Min = 44.767 + 0.766 clc/call; Median-Min = 0.681 clc/call; Max = 45.951 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 55.055 + 1.512 clc/call; Median-Min = 1.571 clc/call; Max = 57.039 clc/call;

--- AFTER ---

$ PERF_ARGS="--latency" CORE_MATH_PERF_MODE=rdtsc ./perf.sh tanhf
[####################] 100 %  (with -mavx2 -mfma)
Ntrial = 20 ; Min = 36.147 + 0.194 clc/call; Median-Min = 0.181 clc/call; Max = 36.536 clc/call;
[####################] 100 %  (with -msse4.2)
Ntrial = 20 ; Min = 40.904 + 0.728 clc/call; Median-Min = 0.557 clc/call; Max = 42.231 clc/call;
[####################] 100 %. (SSE2)
Ntrial = 20 ; Min = 55.776 + 0.557 clc/call; Median-Min = 0.542 clc/call; Max = 56.551 clc/call;
```

Reviewed By: michaelrj

Differential Revision: https://reviews.llvm.org/D153026
2023-06-20 09:25:07 -04:00
Matthias Springer
18ec203084 [mlir][transform] Add ApplyRegisteredPassOp transform op
This transform op runs a pass on the target op.

Differential Revision: https://reviews.llvm.org/D153143
2023-06-20 08:55:22 +02:00
Christian Sigg
53f6229328 [Bazel][mlir] Fix layering check after 11db162db0 2023-06-19 09:26:17 +02:00
Pranav Kant
11db162db0 [Bazel][mlir] Port ee8b8d6b58 2023-06-18 17:51:56 +00:00
Fangrui Song
6b53c35e15 [bazel] Fix clang after D148094 2023-06-16 22:19:32 -07:00
Pranav Kant
b35c3fd780 [Bazel][mlir] Port 120cd5aafc 2023-06-17 02:57:56 +00:00
Benjamin Kramer
7ae49609fd [bazel][mlir] Port 65305aeab9 2023-06-16 13:20:32 +02:00
Pranav Kant
ae7e6df15f [Bazel][mlir][tosa] Fix for 86c4972f5f 2023-06-15 19:50:19 +00:00
Kun Wu
b1c683f5c4 [mlir][sparse][gpu] enable sm80+ sparsity integration test only when explicitly set
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D152966
2023-06-15 17:44:38 +00:00
Benjamin Kramer
048796b887 [bazel][bolt] Port 05634f7346 2023-06-15 15:42:08 +02:00
Pranav Kant
c731bdd6ca [Bazel] Another fix for 7a2fdc685f 2023-06-14 23:29:06 +00:00
Pranav Kant
4cfc33b8b5 [Bazel] Fix for 7a2fdc685f 2023-06-14 23:12:06 +00:00
Benoit Jacob
1c532b5e44 bazel build --incompatible_no_implicit_file_export
The Bazel build was relying, for the two files enumerated in this diff, on the legacy implicit-export semantics described here:
https://bazel.build/reference/be/functions#exports_files

This documentation page encourages migrating away from this legacy behavior, and indeed we have a user who reported a Bazel build error and it appears that they were already using the new, stricter behavior:
https://github.com/openxla/iree/pull/13982
and while examining fixes on our side and trying to get a clean Bazel build, I ran into this similar issue in the LLVM overlay.

It would arguably be cleaner (in the sense of more structured) to rely on `filegroup` to export this, but I am insufficiently familiar with the Clang build (the dependent targets seem to be below Clang) to do this myself. The present `exports_files` solution has the merit of being localized in these few lines here.

Differential Revision: https://reviews.llvm.org/D152491
2023-06-14 19:24:47 +00:00
Tue Ly
055be3c30c [libc] Enable hermetic floating point tests again.
Fixing an issue with LLVM libc's fenv.h defined rounding mode macros
differently from system libc, making get_round() return different values from
fegetround().  Also letting math tests to skip rounding modes that cannot be
set.  This should allow math tests to be run on platforms in which fenv.h is not
implemented yet.

This allows us to re-enable hermatic floating point tests in
https://reviews.llvm.org/D151123 and reverting https://reviews.llvm.org/D152742.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D152873
2023-06-14 10:53:35 -04:00
Guillaume Chatelet
9902fc8dad [libc] Enable custom logging in LibcTest
This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630
2023-06-14 13:37:50 +00:00
Guillaume Chatelet
bdb07c98c4 Revert D152630 "[libc] Enable custom logging in LibcTest"
Failing buildbot https://lab.llvm.org/buildbot/#/builders/73/builds/49707
This reverts commit 9a7b4c9348.
2023-06-14 10:31:49 +00:00
Guillaume Chatelet
9a7b4c9348 [libc] Enable custom logging in LibcTest
This patch mimics the behavior of Google Test and allow users to log custom messages after all flavors of ASSERT_ / EXPECT_.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152630
2023-06-14 10:26:18 +00:00
Pranav Kant
53e3380786 [Bazel] Fix build 2023-06-13 22:44:11 +00:00
Tue Ly
1557256ab0 [libc] Add Int<> type and fix (U)Int<128> compatibility issues.
Add Int<> and Int128 types to replace the usage of __int128_t in math
functions.  Clean up to make sure that (U)Int128 and __(u)int128_t are
interchangeable in the code base.

Reviewed By: sivachandra, mikhail.ramalho

Differential Revision: https://reviews.llvm.org/D152459
2023-06-13 09:40:48 -04:00
James Knight
c5f6a28749 [bazel] Repair clang_headers_gen when run on macOS.
The antique version of bash (3.2.57, from 2007) which is available on
macOS cannot deal with quoted slashes in a `${x/...}`
substitution. Since only prefix-removal is required here, switch to a
`${x#...}` substitution instead.

(E.g. `src="foo/bar/baz.h"; echo ${src/"foo/bar"}` echos `bar/bar/baz.h`
instead of `/baz.h` on old bash versions).

Originally broken by 459420c33a.
Fixes #63222
2023-06-12 16:30:45 -04:00
Michael Jones
d3074f16a6 [libc] Add qsort_r
This patch adds the reentrent qsort entrypoint, qsort_r. This is done by
extending the qsort functionality and moving it to a shared utility
header. For this reason the qsort_r tests focus mostly on the places
where it differs from qsort, since they share the same sorting code.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D152467
2023-06-12 11:12:17 -07:00
Guillaume Chatelet
5e32765c15 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 13:47:16 +00:00
Tue Ly
a982431295 [libc] Add platform independent floating point rounding mode checks.
Many math functions need to check for floating point rounding modes to
return correct values.  Currently most of them use the internal implementation
of `fegetround`, which is platform-dependent and blocking math functions to be
enabled on platforms with unimplemented `fegetround`.  In this change, we add
platform independent rounding mode checks and switching math functions to use
them instead. https://github.com/llvm/llvm-project/issues/63016

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152280
2023-06-12 09:36:41 -04:00
Guillaume Chatelet
1ec995cc1c Revert D148717 "[libc] Improve memcmp latency and codegen"
This broke aarch64 debug buildbot https://lab.llvm.org/buildbot/#/builders/223/builds/21703
This reverts commit bd4f978754.
2023-06-12 08:32:00 +00:00
Guillaume Chatelet
bd4f978754 [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-12 07:56:23 +00:00
Tue Ly
37458f6693 [libc][math] Move str method from FPBits class to testing utils.
str method of FPBits class is only used for pretty printing its objects
in tests.  It brings cpp::string dependency to FPBits class, which is not ideal
for embedded use case.  We move str method to a free function in test utils and
remove this dependency of FPBits class.

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152607
2023-06-10 02:50:58 -04:00
Jordan Rupprecht
261b693afd [bazel][NFC] Add Dialect/Func/Extensions library and deps
Added in D120368
2023-06-09 17:04:41 -07:00
Mikhail Goncharov
b28614c4fc [bazel] format bazel files NFC 2023-06-09 12:13:07 +02:00
Michael Jones
47fd67ec34 [libc][NFC] land long double table for printf
The Mega Table that printf uses for long doubles with some flags is too
large for the linters, and so has been split out from the main patch.
The main patch: https://reviews.llvm.org/D150399

Reviewed By: sivachandra

Differential Revision: https://reviews.llvm.org/D152470
2023-06-08 16:14:56 -07:00
Michael Jones
688b9730d1 [libc] add options to printf decimal floats
This patch adds three options for printf decimal long doubles, and these
can also apply to doubles.

1. Use a giant table which is fast and accurate, but takes up ~5MB).
2. Use dyadic floats for approximations, which only gives ~50 digits of
   accuracy but is very fast.
3. Use large integers for approximations, which is accurate but very
   slow.

Reviewed By: sivachandra, lntue

Differential Revision: https://reviews.llvm.org/D150399
2023-06-08 14:23:15 -07:00
Kun Wu
8ed59c53de [mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support
Reviewed By: aartbik

Differential Revision: https://reviews.llvm.org/D151775
2023-06-06 23:13:21 +00:00
Benjamin Kramer
e412650726 [bazel] Port 44268271f6 2023-06-06 22:47:30 +02:00
Benjamin Kramer
ba8c0bf37e [bazel] Port 1117b9a284 2023-06-06 22:47:17 +02:00
Jacques Pienaar
f007bcbc3c [mlir] Convert quantized dialect bytecode to generated.
Serves as rather self-contained documentation for using the generator
from https://reviews.llvm.org/D144820.

Differential Revision: https://reviews.llvm.org/D152118
2023-06-06 11:16:07 -07:00
Aart Bik
eb5308adc4 bazel build fix
Reviewed By: Peiming, manishucsd

Differential Revision: https://reviews.llvm.org/D152214
2023-06-05 17:24:14 -07:00
Aart Bik
62a06d8224 fix build issue on bazel
Needed to fix:
53a5c3ab4d
db7cc0348c

Reviewed By: Peiming, anlunx

Differential Revision: https://reviews.llvm.org/D152202
2023-06-05 15:33:31 -07:00
Siva Chandra Reddy
2bd82c5462 [bazel][libc] Add targets for integer abs and div functions.
Reviewed By: lntue

Differential Revision: https://reviews.llvm.org/D152084
2023-06-05 22:15:12 +00:00
Johannes Reifferscheid
70bd660709 [bazel] Merge BytecodeOpInterface target into IR.
Reviewed By: akuegel

Differential Revision: https://reviews.llvm.org/D152133
2023-06-05 11:57:12 +02:00
Guillaume Chatelet
e49a608511 Revert D148717 "[libc] Improve memcmp latency and codegen"
This reverts commit 9ec6ebd3ce.

The patch broke RISCV and aarch64 builtbots.
2023-06-05 09:50:30 +00:00
Guillaume Chatelet
9ec6ebd3ce [libc] Improve memcmp latency and codegen
This is based on ideas from @nafi to:
 - use a branchless version of 'cmp' for 'uint32_t',
 - completely resolve the lexicographic comparison through vector
   operations when wide types are available. We also get rid of byte
   reloads and serializing '__builtin_ctzll'.

I did not include the suggestion to replace comparisons of 'uint16_t'
with two 'uint8_t' as it did not seem to help the codegen. This can
be revisited in sub-sequent patches.

The code been rewritten to reduce nested function calls, making the
job of the inliner easier and preventing harmful code duplication.

Reviewed By: nafi3000

Differential Revision: https://reviews.llvm.org/D148717
2023-06-05 09:46:05 +00:00
Adrian Kuegel
bc7f65cbd8 [mlir][Bazel] Adjust BUILD files for a9d003ef85 2023-06-05 09:57:01 +02:00
Mikhail Goncharov
34866154d6 [bazel] add missing dep for GPUTransforms 2023-06-05 09:20:20 +02:00
Benjamin Kramer
9d531c2dcf [bazel] Port 36f351098c 2023-06-04 21:39:52 +02:00
Tue Ly
5a4e344bd9 [libc][NFC] Add LIBC_INLINE and attribute.h header includes to targets' FMA.h.
Targets' FMA.h headers are missing LIBC_INLINE and attributes.h header.

Reviewed By: brooksmoses

Differential Revision: https://reviews.llvm.org/D152024
2023-06-02 21:15:58 -04:00