The new behavior of interrupting the longest running test with Ctrl-C is useful
when tests hang, but not when the run is completely broken for some reason.
Psychology tells us that the user will compulsively spam Ctrl-C in this case,
so exit if three Ctrl-C's are detected within a second.
This correctly formats tests with CJK names or, well, emoji. It is not perfect
(for example it does not correctly format emoji that are variations of 1-wide
characters), but it is as good as most terminal emulators.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of slurping in the entire stream, build the TestResult along
the way. This allows reporting the results of TAP and Rust subtests as
they come in, either as part of the progress report or (in the future)
as individual lines of the output.
Instead of creating temporary files, get the StreamReaders from
_run_subprocess's returned object. Through asyncio magic, their
contents will be read as it becomes ready and then returned when
the StreamReader.read future is awaited.
Because of this change, the stdout and stderr can be easily
preserved when TestSubprocess returns an additional_error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We would like SingleTestRunner to run code before waiting on the process,
for example starting tasks to read stdout and stderr.
Return a new object that is able to complete _run_subprocess's task.
In the next patch, SingleTestRunner will also use the object to get hold
of the stdout and stderr StreamReaders.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Include the names from the TAP output and the SKIP/TODO explanations
if present. Omit the classname attribute, it is optional.
In order to enable this, TestRun.results becomes a list of TAPParser.Test
objects. If in the future there are other kinds of subtest results a
new class can be introduced, but for now it is enough.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For now this is just a refactoring that simplifies the next patch. However,
it will also come in handy when we will make the parsing asynchronous, because
it will make it possible to access subtest results while the test runs.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is cleaner than collections.namedtuple. It also catches that "count()" is
a method on tuple, so rename the field to num_tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Pass the StringIO object to the parse method instead, because
there will be no T.Iterator[str] to use in the asynchronous
case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is the first step towards asynchronous parsing of the TAP output.
We will need to call the same code from both a "for" loop (for unit
tests) and an "async for" loop (for mtest itself). Because the same
function cannot be both a generator and an asynchronous generator, we
need to build both on a common core. This commit therefore introduces
a parse_line function that "parse" can call in a loop. All the local
variables of TAPParser.parse move into "self".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rust has it's own built in unit test format, which is invoked by
compiling a rust executable with the `--test` flag to rustc. The tests
are then run by simply invoking that binary. They output a custom test
format, which this patch adds parsing support for. This means that we
can report each subtest in the junit we generate correctly, which should
be helpful for orchestration systems like gitlab and jenkins which can
parse junit XML.
for non tap tests we want to associate names with the tests, to that end
store them as a dict. For TAP tests, we'll store the "name" as an
integer string that coresponds to the order that the tests were run in.
Flush after each output line, even if printing to a file, so that each
result is immediately visible down a pipeline.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a progress report in the style of "yum". Every second the
report prints a different test among the ones that are running.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The TestLogger class lets us move the code for all those log files
out of TestHarness. The interface is based on JunitBuilder, which
is converted already in this commit. Over the next commits, we
will also convert JSON, text and console output.
The main difference with JunitBuilder is that the completion method is
asynchronous. This can be useful if the logger needs to clean up after
itself and wait for asyncio tasks.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Just reuse the collected_failures collection now that it contains
TestRun objects. Move the code to generate the short form of the log
to TestRun.
Note that the first line of the error log is not included in
get_log()'s return value, so the magic "first four lines are passed
unscathed" is changed to three lines only. The resulting output is
like this:
--- command ---
<command line>
--- Listing only the last 100 lines from a long log. ---
--- stdout ---
...
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If there's an UnicodeEncodeError while printing the error logs,
TestHarness tries an encode/decode pair to get rid of anything that
is not a 7-bit ASCII character; this however results in "?" characters
that are not very clear. To make it easier to understand what is
going on, use backslashreplace instead.
While at it, fix the decode to use a matching encoding. This will
only matter in the rare case of sys.stdout.encoding not being an
ASCII superset, but that should not matter.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of colorizing the whole status line, only colorize the word
representing the outcome of the test (SKIP, OK, FAIL, etc.). This
is less intrusive, so the patch also does the following changes:
- colorize OK and EXPECTEDFAIL, respectively as green and yellow
- colorize the summary of failures as well.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of storing the string, store the whole TestRun. In the
next patches we'll use this to colorize the summary of failures,
and to allow a few more simplifications.
There is some code duplication between the console and logfile
code, but it won't matter as soon as console and logfile output
will be in two completely separate classes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid passing them around as parameters; this will be useful when logging
is moved out of TestHarness, because individual loggers will call back
into TestHarness to do common formatting chores.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Place in TestRun everything that is needed in order to
format the result. This avoids passing around the number
and visible test name as arguments.
Test numbers are assigned the first time they are used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will provide a way to pass more information from the TestHarness
local variables to the SingleTestRunner and use them outside the
run_test function. For example, the name could be used to report
progress while the tests are running.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is a usual workflow to fix something and retest to see if it is fixed using a
particular test. When tests start to become numerous, it becomes time consuming
for "meson test" to relink all of them (and in fact rebuild the whole project)
where the user has already specified the tests they want to run, as well as
the tests' dependencies.
Teach meson to be smart and only build what is needed for the test (or suite)
that were specified.
Fixes: #7473
Related: #7830
Avoid calling self.collected_failures.append twice, and avoid
inflated indentation by adding a "plain" decorator to mlog.
Fixes: ba71fde18 ("mtest: collect failures regardless of colorized console", 2020-10-12)
Rewrite the SingleTestRunner to use asyncio to manage subprocesses,
while still using subprocess.Popen to run them. Concurrency is
managed with an asyncio Semaphore; for simplicity (since this is
a temporary state) we create a new thread for each test that is run
instead of having a pool.
This already provides the main advantage of asyncio, which is better
control on cancellation; with the current code, KeyboardInterrupt
was never handled by the thread executor so the code that tried to handle
it in SingleTestRunner only worked for non-parallel tests. And
because executor futures cannot be cancelled, there was no way for
the user to kill a test that got stuck. Instead, without executors
^C exits "meson test" immediately. The next patch will improve things
even further, allowing a single test to be interrupted with ^C.
Distinguish a failure due to user interrupt from a presumable ERROR
result due to the SIGTERM. The test should fail after CTRL+C even if
the test traps SIGTERM and exits with a return code of 0.
ProcessLookupError can also happen from p.kill(). There is also
nothing we can do in that case, so move the "try" for that
exception to the entire kill_process function.
The ValueError case seems like dead code, so get rid of it.
A large part of _run_cmd is devoted to setting up and killing the
test subprocess. Move that to a separate function to make the
test runner logic easier to understand.
Use asyncio futures for the run loop, while still handling I/O in
a thread pool using run_on_executor.
The handling of the test result is not duplicated anymore between
run_tests and drain_futures. Instead, the test result is always processed
and printed by run_test after single_test.run() completes and (in verbose
mode) it cannot interleave with the test output. Therefore the special
case for self.options.num_processes == 1 can be removed.
run_special and doit are the same except that run_special forgot to
set self.is_run. There is no need for the duplication.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* Fix gtest invoking while workdir is set
* Fix gtest invoking when workdir is not set
* Code style fix
Co-authored-by: Sergey Kartashev <kartashev.sv@mipt.ru>
You could always specify a list of tests to run by passing the names as
arguments to `meson test`. If there were multiple tests with that name (in the
same project or different subprojects), all of them would be run. Now you can:
1. Run all tests with the specified name from a specific subproject: `meson test subprojname:testname`
1. Run all tests defined in a specific subproject: `meson test subprojectname:`
Also forbid ':' in test names. We already forbid this elsewhere, so
should not be a big deal.
This allows the NINJA environment variable to support all the Windows special
cases, especially allowing an absolute path without extension.
Based on a patch by Yonggang Luo.
Fixes: #7659
Suggested-by: Nirbheek Chauhan <nirbheek@centricular.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This removes the check for "mingw" for platform.system(). The only case I know
where "mingw" is return is if using a msys Python under a msys2 mingw environment.
This combination is not really supported by meson and will result in weird errors,
so remove the check.
The second change is checking sys.platform for cygwin instead of platform.system().
The former is document to return "cygwin", while the latter is not and just
returns uname().
While under Cygwin it uname() always starts with "cygwin" it's not hardcoded in MSYS2
and starts with the environment name. Using sys.platform is safer here.
Fixes#7552
According to the specification:
https://testanything.org/tap-specification.html#skipping-tests
The harness should report the text after # SKIP\S*\s+ as a reason for
skipping.
(it's not exactly like the TODO directive, the phrasing/presentation of
the spec could be improved).
* mtest: TestResult.SKIP is not a failure
If some but not all tests in a run were skipped, then the overall result
is given by whether there were any failures among the non-skipped tests.
Resolves: https://github.com/mesonbuild/meson/issues/7515
Signed-off-by: Simon McVittie <smcv@debian.org>
* Add test-cases for partially skipped TAP tests
issue7515.txt is the output of one of the real TAP tests in gjs, which
failed as a result of #7515. The version inline in meson.build is
a minimal reproducer.
Signed-off-by: Simon McVittie <smcv@debian.org>
Otherwise a wrapper script which takes an executable as an argument will
mistakenly run when that executable is cross compiled. This does not
wrap said executable in an exe_wrapper, just skip it.
Fixes#5982
Gtest can output junit results with a command line switch. We can parse
this to get more detailed results than the returncode, and put those in
our own Junit output. We basically just throw away the top level
'testsuites' object, then fixup the names of the tests, and shove that
into our junit.
JUnit is pretty ubiquitous, lots of services and results viewers
understand it, in particular gitlab and jenkins know how to consume
JUnit xml. This means projects using CI services can have their test
results consumed automatically.
Fixes: #6972
Remove some weirdness from test output such as extra commas, missing
spaces and way too precise time durations. Also improve the overall
alignment of the output.
According to http://testanything.org/tap-specification.html
"Any output line that is not a version, a plan, a test line, a
diagnostic or a bail out is considered an “unknown” line. A TAP parser
is required to not consider an unknown line as an error but may
optionally choose to capture said line and hand it to the test
harness, which may have custom behavior attached [...] TAP::Harness
reports TAP syntax errors at the end of a test run".
(glib gtest can generate empty lines)
The size of WINEPATH is limited (1024 [until recently]), we
can very easily reach that limit, and even the new one (2048) so
try to keep path as small as possible by using the shortPath
version of paths.
Also assert that we do not reach the new hard limit.
And avoid having duplicates in the list of path.
[until recently]: https://bugs.winehq.org/show_bug.cgi?id=45810
This started out with a bug report of mtest trying to add bytes + str,
which I though "Oh, mypy can help!" and turned into an entire day of
awful code traversal and trying to figure out why attributes were
changing type. Hopefully this makes everything cleaner and easier to
follow.
I wanted to look at the imports for annotations but was having hard time
reading them because they're just all over the place. This is purely a
human readability issue.
For consistency, it can be useful to have an explicit empty test suite list
for a test:
test('test-name', binary, suite: [])
This currently passes meson but fails when running meson tests:
Traceback (most recent call last):
File "/usr/lib/python3.7/site-packages/mesonbuild/mesonmain.py", line 122, in run
return options.run_func(options)
File "/usr/lib/python3.7/site-packages/mesonbuild/mtest.py", line 1005, in run
return th.doit()
File "/usr/lib/python3.7/site-packages/mesonbuild/mtest.py", line 756, in doit
self.run_tests(tests)
File "/usr/lib/python3.7/site-packages/mesonbuild/mtest.py", line 896, in run_tests
visible_name = self.get_pretty_suite(test)
File "/usr/lib/python3.7/site-packages/mesonbuild/mtest.py", line 875, in get_pretty_suite
rv = TestHarness.split_suite_string(test.suite[0])[0]
IndexError: list index out of range
Fix it by simply checking for the test suite to be a valid list we can pass on
Fixes#5340
Signed-off-by: Peter Hutterer <peter.hutterer@who-t.net>
* mtest: fix TAP with --verbose
TAP needs to process the test stdout even if --verbose is passed.
Capture it to a separate temporary file, and print it at the end
of the test if --verbose was passed.
In the future, we could parse it on the fly and print the result of
each TAP test point in verbose mode.
* Prefer "stderr is stdout" to "=="
The previous commit used "==" in accordance with the preexisting code,
but reviewers preferred using "is" instead. Fix both occurrences.
This provides an initial support for parsing TAP output. It detects failures
and skipped tests without relying on exit code, as well as early termination
of the test due to an error or a crash.
For now, subtests are not recorded in the TestRun object. However, because the
TAP output goes on stdout, it is printed by --print-errorlogs when a test does
not behave as expected. Handling subtests as TestRuns, and serializing them
to JSON, can be added later.
The parser was written specifically for Meson, and comes with its own
test suite.
Fixes#2923.
Hard errors also come from the GNU Automake test protocol. They happen when
e.g., the set-up of a test case scenario fails, or when some
other unexpected or highly undesirable condition is encountered.
TAP will use them for parse errors too. Add them to the exitcode protocol
first.
--print-errorlogs is using the test's return code to look for failed
tests, instead of just looking at the TestResult. Simplify the code and
make it work for TAP too.
If the global gdb option of mesontest is disabled (e.g. not set '--gdb')
and the gdb option of test_setup is enabled, an exception will be thrown.
Because signal.signal function can only be called from the main thread.
If attempting to call it from other threads will cause a ValueError exception to be raised.
When python sees an invalid character in a filename for the current locale,
instead of clobbering it, it saves is as an invalid codepoint called a
surrogate. We need to explicitly instruct the encoder to write those out
as-is. In the JSON file, we replace them instead to produce valid json.
$ flake8
./mesonbuild/mtest.py:524:9: E122 continuation line missing indentation or outdented
per PEP8, this line requires more indentation to distinguish it from the
following line
This makes it clear in the results that tests marked "should_fail"
exist. We also avoid the all caps output and make the classifications
unambigous compared to pytest or autotools' XFAIL/XPASS.
Before:
OK: 329
FAIL: 1
SKIP: 0
TIMEOUT: 0
After:
Ok: 323
Expected Fail: 1
Fail: 6
Unexpected Pass: 0
Skipped: 0
Timeout: 0
as instructed in the python docs, you should not use PIPE here. This can
lead to deadlocks, with massive testsuite output. Which was the case for efl.
For now the output of the tests is redirected into the a temp file, the
content from there can then be used to fill the TestRun structure.
This fixes test running problems in efl.
This has the adventage that "meson --help" shows a list of all commands,
making them discoverable. This also reduce the manual parsing of
arguments to the strict minimum needed for backward compatibility.
We used to immediately try to use whatever exe_wrapper was defined in
the cross file, but some people generate the cross file once and use
it for several projects, most of which do not even need an exe wrapper
to build.
Now we're a bit more resilient. We quietly fall back to using
non-exe-wrapper paths for compiler checks and skip the sanity check.
However, if some code needs the exe wrapper, f.ex., if you run a built
executable using custom_target() or run_target(), we will error out
during setup.
Tests will, of course, continue to error out when you run them if the
exe wrapper was not found. We don't want people's tests to silently
"pass" (aka skip) because of a bad CI setup.
Closes https://github.com/mesonbuild/meson/issues/3562
This commit also adds a test for the behaviour of exe_wrapper in these
cases, and refactors the unit tests a bit for it.
We already have code to fetch and find binaries specified in a cross
file, so use the same code for exe_wrapper. This allows us to handle
the same corner-cases that were fixed for other cross binaries.
When a test fails due to a signal (e.g., SIGSEGV) it can be somewhat
mysterious why the test failed. Also, even when a test fails due to a
non-zero exit status it would help if the exit status was reported. This
augments the result string to include the non-zero exit status or
signal number and name.
Resolves#3642
When the exe runner is `wine` or `wine32` or `wine64`, etc.
This allows people to run tests with wine.
Note that you also have to set WINEPATH to point to your custom
prefix(es) if your tests use external dependencies.
Closes https://github.com/mesonbuild/meson/issues/3620
Replace the logic where a test setup with no project specifier defaults to
the main project with one that takes the test setup from the same
(sub)project from where the to-be-executed test has been read from.
Use $project_name:$test_setup namespace scheme for test setups. This
allows one to choose from which (sub)project a test setup is taken from
should there be several sharing the same name. Defaults to the main
project. E.g. "meson test --setup subproj:valgrind".
Setting MALLOC_PERTURB_ to a non-zero value is fine for regular test
cases. It helps catching bugs, but also comes with some runtime
overhead.
This overhead is noticeable for benchmarks when compared to running them
directly instead of through Meason.
Therefore, MALLOC_PERTURB_ is not touched for benchmarks.
closes#3034
According to Python documentation[1] dirname and basename
are defined as follows:
os.path.dirname() = os.path.split()[0]
os.path.basename() = os.path.split()[1]
For the purpose of better readability split() is replaced
by appropriate function if only one part of returned tuple
is used.
[1]: https://docs.python.org/3/library/os.path.html#os.path.split
When `ninja -C builddir/ test` is run, ninja will change into the build
dir before starting, but `meson test -C builddir/` does not. This is
important because meson does not use (for good reasons) absolute paths,
which means if a test case needs to be passed as an argument a file name
that is part of the build process, it will be relative builddir. Without
changing into the builddir the path will not exist (or worse, point at
the wrong thing), and test will not behave as intended.
To fix this mtest will change directory before starting tests, and will
change back after all tests have been finished.
Fixes#2710
Adding it to the end of PATH means that if an installed instance of a DLL
exists, that would be used instead of the built instance.
Compare with run_exe(), which already gets this right.