Postgres Email Lists

pgsql-hackers

❮

meson: Make test output much more useful on failure (both in CI and locally)

Jump to comment-1
Jelte Fennema-Nio<postgres@jeltef.nl>
Jan 26, 2026, 10:14 AM UTC
I've gotten totally fed up with the amount of clicking I need to do
every time a test fails in CI to find out what exactly failed. The
output that meson gives in its main output is completely useless
currently. Which means I have to go click through CirrusCI its (pretty
terrible) filebrowser UI searching for log files or regression.diffs.
And locally I have the exact same problem when a perl test is failing.
Meson often shows no useful output at all, which means I have to find
some unknown log file somewhere in my build directory. And then I have
to find the useful spot in that log file that actually tells me what the
error was.
So attached are a few small patches that greatly improve the test output
of the most common test failure reasons (at least the ones I tend to run
into myself).
An example CI run with some intentional failures (attached as the
nocfbot patch) can be found here:
https://cirrus-ci.com/task/6592551433535488
View in PostgreSQL Archives →
- Jump to comment-1
  Andres Freund<andres@anarazel.de>
  Jan 26, 2026, 4:42 PM UTC
  Hi,
  On 2026-01-26 11:13:57 +0100, Jelte Fennema-Nio wrote:
  I've gotten totally fed up with the amount of clicking I need to do
  every time a test fails in CI to find out what exactly failed. The
  output that meson gives in its main output is completely useless
  currently.
  This doesn't really seem meson specific, right? It just seems about the output
  of our own test tooling?
  Meson can't really output something that we hide from it...
  I agree that what we currently do is pretty unhelpful.
  Subject: [PATCH v1 1/5] meson: Include pg_regress diffs in meson output
  This isn't about meson, it's about adding a new argument to pg_regress.
  I think we'd need to limit the size of the output here somewhat. If e.g. the
  server crashes, you end up with all subsequent tests failing and printing out
  a couple hundred kB of pointless diff output.
  @@ -1521,22 +1523,40 @@ results_differ(const char testname, const char resultsfile, const char *defaul
  * append to the diffs summary file.
  */
  
  - / Write diff header /
  - difffile = fopen(difffilename, "a");
  + difffile = fopen(difffilename, "a+");
  if (difffile)
  {
  + fseek(difffile, 0, SEEK_END);
  + startpos = ftell(difffile);
  fprintf(difffile,
  "diff %s %s %s\n",
  prettydiffopts, bestexpectfile, resultsfile);
  + fflush(difffile);
  +
  + / Run diff /
  + snprintf(cmd, sizeof(cmd),
  + "diff %s \"%s\" \"%s\" >> \"%s\"",
  + prettydiffopts, bestexpectfile, resultsfile, difffilename);
  + run_diff(cmd, difffilename);
  +
  + / Emit diff as TAP diagnostics if requested /
  + if (print_diffs)
  + {
  + char line[1024];
  +
  + fseek(difffile, startpos, SEEK_SET);
  + while (fgets(line, sizeof(line), difffile))
  + {
  + size_t len = strlen(line);
  +
  + if (len > 0 && line[len - 1] == '\n')
  + line[len - 1] = '\0';
  + diag("%s", line);
  + }
  + }
  fclose(difffile);
  }
  I'm a bit confused. Why are we rerunning diffs and appending to the diff file?
  From c1db07a8bba54e6b903766585f1b521ef526de12 Mon Sep 17 00:00:00 2001
  From: Jelte Fennema-Nio <postgres@jeltef.nl>
  Date: Mon, 26 Jan 2026 09:09:11 +0100
  Subject: [PATCH v1 2/5] perl tap: Show failed command output
  
  This adds the output of failed commands to the TAP output. Before a
  failed libpq_pipeline test would look like this:
  
  Failed test 'libpq_pipeline cancel'
  at /home/jelte/work/postgres-3/src/test/modules/libpqpipeline/t/001libpq_pipeline.pl line 55.
  
  Now you can actually see the reason of the failure:
  
  Failed test 'libpq_pipeline cancel'
  at /home/jelte/work/postgres-3/src/test/modules/libpqpipeline/t/001libpq_pipeline.pl line 55.
  ----- command failed -----
  libpqpipeline -r 700 cancel port=14309 host=/tmp/htMib451qD dbname='postgres' maxprotocol_version=latest
  --------- stderr ---------
  test cancellations...
  libpq_pipeline:315: unexpected number of rows received: 1
  --------------------------
  ---
  src/test/perl/PostgreSQL/Test/Utils.pm | 36 +++++++++++++++++++++++---
  1 file changed, 32 insertions(+), 4 deletions(-)
  
  diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
  index ff843eecc6e..bd1e981c6f0 100644
  --- a/src/test/perl/PostgreSQL/Test/Utils.pm
  +++ b/src/test/perl/PostgreSQL/Test/Utils.pm
  @@ -955,8 +955,22 @@ sub command_ok
  {
  local $Test::Builder::Level = $Test::Builder::Level + 1;
  my ($cmd, $testname) = @;
  - my $result = run_log($cmd);
  - ok($result, $test_name);
  + # Doesn't rely on detecting end of file on the file descriptors,
  + # which can fail, causing the process to hang, notably on Msys
  + # when used with 'pg_ctl start'
  + my $stdoutfile = File::Temp->new();
  + my $stderrfile = File::Temp->new();
  I'm afraid that the creation of this many additional tempfiles would slow down
  the tests on some platforms...
  + my $result = IPC::Run::run $cmd, '>' => $stdoutfile, '2>' => $stderrfile;
  + ok($result, $test_name) or do
  + {
  + my $stdout = slurp_file($stdoutfile);
  + my $stderr = slurp_file($stderrfile);
  + diag("----- command failed -----");
  + diag(join(" ", @$cmd));
  + diag("--------- stdout ---------"), diag($stdout) if $stdout;
  + diag("--------- stderr ---------"), diag($stderr) if $stderr;
  + diag("--------------------------");
  + };
  Outputting the entire file seems to again have the danger of completely
  swamping the output.
  return;
  }
  
  @@ -972,8 +986,22 @@ sub command_fails
  {
  local $Test::Builder::Level = $Test::Builder::Level + 1;
  my ($cmd, $testname) = @;
  - my $result = run_log($cmd);
  - ok(!$result, $test_name);
  + # Doesn't rely on detecting end of file on the file descriptors,
  + # which can fail, causing the process to hang, notably on Msys
  + # when used with 'pg_ctl start'
  Huh? Any more details?
  + my $stdoutfile = File::Temp->new();
  + my $stderrfile = File::Temp->new();
  + my $result = IPC::Run::run $cmd, '>' => $stdoutfile, '2>' => $stderrfile;
  + ok(!$result, $test_name) or do
  + {
  + my $stdout = slurp_file($stdoutfile);
  + my $stderr = slurp_file($stderrfile);
  + diag("-- command succeeded unexpectedly --");
  + diag(join(" ", @$cmd));
  + diag("-------------- stdout --------------"), diag($stdout) if $stdout;
  + diag("-------------- stderr --------------"), diag($stderr) if $stderr;
  + diag("------------------------------------");
  + };
  I don't think it makes sense to duplicate the logic for this multiple times.
  From c88f1fbf6462da56a1f0739dd83a6284d6bdd389 Mon Sep 17 00:00:00 2001
  From: Jelte Fennema-Nio <postgres@jeltef.nl>
  Date: Mon, 26 Jan 2026 09:32:15 +0100
  Subject: [PATCH v1 3/5] perl tap: Show die reason in TAP output
  ...
  diff --git a/src/test/perl/PostgreSQL/Test/Utils.pm b/src/test/perl/PostgreSQL/Test/Utils.pm
  index bd1e981c6f0..df328d0f571 100644
  --- a/src/test/perl/PostgreSQL/Test/Utils.pm
  +++ b/src/test/perl/PostgreSQL/Test/Utils.pm
  @@ -206,6 +206,14 @@ INIT
  # test may still fail, but it's more likely to report useful facts.
  $SIG{PIPE} = 'IGNORE';
  
  + # Emit die messages as TAP diagnostics so they appear in test output.
  + $SIG{DIE} = sub {
  + return if $^S; # Ignore dies inside eval
  + my $msg = shift;
  + chomp $msg;
  + diag("die: $msg");
  + };
  +
  This looked familiar - turns out I had complained about this in the past and
  then forgotten about it :(
  https://www.postgresql.org/message-id/20220222181924.eehi7o4pmneeb4hm%40alap3.anarazel.de
  At least at the time I found it necessary to separately test for $^S, to avoid
  triggering the logic for syntax errors. Not sure why I concluded that though.
  In a reply Andrew suggested to use done_testing(), which avoids the pointless
  "Tests were run but no plan was declared and done_testing() was not seen."
  which does seem like an improvement.
  From a6b324678293d65f3a417b8f66889b3f3f6a9c29 Mon Sep 17 00:00:00 2001
  From: Jelte Fennema-Nio <postgres@jeltef.nl>
  Date: Mon, 26 Jan 2026 10:04:44 +0100
  Subject: [PATCH v1 4/5] perl tap: Include caller in die messages of psql
  
  This way by looking at the error output you can actually figure out
  where it occured, instead of just knowing that is in the very often
  called psql function.
  I think this is too localized a fix. Right now every die in a .pm file has
  this issue. The easiest fix would probably be to just replace all uses of die
  in .pm files with croak (which we already, inconsistently, use).
  stderr:
  # die: error running SQL: 'psql:<stdin>:1: ERROR: syntax error at or near "I"
  # LINE 1: I am the worst
  # ^'
  # while running 'psql --no-psqlrc --no-align --tuples-only --quiet --dbname port=20878 host=/tmp/K5V48mzFx9 dbname='postgres' --file - --variable ONERRORSTOP=1' with sql 'I am the worst' at /home/andres/src/postgresql/src/bin/scripts/t/020_createdb.pl line 19.
  Note that the failure now is reported as coming from the actual test file.
  We could also use Test::More->builder->croak(), but that's a heck of a lot
  longer, and would require us to be more liberal about sprinkling
  local $Test::Builder::Level = $Test::Builder::Level + 1;
  around. I don't see an improvement in the output either.
  In contrast Carp.pm just looks for a caller outside of the current package. If
  we are worried about errors across Cluster.pm/Utils.pm, we could just fix that
  in one place by using @CARP_NOT.
  Maybe there's a way to do that more centrally?
  Greetings,
  Andres Freund
  View in PostgreSQL Archives →
  Jump to comment-1
  Jelte Fennema-Nio<postgres@jeltef.nl>
  Jan 26, 2026, 5:46 PM UTC
  On Mon, 26 Jan 2026 at 17:42, Andres Freund <andres@anarazel.de> wrote:
  This doesn't really seem meson specific, right? It just seems about the output
  of our own test tooling?
  Yes, it's general improvements. But the problem is worse in meson than
  for because that really only shows the TAP output. The Makefile based
  builds will show a bit more information currently.
  I think we'd need to limit the size of the output here somewhat. If e.g. the
  server crashes, you end up with all subsequent tests failing and printing out
  a couple hundred kB of pointless diff output.
  Fair enough.
  I'm a bit confused. Why are we rerunning diffs and appending to the diff file?
  Now I'm confused. We're running the same amount of diffs as before.
  Did you miss the removal of the / Run diff / below the block that
  you quoted?
  I'm afraid that the creation of this many additional tempfiles would slow down
  the tests on some platforms...
  < snip>
  @@ -972,8 +986,22 @@ sub command_fails
  {
  local $Test::Builder::Level = $Test::Builder::Level + 1;
  my ($cmd, $testname) = @;
  - my $result = run_log($cmd);
  - ok(!$result, $test_name);
  + # Doesn't rely on detecting end of file on the file descriptors,
  + # which can fail, causing the process to hang, notably on Msys
  + # when used with 'pg_ctl start'
  
  Huh? Any more details?
  I copy pasted that comment and the pipe to tempfiles from
  commandlikesafe. I did this after I initially tried to pipe to
  variables, but that made the test that we run in SanityCheck get stuck
  forever.
  I don't get why this is needed either.
  + my $stdoutfile = File::Temp->new();
  + my $stderrfile = File::Temp->new();
  + my $result = IPC::Run::run $cmd, '>' => $stdoutfile, '2>' => $stderrfile;
  + ok(!$result, $test_name) or do
  + {
  + my $stdout = slurp_file($stdoutfile);
  + my $stderr = slurp_file($stderrfile);
  + diag("-- command succeeded unexpectedly --");
  + diag(join(" ", @$cmd));
  + diag("-------------- stdout --------------"), diag($stdout) if $stdout;
  + diag("-------------- stderr --------------"), diag($stderr) if $stderr;
  + diag("------------------------------------");
  + };
  
  I don't think it makes sense to duplicate the logic for this multiple times.
  It seemed a bit much to make it generic over the input, as it would
  need to remove dashes accordingly. Especially since it only has two
  copies.
  This looked familiar - turns out I had complained about this in the past and
  then forgotten about it :(
  
  https://www.postgresql.org/message-id/20220222181924.eehi7o4pmneeb4hm%40alap3.anarazel.de
  
  At least at the time I found it necessary to separately test for $^S, to avoid
  triggering the logic for syntax errors. Not sure why I concluded that though.
  
  In a reply Andrew suggested to use done_testing(), which avoids the pointless
  "Tests were run but no plan was declared and done_testing() was not seen."
  which does seem like an improvement.
  I'll take a look at doing that.
  I think this is too localized a fix. Right now every die in a .pm file has
  this issue. The easiest fix would probably be to just replace all uses of die
  in .pm files with croak (which we already, inconsistently, use).
  I didn't even know that function existed (due to my pretty much
  non-existent knowlegde of perl).
  View in PostgreSQL Archives →
  Jump to comment-1
  Jelte Fennema-Nio<postgres@jeltef.nl>
  Jan 27, 2026, 2:49 PM UTC
  On Mon Jan 26, 2026 at 5:42 PM CET, Andres Freund wrote:
  I agree that what we currently do is pretty unhelpful.
  Attached is v2, which I think fixes all the issues that you mentioned.
  View in PostgreSQL Archives →
- Jump to comment-1
  Zsolt Parragi<zsolt.parragi@percona.com>
  Jan 26, 2026, 3:36 PM UTC
  Hello,
  This is really useful!
  - my $result = run_log($cmd);
  - ok($result, $test_name);
  + # Doesn't rely on detecting end of file on the file descriptors,
  + # which can fail, causing the process to hang, notably on Msys
  + # when used with 'pg_ctl start'
  The function comments still refer to runlog, for both commandok and
  command_fails
  View in PostgreSQL Archives →