Postgres Email Lists

pgsql-hackers

❮

pgsql: Prevent invalidation of newly synced replication slots.

Jump to comment-1
Amit Kapila<akapila@postgresql.org>
Jan 27, 2026, 5:56 AM UTC
Prevent invalidation of newly synced replication slots.
A race condition could cause a newly synced replication slot to become
invalidated between its initial sync and the checkpoint.
When syncing a replication slot to a standby, the slot's initial
restartlsn is taken from the publisher's remoterestart_lsn. Because slot
sync happens asynchronously, this value can lag behind the standby's
current redo pointer. Without any interlocking between WAL reservation and
checkpoints, a checkpoint may remove WAL required by the newly synced
slot, causing the slot to be invalidated.
To fix this, we acquire ReplicationSlotAllocationLock before reserving WAL
for a newly synced slot, similar to commit 006dd4b2e5. This ensures that
if WAL reservation happens first, the checkpoint process must wait for
slotsync to update the slot's restart_lsn before it computes the minimum
required LSN.
However, unlike in ReplicationSlotReserveWal(), this lock alone cannot
protect a newly synced slot if a checkpoint has already run
CheckPointReplicationSlots() before slotsync updates the slot. In such
cases, the remote restart_lsn may be stale and earlier than the current
redo pointer. To prevent relying on an outdated LSN, we use the oldest
WAL location available if it is greater than the remote restart_lsn.
This ensures that newly synced slots always start with a safe, non-stale
restart_lsn and are not invalidated by concurrent checkpoints.
Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Vitaly Davydov <v.davydov@postgrespro.ru>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Backpatch-through: 17
Discussion: https://postgr.es/m/TY4PR01MB16907E744589B1AB2EE89A31F94D7A%40TY4PR01MB16907.jpnprd01.prod.outlook.com
Branch
```
------
```
master
Details
```
-------
```
https://git.postgresql.org/pg/commitdiff/851f6649cc18c4b482fa2b6afddb65b35d035370
Modified Files
```
--------------
src/backend/access/transam/xlog.c                  |  6 +-
src/backend/replication/logical/slotsync.c         | 97 +++++++++++-----------
src/include/access/xlog.h                          |  1 +
src/test/recovery/t/046_checkpoint_logical_slot.pl | 84 ++++++++++++++++++-
```
4 files changed, 136 insertions(+), 52 deletions(-)
View in PostgreSQL Archives →
- Jump to comment-1
  Robert Haas<robertmhaas@gmail.com>
  Jan 27, 2026, 2:59 PM UTC
  On Tue, Jan 27, 2026 at 12:56 AM Amit Kapila <akapila@postgresql.org> wrote:
  Prevent invalidation of newly synced replication slots.
  This commit has broken CI for me. On the "Windows - Server 2022, VS
  2019 - Meson & ninja" build, the following shows up in
  046checkpointlogicalslotstandby.log:
  2026-01-27 13:44:44.421 GMT startup[5172] FATAL: could not rename
  file "backuplabel" to "backuplabel.old": Permission denied
  I imagine this is going to break CI for everybody else too, as well as cfbot.
  --
  Robert Haas
  EDB: http://www.enterprisedb.com
  View in PostgreSQL Archives →
  Jump to comment-1
  Amit Kapila<amit.kapila16@gmail.com>
  Jan 28, 2026, 4:34 AM UTC
  On Tue, Jan 27, 2026 at 8:29 PM Robert Haas <robertmhaas@gmail.com> wrote:
  On Tue, Jan 27, 2026 at 12:56 AM Amit Kapila <akapila@postgresql.org> wrote:
  Prevent invalidation of newly synced replication slots.
  
  This commit has broken CI for me. On the "Windows - Server 2022, VS
  2019 - Meson & ninja" build, the following shows up in
  046checkpointlogicalslotstandby.log:
  
  2026-01-27 13:44:44.421 GMT startup[5172] FATAL: could not rename
  file "backuplabel" to "backuplabel.old": Permission denied
  
  I imagine this is going to break CI for everybody else too, as well as cfbot.
  I'll try to reproduce and look into it.
  --
  With Regards,
  Amit Kapila.
  View in PostgreSQL Archives →
  Jump to comment-1
  Thomas Munro<thomas.munro@gmail.com>
  Jan 27, 2026, 4:16 PM UTC
  On Wed, Jan 28, 2026 at 3:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
  I imagine this is going to break CI for everybody else too, as well as cfbot.
  Just by the way, on that last point, we trained cfbot to watch out for
  CI pass/fail in this account:
  https://github.com/postgres/postgres/commits/master/
  and then use the most recent pass as the base commit when applying
  patches to make test branches. So if master is broken for a while, it
  no longer takes all the cfbot runs with it. Mentioning just in case
  anyone is confused by that...
  As for what's happening... hmm, there are a few holes in the "shared
  locking" stuff you get with the flags we use. For example you can't
  unlink a directory that contains a file that has been unlinked but
  someone still holds open. Doesn't seem to be the case here. But I
  wonder if you can't rename("old", "new") where "new" is a file that
  has already been unlinked (or renamed over) that someone still holds
  open, or something like that...
  View in PostgreSQL Archives →
  Jump to comment-1
  Andres Freund<andres@anarazel.de>
  Jan 27, 2026, 4:38 PM UTC
  Hi,
  On 2026-01-28 05:16:13 +1300, Thomas Munro wrote:
  On Wed, Jan 28, 2026 at 3:59 AM Robert Haas <robertmhaas@gmail.com> wrote:
  I imagine this is going to break CI for everybody else too, as well as cfbot.
  
  Just by the way, on that last point, we trained cfbot to watch out for
  CI pass/fail in this account:
  
  https://github.com/postgres/postgres/commits/master/
  
  and then use the most recent pass as the base commit when applying
  patches to make test branches. So if master is broken for a while, it
  no longer takes all the cfbot runs with it. Mentioning just in case
  anyone is confused by that...
  Ah. I was indeed confused by that for a bit.
  But I wonder if you can't rename("old", "new") where "new" is a file that
  has already been unlinked (or renamed over) that someone still holds open,
  or something like that...
  I don't see a source of that that would be specific to this test though :(. We
  do wait for pg_basebackup to have shut down, which wrote backup.label (which
  was "manifactured" during streaming by basebackup.c).
  Perhaps we should crank up log level in the test? No idea if it'll help, but
  right now I don't even know where to start looking.
  Greetings,
  Andres Freund
  View in PostgreSQL Archives →
  Jump to comment-1
  Thomas Munro<thomas.munro@gmail.com>
  Jan 28, 2026, 7:23 AM UTC
  On Tue, Jan 27, 2026 at 5:37 PM Andres Freund <andres@anarazel.de> wrote:
  On 2026-01-28 05:16:13 +1300, Thomas Munro wrote:
  But I wonder if you can't rename("old", "new") where "new" is a file that
  has already been unlinked (or renamed over) that someone still holds open,
  or something like that...
  
  I don't see a source of that that would be specific to this test though :(. We
  do wait for pg_basebackup to have shut down, which wrote backup.label (which
  was "manifactured" during streaming by basebackup.c).
  I have no specific ideas, but just in case it's helpful for this
  discussion, I looked at my old test suite[1] where I tried to
  catalogue all the edge conditions around this sort of stuff
  empirically, and saw that rename() always fails like that if the file
  is open (that is, it doesn't require a more complicated sequence with
  an earlier unlink/rename of the new name):
  + /*
  + * Windows can't rename over an open non-unlinked file, even with
  + * haveposixunlink_semantics.
  + */
  + pgwin32dirmodloops = 2; / minimize looping to fail fast in testing /
  + PGEXPECTSYS(rename(path, path2) == -1,
  + "Windows: can't rename name1.txt -> name2.txt while name2.txt is open");
  + PGEXPECTEQ(errno, EACCES);
  [1] https://www.postgresql.org/message-id/flat/CA%2BhUKG%2BajSQ_8eu2AogTncOnZ5me2D-Cn66iN_-wZnRjLN%2Bicg%40mail.gmail.com
  View in PostgreSQL Archives →
  Jump to comment-1
  Robert Haas<robertmhaas@gmail.com>
  Jan 27, 2026, 5:43 PM UTC
  On Tue, Jan 27, 2026 at 11:37 AM Andres Freund <andres@anarazel.de> wrote:
  But I wonder if you can't rename("old", "new") where "new" is a file that
  has already been unlinked (or renamed over) that someone still holds open,
  or something like that...
  
  I don't see a source of that that would be specific to this test though :(. We
  do wait for pg_basebackup to have shut down, which wrote backup.label (which
  was "manifactured" during streaming by basebackup.c).
  
  Perhaps we should crank up log level in the test? No idea if it'll help, but
  right now I don't even know where to start looking.
  I tried sticking a pg_sleep(30) in just before starting the standby
  node, and that didn't help, so it doesn't seem like it's a race
  condition.
  Here's what the standby log file looks like with logminmessages=DEBUG2:
  2026-01-27 17:19:25.262 GMT postmaster[4932] DEBUG: registering
  background worker "logical replication launcher"
  2026-01-27 17:19:25.264 GMT postmaster[4932] DEBUG: dynamic shared
  memory system will support 229 segments
  2026-01-27 17:19:25.264 GMT postmaster[4932] DEBUG: created dynamic
  shared memory control segment 3769552926 (9176 bytes)
  2026-01-27 17:19:25.266 GMT postmaster[4932] DEBUG: maxsafefds =
  990, usablefds = 1000, alreadyopen = 3
  2026-01-27 17:19:25.268 GMT postmaster[4932] LOG: starting PostgreSQL
  19devel on x86_64-windows, compiled by msvc-19.29.30159, 64-bit
  2026-01-27 17:19:25.271 GMT postmaster[4932] LOG: listening on Unix
  socket "C:/Windows/TEMP/3xesO1s4ba/.s.PGSQL.17575"
  2026-01-27 17:19:25.273 GMT postmaster[4932] DEBUG: updating PMState
  from PMINIT to PMSTARTUP
  2026-01-27 17:19:25.273 GMT postmaster[4932] DEBUG: assigned pm child
  slot 57 for io worker
  2026-01-27 17:19:25.275 GMT postmaster[4932] DEBUG: assigned pm child
  slot 58 for io worker
  2026-01-27 17:19:25.277 GMT postmaster[4932] DEBUG: assigned pm child
  slot 59 for io worker
  2026-01-27 17:19:25.278 GMT postmaster[4932] DEBUG: assigned pm child
  slot 56 for checkpointer
  2026-01-27 17:19:25.280 GMT postmaster[4932] DEBUG: assigned pm child
  slot 55 for background writer
  2026-01-27 17:19:25.281 GMT postmaster[4932] DEBUG: assigned pm child
  slot 89 for startup
  2026-01-27 17:19:25.308 GMT checkpointer[6560] DEBUG: checkpointer
  updated shared memory configuration values
  2026-01-27 17:19:25.314 GMT startup[2488] LOG: database system was
  interrupted; last known up at 2026-01-27 17:19:21 GMT
  2026-01-27 17:19:25.317 GMT startup[2488] DEBUG: removing all
  temporary WAL segments
  The system cannot find the file specified.
  2026-01-27 17:19:25.336 GMT startup[2488] DEBUG: could not restore
  file "00000002.history" from archive: child process exited with exit
  code 1
  2026-01-27 17:19:25.337 GMT startup[2488] DEBUG: backup time
  2026-01-27 17:19:21 GMT in file "backup_label"
  2026-01-27 17:19:25.337 GMT startup[2488] DEBUG: backup label
  pgbasebackup base backup in file "backuplabel"
  2026-01-27 17:19:25.337 GMT startup[2488] DEBUG: backup timeline 1 in
  file "backup_label"
  2026-01-27 17:19:25.337 GMT startup[2488] LOG: starting backup
  recovery with redo LSN 0/2A000028, checkpoint LSN 0/2A000080, on
  timeline ID 1
  The system cannot find the file specified.
  2026-01-27 17:19:25.352 GMT startup[2488] DEBUG: could not restore
  file "00000001000000000000002A" from archive: child process exited
  with exit code 1
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: checkpoint record is
  at 0/2A000080
  2026-01-27 17:19:25.353 GMT startup[2488] LOG: entering standby mode
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: redo record is at
  0/2A000028; shutdown false
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: next transaction ID:
  769; next OID: 24576
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: next MultiXactId: 1;
  next MultiXactOffset: 1
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: oldest unfrozen
  transaction ID: 760, in database 1
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: oldest MultiXactId:
  1, in database 1
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: commit timestamp Xid
  oldest/newest: 0/0
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: transaction ID wrap
  limit is 2147484407, limited by database with OID 1
  2026-01-27 17:19:25.353 GMT startup[2488] DEBUG: MultiXactId wrap
  limit is 2147483648, limited by database with OID 1
  2026-01-27 17:19:25.354 GMT startup[2488] DEBUG: starting up replication slots
  2026-01-27 17:19:25.354 GMT startup[2488] DEBUG: xmin required by
  slots: data 0, catalog 0
  2026-01-27 17:19:25.354 GMT startup[2488] DEBUG: starting up
  replication origin progress state
  2026-01-27 17:19:25.354 GMT startup[2488] DEBUG: didn't need to
  unlink permanent stats file "pg_stat/pgstat.stat" - didn't exist
  2026-01-27 17:19:38.938 GMT startup[2488] FATAL: could not rename
  file "backuplabel" to "backuplabel.old": Permission denied
  2026-01-27 17:19:38.983 GMT postmaster[4932] DEBUG: releasing pm child slot 89
  2026-01-27 17:19:38.983 GMT postmaster[4932] LOG: startup process
  (PID 2488) exited with exit code 1
  2026-01-27 17:19:38.983 GMT postmaster[4932] LOG: aborting startup
  due to startup process failure
  2026-01-27 17:19:38.983 GMT postmaster[4932] DEBUG: cleaning up
  dynamic shared memory control segment with ID 3769552926
  2026-01-27 17:19:38.985 GMT postmaster[4932] LOG: database system is shut down
  Unfortunately, I don't see any clues there. The "The system cannot
  find the file specified." messages look like they might be a clue, but
  I think they are not, because they also occur in
  040standbyfailoverslotssync_standby1.log, and that test passes. At
  the point where this log file shows the FATAL error, that log file
  continues thus:
  2026-01-27 17:18:36.905 GMT startup[1420] DEBUG: resetting unlogged
  relations: cleanup 1 init 0
  2026-01-27 17:18:36.906 GMT startup[1420] DEBUG: initializing for hot standby
  2026-01-27 17:18:36.906 GMT startup[1420] LOG: redo starts at 0/02000028
  2026-01-27 17:18:36.906 GMT startup[1420] DEBUG: recovery snapshots
  are now enabled
  2026-01-27 17:18:36.906 GMT startup[1420] CONTEXT: WAL redo at
  0/02000048 for Standby/RUNNING_XACTS: nextXid 769 latestCompletedXid
  768 oldestRunningXid 769
  2026-01-27 17:18:36.907 GMT startup[1420] DEBUG: end of backup record reached
  2026-01-27 17:18:36.907 GMT startup[1420] CONTEXT: WAL redo at
  0/02000100 for XLOG/BACKUP_END: 0/02000028
  2026-01-27 17:18:36.907 GMT startup[1420] DEBUG: end of backup reached
  Which again seems totally normal.
  --
  Robert Haas
  EDB: http://www.enterprisedb.com
  View in PostgreSQL Archives →
  Jump to comment-1
  Andres Freund<andres@anarazel.de>
  Jan 27, 2026, 6:17 PM UTC
  Hi,
  On 2026-01-27 12:42:51 -0500, Robert Haas wrote:
  I tried sticking a pg_sleep(30) in just before starting the standby
  node, and that didn't help, so it doesn't seem like it's a race
  condition.
  Interesting.
  It could be worth trying to run the test in isolation, without all the other
  concurrent tests.
  Greg, have you tried to repro it interactively?
  Bryan, you seem to have become the resident windows expert...
  2026-01-27 17:19:25.337 GMT startup[2488] LOG: starting backup
  recovery with redo LSN 0/2A000028, checkpoint LSN 0/2A000080, on
  timeline ID 1
  The system cannot find the file specified.
  2026-01-27 17:19:25.352 GMT startup[2488] DEBUG: could not restore
  file "00000001000000000000002A" from archive: child process exited
  with exit code 1
  I think that must be a message from "copy" (which we seem to be using for
  restore_command on windows).
  I don't know why the standby is created with has_restoring => 1. But it
  shouldn't be related to the issue, I think?
  Greetings,
  Andres Freund
  View in PostgreSQL Archives →
  Jump to comment-1
  Greg Burd<greg@burd.me>
  Jan 28, 2026, 6:02 PM UTC
  On Tue, Jan 27, 2026, at 1:17 PM, Andres Freund wrote:
  Hi,
  
  On 2026-01-27 12:42:51 -0500, Robert Haas wrote:
  I tried sticking a pg_sleep(30) in just before starting the standby
  node, and that didn't help, so it doesn't seem like it's a race
  condition.
  
  Interesting.
  
  It could be worth trying to run the test in isolation, without all the other
  concurrent tests.
  
  Greg, have you tried to repro it interactively?
  Nope, not yet. I'm working on my ailing animals now and updated unicorn to include injection points.
  -greg
  Bryan, you seem to have become the resident windows expert...
  
  2026-01-27 17:19:25.337 GMT startup[2488] LOG: starting backup
  recovery with redo LSN 0/2A000028, checkpoint LSN 0/2A000080, on
  timeline ID 1
  The system cannot find the file specified.
  2026-01-27 17:19:25.352 GMT startup[2488] DEBUG: could not restore
  file "00000001000000000000002A" from archive: child process exited
  with exit code 1
  
  I think that must be a message from "copy" (which we seem to be using for
  restore_command on windows).
  I don't know why the standby is created with has_restoring => 1. But it
  shouldn't be related to the issue, I think?
  
  Greetings,
  
  Andres Freund
  View in PostgreSQL Archives →
  Jump to comment-1
  Amit Kapila<amit.kapila16@gmail.com>
  Jan 28, 2026, 11:20 AM UTC
  On Tue, Jan 27, 2026 at 11:47 PM Andres Freund <andres@anarazel.de> wrote:
  I don't know why the standby is created with has_restoring => 1.
  This is not required. I think this is copy-paste oversight.
  But it
  shouldn't be related to the issue, I think?
  Yeah, tried without this as well apart from other experiments.
  --
  With Regards,
  Amit Kapila.
  View in PostgreSQL Archives →
  Jump to comment-1
  Robert Haas<robertmhaas@gmail.com>
  Jan 27, 2026, 6:16 PM UTC
  On Tue, Jan 27, 2026 at 12:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
  2026-01-27 17:19:25.354 GMT startup[2488] DEBUG: didn't need to
  unlink permanent stats file "pg_stat/pgstat.stat" - didn't exist
  2026-01-27 17:19:38.938 GMT startup[2488] FATAL: could not rename
  file "backuplabel" to "backuplabel.old": Permission denied
  Andrey Borodin pointed out to me off-list that there's a retry loop in
  pgrename(). The 13 second delay between the above two log messages
  almost certainly means that retry loop is iterating until it hits its
  10 second timeout. This almost certainly means that the underlying
  Windows error is ERRORACCESSDENIED, ERRORSHARINGVIOLATION, or
  ERRORLOCKVIOLATION, and that somebody else has the file open. But
  nothing other than Perl touches that directory before we try to start
  the standby:
  my $standby = PostgreSQL::Test::Cluster->new('standby');
  $standby->initfrombackup(
  $primary, $backup_name, has_streaming => 1, has_restoring => 1);
  $standby->append_conf(
  'postgresql.conf', qq(
  hotstandbyfeedback = on
  primaryslotname = 'phys_slot'
  primaryconninfo = '$connstr1 dbname=postgres'
  logminmessages = 'debug2'
  ));
  $standby->start;
  As far as I can see, only initfrombackup() touches the backup_label
  file, and that just copies the directory using RecursiveCopy.pm, which
  as far as I can tell is quite careful about closing file handles. So I
  still have no idea what's happening here.
  --
  Robert Haas
  EDB: http://www.enterprisedb.com
  View in PostgreSQL Archives →
  Jump to comment-1
  Amit Kapila<amit.kapila16@gmail.com>
  Jan 28, 2026, 5:48 AM UTC
  On Tue, Jan 27, 2026 at 11:46 PM Robert Haas <robertmhaas@gmail.com> wrote:
  On Tue, Jan 27, 2026 at 12:42 PM Robert Haas <robertmhaas@gmail.com> wrote:
  2026-01-27 17:19:25.354 GMT startup[2488] DEBUG: didn't need to
  unlink permanent stats file "pg_stat/pgstat.stat" - didn't exist
  2026-01-27 17:19:38.938 GMT startup[2488] FATAL: could not rename
  file "backuplabel" to "backuplabel.old": Permission denied
  
  Andrey Borodin pointed out to me off-list that there's a retry loop in
  pgrename(). The 13 second delay between the above two log messages
  almost certainly means that retry loop is iterating until it hits its
  10 second timeout.
  Yes, this is correct. I am able to reproduce it. In pgrename(), we use
  MoveFileEx() windows API which fails with errorcode 32 which further
  maps to doserrr 13 via _dosmaperr. It is following mapping
  ERRORSHARINGVIOLATION, EACCES in doserrors struct.
  This almost certainly means that the underlying
  Windows error is ERRORACCESSDENIED, ERRORSHARINGVIOLATION, or
  ERRORLOCKVIOLATION, and that somebody else has the file open.
  It is ERRORSHARINGVIOLATION.
  But
  nothing other than Perl touches that directory before we try to start
  the standby:
  my $standby = PostgreSQL::Test::Cluster->new('standby');
  $standby->initfrombackup(
  $primary, $backup_name,
  has_streaming => 1,
  has_restoring => 1);
  $standby->append_conf(
  'postgresql.conf', qq(
  hotstandbyfeedback = on
  primaryslotname = 'phys_slot'
  primaryconninfo = '$connstr1 dbname=postgres'
  logminmessages = 'debug2'
  ));
  $standby->start;
  
  As far as I can see, only initfrombackup() touches the backup_label
  file, and that just copies the directory using RecursiveCopy.pm, which
  as far as I can tell is quite careful about closing file handles. So I
  still have no idea what's happening here.
  It is not clear to me either why the similar test like
  040standbyfailoverslotssync is successful and
  046checkpointlogical_slot is failing. I am still thinking about it
  but thought of sharing the information I could gather by debugging.
  Do let me know if you could think of gathering any other information
  which can be of help here.
  --
  With Regards,
  Amit Kapila.
  View in PostgreSQL Archives →
  Jump to comment-1
  Amit Kapila<amit.kapila16@gmail.com>
  Jan 28, 2026, 10:47 AM UTC
  On Wed, Jan 28, 2026 at 11:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
  
  It is not clear to me either why the similar test like
  040standbyfailoverslotssync is successful and
  046checkpointlogical_slot is failing. I am still thinking about it
  but thought of sharing the information I could gather by debugging.
  It seems there is some interaction with previous test in same file
  which is causing this failure as we are using the primary node from
  previous test. When I tried to comment out get_changes and its
  corresponding injection_point in the previous test as attached, the
  entire test passed. I think if we use a freshly created primary node,
  this test will pass but I wanted to spend some more time to see
  how/why previous test is causing this issue?
  --
  With Regards,
  Amit Kapila.
  View in PostgreSQL Archives →
  Jump to comment-1
  Amit Kapila<amit.kapila16@gmail.com>
  Jan 28, 2026, 12:35 PM UTC
  On Wed, Jan 28, 2026 at 4:16 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
  On Wed, Jan 28, 2026 at 11:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
  
  It is not clear to me either why the similar test like
  040standbyfailoverslotssync is successful and
  046checkpointlogical_slot is failing. I am still thinking about it
  but thought of sharing the information I could gather by debugging.
  
  It seems there is some interaction with previous test in same file
  which is causing this failure as we are using the primary node from
  previous test. When I tried to comment out get_changes and its
  corresponding injection_point in the previous test as attached, the
  entire test passed. I think if we use a freshly created primary node,
  this test will pass but I wanted to spend some more time to see
  how/why previous test is causing this issue?
  I noticed that the previous test didn't quitted the background psql
  session used for concurrent checkpoint. By quitting that background
  session, the test passed for me consistently. See attached. It is
  written in comments atop background_psql: "Be sure to "quit" the
  returned object when done with it.". Now, this background session
  doesn't directly access the backup_label file but it could be
  accessing one of the parent directories where backup_label is present.
  One of gen-AI says as follows: "In Windows, MoveFileEx (Error 32:
  ERRORSHARINGVIOLATION) can fail if a process is accessing the file's
  parent directory in a way that creates a lock. While the error message
  usually points to the file itself, the parent folder is a critical
  part of the operation.". I admit that I don't know the internals of
  MoveFileEx, so can't say with complete conviction but the attached
  sounds like a reasonable fix. Can anyone else who can reproduce the
  issue once test the attached patch and share the results?
  Does this fix/theory sound plausible?
  --
  With Regards,
  Amit Kapila.
  View in PostgreSQL Archives →
  Jump to comment-1
  Andres Freund<andres@anarazel.de>
  Jan 28, 2026, 4:54 PM UTC
  Hi,
  On 2026-01-28 18:05:10 +0530, Amit Kapila wrote:
  I noticed that the previous test didn't quitted the background psql
  session used for concurrent checkpoint. By quitting that background
  session, the test passed for me consistently. See attached. It is
  written in comments atop background_psql: "Be sure to "quit" the
  returned object when done with it.". Now, this background session
  doesn't directly access the backup_label file but it could be
  accessing one of the parent directories where backup_label is present.
  Hm. I've seen (and complained about [1]) weird errors when not shutting down
  IPC::Run processes - mostly the test hanging at the end though.
  One of gen-AI says as follows: "In Windows, MoveFileEx (Error 32:
  ERRORSHARINGVIOLATION) can fail if a process is accessing the file's
  parent directory in a way that creates a lock. While the error message
  usually points to the file itself, the parent folder is a critical
  part of the operation.".
  I don't see how that could be the plausible reason - after all we have a lot
  of other open files open in the relevant directories. But: It seems to fix
  the problem for you, so it's worth going for it, as it's the right thing to do
  anyway.
  I think it'd be worth, separately from committing the workaround, trying to
  figure out what's holding the file open. Andrey observed that the tests pass
  for him with a much longer timeout. If you can reproduce it locally, I'd try
  to use something like [2] to see what has handles open to the relevant files,
  while waiting for the timeout.
  Greetings,
  Andres Freund
  [1] https://postgr.es/m/20240619030727.ldp3mcrjbd5fqwj5%40awork3.anarazel.de
  [2] https://learn.microsoft.com/en-us/sysinternals/downloads/handle
  View in PostgreSQL Archives →
  Jump to comment-1
  Amit Kapila<amit.kapila16@gmail.com>
  Jan 29, 2026, 1:36 PM UTC
  On Wed, Jan 28, 2026 at 10:24 PM Andres Freund <andres@anarazel.de> wrote:
  
  I think it'd be worth, separately from committing the workaround, trying to
  figure out what's holding the file open. Andrey observed that the tests pass
  for him with a much longer timeout. If you can reproduce it locally, I'd try
  to use something like [2] to see what has handles open to the relevant files,
  while waiting for the timeout.
  Thanks for the suggestion. I did some experiments by using handle.exe
  and below are the results. To get the results, I added a long sleep
  before rename of backup_label file.
  After Fix:
  ==========
  handle.exe D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  Nthandle v5.0 - Handle viewer
  Copyright (C) 1997-2022 Mark Russinovich
  Sysinternals - www.sysinternals.com
  No matching handles found.
  Before Fix:
  ==========
  handle.exe D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  Nthandle v5.0 - Handle viewer
  Copyright (C) 1997-2022 Mark Russinovich
  Sysinternals - www.sysinternals.com
  perl.exe pid: 33784 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  pg_ctl.exe pid: 51236 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  cmd.exe pid: 35332 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  postgres.exe pid: 48200 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  postgres.exe pid: 7420 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  postgres.exe pid: 17160 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  postgres.exe pid: 56192 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  postgres.exe pid: 53892 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  postgres.exe pid: 44732 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  postgres.exe pid: 43488 type: File 30C:
  D:\Workspace\Postgresql\head\postgresql\build\testrun\recovery\046checkpointlogicalslot\data\t046checkpointlogicalslotstandbydata\pgdata\backuplabel
  All the shown postgres processes are various standby processes. Below
  are details of each postgres process:
  43488: startup process
  XLogCtl->SharedRecoveryState RECOVERYSTATEARCHIVE (1)
  44732: bgwriter:
  XLogCtl->SharedRecoveryState RECOVERYSTATEARCHIVE (1)
  53892: checkpointer
  XLogCtl->SharedRecoveryState RECOVERYSTATEARCHIVE (1)
  56192: aio-worker
  XLogCtl->SharedRecoveryState RECOVERYSTATEARCHIVE (1)
  17160: aio-worker
  XLogCtl->SharedRecoveryState RECOVERYSTATEARCHIVE (1)
  7420: aio-worker
  XLogCtl->SharedRecoveryState RECOVERYSTATEARCHIVE (1)
  48200: postmaster
  XLogCtl->SharedRecoveryState RECOVERYSTATEARCHIVE (1)
  I printed XLogCtl->SharedRecoveryState to show all are standby processes.
  The results are a bit strange in the sense that some unfinished psql
  sessions of primary could lead standby processes to be shown in
  results of handle.exe.
  Note: I have access to this environment till tomorrow noon, so I can
  try to investigate a bit tomorrow if there are more questions related
  to the above experiment.
  --
  With Regards,
  Amit Kapila.
  View in PostgreSQL Archives →
  Jump to comment-1
  Andrey Borodin<x4mmm@yandex-team.ru>
  Jan 28, 2026, 6:09 PM UTC
  On 28 Jan 2026, at 21:53, Andres Freund <andres@anarazel.de> wrote:
  
  Andrey observed that the tests pass
  for him with a much longer timeout.
  Unfortunately, I was wrong. The job "Windows - Server 2022, MinGW64 - Meson" which failed yesterday did not fail today.
  But it did not succeed either. CirrusCI seems just did not run it. I do not understand why.
  Anyway, I cannot prove that it is race condition. On a contrary, test fails on any big timeout (pg_ctl will bail out) deterministically.
  Best regards, Andrey Borodin.
  View in PostgreSQL Archives →
  Jump to comment-1
  Robert Haas<robertmhaas@gmail.com>
  Jan 28, 2026, 12:58 PM UTC
  On Wed, Jan 28, 2026 at 7:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
  Does this fix/theory sound plausible?
  I wondered about this yesterday, too. I didn't actually understand how
  the existence of the background psql could be causing the failure, but
  I thought it might be. However, I couldn't figure out the correct
  incantation to get rid of it in my testing, as I thought I would need
  to detach the injection point first or something.
  If it fixes it for you, I would suggest committing promptly. I think
  we are too dependent on CI now to leave it broken for any period of
  time, and indeed I suggest getting set up so that you test your
  commits against it before committing.
  --
  Robert Haas
  EDB: http://www.enterprisedb.com
  View in PostgreSQL Archives →
  Jump to comment-1
  Amit Kapila<amit.kapila16@gmail.com>
  Jan 28, 2026, 3:01 PM UTC
  On Wed, Jan 28, 2026 at 6:28 PM Robert Haas <robertmhaas@gmail.com> wrote:
  On Wed, Jan 28, 2026 at 7:35 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
  Does this fix/theory sound plausible?
  
  I wondered about this yesterday, too. I didn't actually understand how
  the existence of the background psql could be causing the failure, but
  I thought it might be. However, I couldn't figure out the correct
  incantation to get rid of it in my testing, as I thought I would need
  to detach the injection point first or something.
  Yeah, it would be better to quit these sessions after the test is
  complete because there are other two background sessions as well. I
  used the method to quit these sessions as used in
  \src\test\modules\testmisc\t\005timeouts.pl. The attached passes for
  me on both Linux and Windows (check on HEAD only as of now). I'll do
  some more testing on back branches as well and push tomorrow morning
  if there are no more comments.
  --
  With Regards,
  Amit Kapila.
  View in PostgreSQL Archives →
  Jump to comment-1
  Andrey Borodin<x4mmm@yandex-team.ru>
  Jan 28, 2026, 10:45 AM UTC
  On 28 Jan 2026, at 10:47, Amit Kapila <amit.kapila16@gmail.com> wrote:
  
  Do let me know if you could think of gathering any other information
  which can be of help here.
  Interestingly, increasing timeout in pgrename() to 500 seconds fixes "Windows - Server 2022, VS 2019 - Meson & ninja ", but does not fix "Windows - Server 2022, VS 2019 - Meson & ninja".
  diff --git a/src/port/dirmod.c b/src/port/dirmod.c
  index 467b50d6f09..da38e37aa45 100644
  --- a/src/port/dirmod.c
  +++ b/src/port/dirmod.c
  @@ -88,7 +88,7 @@ pgrename(const char from, const char to)
  return -1;
  #endif
  - if (++loops > 100) / time out after 10 sec /
  + if (++loops > 5000) / time out after 10 sec /
  return -1; pg_usleep(100000); /* us */ }
  Best regards, Andrey Borodin.
  View in PostgreSQL Archives →
  Jump to comment-1
  Tom Lane<tgl@sss.pgh.pa.us>
  Jan 27, 2026, 3:11 PM UTC
  Robert Haas <robertmhaas@gmail.com> writes:
  On Tue, Jan 27, 2026 at 12:56 AM Amit Kapila <akapila@postgresql.org> wrote:
  Prevent invalidation of newly synced replication slots.
  This commit has broken CI for me.
  Hmm, I wonder why the buildfarm seems fine with it ... I'm prepared
  to believe a Windows-only problem, but at least hamerkop has run
  since 851f664.
  regards, tom lane
  View in PostgreSQL Archives →
  Jump to comment-1
  Robert Haas<robertmhaas@gmail.com>
  Jan 27, 2026, 3:52 PM UTC
  On Tue, Jan 27, 2026 at 10:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
  Robert Haas <robertmhaas@gmail.com> writes:
  On Tue, Jan 27, 2026 at 12:56 AM Amit Kapila <akapila@postgresql.org> wrote:
  Prevent invalidation of newly synced replication slots.
  This commit has broken CI for me.
  
  Hmm, I wonder why the buildfarm seems fine with it ... I'm prepared
  to believe a Windows-only problem, but at least hamerkop has run
  since 851f664.
  I don't understand it, either. There's a bunch of error codes that we
  map to EACCES in _dosmaperr, but I don't know why any of those
  problems would have occurred here:
  ERRORACCESSDENIED, EACCES
  ERRORCURRENTDIRECTORY, EACCES
  ERRORLOCKVIOLATION, EACCES
  ERRORSHARINGVIOLATION, EACCES
  ERRORNETWORKACCESS_DENIED, EACCES
  ERRORCANNOTMAKE, EACCES
  ERRORFAILI24, EACCES
  ERRORDRIVELOCKED, EACCES
  ERRORSEEKON_DEVICE, EACCES
  ERRORNOTLOCKED, EACCES
  ERRORLOCKFAILED, EACCES
  (Side note: Wouldn't it make a lot of sense to go back and kill
  _dosmaperr in favor of display the actual Windows error code string?)
  What's also puzzling is that what this test is doing seems to be
  totally standard. 040standbyfailoverslotssync.pl does this:
  my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
  $standby1->initfrombackup(
  $primary, $backup_name, has_streaming => 1, has_restoring => 1);
  And 046checkpontlogical_slot.pl does this:
  my $standby = PostgreSQL::Test::Cluster->new('standby');
  $standby->initfrombackup(
  $primary, $backup_name, has_streaming => 1, has_restoring => 1);
  So why is 046 failing and 040 is fine? I have no idea.
  --
  Robert Haas
  EDB: http://www.enterprisedb.com
  View in PostgreSQL Archives →
  Jump to comment-1
  Tom Lane<tgl@sss.pgh.pa.us>
  Jan 27, 2026, 4:11 PM UTC
  Robert Haas <robertmhaas@gmail.com> writes:
  What's also puzzling is that what this test is doing seems to be
  totally standard.
  Yeah. I do notice something interesting when running it here:
  046checkpointlogicalslotmike.log shows that we are triggering
  quite a few checkpoints (via pgswitchwal()) in quick succession
  on the primary. I wonder if that is somehow tickling a Windows
  filesystem restriction.
  regards, tom lane
  View in PostgreSQL Archives →
  Jump to comment-1
  Robert Haas<robertmhaas@gmail.com>
  Jan 27, 2026, 4:18 PM UTC
  On Tue, Jan 27, 2026 at 11:11 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
  Robert Haas <robertmhaas@gmail.com> writes:
  What's also puzzling is that what this test is doing seems to be
  totally standard.
  
  Yeah. I do notice something interesting when running it here:
  046checkpointlogicalslotmike.log shows that we are triggering
  quite a few checkpoints (via pgswitchwal()) in quick succession
  on the primary. I wonder if that is somehow tickling a Windows
  filesystem restriction.
  Maybe, but it seems unlikely to me that this would mess up the
  standby, since it's a totally different node. What I kind of wonder is
  if somehow there's still a process that has backup_label open, or has
  closed it but not recently enough for Windows to unlock it. However, I
  don't see why that would affect this test case and not others.
  --
  Robert Haas
  EDB: http://www.enterprisedb.com
  View in PostgreSQL Archives →
  Jump to comment-1
  Andres Freund<andres@anarazel.de>
  Jan 27, 2026, 4:17 PM UTC
  Hi,
  On 2026-01-27 10:51:58 -0500, Robert Haas wrote:
  I don't understand it, either. There's a bunch of error codes that we
  map to EACCES in _dosmaperr, but I don't know why any of those
  problems would have occurred here:
  
  ERRORACCESSDENIED, EACCES
  ERRORCURRENTDIRECTORY, EACCES
  ERRORLOCKVIOLATION, EACCES
  ERRORSHARINGVIOLATION, EACCES
  ERRORNETWORKACCESS_DENIED, EACCES
  ERRORCANNOTMAKE, EACCES
  ERRORFAILI24, EACCES
  ERRORDRIVELOCKED, EACCES
  ERRORSEEKON_DEVICE, EACCES
  ERRORNOTLOCKED, EACCES
  ERRORLOCKFAILED, EACCES
  
  (Side note: Wouldn't it make a lot of sense to go back and kill
  _dosmaperr in favor of display the actual Windows error code string?)
  It'd be great to somehow preserve the mapping to preserve the original error
  message, but I don't really see how we could just give up on our mapping. We
  rely on e.g. knowing that a read failed due to ENOENT, not
  ERRORFILENOT_FOUND or whatnot.
  What's also puzzling is that what this test is doing seems to be
  totally standard. 040standbyfailoverslotssync.pl does this:
  my $standby1 = PostgreSQL::Test::Cluster->new('standby1');
  $standby1->initfrombackup(
  $primary, $backup_name,
  has_streaming => 1,
  has_restoring => 1);
  
  And 046checkpontlogical_slot.pl does this:
  my $standby = PostgreSQL::Test::Cluster->new('standby');
  $standby->initfrombackup(
  $primary, $backup_name,
  has_streaming => 1,
  has_restoring => 1);
  
  So why is 046 failing and 040 is fine? I have no idea.
  046 does a fair bit of stuff before the base backup is being taken, I guess?
  But what that concretely could be, I have no idea.
  It'd be one thing if it failed while creating a base backup, but the fact that
  it allows the base backup being created, but then fails during startup is just
  plain odd. The typical sharing violation issue seems like it'd require that
  we somehow are not waiting for pg_basebackup to actually have terminated?
  Greetings,
  Andres Freund
  View in PostgreSQL Archives →
  Jump to comment-1
  Tom Lane<tgl@sss.pgh.pa.us>
  Jan 27, 2026, 3:49 PM UTC
  I wrote:
  Robert Haas <robertmhaas@gmail.com> writes:
  This commit has broken CI for me.
  Hmm, I wonder why the buildfarm seems fine with it ... I'm prepared
  to believe a Windows-only problem, but at least hamerkop has run
  since 851f664.
  D'oh: hamerkop doesn't run any TAP tests, let alone ones that require
  --enable-injection-points. So that success proves nothing.
  Our other Windows animals (drongo, fairywren, unicorn) seem to be
  configured with -Dtap_tests=enabled, but nothing about injection
  points, so they will also skip 046checkpointlogical_slot.
  Seems like a bit of a blind spot in the buildfarm.
  regards, tom lane
  View in PostgreSQL Archives →
  Jump to comment-1
  Greg Burd<greg@burd.me>
  Jan 27, 2026, 4:53 PM UTC
  On Jan 27, 2026, at 10:49 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
  
  I wrote:
  Robert Haas <robertmhaas@gmail.com> writes:
  This commit has broken CI for me.
  Hmm, I wonder why the buildfarm seems fine with it ... I'm prepared
  to believe a Windows-only problem, but at least hamerkop has run
  since 851f664.
  
  D'oh: hamerkop doesn't run any TAP tests, let alone ones that require
  --enable-injection-points. So that success proves nothing.
  
  Our other Windows animals (drongo, fairywren, unicorn) seem to be
  configured with -Dtap_tests=enabled, but nothing about injection
  points, so they will also skip 046checkpointlogical_slot.
  Seems like a bit of a blind spot in the buildfarm.
  
  regards, tom lane
  I'll see if I can update unicorn today to enable injection points to add some coverage on Win11/ARM64/MSVC. No promises that will be diagnostic at all, but it seems like a good idea.
  -Dinjection_points=true
  -greg
  View in PostgreSQL Archives →
  Jump to comment-1
  Robert Haas<robertmhaas@gmail.com>
  Jan 27, 2026, 5:11 PM UTC
  On Tue, Jan 27, 2026 at 11:53 AM Greg Burd <greg@burd.me> wrote:
  I'll see if I can update unicorn today to enable injection points to add some coverage on Win11/ARM64/MSVC. No promises that will be diagnostic at all, but it seems like a good idea.
  -Dinjection_points=true
  Sounds good!
  Thanks,
  --
  Robert Haas
  EDB: http://www.enterprisedb.com
  View in PostgreSQL Archives →