Postgres Email Lists

pgsql-hackers

❮

BUG: Cascading standby fails to reconnect after falling back to archive recovery

Jump to comment-1
Marco Nenciarini<marco.nenciarini@enterprisedb.com>
Jan 28, 2026, 5:03 PM UTC
Hi hackers,
I've encountered a bug in PostgreSQL's streaming replication where cascading
standbys fail to reconnect after falling back to archive recovery. The issue
occurs when the upstream standby uses archive-only recovery.
The standby requests streaming from the wrong WAL position (next segment
boundary
instead of the current position), causing connection failures with this
error:
```
ERROR: requested starting point 0/A000000 is ahead of the WAL flush
position of this server 0/9000000
```
Attached are two shell scripts that reliably reproduce the issue on
PostgreSQL
17.x and 18.x:
1. reproducerrestartupstream_portable.sh - triggers by restarting upstream
2. reproducercascaderestart_portable.sh - triggers by restarting the
cascade
The scripts set up this topology:
- Primary with archiving enabled
- Standby using only archive recovery (no streaming from primary)
- Cascading standby streaming from the archive-only standby
When the cascade loses its streaming connection and falls back to archive
recovery,
it cannot reconnect. The issue appears to be in xlogrecovery.c around line
3880,
where the position passed to RequestXLogStreaming() determines which segment
boundary is requested.
The cascade restart reproducer shows that even restarting the cascade itself
triggers the bug, which affects routine maintenance operations.
Scripts require PostgreSQL binaries in PATH and use ports 15432-15434.
Best regards,
Marco
View in PostgreSQL Archives →
- Jump to comment-1
  Xuneng Zhou<xunengzhou@gmail.com>
  Jan 29, 2026, 12:22 PM UTC
  Hi Marco,
  On Thu, Jan 29, 2026 at 1:03 AM Marco Nenciarini
  <marco.nenciarini@enterprisedb.com> wrote:
  
  Hi hackers,
  
  I've encountered a bug in PostgreSQL's streaming replication where cascading
  standbys fail to reconnect after falling back to archive recovery. The issue
  occurs when the upstream standby uses archive-only recovery.
  
  The standby requests streaming from the wrong WAL position (next segment boundary
  instead of the current position), causing connection failures with this error:
  
  ERROR: requested starting point 0/A000000 is ahead of the WAL flush
  position of this server 0/9000000
  
  Attached are two shell scripts that reliably reproduce the issue on PostgreSQL
  17.x and 18.x:
  
  1. reproducerrestartupstream_portable.sh - triggers by restarting upstream
  2. reproducercascaderestart_portable.sh - triggers by restarting the cascade
  
  The scripts set up this topology:
  - Primary with archiving enabled
  - Standby using only archive recovery (no streaming from primary)
  - Cascading standby streaming from the archive-only standby
  
  When the cascade loses its streaming connection and falls back to archive recovery,
  it cannot reconnect. The issue appears to be in xlogrecovery.c around line 3880,
  where the position passed to RequestXLogStreaming() determines which segment
  boundary is requested.
  
  The cascade restart reproducer shows that even restarting the cascade itself
  triggers the bug, which affects routine maintenance operations.
  
  Scripts require PostgreSQL binaries in PATH and use ports 15432-15434.
  
  Best regards,
  Marco
  Thanks for your report. I can reliably reproduce the issue on HEAD
  using your scripts. I’ve analyzed the problem and am proposing a patch
  to fix it.
  --- Analysis
  When a cascading standby streams from an archive-only upstream:
  1. The upstream's GetStandbyFlushRecPtr() returns only replay position
  (no received-but-not-replayed buffer since there's no walreceiver)
  2. When streaming ends and the cascade falls back to archive recovery,
  it can restore WAL segments from its own archive access
  3. The cascade's read position (RecPtr) advances beyond what the
  upstream has replayed
  4. On reconnect, the cascade requests streaming from RecPtr, which the
  upstream rejects as "ahead of flush position"
  --- Proposed Fix
  Track the last confirmed flush position from streaming
  (lastStreamedFlush) and clamp the streaming start request when it
  exceeds that position:
  - Same timeline: clamp to lastStreamedFlush if RecPtr > lastStreamedFlush
  - Timeline switch: fall back to timeline switchpoint as safe boundary
  This ensures the cascade requests from a position the upstream
  definitely has, rather than assuming the upstream can serve whatever
  the cascade restored locally from archive.
  I’m not a fan of using sleep in TAP tests, but I haven’t found a
  better way to reproduce this behavior yet.
  --
  Best,
  Xuneng
  View in PostgreSQL Archives →
  Jump to comment-1
  Fujii Masao<masao.fujii@gmail.com>
  Jan 30, 2026, 3:13 AM UTC
  On Thu, Jan 29, 2026 at 9:22 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
  Thanks for your report. I can reliably reproduce the issue on HEAD
  using your scripts. I’ve analyzed the problem and am proposing a patch
  to fix it.
  
  --- Analysis
  When a cascading standby streams from an archive-only upstream:
  
  1. The upstream's GetStandbyFlushRecPtr() returns only replay position
  (no received-but-not-replayed buffer since there's no walreceiver)
  2. When streaming ends and the cascade falls back to archive recovery,
  it can restore WAL segments from its own archive access
  3. The cascade's read position (RecPtr) advances beyond what the
  upstream has replayed
  4. On reconnect, the cascade requests streaming from RecPtr, which the
  upstream rejects as "ahead of flush position"
  
  --- Proposed Fix
  
  Track the last confirmed flush position from streaming
  (lastStreamedFlush) and clamp the streaming start request when it
  exceeds that position:
  I haven't read the patch yet, but doesn't lastStreamedFlush represent
  the same LSN as tliRecPtr or replayLSN (the arguments to
  WaitForWALToBecomeAvailable())? If so, we may not need to introduce
  a new variable to track this LSN.
  The choice of which LSN is used as the replication start point has varied
  over time to handle corner cases (for example, commit 06687198018).
  That makes me wonder whether we should first better understand
  why WaitForWALToBecomeAvailable() currently uses RecPtr as
  the starting point.
  BTW, with v1 patch, I was able to reproduce the issue using the following steps:
  --------------------------------------------
  initdb -D data
  mkdir arch
  cat <<EOF >> data/postgresql.conf
  archive_mode = on
  archive_command = 'cp %p ../arch/%f'
  restore_command = 'cp ../arch/%f %p'
  EOF
  pg_ctl -D data start
  pg_basebackup -D sby1 -c fast
  cp -a sby1 sby2
  cat <<EOF >> sby1/postgresql.conf
  port = 5433
  EOF
  touch sby1/standby.signal
  pg_ctl -D sby1 start
  cat <<EOF >> sby2/postgresql.conf
  port = 5434
  primary_conninfo = 'port=5433'
  EOF
  touch sby2/standby.signal
  pg_ctl -D sby2 start
  pgbench -i -s2
  pg_ctl -D sby2 restart
  --------------------------------------------
  In this case, after restarting the standby connecting to another
  (cascading) standby, I observed the following error.
  FATAL: could not receive data from WAL stream: ERROR: requested
  starting point 0/04000000 is ahead of the WAL flush position of this
  server 0/03FFE8D0
  Regards,
  --
  Fujii Masao
  View in PostgreSQL Archives →
  Jump to comment-1
  Xuneng Zhou<xunengzhou@gmail.com>
  Jan 30, 2026, 6:01 AM UTC
  Hi Fujii'san,
  Thanks for looking into this.
  On Fri, Jan 30, 2026 at 11:12 AM Fujii Masao <masao.fujii@gmail.com> wrote:
  On Thu, Jan 29, 2026 at 9:22 PM Xuneng Zhou <xunengzhou@gmail.com> wrote:
  Thanks for your report. I can reliably reproduce the issue on HEAD
  using your scripts. I’ve analyzed the problem and am proposing a patch
  to fix it.
  
  --- Analysis
  When a cascading standby streams from an archive-only upstream:
  
  1. The upstream's GetStandbyFlushRecPtr() returns only replay position
  (no received-but-not-replayed buffer since there's no walreceiver)
  2. When streaming ends and the cascade falls back to archive recovery,
  it can restore WAL segments from its own archive access
  3. The cascade's read position (RecPtr) advances beyond what the
  upstream has replayed
  4. On reconnect, the cascade requests streaming from RecPtr, which the
  upstream rejects as "ahead of flush position"
  
  --- Proposed Fix
  
  Track the last confirmed flush position from streaming
  (lastStreamedFlush) and clamp the streaming start request when it
  exceeds that position:
  
  I haven't read the patch yet, but doesn't lastStreamedFlush represent
  the same LSN as tliRecPtr or replayLSN (the arguments to
  WaitForWALToBecomeAvailable())? If so, we may not need to introduce
  a new variable to track this LSN.
  I think they refer to different types of LSNs. I don’t have access to my
  computer at the moment, but I’ll look into it and get back to you shortly.
  The choice of which LSN is used as the replication start point has varied
  over time to handle corner cases (for example, commit 06687198018).
  That makes me wonder whether we should first better understand
  why WaitForWALToBecomeAvailable() currently uses RecPtr as
  the starting point.
  
  BTW, with v1 patch, I was able to reproduce the issue using the following
  steps:
  
  --------------------------------------------
  initdb -D data
  mkdir arch
  cat <<EOF >> data/postgresql.conf
  archive_mode = on
  archive_command = 'cp %p ../arch/%f'
  restore_command = 'cp ../arch/%f %p'
  EOF
  pg_ctl -D data start
  pg_basebackup -D sby1 -c fast
  cp -a sby1 sby2
  cat <<EOF >> sby1/postgresql.conf
  port = 5433
  EOF
  touch sby1/standby.signal
  pg_ctl -D sby1 start
  cat <<EOF >> sby2/postgresql.conf
  port = 5434
  primary_conninfo = 'port=5433'
  EOF
  touch sby2/standby.signal
  pg_ctl -D sby2 start
  pgbench -i -s2
  pg_ctl -D sby2 restart
  --------------------------------------------
  
  In this case, after restarting the standby connecting to another
  (cascading) standby, I observed the following error.
  
  FATAL: could not receive data from WAL stream: ERROR: requested
  starting point 0/04000000 is ahead of the WAL flush position of this
  server 0/03FFE8D0
  
  Regards,
  
  --
  Fujii Masao
  Best,
  Xuneng
  View in PostgreSQL Archives →
- Jump to comment-1
  Fujii Masao<masao.fujii@gmail.com>
  Jan 29, 2026, 11:33 AM UTC
  On Thu, Jan 29, 2026 at 2:03 AM Marco Nenciarini
  <marco.nenciarini@enterprisedb.com> wrote:
  
  Hi hackers,
  
  I've encountered a bug in PostgreSQL's streaming replication where cascading
  standbys fail to reconnect after falling back to archive recovery. The issue
  occurs when the upstream standby uses archive-only recovery.
  
  The standby requests streaming from the wrong WAL position (next segment boundary
  instead of the current position), causing connection failures with this error:
  
  ERROR: requested starting point 0/A000000 is ahead of the WAL flush
  position of this server 0/9000000
  Thanks for the report!
  I was also able to reproduce this issue on the master branch.
  Interestingly, I couldn't reproduce it on v11 using the same test case.
  This makes me wonder whether the issue was introduced in v12 or later.
  Do you see the same behavior in your environment?
  Regards,
  --
  Fujii Masao
  View in PostgreSQL Archives →