Wake up backends immediately when sync standbys decrease

  • Jump to comment-1
    Shinya Kato<shinya11.kato@gmail.com>
    Jan 30, 2026, 7:00 AM UTC
    Hi hackers,
    I have noticed an issue where backends waiting for synchronous
    replication are not woken up immediately when the number of required
    synchronous standbys is reduced in a multiple synchronous standby
    environment.
    When synchronousstandbynames is updated to require fewer standbys
    (for example, changing from "FIRST 2 (s1, s2)" to "FIRST 1 (s1)"), the
    backends currently waiting for replication remain blocked even after a
    config reload (SIGHUP). They are only released when a new message
    eventually arrives from a standby, despite the fact that the new
    requirements are already satisfied.
    The attached patch adds SyncRepReleaseWaiters() calls within
    walsender.c immediately after the configuration is reloaded and
    SyncRepInitConfig() is called. This ensures that any backends whose
    waiting conditions are now met by the new configuration are released
    without unnecessary delay.
    Thoughts?
    --
    Best regards,
    Shinya Kato
    NTT OSS Center
    • Jump to comment-1
      Chao Li<li.evan.chao@gmail.com>
      Jan 30, 2026, 7:49 AM UTC
      On Jan 30, 2026, at 14:59, Shinya Kato <shinya11.kato@gmail.com> wrote:

      Hi hackers,

      I have noticed an issue where backends waiting for synchronous
      replication are not woken up immediately when the number of required
      synchronous standbys is reduced in a multiple synchronous standby
      environment.

      When synchronousstandbynames is updated to require fewer standbys
      (for example, changing from "FIRST 2 (s1, s2)" to "FIRST 1 (s1)"), the
      backends currently waiting for replication remain blocked even after a
      config reload (SIGHUP). They are only released when a new message
      eventually arrives from a standby, despite the fact that the new
      requirements are already satisfied.

      The attached patch adds SyncRepReleaseWaiters() calls within
      walsender.c immediately after the configuration is reloaded and
      SyncRepInitConfig() is called. This ensures that any backends whose
      waiting conditions are now met by the new configuration are released
      without unnecessary delay.

      Thoughts?

      --
      Best regards,
      Shinya Kato
      NTT OSS Center
      <v1-0001-Wake-up-backends-immediately-when-sync-standbys-d.patch>
      Hi Shinya-san,
      This patch makes sense to me, and the behavior change looks reasonable.
      My main concern is code duplication. The same block is added in three places. While the existing reload handling is already duplicated there, adding more logic on top makes the situation a bit worse from a maintenance perspective.
      Would it make sense to factor the reload handling into a small helper, for example:
      static void
      WalSndHandleConfigReload(void)
      {
          if (!ConfigReloadPending)
              return;
      
          ConfigReloadPending = false;
          ProcessConfigFile(PGC_SIGHUP);
          SyncRepInitConfig();
      
          if (!am_cascading_walsender)
              SyncRepReleaseWaiters();
      }
      Best regards,
      --
      Chao Li (Evan)
      HighGo Software Co., Ltd.
      https://www.highgo.com/
      • Jump to comment-1
        Fujii Masao<masao.fujii@gmail.com>
        Jan 30, 2026, 3:28 PM UTC
        On Fri, Jan 30, 2026 at 4:49 PM Chao Li <li.evan.chao@gmail.com> wrote:


        On Jan 30, 2026, at 14:59, Shinya Kato <shinya11.kato@gmail.com> wrote:

        Hi hackers,

        I have noticed an issue where backends waiting for synchronous
        replication are not woken up immediately when the number of required
        synchronous standbys is reduced in a multiple synchronous standby
        environment.
        Thanks for reporting this!
        This issue can occur not only when the number of sync standbys is reduced,
        but also when the configured standby names change. For example, if the config
        changes from "FIRST 2 (sby1, sby2)" to "FIRST 2 (sby1, sby3)",
        waiters on sby2 should be released immediately. But, currently, there can
        a delay before that happens. Right?
        My main concern is code duplication. The same block is added in three places. While the existing reload handling is already duplicated there, adding more logic on top makes the situation a bit worse from a maintenance perspective.

        Would it make sense to factor the reload handling into a small helper, for example:
        +1
        Regards,
        --
        Fujii Masao