logical apply worker's lock waits in subscriber can stall checkpointer in publisher

  • Jump to comment-1
    Fujii Masao<masao.fujii@gmail.com>
    Jan 27, 2026, 11:33 AM UTC
    Hi,
    While reviewing the patch at [1], I noticed a case where lock waits on
    a logical apply worker in the subscriber can cause the checkpointer on
    the publisher to stall. This seems like unexpected behavior and may
    need to be addressed.
    The issue can occur as follows:
    1. A logical apply worker on the subscriber blocks waiting for a lock.
    2. Because the apply worker cannot receive further messages, the walsender's
    send buffer on the publisher becomes full.
    3. If the walsender then encounters a maxslotwalkeepsize error,
    it attempts to send an error message to the subscriber before exiting.
    However, with a full send buffer, the walsender blocks while trying to
    send this message.
    4. The checkpointer on the publisher calls InvalidateObsoleteReplicationSlots()
    and waits for the slot to be released. Since the walsender is stuck and
    the slot is not released, the checkpointer also becomes stuck.
    This behavior seems problematic, isn't it?
    One possible approach to address this issue would be to make the walsender
    send the error message in non-blocking mode. Even if the send buffer is full,
    the walsender could then exit, allowing the slot to be released and
    the checkpointer to proceed. This would mean that, in some cases,
    the final error message might not reach the subscriber, which seems
    acceptable to me, though others may disagree.
    This approach would also help when users want to terminate a walsender
    via pgterminatebackend() but the send buffer is full. In this case, today,
    the walsender can similarly block while trying to send the error message.
    Another idea would be to change the checkpointer so that
    InvalidateObsoleteReplicationSlots() operates in a non-blocking manner.
    I'm not sure whether that is feasible, but if immediate invalidation is not
    strictly required, the checkpointer could give up and retry later.
    Thoughts?
    Regards,
    [1]
    https://postgr.es/m/TYAPR01MB586668E50FC2447AD7F92491F5E89@TYAPR01MB5866.jpnprd01.prod.outlook.com
    --
    Fujii Masao
    • Jump to comment-1
      Hayato Kuroda (Fujitsu)<kuroda.hayato@fujitsu.com>
      Jan 29, 2026, 7:03 AM UTC
      Dear Fujii-san,
      While reviewing the patch at [1], I noticed a case where lock waits on
      a logical apply worker in the subscriber can cause the checkpointer on
      the publisher to stall. This seems like unexpected behavior and may
      need to be addressed.

      The issue can occur as follows:

      1. A logical apply worker on the subscriber blocks waiting for a lock.
      2. Because the apply worker cannot receive further messages, the walsender's
      send buffer on the publisher becomes full.
      3. If the walsender then encounters a maxslotwalkeepsize error,
      it attempts to send an error message to the subscriber before exiting.
      However, with a full send buffer, the walsender blocks while trying to
      send this message.
      4. The checkpointer on the publisher calls InvalidateObsoleteReplicationSlots()
      and waits for the slot to be released. Since the walsender is stuck and
      the slot is not released, the checkpointer also becomes stuck.
      I confirmed this could happen if the maxslotwalkeepsize is enabled
      (in other words, the value is not -1).
      Per my test, walsendertimeout cannot work well because the process is stuck at
      the lower layer, but tcpusertimeout can terminate the process. Can we mention
      the workaround in the doc instead of fixing the code?
      It won't work for a Unix domain socket connection, but it's not realistic for the
      production stage.
      Best regards,
      Hayato Kuroda
      FUJITSU LIMITED
      • Jump to comment-1
        Fujii Masao<masao.fujii@gmail.com>
        Jan 29, 2026, 2:33 PM UTC
        On Thu, Jan 29, 2026 at 4:03 PM Hayato Kuroda (Fujitsu)
        <kuroda.hayato@fujitsu.com> wrote:

        Dear Fujii-san,
        While reviewing the patch at [1], I noticed a case where lock waits on
        a logical apply worker in the subscriber can cause the checkpointer on
        the publisher to stall. This seems like unexpected behavior and may
        need to be addressed.

        The issue can occur as follows:

        1. A logical apply worker on the subscriber blocks waiting for a lock.
        2. Because the apply worker cannot receive further messages, the walsender's
        send buffer on the publisher becomes full.
        3. If the walsender then encounters a maxslotwalkeepsize error,
        it attempts to send an error message to the subscriber before exiting.
        However, with a full send buffer, the walsender blocks while trying to
        send this message.
        4. The checkpointer on the publisher calls InvalidateObsoleteReplicationSlots()
        and waits for the slot to be released. Since the walsender is stuck and
        the slot is not released, the checkpointer also becomes stuck.

        I confirmed this could happen if the maxslotwalkeepsize is enabled
        (in other words, the value is not -1).
        Per my test, walsendertimeout cannot work well because the process is stuck at
        the lower layer, but tcpusertimeout can terminate the process. Can we mention
        the workaround in the doc instead of fixing the code?

        It won't work for a Unix domain socket connection, but it's not realistic for the
        production stage.
        This approach doesn't seem helpful on platforms that don't support
        TCPUSERTIMEOUT, i.e., tcpusertimeout is not available. Right?
        If I remember correctly, Windows is one of those platforms.
        Regards,
        --
        Fujii Masao
        • Jump to comment-1
          Hayato Kuroda (Fujitsu)<kuroda.hayato@fujitsu.com>
          Jan 30, 2026, 4:20 AM UTC
          Dear Fujii-san,
          This approach doesn't seem helpful on platforms that don't support
          TCPUSERTIMEOUT, i.e., tcpusertimeout is not available. Right?
          If I remember correctly, Windows is one of those platforms.
          Good point, documentation said it's not usable for Windows.
          The easiest fix I can come up with is to determine a timeout for checkpoint wait;
          ConditionVariableTimedSleep() can be used in InvalidatePossiblyObsoleteSlot(),
          we can put some LOG and skip invalidating for a while. Not sure how long we
          should wait but at least we can use the a fixed value. This might be similar
          with your second option.
          Regarding the first option, it can solve the root cause, but I'm afraid we may
          have to modify very common code.
          Best regards,
          Hayato Kuroda
          FUJITSU LIMITED
          • Jump to comment-1
            Fujii Masao<masao.fujii@gmail.com>
            Jan 30, 2026, 2:20 PM UTC
            On Fri, Jan 30, 2026 at 1:19 PM Hayato Kuroda (Fujitsu)
            <kuroda.hayato@fujitsu.com> wrote:

            Dear Fujii-san,
            This approach doesn't seem helpful on platforms that don't support
            TCPUSERTIMEOUT, i.e., tcpusertimeout is not available. Right?
            If I remember correctly, Windows is one of those platforms.

            Good point, documentation said it's not usable for Windows.
            The easiest fix I can come up with is to determine a timeout for checkpoint wait;
            ConditionVariableTimedSleep() can be used in InvalidatePossiblyObsoleteSlot(),
            we can put some LOG and skip invalidating for a while. Not sure how long we
            should wait but at least we can use the a fixed value. This might be similar
            with your second option.
            Regarding the first option, it can solve the root cause, but I'm afraid we may
            have to modify very common code.
            Yeah, but I'd like to try the first option. Attached is a very WIP patch that
            attempts to implement it.
            With this patch, when a walsender exits with >= FATAL,
            sendmessageto_frontend() attempts to send the error message to the standby
            in non-blocking mode. If that fails, the walsender gives up on sending
            the message and exits immediately.
            I'm not yet sure whether treating walsender exit as a special case is
            acceptable, but I wanted to share this WIP patch to get feedback.
            Regards,
            --
            Fujii Masao