[Patch]: Fix excessive ProcArrayLock acquisitions with subscription max_retention_duration=0

  • Jump to comment-1
    SATYANARAYANA NARLAPURAM<satyanarlapuram@gmail.com>
    Apr 27, 2026, 8:41 AM UTC
    Hi Hackers,
    When a subscription has retaindeadtuples enabled with maxretention set
    to zero (unlimited retention), adjustxidadvance_interval() caps
    xidadvanceinterval to Min(interval, maxretention). Since maxretention
    is zero, this always collapses the interval to zero milliseconds.
    A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
    true in getcandidatexid(). This causes the apply worker to call
    GetOldestActiveTransactionId() on every single WAL message. This results in
    a huge number of ProcArrayLock acquisitions under moderate write load.
    Fix by adding a maxretention > 0 guard to the cap. When maxretention is
    zero ,
    the exponential back-off in adjustxidadvance_interval()
    now works correctly, growing the interval from 100 ms toward the 180 s
    ceiling.
    Measured with perf uprobe counting GetOldestActiveTransactionId calls
    at ~39K TPS (pgbench, 5 clients):
    Before fix: 25,104 calls / 5 s  (~5,021/s)
    After fix:     31 calls / 5 s  (~6/s)
    Thank
    Satya
    • Jump to comment-1
      shveta malik<shveta.malik@gmail.com>
      Apr 27, 2026, 9:48 AM UTC
      On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
      <satyanarlapuram@gmail.com> wrote:

      Hi Hackers,

      When a subscription has retaindeadtuples enabled with maxretention set
      to zero (unlimited retention), adjustxidadvance_interval() caps
      xidadvanceinterval to Min(interval, maxretention). Since maxretention
      is zero, this always collapses the interval to zero milliseconds.

      A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
      true in getcandidatexid(). This causes the apply worker to call
      GetOldestActiveTransactionId() on every single WAL message. This results in
      a huge number of ProcArrayLock acquisitions under moderate write load.
      Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
      the exponential back-off in adjustxidadvance_interval()
      now works correctly, growing the interval from 100 ms toward the 180 s
      ceiling.

      Measured with perf uprobe counting GetOldestActiveTransactionId calls
      at ~39K TPS (pgbench, 5 clients):

      Before fix: 25,104 calls / 5 s (~5,021/s)
      After fix: 31 calls / 5 s (~6/s)
      Thanks for reporting it. I am reveiwing the problem sattement.
      Meanwhile can you please look at it, I am getting the following error
      while applying the patch on my Ubuntu setup (git am):
      error: corrupt patch at line 22
      thanks
      Shveta
      • Jump to comment-1
        shveta malik<shveta.malik@gmail.com>
        Apr 27, 2026, 10:19 AM UTC
        On Mon, Apr 27, 2026 at 3:18 PM shveta malik <shveta.malik@gmail.com> wrote:

        On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
        <satyanarlapuram@gmail.com> wrote:

        Hi Hackers,

        When a subscription has retaindeadtuples enabled with maxretention set
        to zero (unlimited retention), adjustxidadvance_interval() caps
        xidadvanceinterval to Min(interval, maxretention). Since maxretention
        is zero, this always collapses the interval to zero milliseconds.

        A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
        true in getcandidatexid(). This causes the apply worker to call
        GetOldestActiveTransactionId() on every single WAL message. This results in
        a huge number of ProcArrayLock acquisitions under moderate write load.
        I agree with the problem statement. I can see it in my debugging.
        Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
        the exponential back-off in adjustxidadvance_interval()
        now works correctly, growing the interval from 100 ms toward the 180 s
        ceiling.
        Yes, this should work. Let's see what others have to say on this.
        Measured with perf uprobe counting GetOldestActiveTransactionId calls
        at ~39K TPS (pgbench, 5 clients):

        Before fix: 25,104 calls / 5 s (~5,021/s)
        After fix: 31 calls / 5 s (~6/s)
        Just curious, how did you catch this problem? Did it show up in any of
        your profiling reports?
        thanks
        Shveta
      • Jump to comment-1
        SATYANARAYANA NARLAPURAM<satyanarlapuram@gmail.com>
        Apr 27, 2026, 5:03 PM UTC
        Hi,
        On Mon, Apr 27, 2026 at 2:48 AM shveta malik <shveta.malik@gmail.com> wrote:
        On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
        <satyanarlapuram@gmail.com> wrote:

        Hi Hackers,

        When a subscription has retaindeadtuples enabled with maxretention set
        to zero (unlimited retention), adjustxidadvance_interval() caps
        xidadvanceinterval to Min(interval, maxretention). Since maxretention
        is zero, this always collapses the interval to zero milliseconds.

        A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
        true in getcandidatexid(). This causes the apply worker to call
        GetOldestActiveTransactionId() on every single WAL message. This results
        in
        a huge number of ProcArrayLock acquisitions under moderate write load.
        Fix by adding a maxretention > 0 guard to the cap. When maxretention is
        zero ,
        the exponential back-off in adjustxidadvance_interval()
        now works correctly, growing the interval from 100 ms toward the 180 s
        ceiling.

        Measured with perf uprobe counting GetOldestActiveTransactionId calls
        at ~39K TPS (pgbench, 5 clients):

        Before fix: 25,104 calls / 5 s (~5,021/s)
        After fix: 31 calls / 5 s (~6/s)

        Thanks for reporting it. I am reveiwing the problem sattement.
        Meanwhile can you please look at it, I am getting the following error
        while applying the patch on my Ubuntu setup (git am):

        error: corrupt patch at line 22
        Thanks! Please find the updated v2 patch.
        • Jump to comment-1
          shveta malik<shveta.malik@gmail.com>
          Apr 28, 2026, 4:08 AM UTC
          On Mon, Apr 27, 2026 at 10:32 PM SATYANARAYANA NARLAPURAM
          <satyanarlapuram@gmail.com> wrote:

          Hi,
          On Mon, Apr 27, 2026 at 2:48 AM shveta malik <shveta.malik@gmail.com> wrote:

          On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
          <satyanarlapuram@gmail.com> wrote:

          Hi Hackers,

          When a subscription has retaindeadtuples enabled with maxretention set
          to zero (unlimited retention), adjustxidadvance_interval() caps
          xidadvanceinterval to Min(interval, maxretention). Since maxretention
          is zero, this always collapses the interval to zero milliseconds.

          A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
          true in getcandidatexid(). This causes the apply worker to call
          GetOldestActiveTransactionId() on every single WAL message. This results in
          a huge number of ProcArrayLock acquisitions under moderate write load.
          Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
          the exponential back-off in adjustxidadvance_interval()
          now works correctly, growing the interval from 100 ms toward the 180 s
          ceiling.

          Measured with perf uprobe counting GetOldestActiveTransactionId calls
          at ~39K TPS (pgbench, 5 clients):

          Before fix: 25,104 calls / 5 s (~5,021/s)
          After fix: 31 calls / 5 s (~6/s)

          Thanks for reporting it. I am reveiwing the problem sattement.
          Meanwhile can you please look at it, I am getting the following error
          while applying the patch on my Ubuntu setup (git am):

          error: corrupt patch at line 22


          Thanks! Please find the updated v2 patch.
          Thanks. The patch looks good.
          thanks
          Shveta
          • Jump to comment-1
            Amit Kapila<amit.kapila16@gmail.com>
            Apr 28, 2026, 10:58 AM UTC
            On Tue, Apr 28, 2026 at 9:38 AM shveta malik <shveta.malik@gmail.com> wrote:

            On Mon, Apr 27, 2026 at 10:32 PM SATYANARAYANA NARLAPURAM
            <satyanarlapuram@gmail.com> wrote:

            Hi,
            On Mon, Apr 27, 2026 at 2:48 AM shveta malik <shveta.malik@gmail.com> wrote:

            On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
            <satyanarlapuram@gmail.com> wrote:

            Hi Hackers,

            When a subscription has retaindeadtuples enabled with maxretention set
            to zero (unlimited retention), adjustxidadvance_interval() caps
            xidadvanceinterval to Min(interval, maxretention). Since maxretention
            is zero, this always collapses the interval to zero milliseconds.

            A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
            true in getcandidatexid(). This causes the apply worker to call
            GetOldestActiveTransactionId() on every single WAL message. This results in
            a huge number of ProcArrayLock acquisitions under moderate write load.
            Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
            the exponential back-off in adjustxidadvance_interval()
            now works correctly, growing the interval from 100 ms toward the 180 s
            ceiling.

            Measured with perf uprobe counting GetOldestActiveTransactionId calls
            at ~39K TPS (pgbench, 5 clients):

            Before fix: 25,104 calls / 5 s (~5,021/s)
            After fix: 31 calls / 5 s (~6/s)

            Thanks for reporting it. I am reveiwing the problem sattement.
            Meanwhile can you please look at it, I am getting the following error
            while applying the patch on my Ubuntu setup (git am):

            error: corrupt patch at line 22


            Thanks! Please find the updated v2 patch.

            Thanks. The patch looks good.
            LGTM as well, so pushed.
            --
            With Regards,
            Amit Kapila.
        • Jump to comment-1
          Nisha Moond<nisha.moond412@gmail.com>
          Apr 28, 2026, 8:35 AM UTC
          On Mon, Apr 27, 2026 at 10:32 PM SATYANARAYANA NARLAPURAM
          <satyanarlapuram@gmail.com> wrote:

          Hi,
          On Mon, Apr 27, 2026 at 2:48 AM shveta malik <shveta.malik@gmail.com> wrote:

          On Mon, Apr 27, 2026 at 2:11 PM SATYANARAYANA NARLAPURAM
          <satyanarlapuram@gmail.com> wrote:

          Hi Hackers,

          When a subscription has retaindeadtuples enabled with maxretention set
          to zero (unlimited retention), adjustxidadvance_interval() caps
          xidadvanceinterval to Min(interval, maxretention). Since maxretention
          is zero, this always collapses the interval to zero milliseconds.

          A zero makes TimestampDifferenceExceeds(last_time, now, 0) always
          true in getcandidatexid(). This causes the apply worker to call
          GetOldestActiveTransactionId() on every single WAL message. This results in
          a huge number of ProcArrayLock acquisitions under moderate write load.
          Fix by adding a maxretention > 0 guard to the cap. When maxretention is zero ,
          the exponential back-off in adjustxidadvance_interval()
          now works correctly, growing the interval from 100 ms toward the 180 s
          ceiling.

          Measured with perf uprobe counting GetOldestActiveTransactionId calls
          at ~39K TPS (pgbench, 5 clients):

          Before fix: 25,104 calls / 5 s (~5,021/s)
          After fix: 31 calls / 5 s (~6/s)

          Thanks for reporting it. I am reveiwing the problem sattement.
          Meanwhile can you please look at it, I am getting the following error
          while applying the patch on my Ubuntu setup (git am):

          error: corrupt patch at line 22


          Thanks! Please find the updated v2 patch.
          Thanks for the patch. I am able to reproduce the reported issue in
          debugging. The xidadvanceinterval stays 0 when retaindeadtuples
          is enabled but maxretentionduration is off which is unexpected
          behavior.
          Confirmed that the patch fixed it.
          --
          Thanks,
          Nisha