[PATCH] Preserve replication origin OIDs in pg_upgrade

  • Jump to comment-1
    Ajin Cherian<itsajin@gmail.com>
    Apr 28, 2026, 11:19 AM UTC
    Hello hackers,
    The idea for this patch came up during discussions in the thread [1]
    on migration of the pgcommitts directory as part of pg_upgrade.
    There was a problem raised by Sawada-san in that thread which this
    patch addresses. [2]
    The problem:
    The pgcommitts directory stores commit-timestamp records for each
    transaction, and each record embeds the replication origin ID
    (roident) that identifies which subscription wrote that transaction.
    When pgupgrade migrates a subscriber, the pgcommit_ts directory is
    copied directly from the old cluster to the new cluster. This means
    those embedded roidents must remain valid in the new cluster. When
    pg_upgrade migrates a subscriber, CREATE SUBSCRIPTION on the new
    cluster calls replorigin_create() which assigns fresh roidents to each
    subscription's replication origin. Because subscription OIDs are not
    stable across upgrades, the origin names change (e.g. pg_16392 becomes
    pg_16403), and consequently the roidents can be assigned differently —
    or in the worst case, swapped between subscriptions.
    Consider two subscriptions subA and subB with roidents 1 and 2
    respectively before upgrade. After upgrade, due to OID reassignment,
    subA might get roident 2 and subB might get roident 1. The
    commit-timestamp records copied from the old cluster still say roident
    1 for rows written by subA, but the new cluster now thinks roident 1
    belongs to subB. This causes spurious updateorigindiffers conflicts
    — the new cluster incorrectly thinks a row was last modified by a
    different subscription than it actually was.
    This patch attempts to fix this by replicating the roident of the
    replication origins of each subscription on migration. This patch also
    migrates all replication origins as part of pg_upgrade.
    Sequence of Events During Upgrade
    1. pg_dumpall dumps all non-subscription replication origins from the
    old cluster with their roidents and LSN positions.
    2. pg_dump dumps each subscription, but now records the old roident
    alongside the subscription info.
    3. During restore, pg_dumpall's output recreates non-subscription
    origins on the new cluster with their original roidents via
    binaryupgradecreatereplicationorigin().
    4. During per-database restore, CREATE SUBSCRIPTION runs but skips
    origin creation.
    5. binaryupgradesetnextreplorigin_oid() creates the origin for
    each subscription with the preserved roident.
    6. binaryupgradereplorigin_advance() restores the LSN position for
    each subscription.
    7. Subscriptions that were running before upgrade are re-enabled.
    Please let me know your feedback regarding this patch
    [1] - https://www.postgresql.org/message-id/flat/182311743703924%40mail.yandex.ru
    [2] - https://www.postgresql.org/message-id/CAD21AoDG8zQpHHfw7OvaEy7W0ZSyP%3D_dS-hrcquJ3C_ctMDmMQ%40mail.gmail.com
    regards,
    Ajin Cherian
    Fujitsu Australia
    • Jump to comment-1
      Hayato Kuroda (Fujitsu)<kuroda.hayato@fujitsu.com>
      Apr 29, 2026, 8:41 AM UTC
      Dear Ajin,
      Sequence of Events During Upgrade

      1. pg_dumpall dumps all non-subscription replication origins from the
      old cluster with their roidents and LSN positions.
      2. pg_dump dumps each subscription, but now records the old roident
      alongside the subscription info.
      3. During restore, pg_dumpall's output recreates non-subscription
      origins on the new cluster with their original roidents via
      binaryupgradecreatereplicationorigin().
      To confirm, why do we have to handle separately for subscription-associated
      origins? I'm thinking it's not needed if the subscription's OID is preserved
      during the upgrade.
      I checked the old thread to preserve it [1], but it could not be accepted because
      there are no strong motivations. But I feel this is the good reason to do so now.
      How do you feel?
      [1]: https://www.postgresql.org/message-id/CALDaNm2Wj63VcbB0SY2NECHr1mKM1YSaV1ZydrdQVxyox2O2hg%40mail.gmail.com
      Best regards,
      Hayato Kuroda
      FUJITSU LIMITED
      • Jump to comment-1
        vignesh C<vignesh21@gmail.com>
        Apr 30, 2026, 6:52 AM UTC
        On Wed, 29 Apr 2026 at 14:11, Hayato Kuroda (Fujitsu)
        <kuroda.hayato@fujitsu.com> wrote:

        Dear Ajin,
        Sequence of Events During Upgrade

        1. pg_dumpall dumps all non-subscription replication origins from the
        old cluster with their roidents and LSN positions.
        2. pg_dump dumps each subscription, but now records the old roident
        alongside the subscription info.
        3. During restore, pg_dumpall's output recreates non-subscription
        origins on the new cluster with their original roidents via
        binaryupgradecreatereplicationorigin().

        To confirm, why do we have to handle separately for subscription-associated
        origins? I'm thinking it's not needed if the subscription's OID is preserved
        during the upgrade.
        +1 to preserve the subscription OID. This should make preserving
        replication origin easier.
        I checked the old thread to preserve it [1], but it could not be accepted because
        there are no strong motivations. But I feel this is the good reason to do so now.
        Here is a rebased version of the patch.
        Regards,
        Vignesh
      • Jump to comment-1
        shveta malik<shveta.malik@gmail.com>
        Apr 30, 2026, 9:39 AM UTC
        On Wed, Apr 29, 2026 at 2:11 PM Hayato Kuroda (Fujitsu)
        <kuroda.hayato@fujitsu.com> wrote:

        Dear Ajin,
        Sequence of Events During Upgrade

        1. pg_dumpall dumps all non-subscription replication origins from the
        old cluster with their roidents and LSN positions.
        2. pg_dump dumps each subscription, but now records the old roident
        alongside the subscription info.
        3. During restore, pg_dumpall's output recreates non-subscription
        origins on the new cluster with their original roidents via
        binaryupgradecreatereplicationorigin().

        To confirm, why do we have to handle separately for subscription-associated
        origins? I'm thinking it's not needed if the subscription's OID is preserved
        during the upgrade.
        I’m not sure how preserving the subscription OID would ensure that the
        origin ID is also preserved for sub-associated origins. Could you
        please elaborate?
        As I understand it, roident values are assigned independently during
        origin creation. Even if subscription OIDs are preserved, the origin
        IDs could still be reassigned differently on the new cluster. For
        example, suppose we have two subscriptions, sub1 and sub2, with
        roident values 2 and 3, assuming 1 was previously used and dropped.
        After upgrade, origin creation may start allocating from 1 again,
        resulting in roident values 1 and 2 instead. Since pgcommitts stores
        the numeric roident, not the origin name, this mismatch could still
        lead to incorrect conflict detection. Wouldn’t that result in the same
        wrong conflict detection issue we are trying to avoid?
        Please let me know if my understanding is wrong.
        thanks
        Shveta