pgsql-hackers
❮
[PATCH] Preserve replication origin OIDs in pg_upgrade
- Jump to comment-1Ajin Cherian<itsajin@gmail.com>Apr 28, 2026, 11:19 AM UTCHello hackers,
The idea for this patch came up during discussions in the thread [1]
on migration of the pgcommitts directory as part of pg_upgrade.
There was a problem raised by Sawada-san in that thread which this
patch addresses. [2]
The problem:
The pgcommitts directory stores commit-timestamp records for each
transaction, and each record embeds the replication origin ID
(roident) that identifies which subscription wrote that transaction.
When pgupgrade migrates a subscriber, the pgcommit_ts directory is
copied directly from the old cluster to the new cluster. This means
those embedded roidents must remain valid in the new cluster. When
pg_upgrade migrates a subscriber, CREATE SUBSCRIPTION on the new
cluster calls replorigin_create() which assigns fresh roidents to each
subscription's replication origin. Because subscription OIDs are not
stable across upgrades, the origin names change (e.g. pg_16392 becomes
pg_16403), and consequently the roidents can be assigned differently —
or in the worst case, swapped between subscriptions.
Consider two subscriptions subA and subB with roidents 1 and 2
respectively before upgrade. After upgrade, due to OID reassignment,
subA might get roident 2 and subB might get roident 1. The
commit-timestamp records copied from the old cluster still say roident
1 for rows written by subA, but the new cluster now thinks roident 1
belongs to subB. This causes spurious updateorigindiffers conflicts
— the new cluster incorrectly thinks a row was last modified by a
different subscription than it actually was.
This patch attempts to fix this by replicating the roident of the
replication origins of each subscription on migration. This patch also
migrates all replication origins as part of pg_upgrade.
Sequence of Events During Upgrade
1. pg_dumpall dumps all non-subscription replication origins from the
old cluster with their roidents and LSN positions.
2. pg_dump dumps each subscription, but now records the old roident
alongside the subscription info.
3. During restore, pg_dumpall's output recreates non-subscription
origins on the new cluster with their original roidents via
binaryupgradecreatereplicationorigin().
4. During per-database restore, CREATE SUBSCRIPTION runs but skips
origin creation.
5. binaryupgradesetnextreplorigin_oid() creates the origin for
each subscription with the preserved roident.
6. binaryupgradereplorigin_advance() restores the LSN position for
each subscription.
7. Subscriptions that were running before upgrade are re-enabled.
Please let me know your feedback regarding this patch
[1] - https://www.postgresql.org/message-id/flat/182311743703924%40mail.yandex.ru
[2] - https://www.postgresql.org/message-id/CAD21AoDG8zQpHHfw7OvaEy7W0ZSyP%3D_dS-hrcquJ3C_ctMDmMQ%40mail.gmail.com
regards,
Ajin Cherian
Fujitsu Australia- Jump to comment-1Hayato Kuroda (Fujitsu)<kuroda.hayato@fujitsu.com>Apr 29, 2026, 8:41 AM UTCDear Ajin,
Sequence of Events During Upgrade
To confirm, why do we have to handle separately for subscription-associated
1. pg_dumpall dumps all non-subscription replication origins from the
old cluster with their roidents and LSN positions.
2. pg_dump dumps each subscription, but now records the old roident
alongside the subscription info.
3. During restore, pg_dumpall's output recreates non-subscription
origins on the new cluster with their original roidents via
binaryupgradecreatereplicationorigin().
origins? I'm thinking it's not needed if the subscription's OID is preserved
during the upgrade.
I checked the old thread to preserve it [1], but it could not be accepted because
there are no strong motivations. But I feel this is the good reason to do so now.
How do you feel?
[1]: https://www.postgresql.org/message-id/CALDaNm2Wj63VcbB0SY2NECHr1mKM1YSaV1ZydrdQVxyox2O2hg%40mail.gmail.com
Best regards,
Hayato Kuroda
FUJITSU LIMITED- Jump to comment-1vignesh C<vignesh21@gmail.com>Apr 30, 2026, 6:52 AM UTCOn Wed, 29 Apr 2026 at 14:11, Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Ajin,Sequence of Events During Upgrade
1. pg_dumpall dumps all non-subscription replication origins from the
old cluster with their roidents and LSN positions.
2. pg_dump dumps each subscription, but now records the old roident
alongside the subscription info.
3. During restore, pg_dumpall's output recreates non-subscription
origins on the new cluster with their original roidents via
binaryupgradecreatereplicationorigin().
+1 to preserve the subscription OID. This should make preserving
To confirm, why do we have to handle separately for subscription-associated
origins? I'm thinking it's not needed if the subscription's OID is preserved
during the upgrade.
replication origin easier.I checked the old thread to preserve it [1], but it could not be accepted because
Here is a rebased version of the patch.
there are no strong motivations. But I feel this is the good reason to do so now.
Regards,
Vignesh - Jump to comment-1shveta malik<shveta.malik@gmail.com>Apr 30, 2026, 9:39 AM UTCOn Wed, Apr 29, 2026 at 2:11 PM Hayato Kuroda (Fujitsu)
<kuroda.hayato@fujitsu.com> wrote:
Dear Ajin,Sequence of Events During Upgrade
1. pg_dumpall dumps all non-subscription replication origins from the
old cluster with their roidents and LSN positions.
2. pg_dump dumps each subscription, but now records the old roident
alongside the subscription info.
3. During restore, pg_dumpall's output recreates non-subscription
origins on the new cluster with their original roidents via
binaryupgradecreatereplicationorigin().
I’m not sure how preserving the subscription OID would ensure that the
To confirm, why do we have to handle separately for subscription-associated
origins? I'm thinking it's not needed if the subscription's OID is preserved
during the upgrade.
origin ID is also preserved for sub-associated origins. Could you
please elaborate?
As I understand it, roident values are assigned independently during
origin creation. Even if subscription OIDs are preserved, the origin
IDs could still be reassigned differently on the new cluster. For
example, suppose we have two subscriptions, sub1 and sub2, with
roident values 2 and 3, assuming 1 was previously used and dropped.
After upgrade, origin creation may start allocating from 1 again,
resulting in roident values 1 and 2 instead. Since pgcommitts stores
the numeric roident, not the origin name, this mismatch could still
lead to incorrect conflict detection. Wouldn’t that result in the same
wrong conflict detection issue we are trying to avoid?
Please let me know if my understanding is wrong.
thanks
Shveta