New Remix App

pgsql-hackers

❮

promotion related handling in pg_sync_replication_slots()

Jump to comment-1
shveta.malik@gmail.com2024-04-04T11:35:50+00:00
Hello hackers, Currently, promotion related handling is missing in the slot sync SQL function pg_sync_replication_slots(). Here is the background on how it is done in slot sync worker: During promotion, the startup process in order to shut down the slot-sync worker, sets the 'stopSignaled' flag, sends the shut-down signal, and waits for slot sync worker to exit. Meanwhile if the postmaster has not noticed the promotion yet, it may end up restarting slot sync worker. In such a case, the worker exits if 'stopSignaled' is set. Since there is a chance that the user (or any of his scripts/tools) may execute SQL function pg_sync_replication_slots() in parallel to promotion, such handling is needed in this SQL function as well, The attached patch attempts to implement the same. Changes are: 1) If pg_sync_replication_slots() is already running when the promotion is triggered, ShutDownSlotSync() checks the 'SlotSyncCtx->syncing' flag as well and waits for it to become false i.e. waits till parallel running SQL function is finished. 2) If pg_sync_replication_slots() is invoked when promotion is already in progress, pg_sync_replication_slots() respects the 'stopSignaled' flag set by the startup process and becomes a no-op. thanks Shveta
- Jump to comment-1
  amit.kapila16@gmail.com2024-04-04T11:47:07+00:00
  On Thu, Apr 4, 2024 at 5:05 PM shveta malik <shveta.malik@gmail.com> wrote: > > Hello hackers, > > Currently, promotion related handling is missing in the slot sync SQL > function pg_sync_replication_slots(). Here is the background on how > it is done in slot sync worker: > During promotion, the startup process in order to shut down the > slot-sync worker, sets the 'stopSignaled' flag, sends the shut-down > signal, and waits for slot sync worker to exit. Meanwhile if the > postmaster has not noticed the promotion yet, it may end up restarting > slot sync worker. In such a case, the worker exits if 'stopSignaled' > is set. > > Since there is a chance that the user (or any of his scripts/tools) > may execute SQL function pg_sync_replication_slots() in parallel to > promotion, such handling is needed in this SQL function as well, The > attached patch attempts to implement the same. > Thanks for the report and patch. I'll look into it. -- With Regards, Amit Kapila.
  Jump to comment-1
  shveta.malik@gmail.com2024-04-05T05:01:35+00:00
  On Thu, Apr 4, 2024 at 5:17 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Thu, Apr 4, 2024 at 5:05 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Hello hackers, > > > > Currently, promotion related handling is missing in the slot sync SQL > > function pg_sync_replication_slots(). Here is the background on how > > it is done in slot sync worker: > > During promotion, the startup process in order to shut down the > > slot-sync worker, sets the 'stopSignaled' flag, sends the shut-down > > signal, and waits for slot sync worker to exit. Meanwhile if the > > postmaster has not noticed the promotion yet, it may end up restarting > > slot sync worker. In such a case, the worker exits if 'stopSignaled' > > is set. > > > > Since there is a chance that the user (or any of his scripts/tools) > > may execute SQL function pg_sync_replication_slots() in parallel to > > promotion, such handling is needed in this SQL function as well, The > > attached patch attempts to implement the same. > > > > Thanks for the report and patch. I'll look into it. > Please find v2. Changes are: 1) Rebased the patch as there was conflict due to recent commit 6f132ed. 2) Added an Assert in update_synced_slots_inactive_since() to ensure that the slot does not have active_pid. 3) Improved commit msg and comments. thanks Shveta
  Jump to comment-1
  bharath.rupireddyforpostgres@gmail.com2024-04-06T06:55:47+00:00
  On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote: > > Please find v2. Changes are: Thanks for the patch. Here are some comments. 1. Can we have a clear saying in the shmem variable who's syncing at the moment? Is it a slot sync worker or a backend via SQL function? Perhaps turn "bool syncing;" to "SlotSyncSource sync_source;" typedef enum SlotSyncSource { SLOT_SYNC_NONE, SLOT_SYNC_WORKER, SLOT_SYNC_BACKEND, } SlotSyncSource; Then, the check in ShutDownSlotSync can be: + /* + * Return if neither the slot sync worker is running nor the function + * pg_sync_replication_slots() is executing. + */ + if ((SlotSyncCtx->pid == InvalidPid) && SlotSyncCtx->sync_source != SLOT_SYNC_BACKEND) { 2. SyncReplicationSlots(WalReceiverConn *wrconn) { + /* + * Startup process signaled the slot sync to stop, so if meanwhile user + * has invoked slot sync SQL function, simply return. + */ + SpinLockAcquire(&SlotSyncCtx->mutex); + if (SlotSyncCtx->stopSignaled) + { + ereport(LOG, + errmsg("skipping slot synchronization as slot sync shutdown is signaled during promotion")); + Unless I'm missing something, I think this can't detect if the backend via SQL function is already half-way through syncing in synchronize_one_slot. So, better move this check to (or also have it there) slot sync loop that calls synchronize_one_slot. To avoid spinlock acquisitions, we can perhaps do this check in when we acquire the spinlock for synced flag. /* Search for the named slot */ if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) { bool synced; SpinLockAcquire(&slot->mutex); synced = slot->data.synced; << get SlotSyncCtx->stopSignaled here >> SpinLockRelease(&slot->mutex); << do the slot sync skip check here if (stopSignaled) >> 3. Can we have a test or steps at least to check the consequences manually one can get if slot syncing via SQL function is happening during the promotion? IIUC, we need to ensure there is no backend acquiring it and performing sync while the slot sync worker is shutting down/standby promotion is occuring. Otherwise, some of the slots can get resynced and some are not while we are shutting down the slot sync worker as part of the standby promotion which might leave the slots in an inconsistent state. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  shveta.malik@gmail.com2024-04-12T02:28:00+00:00
  On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > Please find v2. Changes are: > > Thanks for the patch. Here are some comments. Thanks for reviewing. > > 1. Can we have a clear saying in the shmem variable who's syncing at > the moment? Is it a slot sync worker or a backend via SQL function? > Perhaps turn "bool syncing;" to "SlotSyncSource sync_source;" > > typedef enum SlotSyncSource > { > SLOT_SYNC_NONE, > SLOT_SYNC_WORKER, > SLOT_SYNC_BACKEND, > } SlotSyncSource; > > Then, the check in ShutDownSlotSync can be: > > + /* > + * Return if neither the slot sync worker is running nor the function > + * pg_sync_replication_slots() is executing. > + */ > + if ((SlotSyncCtx->pid == InvalidPid) && > SlotSyncCtx->sync_source != SLOT_SYNC_BACKEND) > { > > 2. > SyncReplicationSlots(WalReceiverConn *wrconn) > { > + /* > + * Startup process signaled the slot sync to stop, so if meanwhile user > + * has invoked slot sync SQL function, simply return. > + */ > + SpinLockAcquire(&SlotSyncCtx->mutex); > + if (SlotSyncCtx->stopSignaled) > + { > + ereport(LOG, > + errmsg("skipping slot synchronization as slot sync > shutdown is signaled during promotion")); > + > > Unless I'm missing something, I think this can't detect if the backend > via SQL function is already half-way through syncing in > synchronize_one_slot. So, better move this check to (or also have it > there) slot sync loop that calls synchronize_one_slot. To avoid > spinlock acquisitions, we can perhaps do this check in when we acquire > the spinlock for synced flag. If the sync via SQL function is already half-way, then promotion should wait for it to finish. I don't think it is a good idea to move the check to synchronize_one_slot(). The sync-call should either not start (if it noticed the promotion) or finish the sync and then let promotion proceed. But I would like to know others' opinion on this. > > /* Search for the named slot */ > if ((slot = SearchNamedReplicationSlot(remote_slot->name, true))) > { > bool synced; > > SpinLockAcquire(&slot->mutex); > synced = slot->data.synced; > << get SlotSyncCtx->stopSignaled here >> > SpinLockRelease(&slot->mutex); > > << do the slot sync skip check here if (stopSignaled) >> > > 3. Can we have a test or steps at least to check the consequences > manually one can get if slot syncing via SQL function is happening > during the promotion? > > IIUC, we need to ensure there is no backend acquiring it and > performing sync while the slot sync worker is shutting down/standby > promotion is occuring. Otherwise, some of the slots can get resynced > and some are not while we are shutting down the slot sync worker as > part of the standby promotion which might leave the slots in an > inconsistent state. I do not think that we can reach a state (exception is some error scenario) where some of the slots are synced while the rest are not during a *particular* sync-cycle only because promotion is going in parallel. (And yes we need to fix the race-condition stated by Amit up-thread for this statement to be true.) thanks Shveta
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-12T10:48:12+00:00
  On Fri, Apr 12, 2024 at 7:57 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Please find v2. Changes are: > > > > Thanks for the patch. Here are some comments. > > Thanks for reviewing. > > > > 1. Can we have a clear saying in the shmem variable who's syncing at > > the moment? Is it a slot sync worker or a backend via SQL function? > > Perhaps turn "bool syncing;" to "SlotSyncSource sync_source;" > > > > typedef enum SlotSyncSource > > { > > SLOT_SYNC_NONE, > > SLOT_SYNC_WORKER, > > SLOT_SYNC_BACKEND, > > } SlotSyncSource; > > > > Then, the check in ShutDownSlotSync can be: > > > > + /* > > + * Return if neither the slot sync worker is running nor the function > > + * pg_sync_replication_slots() is executing. > > + */ > > + if ((SlotSyncCtx->pid == InvalidPid) && > > SlotSyncCtx->sync_source != SLOT_SYNC_BACKEND) > > { > > I don't know if this will be help, especially after fixing the race condition I mentioned. But otherwise, also, at this stage it doesn't seem helpful to add the source of sync explicitly. > > 2. > > SyncReplicationSlots(WalReceiverConn *wrconn) > > { > > + /* > > + * Startup process signaled the slot sync to stop, so if meanwhile user > > + * has invoked slot sync SQL function, simply return. > > + */ > > + SpinLockAcquire(&SlotSyncCtx->mutex); > > + if (SlotSyncCtx->stopSignaled) > > + { > > + ereport(LOG, > > + errmsg("skipping slot synchronization as slot sync > > shutdown is signaled during promotion")); > > + > > > > Unless I'm missing something, I think this can't detect if the backend > > via SQL function is already half-way through syncing in > > synchronize_one_slot. So, better move this check to (or also have it > > there) slot sync loop that calls synchronize_one_slot. To avoid > > spinlock acquisitions, we can perhaps do this check in when we acquire > > the spinlock for synced flag. > > If the sync via SQL function is already half-way, then promotion > should wait for it to finish. I don't think it is a good idea to move > the check to synchronize_one_slot(). The sync-call should either not > start (if it noticed the promotion) or finish the sync and then let > promotion proceed. But I would like to know others' opinion on this. > Agreed. -- With Regards, Amit Kapila.
  Jump to comment-1
  shveta.malik@gmail.com2024-04-12T11:56:09+00:00
  On Fri, Apr 12, 2024 at 4:18 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Apr 12, 2024 at 7:57 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Sat, Apr 6, 2024 at 12:25 PM Bharath Rupireddy > > <bharath.rupireddyforpostgres@gmail.com> wrote: > > > > > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > Please find v2. Changes are: > > > > > > Thanks for the patch. Here are some comments. > > > > Thanks for reviewing. > > > > > > 1. Can we have a clear saying in the shmem variable who's syncing at > > > the moment? Is it a slot sync worker or a backend via SQL function? > > > Perhaps turn "bool syncing;" to "SlotSyncSource sync_source;" > > > > > > typedef enum SlotSyncSource > > > { > > > SLOT_SYNC_NONE, > > > SLOT_SYNC_WORKER, > > > SLOT_SYNC_BACKEND, > > > } SlotSyncSource; > > > > > > Then, the check in ShutDownSlotSync can be: > > > > > > + /* > > > + * Return if neither the slot sync worker is running nor the function > > > + * pg_sync_replication_slots() is executing. > > > + */ > > > + if ((SlotSyncCtx->pid == InvalidPid) && > > > SlotSyncCtx->sync_source != SLOT_SYNC_BACKEND) > > > { > > > > > I don't know if this will be help, especially after fixing the race > condition I mentioned. But otherwise, also, at this stage it doesn't > seem helpful to add the source of sync explicitly. > Agreed. Please find v3 addressing race-condition and one other comment. Up-thread it was suggested that, probably, checking SlotSyncCtx->syncing should be sufficient in ShutDownSlotSync(). On re-thinking, it might not be. Slot sync worker sets and resets 'syncing' with each sync-cycle, and thus we need to rely on worker's pid in ShutDownSlotSync(), as there could be a window where promotion is triggered and 'syncing' is not set for worker, while the worker is still running. This implementation of setting and resetting syncing with each sync-cycle looks better as compared to setting syncing during the entire life-cycle of the worker. So, I did not change it. To fix the race condition, I moved the setting of the 'syncing' flag together with the 'stopSignaled' check under the same spinLock for the SQL function. OTOH, for worker, I feel it is good to check 'stopSignaled' at the beginning itself, while retaining the setting/resetting of 'syncing' at a later stage during the actual sync cycle. This makes handling for SQL function and worker slightly different. And thus to achieve this, I had to take the 'syncing' flag handling out of synchronize_slots() and move it to both worker and SQL function by introducing 2 new functions check_and_set_syncing_flag() and reset_syncing_flag(). I am analyzing if there are better ways to achieve this, any suggestions are welcome. thanks Shveta
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-15T08:59:33+00:00
  On Fri, Apr 12, 2024 at 5:25 PM shveta malik <shveta.malik@gmail.com> wrote: > > Please find v3 addressing race-condition and one other comment. > > Up-thread it was suggested that, probably, checking > SlotSyncCtx->syncing should be sufficient in ShutDownSlotSync(). On > re-thinking, it might not be. Slot sync worker sets and resets > 'syncing' with each sync-cycle, and thus we need to rely on worker's > pid in ShutDownSlotSync(), as there could be a window where promotion > is triggered and 'syncing' is not set for worker, while the worker is > still running. This implementation of setting and resetting syncing > with each sync-cycle looks better as compared to setting syncing > during the entire life-cycle of the worker. So, I did not change it. > To retain this we need to have different handling for 'syncing' for workers and function which seems like more maintenance burden than the value it provides. Moreover, in SyncReplicationSlots(), we are calling a function after acquiring spinlock which is not our usual coding practice. One minor comment: * All the fields except 'syncing' are used only by slotsync worker. * 'syncing' is used both by worker and SQL function pg_sync_replication_slots. */ typedef struct SlotSyncCtxStruct { pid_t pid; bool stopSignaled; bool syncing; time_t last_start_time; slock_t mutex; } SlotSyncCtxStruct; I feel the above comment is no longer valid after this patch. We can probably remove this altogether. -- With Regards, Amit Kapila.
  Jump to comment-1
  shveta.malik@gmail.com2024-04-15T10:09:12+00:00
  On Mon, Apr 15, 2024 at 2:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Apr 12, 2024 at 5:25 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Please find v3 addressing race-condition and one other comment. > > > > Up-thread it was suggested that, probably, checking > > SlotSyncCtx->syncing should be sufficient in ShutDownSlotSync(). On > > re-thinking, it might not be. Slot sync worker sets and resets > > 'syncing' with each sync-cycle, and thus we need to rely on worker's > > pid in ShutDownSlotSync(), as there could be a window where promotion > > is triggered and 'syncing' is not set for worker, while the worker is > > still running. This implementation of setting and resetting syncing > > with each sync-cycle looks better as compared to setting syncing > > during the entire life-cycle of the worker. So, I did not change it. > > > > To retain this we need to have different handling for 'syncing' for > workers and function which seems like more maintenance burden than the > value it provides. Moreover, in SyncReplicationSlots(), we are calling > a function after acquiring spinlock which is not our usual coding > practice. Okay. Changed it to consistent handling. Now both worker and SQL function set 'syncing' when they start and reset it when they exit. > One minor comment: > * All the fields except 'syncing' are used only by slotsync worker. > * 'syncing' is used both by worker and SQL function pg_sync_replication_slots. > */ > typedef struct SlotSyncCtxStruct > { > pid_t pid; > bool stopSignaled; > bool syncing; > time_t last_start_time; > slock_t mutex; > } SlotSyncCtxStruct; > > I feel the above comment is no longer valid after this patch. We can > probably remove this altogether. Yes, changed. Please find v4 addressing the above comments. thanks Shveta
  Jump to comment-1
  bertranddrouvot.pg@gmail.com2024-04-15T12:51:14+00:00
  Hi, On Mon, Apr 15, 2024 at 03:38:53PM +0530, shveta malik wrote: > On Mon, Apr 15, 2024 at 2:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Apr 12, 2024 at 5:25 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Please find v3 addressing race-condition and one other comment. > > > > > > Up-thread it was suggested that, probably, checking > > > SlotSyncCtx->syncing should be sufficient in ShutDownSlotSync(). On > > > re-thinking, it might not be. Slot sync worker sets and resets > > > 'syncing' with each sync-cycle, and thus we need to rely on worker's > > > pid in ShutDownSlotSync(), as there could be a window where promotion > > > is triggered and 'syncing' is not set for worker, while the worker is > > > still running. This implementation of setting and resetting syncing > > > with each sync-cycle looks better as compared to setting syncing > > > during the entire life-cycle of the worker. So, I did not change it. > > > > > > > To retain this we need to have different handling for 'syncing' for > > workers and function which seems like more maintenance burden than the > > value it provides. Moreover, in SyncReplicationSlots(), we are calling > > a function after acquiring spinlock which is not our usual coding > > practice. > > Okay. Changed it to consistent handling. Thanks for the patch! > Now both worker and SQL > function set 'syncing' when they start and reset it when they exit. It means that it's not possible anymore to trigger a manual sync if sync_replication_slots is on. Indeed that would trigger: postgres=# select pg_sync_replication_slots(); ERROR: cannot synchronize replication slots concurrently That looks like an issue to me, thoughts? Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-15T13:00:09+00:00
  On Mon, Apr 15, 2024 at 6:21 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Mon, Apr 15, 2024 at 03:38:53PM +0530, shveta malik wrote: > > On Mon, Apr 15, 2024 at 2:29 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > On Fri, Apr 12, 2024 at 5:25 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > > > Please find v3 addressing race-condition and one other comment. > > > > > > > > Up-thread it was suggested that, probably, checking > > > > SlotSyncCtx->syncing should be sufficient in ShutDownSlotSync(). On > > > > re-thinking, it might not be. Slot sync worker sets and resets > > > > 'syncing' with each sync-cycle, and thus we need to rely on worker's > > > > pid in ShutDownSlotSync(), as there could be a window where promotion > > > > is triggered and 'syncing' is not set for worker, while the worker is > > > > still running. This implementation of setting and resetting syncing > > > > with each sync-cycle looks better as compared to setting syncing > > > > during the entire life-cycle of the worker. So, I did not change it. > > > > > > > > > > To retain this we need to have different handling for 'syncing' for > > > workers and function which seems like more maintenance burden than the > > > value it provides. Moreover, in SyncReplicationSlots(), we are calling > > > a function after acquiring spinlock which is not our usual coding > > > practice. > > > > Okay. Changed it to consistent handling. > > Thanks for the patch! > > > Now both worker and SQL > > function set 'syncing' when they start and reset it when they exit. > > It means that it's not possible anymore to trigger a manual sync if > sync_replication_slots is on. Indeed that would trigger: > > postgres=# select pg_sync_replication_slots(); > ERROR: cannot synchronize replication slots concurrently > > That looks like an issue to me, thoughts? > This is intentional as of now for the sake of keeping implementation/code simple. It is not difficult to allow them but I am not sure whether we want to add another set of conditions allowing them in parallel. And that too in an unpredictable way as the API will work only for the time slot sync worker is not performing the sync. -- With Regards, Amit Kapila.
  Jump to comment-1
  bertranddrouvot.pg@gmail.com2024-04-15T14:17:15+00:00
  Hi, On Mon, Apr 15, 2024 at 06:29:49PM +0530, Amit Kapila wrote: > On Mon, Apr 15, 2024 at 6:21 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Mon, Apr 15, 2024 at 03:38:53PM +0530, shveta malik wrote: > > > Now both worker and SQL > > > function set 'syncing' when they start and reset it when they exit. > > > > It means that it's not possible anymore to trigger a manual sync if > > sync_replication_slots is on. Indeed that would trigger: > > > > postgres=# select pg_sync_replication_slots(); > > ERROR: cannot synchronize replication slots concurrently > > > > That looks like an issue to me, thoughts? > > > > This is intentional as of now for the sake of keeping > implementation/code simple. It is not difficult to allow them but I am > not sure whether we want to add another set of conditions allowing > them in parallel. I think that the ability to launch a manual sync before a switchover would be missed. Except for this case I don't think that's an issue to prevent them to run in parallel. > And that too in an unpredictable way as the API will > work only for the time slot sync worker is not performing the sync. Yeah but then at least you would know that there is "really" a sync in progress (which is not the case currently with v4, as the sync worker being started is enough to prevent a manual sync even if a sync is not in progress). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-16T02:51:26+00:00
  On Mon, Apr 15, 2024 at 7:47 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Mon, Apr 15, 2024 at 06:29:49PM +0530, Amit Kapila wrote: > > On Mon, Apr 15, 2024 at 6:21 PM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > On Mon, Apr 15, 2024 at 03:38:53PM +0530, shveta malik wrote: > > > > Now both worker and SQL > > > > function set 'syncing' when they start and reset it when they exit. > > > > > > It means that it's not possible anymore to trigger a manual sync if > > > sync_replication_slots is on. Indeed that would trigger: > > > > > > postgres=# select pg_sync_replication_slots(); > > > ERROR: cannot synchronize replication slots concurrently > > > > > > That looks like an issue to me, thoughts? > > > > > > > This is intentional as of now for the sake of keeping > > implementation/code simple. It is not difficult to allow them but I am > > not sure whether we want to add another set of conditions allowing > > them in parallel. > > I think that the ability to launch a manual sync before a switchover would be > missed. Except for this case I don't think that's an issue to prevent them to > run in parallel. > I think if the slotsync worker is available, it can do that as well. There is no clear use case for allowing them in parallel and I feel it would add more confusion when it can work sometimes but not other times. However, if we receive some report from the field where there is a real demand for such a thing, it should be easy to achieve. For example, I can imagine that we can have sync_state that has values 'started', 'in_progress' , and 'finished'. This should allow us to achieve what the current proposed patch is doing along with allowing the API to work in parallel when the sync_state is not 'in_progress'. I think for now let's restrict their usage in parallel and make the promotion behavior consistent both for worker and API. -- With Regards, Amit Kapila.
  Jump to comment-1
  bertranddrouvot.pg@gmail.com2024-04-16T06:33:36+00:00
  Hi, On Tue, Apr 16, 2024 at 08:21:04AM +0530, Amit Kapila wrote: > On Mon, Apr 15, 2024 at 7:47 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Mon, Apr 15, 2024 at 06:29:49PM +0530, Amit Kapila wrote: > > > On Mon, Apr 15, 2024 at 6:21 PM Bertrand Drouvot > > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > > > On Mon, Apr 15, 2024 at 03:38:53PM +0530, shveta malik wrote: > > > > > Now both worker and SQL > > > > > function set 'syncing' when they start and reset it when they exit. > > > > > > > > It means that it's not possible anymore to trigger a manual sync if > > > > sync_replication_slots is on. Indeed that would trigger: > > > > > > > > postgres=# select pg_sync_replication_slots(); > > > > ERROR: cannot synchronize replication slots concurrently > > > > > > > > That looks like an issue to me, thoughts? > > > > > > > > > > This is intentional as of now for the sake of keeping > > > implementation/code simple. It is not difficult to allow them but I am > > > not sure whether we want to add another set of conditions allowing > > > them in parallel. > > > > I think that the ability to launch a manual sync before a switchover would be > > missed. Except for this case I don't think that's an issue to prevent them to > > run in parallel. > > > > I think if the slotsync worker is available, it can do that as well. Right, but one has no control as to when the sync is triggered. > There is no clear use case for allowing them in parallel and I feel it > would add more confusion when it can work sometimes but not other > times. However, if we receive some report from the field where there > is a real demand for such a thing, it should be easy to achieve. For > example, I can imagine that we can have sync_state that has values > 'started', 'in_progress' , and 'finished'. This should allow us to > achieve what the current proposed patch is doing along with allowing > the API to work in parallel when the sync_state is not 'in_progress'. > > I think for now let's restrict their usage in parallel and make the > promotion behavior consistent both for worker and API. Okay, let's do it that way. Is it worth to add a few words in the doc related to pg_sync_replication_slots() though? (to mention it can not be used if the sync slot worker is running). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-16T06:38:42+00:00
  On Tue, Apr 16, 2024 at 12:03 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > On Tue, Apr 16, 2024 at 08:21:04AM +0530, Amit Kapila wrote: > > > There is no clear use case for allowing them in parallel and I feel it > > would add more confusion when it can work sometimes but not other > > times. However, if we receive some report from the field where there > > is a real demand for such a thing, it should be easy to achieve. For > > example, I can imagine that we can have sync_state that has values > > 'started', 'in_progress' , and 'finished'. This should allow us to > > achieve what the current proposed patch is doing along with allowing > > the API to work in parallel when the sync_state is not 'in_progress'. > > > > I think for now let's restrict their usage in parallel and make the > > promotion behavior consistent both for worker and API. > > Okay, let's do it that way. Is it worth to add a few words in the doc related to > pg_sync_replication_slots() though? (to mention it can not be used if the sync > slot worker is running). > Yes, this makes sense to me. -- With Regards, Amit Kapila.
  Jump to comment-1
  shveta.malik@gmail.com2024-04-16T08:22:58+00:00
  On Tue, Apr 16, 2024 at 12:03 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > I think for now let's restrict their usage in parallel and make the > > promotion behavior consistent both for worker and API. > > Okay, let's do it that way. Is it worth to add a few words in the doc related to > pg_sync_replication_slots() though? (to mention it can not be used if the sync > slot worker is running). +1. Please find v6 having the suggested doc changes. thanks Shveta
  Jump to comment-1
  houzj.fnst@fujitsu.com2024-04-16T03:57:57+00:00
  On Monday, April 15, 2024 6:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > Please find v4 addressing the above comments. Thanks for the patch. Here are few comments: 1. + ereport(ERROR, + errmsg("promotion in progress, can not synchronize replication slots")); + } I think an errcode is needed. The style of the error message seems a bit unnatural to me. I suggest: "cannot synchronize replication slots when standby promotion is ongoing" 2. + if (worker_pid != InvalidPid) + Assert(SlotSyncCtx->pid == InvalidPid); We could merge the checks into one Assert(). Assert(SlotSyncCtx->pid == InvalidPid || worker_pid == InvalidPid); 3. - pqsignal(SIGINT, SignalHandlerForShutdownRequest); I realized that we should register this before setting SlotSyncCtx->pid, otherwise if the standby is promoted after setting pid and before registering signal handle function, the slotsync worker could miss to handle SIGINT sent by startup process(ShutDownSlotSync). This is an existing issue for slotsync worker, but maybe we could fix it together with the patch. Best Regards, Hou zj
  Jump to comment-1
  shveta.malik@gmail.com2024-04-16T04:30:27+00:00
  On Tue, Apr 16, 2024 at 9:27 AM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Monday, April 15, 2024 6:09 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Please find v4 addressing the above comments. > > Thanks for the patch. > > Here are few comments: Thanks for reviewing the patch. > > 1. > > + ereport(ERROR, > + errmsg("promotion in progress, can not synchronize replication slots")); > + } > > I think an errcode is needed. > > The style of the error message seems a bit unnatural to me. I suggest: > "cannot synchronize replication slots when standby promotion is ongoing" Modified. > > 2. > > + if (worker_pid != InvalidPid) > + Assert(SlotSyncCtx->pid == InvalidPid); > > We could merge the checks into one Assert(). > Assert(SlotSyncCtx->pid == InvalidPid || worker_pid == InvalidPid); Modified. > > 3. > > - pqsignal(SIGINT, SignalHandlerForShutdownRequest); > > I realized that we should register this before setting SlotSyncCtx->pid, > otherwise if the standby is promoted after setting pid and before registering > signal handle function, the slotsync worker could miss to handle SIGINT sent by > startup process(ShutDownSlotSync). This is an existing issue for slotsync > worker, but maybe we could fix it together with the patch. Yes, it seems like a problem. Fixed it. Also to be consistent, moved other signal handlers' registration as well before we set pid. Please find v5 addressing above comments. thanks Shveta
  Jump to comment-1
  bertranddrouvot.pg@gmail.com2024-04-16T06:52:51+00:00
  Hi, On Tue, Apr 16, 2024 at 10:00:04AM +0530, shveta malik wrote: > Please find v5 addressing above comments. Thanks! @@ -1634,9 +1677,14 @@ SyncReplicationSlots(WalReceiverConn *wrconn) { PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, PointerGetDatum(wrconn)); { + check_flags_and_set_sync_info(InvalidPid); + Given the fact that if the sync worker is running it won't be possible to trigger a manual sync with pg_sync_replication_slots(), what about also checking the "sync_replication_slots" value at the start of SyncReplicationSlots() and emmit an error if sync_replication_slots is set to on? (The message could explicitly states that it's not possible to use the function if sync_replication_slots is set to on). Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  houzj.fnst@fujitsu.com2024-04-16T08:25:45+00:00
  On Tuesday, April 16, 2024 2:52 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: Hi, > On Tue, Apr 16, 2024 at 10:00:04AM +0530, shveta malik wrote: > > Please find v5 addressing above comments. > > Thanks! > > @@ -1634,9 +1677,14 @@ SyncReplicationSlots(WalReceiverConn *wrconn) { > PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, > PointerGetDatum(wrconn)); > { > + check_flags_and_set_sync_info(InvalidPid); > + > > Given the fact that if the sync worker is running it won't be possible to trigger a > manual sync with pg_sync_replication_slots(), what about also checking the > "sync_replication_slots" value at the start of SyncReplicationSlots() and emmit > an error if sync_replication_slots is set to on? (The message could explicitly > states that it's not possible to use the function if sync_replication_slots is set to > on). I personally feel adding the additional check for sync_replication_slots may not improve the situation here. Because the GUC sync_replication_slots can change at any point, the GUC could be false when performing this addition check and is set to true immediately after the check, so It could not simplify the logic anyway. Best Regards, Hou zj
  Jump to comment-1
  shveta.malik@gmail.com2024-04-16T08:37:03+00:00
  On Tue, Apr 16, 2024 at 1:55 PM Zhijie Hou (Fujitsu) <houzj.fnst@fujitsu.com> wrote: > > On Tuesday, April 16, 2024 2:52 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > > Hi, > > > On Tue, Apr 16, 2024 at 10:00:04AM +0530, shveta malik wrote: > > > Please find v5 addressing above comments. > > > > Thanks! > > > > @@ -1634,9 +1677,14 @@ SyncReplicationSlots(WalReceiverConn *wrconn) { > > PG_ENSURE_ERROR_CLEANUP(slotsync_failure_callback, > > PointerGetDatum(wrconn)); > > { > > + check_flags_and_set_sync_info(InvalidPid); > > + > > > > Given the fact that if the sync worker is running it won't be possible to trigger a > > manual sync with pg_sync_replication_slots(), what about also checking the > > "sync_replication_slots" value at the start of SyncReplicationSlots() and emmit > > an error if sync_replication_slots is set to on? (The message could explicitly > > states that it's not possible to use the function if sync_replication_slots is set to > > on). > > I personally feel adding the additional check for sync_replication_slots may > not improve the situation here. Because the GUC sync_replication_slots can > change at any point, the GUC could be false when performing this addition check > and is set to true immediately after the check, so It could not simplify the logic > anyway. +1. I feel doc and "cannot synchronize replication slots concurrently" check should suffice. In the scenario which Hou-San pointed out, if after performing the GUC check in SQL function, this GUC is enabled immediately and say worker is started sooner than the function could get chance to sync, in that case as well, SQL function will ultimately get error "cannot synchronize replication slots concurrently", even though GUC is enabled. Thus, I feel we should stick with samer error in all scenarios. thanks Shveta
  Jump to comment-1
  bertranddrouvot.pg@gmail.com2024-04-16T09:22:14+00:00
  Hi, On Tue, Apr 16, 2024 at 02:06:45PM +0530, shveta malik wrote: > On Tue, Apr 16, 2024 at 1:55 PM Zhijie Hou (Fujitsu) > <houzj.fnst@fujitsu.com> wrote: > > I personally feel adding the additional check for sync_replication_slots may > > not improve the situation here. Because the GUC sync_replication_slots can > > change at any point, the GUC could be false when performing this addition check > > and is set to true immediately after the check, so It could not simplify the logic > > anyway. > > +1. > I feel doc and "cannot synchronize replication slots concurrently" > check should suffice. > > In the scenario which Hou-San pointed out, if after performing the > GUC check in SQL function, this GUC is enabled immediately and say > worker is started sooner than the function could get chance to sync, > in that case as well, SQL function will ultimately get error "cannot > synchronize replication slots concurrently", even though GUC is > enabled. Thus, I feel we should stick with samer error in all > scenarios. Okay, fine by me, let's forget about checking sync_replication_slots then. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  shveta.malik@gmail.com2024-04-18T07:05:35+00:00
  On Tue, Apr 16, 2024 at 2:52 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Tue, Apr 16, 2024 at 02:06:45PM +0530, shveta malik wrote: > > On Tue, Apr 16, 2024 at 1:55 PM Zhijie Hou (Fujitsu) > > <houzj.fnst@fujitsu.com> wrote: > > > I personally feel adding the additional check for sync_replication_slots may > > > not improve the situation here. Because the GUC sync_replication_slots can > > > change at any point, the GUC could be false when performing this addition check > > > and is set to true immediately after the check, so It could not simplify the logic > > > anyway. > > > > +1. > > I feel doc and "cannot synchronize replication slots concurrently" > > check should suffice. > > > > In the scenario which Hou-San pointed out, if after performing the > > GUC check in SQL function, this GUC is enabled immediately and say > > worker is started sooner than the function could get chance to sync, > > in that case as well, SQL function will ultimately get error "cannot > > synchronize replication slots concurrently", even though GUC is > > enabled. Thus, I feel we should stick with samer error in all > > scenarios. > > Okay, fine by me, let's forget about checking sync_replication_slots then. Thanks. While reviewing and testing this patch further, we encountered 2 race-conditions which needs to be handled: 1) For slot sync worker, the order of cleanup execution was a) first reset 'syncing' flag (slotsync_failure_callback) b) then reset pid and syncing (slotsync_worker_onexit). But in ShutDownSlotSync(), we rely only on the 'syncing' flag for wait-exit logic. So it may so happen that in the window between these two callbacks, ShutDownSlotSync() proceeds and calls update_synced_slots_inactive_since() which may then hit assert Assert((SlotSyncCtx->pid == InvalidPid). 2) Another problem as described by Hou-San off-list: When the slotsync worker error out after acquiring a slot, it will first call slotsync_worker_onexit() and then ReplicationSlotShmemExit(), so in the window between these two callbacks, it's possible that the SlotSyncCtx->syncing SlotSyncCtx->pid has been reset but the slot->active_pid is still valid. The Assert will be broken in this. @@ -1471,6 +1503,9 @@ update_synced_slots_inactive_since(void) { Assert(SlotIsLogical(s)); + /* The slot must not be acquired by any process */ + Assert(s->active_pid == 0); + To fix above issues, these changes have been made in v7: 1) For worker, replaced slotsync_failure_callback() with slotsync_worker_disconnect() so that the latter only disconnects and thus slotsync_worker_onexit() does pid cleanup followed by syncing flag cleanup. This will make ShutDownSlotSync()'s wait exit reliably. 2) To fix second problem, changes are: 2.1) For worker, moved slotsync_worker_onexit() registration before BaseInit() (BaseInit is the one doing ReplicationSlotShmemExit registration). If we do this change in order of registration, then order of cleanup for worker will be a) slotsync_worker_disconnect() b) ReplicationSlotShmemExit() c) slotsync_worker_onexit(). This order ensures that the worker is actually done with slots release and cleanup before it marks itself as done syncing. 2.2) For SQL function, did ReplicationSlotRelease() and ReplicationSlotCleanup() as first step in slotsync_failure_callback(). While doing change 2.2, it occurred to us, that it would be a clean solution to do ReplicationSlotCleanup() even on successful execution of SQL function. It seems better that the temporary slots are cleaned-up when SQL function exists, as we do not know when the user will run this SQL function again and thus leaving temp slots for longer does not seem a good idea. Thoughts? thanks Shveta
  Jump to comment-1
  shveta.malik@gmail.com2024-04-18T12:06:30+00:00
  On Thu, Apr 18, 2024 at 12:35 PM shveta malik <shveta.malik@gmail.com> wrote: > > To fix above issues, these changes have been made in v7: Please find v8 attached. Changes are: 1) It fixes ShutDownSlotSync() issue, where we perform kill(SlotSyncCtx->pid). There are chances that after we release spin-lock and before we perform kill, slot-sync worker has error-ed out and has set SlotSyncCtx->pid to InvalidPid (-1) already. And thus kill(-1) could result in abnormal process kills on some platforms. Now, we get pid under spin-lock and then use it to perform kill to avoid pid=-1 kill. This is on a similar line of how ShutdownWalRcv() does it. 2) Improved comments in code. 3) Updated commit message with new fixes. I had missed to update it in the previous version. thanks Shveta
  Jump to comment-1
  bertranddrouvot.pg@gmail.com2024-04-19T05:23:20+00:00
  Hi, On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote: > Please find v8 attached. Changes are: Thanks! A few comments: 1 === @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len) * slotsync_worker_onexit() but that will need the connection to be made * global and we want to avoid introducing global for this purpose. */ - before_shmem_exit(slotsync_failure_callback, PointerGetDatum(wrconn)); + before_shmem_exit(slotsync_worker_disconnect, PointerGetDatum(wrconn)); The comment above this change still states "Register the failure callback once we have the connection", I think it has to be reworded a bit now that v8 is making use of slotsync_worker_disconnect(). 2 === + * Register slotsync_worker_onexit() before we register + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during exit of + * slot sync worker, ReplicationSlotShmemExit() is called first, followed + * by slotsync_worker_onexit(). Startup process during promotion waits for Worth to mention in shmem_exit() (where it "while (--before_shmem_exit_index >= 0)" or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() relies on this LIFO behavior? (not sure if there is other "strong" LIFO requirement in other part of the code). 3 === + * Startup process during promotion waits for slot sync to finish and it + * does that by checking the 'syncing' flag. worth to mention ShutDownSlotSync()? 4 === I did a few tests manually (launching ShutDownSlotSync() through gdb / with and without sync worker and with / without pg_sync_replication_slots() running concurrently) and it looks like it works as designed. Having said that, the logic that is in place to take care of the corner cases described up-thread seems reasonable to me. Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  shveta.malik@gmail.com2024-04-19T06:07:26+00:00
  On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote: > > Hi, > > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote: > > Please find v8 attached. Changes are: > > Thanks! > > A few comments: Thanks for reviewing. > 1 === > > @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len) > * slotsync_worker_onexit() but that will need the connection to be made > * global and we want to avoid introducing global for this purpose. > */ > - before_shmem_exit(slotsync_failure_callback, PointerGetDatum(wrconn)); > + before_shmem_exit(slotsync_worker_disconnect, PointerGetDatum(wrconn)); > > The comment above this change still states "Register the failure callback once > we have the connection", I think it has to be reworded a bit now that v8 is > making use of slotsync_worker_disconnect(). > > 2 === > > + * Register slotsync_worker_onexit() before we register > + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during exit of > + * slot sync worker, ReplicationSlotShmemExit() is called first, followed > + * by slotsync_worker_onexit(). Startup process during promotion waits for > > Worth to mention in shmem_exit() (where it "while (--before_shmem_exit_index >= 0)" > or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() relies on > this LIFO behavior? (not sure if there is other "strong" LIFO requirement in > other part of the code). I see other modules as well relying on LIFO behavior. Please see applyparallelworker.c where 'before_shmem_exit(pa_shutdown)' is needed to be done after 'before_shmem_exit(logicalrep_worker_onexit)' (commit id 3d144c6). Also in postinit.c, I see such comments atop 'before_shmem_exit(ShutdownPostgres, 0)'. I feel we can skip adding this specific comment about ReplSlotSyncWorkerMain() in ipc.c, as none of the other modules has also not added any. I will address the rest of your comments in the next version. > 3 === > > + * Startup process during promotion waits for slot sync to finish and it > + * does that by checking the 'syncing' flag. > > worth to mention ShutDownSlotSync()? > > 4 === > > I did a few tests manually (launching ShutDownSlotSync() through gdb / with and > without sync worker and with / without pg_sync_replication_slots() running > concurrently) and it looks like it works as designed. Thanks for testing it. > Having said that, the logic that is in place to take care of the corner cases > described up-thread seems reasonable to me. thanks Shveta
  Jump to comment-1
  shveta.malik@gmail.com2024-04-19T08:22:43+00:00
  On Fri, Apr 19, 2024 at 11:37 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > Hi, > > > > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote: > > > Please find v8 attached. Changes are: > > > > Thanks! > > > > A few comments: > > Thanks for reviewing. > > > 1 === > > > > @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, size_t startup_data_len) > > * slotsync_worker_onexit() but that will need the connection to be made > > * global and we want to avoid introducing global for this purpose. > > */ > > - before_shmem_exit(slotsync_failure_callback, PointerGetDatum(wrconn)); > > + before_shmem_exit(slotsync_worker_disconnect, PointerGetDatum(wrconn)); > > > > The comment above this change still states "Register the failure callback once > > we have the connection", I think it has to be reworded a bit now that v8 is > > making use of slotsync_worker_disconnect(). > > > > 2 === > > > > + * Register slotsync_worker_onexit() before we register > > + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during exit of > > + * slot sync worker, ReplicationSlotShmemExit() is called first, followed > > + * by slotsync_worker_onexit(). Startup process during promotion waits for > > > > Worth to mention in shmem_exit() (where it "while (--before_shmem_exit_index >= 0)" > > or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() relies on > > this LIFO behavior? (not sure if there is other "strong" LIFO requirement in > > other part of the code). > > I see other modules as well relying on LIFO behavior. > Please see applyparallelworker.c where > 'before_shmem_exit(pa_shutdown)' is needed to be done after > 'before_shmem_exit(logicalrep_worker_onexit)' (commit id 3d144c6). > Also in postinit.c, I see such comments atop > 'before_shmem_exit(ShutdownPostgres, 0)'. > I feel we can skip adding this specific comment about > ReplSlotSyncWorkerMain() in ipc.c, as none of the other modules has > also not added any. I will address the rest of your comments in the > next version. > > > 3 === > > > > + * Startup process during promotion waits for slot sync to finish and it > > + * does that by checking the 'syncing' flag. > > > > worth to mention ShutDownSlotSync()? > > > > 4 === > > > > I did a few tests manually (launching ShutDownSlotSync() through gdb / with and > > without sync worker and with / without pg_sync_replication_slots() running > > concurrently) and it looks like it works as designed. > > Thanks for testing it. > > > Having said that, the logic that is in place to take care of the corner cases > > described up-thread seems reasonable to me. Please find v9 with the above comments addressed. thanks Shveta
  Jump to comment-1
  houzj.fnst@fujitsu.com2024-04-22T00:31:30+00:00
  On Friday, April 19, 2024 4:22 PM shveta malik <shveta.malik@gmail.com> wrote: > On Fri, Apr 19, 2024 at 11:37 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > On Fri, Apr 19, 2024 at 10:53 AM Bertrand Drouvot > > <bertranddrouvot.pg@gmail.com> wrote: > > > > > > Hi, > > > > > > On Thu, Apr 18, 2024 at 05:36:05PM +0530, shveta malik wrote: > > > > Please find v8 attached. Changes are: > > > > > > Thanks! > > > > > > A few comments: > > > > Thanks for reviewing. > > > > > 1 === > > > > > > @@ -1440,7 +1461,7 @@ ReplSlotSyncWorkerMain(char *startup_data, > size_t startup_data_len) > > > * slotsync_worker_onexit() but that will need the connection to be > made > > > * global and we want to avoid introducing global for this purpose. > > > */ > > > - before_shmem_exit(slotsync_failure_callback, > PointerGetDatum(wrconn)); > > > + before_shmem_exit(slotsync_worker_disconnect, > > > + PointerGetDatum(wrconn)); > > > > > > The comment above this change still states "Register the failure > > > callback once we have the connection", I think it has to be reworded > > > a bit now that v8 is making use of slotsync_worker_disconnect(). > > > > > > 2 === > > > > > > + * Register slotsync_worker_onexit() before we register > > > + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during > exit of > > > + * slot sync worker, ReplicationSlotShmemExit() is called first, > followed > > > + * by slotsync_worker_onexit(). Startup process during > > > + promotion waits for > > > > > > Worth to mention in shmem_exit() (where it "while > (--before_shmem_exit_index >= 0)" > > > or before the shmem_exit() definition) that ReplSlotSyncWorkerMain() > > > relies on this LIFO behavior? (not sure if there is other "strong" > > > LIFO requirement in other part of the code). > > > > I see other modules as well relying on LIFO behavior. > > Please see applyparallelworker.c where > > 'before_shmem_exit(pa_shutdown)' is needed to be done after > > 'before_shmem_exit(logicalrep_worker_onexit)' (commit id 3d144c6). > > Also in postinit.c, I see such comments atop > > 'before_shmem_exit(ShutdownPostgres, 0)'. > > I feel we can skip adding this specific comment about > > ReplSlotSyncWorkerMain() in ipc.c, as none of the other modules has > > also not added any. I will address the rest of your comments in the > > next version. > > > > > 3 === > > > > > > + * Startup process during promotion waits for slot sync to finish > and it > > > + * does that by checking the 'syncing' flag. > > > > > > worth to mention ShutDownSlotSync()? > > > > > > 4 === > > > > > > I did a few tests manually (launching ShutDownSlotSync() through gdb > > > / with and without sync worker and with / without > > > pg_sync_replication_slots() running > > > concurrently) and it looks like it works as designed. > > > > Thanks for testing it. > > > > > Having said that, the logic that is in place to take care of the > > > corner cases described up-thread seems reasonable to me. > > Please find v9 with the above comments addressed. Thanks, the patch looks good to me. I also tested a few concurrent promotion/function execution cases and didn't find issues. Best Regards, Hou zj
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-22T11:41:18+00:00
  On Fri, Apr 19, 2024 at 1:52 PM shveta malik <shveta.malik@gmail.com> wrote: > > Please find v9 with the above comments addressed. > I have made minor modifications in the comments and a function name. Please see the attached top-up patch. Apart from this, the patch looks good to me. -- With Regards, Amit Kapila.
  Jump to comment-1
  shveta.malik@gmail.com2024-04-22T12:02:31+00:00
  On Mon, Apr 22, 2024 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Apr 19, 2024 at 1:52 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Please find v9 with the above comments addressed. > > > > I have made minor modifications in the comments and a function name. > Please see the attached top-up patch. Apart from this, the patch looks > good to me. Thanks for the patch, the changes look good Amit. Please find the merged patch. thanks Shveta
  Jump to comment-1
  sawada.mshk@gmail.com2024-04-22T13:34:25+00:00
  On Mon, Apr 22, 2024 at 9:02 PM shveta malik <shveta.malik@gmail.com> wrote: > > On Mon, Apr 22, 2024 at 5:10 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > On Fri, Apr 19, 2024 at 1:52 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Please find v9 with the above comments addressed. > > > > > > > I have made minor modifications in the comments and a function name. > > Please see the attached top-up patch. Apart from this, the patch looks > > good to me. > > Thanks for the patch, the changes look good Amit. Please find the merged patch. > I've reviewed the patch and have some comments: --- /* - * Early initialization. + * Register slotsync_worker_onexit() before we register + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during the + * exit of the slot sync worker, ReplicationSlotShmemExit() is called + * first, followed by slotsync_worker_onexit(). The startup process during + * promotion invokes ShutDownSlotSync() which waits for slot sync to + * finish and it does that by checking the 'syncing' flag. Thus worker + * must be done with the slots' release and cleanup before it marks itself + * as finished syncing. */ I'm slightly worried that we register the slotsync_worker_onexit() callback before BaseInit(), because it could be a blocker when we want to add more work in the callback, for example sending the stats. --- synchronize_slots(wrconn); + + /* Cleanup the temporary slots */ + ReplicationSlotCleanup(); + + /* We are done with sync, so reset sync flag */ + reset_syncing_flag(); I think it ends up removing other temp slots that are created by the same backend process using pg_create_{physical,logical_replication_slots() function, which could be a large side effect of this function for users. Also, if users want to have a process periodically calling pg_sync_replication_slots() instead of the slotsync worker, it doesn't support a case where we create a temp not-ready slot and turn it into a persistent slot if it's ready for sync. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-23T03:37:40+00:00
  On Mon, Apr 22, 2024 at 7:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > On Mon, Apr 22, 2024 at 9:02 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > Thanks for the patch, the changes look good Amit. Please find the merged patch. > > > > I've reviewed the patch and have some comments: > > --- > /* > - * Early initialization. > + * Register slotsync_worker_onexit() before we register > + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during the > + * exit of the slot sync worker, ReplicationSlotShmemExit() is called > + * first, followed by slotsync_worker_onexit(). The startup process during > + * promotion invokes ShutDownSlotSync() which waits for slot sync to > + * finish and it does that by checking the 'syncing' flag. Thus worker > + * must be done with the slots' release and cleanup before it marks itself > + * as finished syncing. > */ > > I'm slightly worried that we register the slotsync_worker_onexit() > callback before BaseInit(), because it could be a blocker when we want > to add more work in the callback, for example sending the stats. > The other possibility is that we do slot release/clean up in the slotsync_worker_onexit() call itself and then we can do it after BaseInit(). Do you have any other/better idea for this? > --- > synchronize_slots(wrconn); > + > + /* Cleanup the temporary slots */ > + ReplicationSlotCleanup(); > + > + /* We are done with sync, so reset sync flag */ > + reset_syncing_flag(); > > I think it ends up removing other temp slots that are created by the > same backend process using > pg_create_{physical,logical_replication_slots() function, which could > be a large side effect of this function for users. > True, I think here we should either remove only temporary and synced marked slots. The other possibility is to create slots as RS_EPHEMERAL initially when called from the SQL function but that doesn't sound like a neat approach. > Also, if users want > to have a process periodically calling pg_sync_replication_slots() > instead of the slotsync worker, it doesn't support a case where we > create a temp not-ready slot and turn it into a persistent slot if > it's ready for sync. > True, but eventually the API should be able to directly create the persistent slots and anyway this can happen only for the first time (till the slots are created and marked persistent) and one who wants to use this function periodically should be able to see regular syncs. OTOH, leaving temp slots created via this API could remain as-is after promotion and we need to document for users to remove such slots. Now, we can do that if we want but I think it is better to clean up such slots rather than putting the onus on users to remove them after promotion. -- With Regards, Amit Kapila.
  Jump to comment-1
  shveta.malik@gmail.com2024-04-24T04:58:52+00:00
  On Tue, Apr 23, 2024 at 9:07 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Apr 22, 2024 at 7:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, Apr 22, 2024 at 9:02 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Thanks for the patch, the changes look good Amit. Please find the merged patch. > > > > > > > I've reviewed the patch and have some comments: Thanks for the comments. > > --- > > /* > > - * Early initialization. > > + * Register slotsync_worker_onexit() before we register > > + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during the > > + * exit of the slot sync worker, ReplicationSlotShmemExit() is called > > + * first, followed by slotsync_worker_onexit(). The startup process during > > + * promotion invokes ShutDownSlotSync() which waits for slot sync to > > + * finish and it does that by checking the 'syncing' flag. Thus worker > > + * must be done with the slots' release and cleanup before it marks itself > > + * as finished syncing. > > */ > > > > I'm slightly worried that we register the slotsync_worker_onexit() > > callback before BaseInit(), because it could be a blocker when we want > > to add more work in the callback, for example sending the stats. > > > > The other possibility is that we do slot release/clean up in the > slotsync_worker_onexit() call itself and then we can do it after > BaseInit(). Do you have any other/better idea for this? I have currently implemented it this way in v11. > > --- > > synchronize_slots(wrconn); > > + > > + /* Cleanup the temporary slots */ > > + ReplicationSlotCleanup(); > > + > > + /* We are done with sync, so reset sync flag */ > > + reset_syncing_flag(); > > > > I think it ends up removing other temp slots that are created by the > > same backend process using > > pg_create_{physical,logical_replication_slots() function, which could > > be a large side effect of this function for users. Yes, this is a problem. Thanks for catching it. > > True, I think here we should either remove only temporary and synced > marked slots. The other possibility is to create slots as RS_EPHEMERAL > initially when called from the SQL function but that doesn't sound > like a neat approach. Modified the logic to remove only synced temporary slots during SQL-function exit. Please find v11 with above changes. thanks Shveta
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-25T10:52:09+00:00
  On Wed, Apr 24, 2024 at 10:28 AM shveta malik <shveta.malik@gmail.com> wrote: > > Modified the logic to remove only synced temporary slots during > SQL-function exit. > > Please find v11 with above changes. > LGTM, so pushed! -- With Regards, Amit Kapila.
  Jump to comment-1
  sawada.mshk@gmail.com2024-04-24T05:44:43+00:00
  On Tue, Apr 23, 2024 at 12:37 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Mon, Apr 22, 2024 at 7:04 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: > > > > On Mon, Apr 22, 2024 at 9:02 PM shveta malik <shveta.malik@gmail.com> wrote: > > > > > > Thanks for the patch, the changes look good Amit. Please find the merged patch. > > > > > > > I've reviewed the patch and have some comments: > > > > --- > > /* > > - * Early initialization. > > + * Register slotsync_worker_onexit() before we register > > + * ReplicationSlotShmemExit() in BaseInit(), to ensure that during the > > + * exit of the slot sync worker, ReplicationSlotShmemExit() is called > > + * first, followed by slotsync_worker_onexit(). The startup process during > > + * promotion invokes ShutDownSlotSync() which waits for slot sync to > > + * finish and it does that by checking the 'syncing' flag. Thus worker > > + * must be done with the slots' release and cleanup before it marks itself > > + * as finished syncing. > > */ > > > > I'm slightly worried that we register the slotsync_worker_onexit() > > callback before BaseInit(), because it could be a blocker when we want > > to add more work in the callback, for example sending the stats. > > > > The other possibility is that we do slot release/clean up in the > slotsync_worker_onexit() call itself and then we can do it after > BaseInit(). This approach sounds clearer and safer to me. The current approach relies on the callback registration order of ReplicationSlotShmemExit(). If it changes in the future, we will silently have the same problem. Every slot sync related work should be done before allowing someone to touch synced slots by clearing the 'syncing' flag. > > > --- > > synchronize_slots(wrconn); > > + > > + /* Cleanup the temporary slots */ > > + ReplicationSlotCleanup(); > > + > > + /* We are done with sync, so reset sync flag */ > > + reset_syncing_flag(); > > > > I think it ends up removing other temp slots that are created by the > > same backend process using > > pg_create_{physical,logical_replication_slots() function, which could > > be a large side effect of this function for users. > > > > True, I think here we should either remove only temporary and synced > marked slots. The other possibility is to create slots as RS_EPHEMERAL > initially when called from the SQL function but that doesn't sound > like a neat approach. > > > > Also, if users want > > to have a process periodically calling pg_sync_replication_slots() > > instead of the slotsync worker, it doesn't support a case where we > > create a temp not-ready slot and turn it into a persistent slot if > > it's ready for sync. > > > > True, but eventually the API should be able to directly create the > persistent slots and anyway this can happen only for the first time > (till the slots are created and marked persistent) and one who wants > to use this function periodically should be able to see regular syncs. I agree that we remove temp-and-synced slots created via the API at the end of the API . We end up creating and dropping slots in every API call but since the pg_sync_replication_slots() function is a kind of debug-purpose function and it will not be common to call this function regularly instead of using the slot sync worker, we can live with such overhead. Regards, -- Masahiko Sawada Amazon Web Services: https://aws.amazon.com
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-06T06:19:42+00:00
  On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote: > > Please find v2. Changes are: > 1) Rebased the patch as there was conflict due to recent commit 6f132ed. > 2) Added an Assert in update_synced_slots_inactive_since() to ensure > that the slot does not have active_pid. > 3) Improved commit msg and comments. > Few comments: ============== 1. void SyncReplicationSlots(WalReceiverConn *wrconn) { + /* + * Startup process signaled the slot sync to stop, so if meanwhile user + * has invoked slot sync SQL function, simply return. + */ + SpinLockAcquire(&SlotSyncCtx->mutex); + if (SlotSyncCtx->stopSignaled) + { + ereport(LOG, + errmsg("skipping slot synchronization as slot sync shutdown is signaled during promotion")); + + SpinLockRelease(&SlotSyncCtx->mutex); + return; + } + SpinLockRelease(&SlotSyncCtx->mutex); There is a race condition with this code. Say during promotion ShutDownSlotSync() is just before setting this flag and the user has invoked pg_sync_replication_slots() and passed this check but still didn't set the SlotSyncCtx->syncing flag. So, now, the promotion would recognize that there is slot sync going on in parallel, and slot sync wouldn't know that the promotion is in progress. The current coding for slot sync worker ensures there is no such race by checking the stopSignaled and setting the pid together under spinlock. I think we need to move the setting of the syncing flag similarly. Once we do that probably checking SlotSyncCtx->syncing should be sufficient in ShutDownSlotSync(). If we change the location of setting the 'syncing' flag then please ensure its cleanup as we currently do in slotsync_failure_callback(). 2. @@ -1395,6 +1395,7 @@ update_synced_slots_inactive_since(void) if (s->in_use && s->data.synced) { Assert(SlotIsLogical(s)); + Assert(s->active_pid == 0); We can add a comment like: "The slot must not be acquired by any process" -- With Regards, Amit Kapila.
  Jump to comment-1
  shveta.malik@gmail.com2024-04-12T02:17:57+00:00
  On Sat, Apr 6, 2024 at 11:49 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Fri, Apr 5, 2024 at 10:31 AM shveta malik <shveta.malik@gmail.com> wrote: > > > > Please find v2. Changes are: > > 1) Rebased the patch as there was conflict due to recent commit 6f132ed. > > 2) Added an Assert in update_synced_slots_inactive_since() to ensure > > that the slot does not have active_pid. > > 3) Improved commit msg and comments. > > > > Few comments: > ============== > 1. > void > SyncReplicationSlots(WalReceiverConn *wrconn) > { > + /* > + * Startup process signaled the slot sync to stop, so if meanwhile user > + * has invoked slot sync SQL function, simply return. > + */ > + SpinLockAcquire(&SlotSyncCtx->mutex); > + if (SlotSyncCtx->stopSignaled) > + { > + ereport(LOG, > + errmsg("skipping slot synchronization as slot sync shutdown is > signaled during promotion")); > + > + SpinLockRelease(&SlotSyncCtx->mutex); > + return; > + } > + SpinLockRelease(&SlotSyncCtx->mutex); > > There is a race condition with this code. Say during promotion > ShutDownSlotSync() is just before setting this flag and the user has > invoked pg_sync_replication_slots() and passed this check but still > didn't set the SlotSyncCtx->syncing flag. So, now, the promotion would > recognize that there is slot sync going on in parallel, and slot sync > wouldn't know that the promotion is in progress. Did you mean that now, the promotion *would not* recognize... I see, I will fix this. > The current coding for slot sync worker ensures there is no such race > by checking the stopSignaled and setting the pid together under > spinlock. I think we need to move the setting of the syncing flag > similarly. Once we do that probably checking SlotSyncCtx->syncing > should be sufficient in ShutDownSlotSync(). If we change the location > of setting the 'syncing' flag then please ensure its cleanup as we > currently do in slotsync_failure_callback(). Sure, let me review. > 2. > @@ -1395,6 +1395,7 @@ update_synced_slots_inactive_since(void) > if (s->in_use && s->data.synced) > { > Assert(SlotIsLogical(s)); > + Assert(s->active_pid == 0); > > We can add a comment like: "The slot must not be acquired by any process" > > -- > With Regards, > Amit Kapila.
  Jump to comment-1
  amit.kapila16@gmail.com2024-04-12T10:37:22+00:00
  On Fri, Apr 12, 2024 at 7:47 AM shveta malik <shveta.malik@gmail.com> wrote: > > On Sat, Apr 6, 2024 at 11:49 AM Amit Kapila <amit.kapila16@gmail.com> wrote: > > > > > > Few comments: > > ============== > > 1. > > void > > SyncReplicationSlots(WalReceiverConn *wrconn) > > { > > + /* > > + * Startup process signaled the slot sync to stop, so if meanwhile user > > + * has invoked slot sync SQL function, simply return. > > + */ > > + SpinLockAcquire(&SlotSyncCtx->mutex); > > + if (SlotSyncCtx->stopSignaled) > > + { > > + ereport(LOG, > > + errmsg("skipping slot synchronization as slot sync shutdown is > > signaled during promotion")); > > + > > + SpinLockRelease(&SlotSyncCtx->mutex); > > + return; > > + } > > + SpinLockRelease(&SlotSyncCtx->mutex); > > > > There is a race condition with this code. Say during promotion > > ShutDownSlotSync() is just before setting this flag and the user has > > invoked pg_sync_replication_slots() and passed this check but still > > didn't set the SlotSyncCtx->syncing flag. So, now, the promotion would > > recognize that there is slot sync going on in parallel, and slot sync > > wouldn't know that the promotion is in progress. > > Did you mean that now, the promotion *would not* recognize... > Right. > I see, I will fix this. > Thanks. -- With Regards, Amit Kapila.

Lists