[PATCH] Fix ProcKill lock-group vs procLatch recycle race

  • Jump to comment-1
    Vlad Lesin<vladlesin@gmail.com>
    Apr 27, 2026, 8:15 AM UTC
    Hello all,
    The following are the patches and demonstration material for a concurrency bug in ProcKill() where a lock-group leader and a member can exit in parallel.
    ------------------------------------------------------------------------
    Problem
    ------------------------------------------------------------------------
    If a leader detaches from the lock group under leader_lwlock but
    has not yet reached DisownLatch(&MyProc->procLatch), a concurrent
    last follower can still put the leader PGPROC on a free list, or
    the leader and the follower can make inconsistent decisions about
    who returns which PGPROC, so that a slot is linked into the free
    list with procLatch still owned, or is pushed twice. A new backend
    that recycles the slot can then hit:
     PANIC: latch already owned by PID ...
    A concrete interleaving (lock group leader vs last member)
    is the following(PG15 code).
    === Lock group leader ===
    L1: LWLockAcquire(leader_lwlock)
    L2: dlist_delete(&MyProc->lockGroupLink)
     The list contains follower.
    L3: dlistisempty → false (the follower still in)
    L4: else if (leader != MyProc) → false, do nothing
    L5: LWLockRelease(leader_lwlock)
     *lwlock released*
    
      >>>WINDOW OPEN HERE<<<
    L6: SwitchBackToLocalLatch()
    L7: pgstatresetwaiteventstorage()
    L8: proc = MyProc; MyProc = NULL;
    L9: DisownLatch(&proc->procLatch)
     *only here owner_pid = 0*
    L10: SpinLockAcquire(ProcStructLock)
    L11: if (proc->lockGroupLeader == NULL) → false, skip
    L12: SpinLockRelease(ProcStructLock)
    ===========================
    === What the last lock group member does in that WINDOW ===
    M1: LWLockAcquire(leader_lwlock)
     Can proceed as soon as L5 releases it
    M2: dlist_delete(&MyProc->lockGroupLink)
     The list becomes EMPTY
    M3: dlistisempty → true
    M4: leader->lockGroupLeader = NULL
     Flip leader's "don't free me" flag
    M5: leader != MyProc → true
    M6: SpinLockAcquire(ProcStructLock)
    M7: push leader's PGPROC onto freelist
     !!! while leader's latch is still owned !!!
    M8: SpinLockRelease(ProcStructLock)
    M9: LWLockRelease(leader_lwlock)
    ===========================================================
    At M7, the leader's PGPROC is back on freeProcs with
    procLatch.ownerpid == leaderpid (non-zero). The leader is still
    sitting between L5 and L9.
    === The new backend picks it up in InitProcess() ===
    B1: SpinLockAcquire(ProcStructLock)
    B2: MyProc = *procgloballist
     pops the leader's PGPROC
    B3: *procgloballist = MyProc->links.next
    B4: SpinLockRelease(ProcStructLock)
    B5: ...
    B6: OwnLatch(&MyProc->procLatch)
    → owner_pid != 0
    → elog(PANIC, "latch already owned by PID %d", owner_pid);
    =====================================================
    Note that all PG versions since 9.6 are affected.
    The fix is to run SwitchBackToLocalLatch(),
    pgstatresetwaiteventstorage() and DisownLatch() before the
    lock-group block, and to decide under leader_lwlock
    (pushleader / pushself) with a single freelist update section at
    the end. This matches what we implemented; detailed commentary
    is in the commit messages and in the patch.
    Mainline parallel query usually avoids the race: the leader is not
    expected to reach ProcKill with another group member still in play
    the way some extension-driven workloads can be.
    ------------------------------------------------------------------------
    Testing
    ------------------------------------------------------------------------
    Technically, a regression test should be included. However, developing one is non-trivial given the current PG18 injection point implementation. Because ProcKill() partially de-initializes data necessary for injection points to function, a significant workaround would be required. Given the complexity of such a workaround, I would like to discuss whether a test is mandatory for this patch.
    I have provided versions without the test (0001, 0002) as well as a test that uses an injection point workaround to reproduce the bug deterministically(0003). If a test for the bug fix is required, please note that I can only provide it for PG17+, as earlier versions do not support injection points.
    ------------------------------------------------------------------------
    Attached patches
    ------------------------------------------------------------------------
    1) 0001-ProcKill-REL15-lockgroup-freelist-race.patch
      Core proc.c for REL_15_STABLE.
      No TAP: PG15 has no suitable injection support for a core
      reproducer; fix-only for that line.
    2) 0002-ProcKill-PG18-core-lockgroup-freelist-race.patch
      Same *core* fix for PG18: proc.c only.
    3) 0003-PG18-unfixed-repro-tap-injection-harness.patch
      *Demonstration only* Shows the PANIC; not the shape to land after
      the fix without replacing the test.
    --
    Best regards,
    Vlad