Postgres Email Lists

pgsql-hackers

❮

Optimize CPU usage of dropping buffers during recovery

Jump to comment-1
Jingtang Zhang<mrdrivingduck@gmail.com>
Jan 27, 2026, 5:42 PM UTC
Hi hackers,
I happen to find a case where a transaction involves dropping a relation
file, and a relation happens to be empty, the WAL recovery will waste
a lot of time on scanning the whole buffer pool.
In DropRelationBuffers/DropRelationsAllBuffers, we will take fast path
to avoid scanning whole buffer pool, if the buffers to be invalidated
are less than typically 1/32 of buffer pool. However, since no more WAL
records are referencing an empty relation, if we want to drop it, startup
process will never initialize smgrcachednblocks for that relation,
causing fallback to slow path.
The situation can be like refreshing a materialized view whose query
returns no row, or truncating a table and then loading it but the query
returns no row. It will generate WAL sequence like:
rmgr: Standby len (rec/tot): 42/ 42, tx: 780, lsn: 0/41238520, prev 0/41238490, desc: LOCK xid 780 db 16384 rel 16416
rmgr: Storage len (rec/tot): 42/ 42, tx: 780, lsn: 0/412385F0, prev 0/412385C0, desc: CREATE base/16384/16416
...
rmgr: Transaction len (rec/tot): 1073/ 1073, tx: 783, lsn: 0/4136BF70, prev 0/41369F60, desc: COMMIT 2026-01-27 08:07:49.953301 UTC; rels: base/16384/16416
This makes me think about initializing the cached nblocks to zero when
redo CREATE record. Note that the 0-size may not correct, because the
relation may have been zeroextended before server crash, and recovery
starts from an earlier checkpoint. But we can correct that by invalidating
the cached value when we really meet a WAL record referencing a block of
that relation, so that a later smgrnblocks() will fill-in the correct
file length. It can avoid extending the relation in the middle of a file
by mistake.
Quickly proposed a patch for it, with a Perl program for judging the
performance difference.
In my environment with 16GB shared buffers, -O2 optimized, debugging off,
I've run a test of 10 client * 500 transactions using such workload, made
the server crash and poll it until recovery is done. The CPU time:
w/o patch: CPU user: 77.58 s, system: 0.27 s
patched: CPU user: 0.14 s, system: 0.09 s
Without the patch, the CPU is wasted on BufTagMatchesRelFileLocator:
```
Overhead  Command   Shared Object       Symbol
```
- 98.23% postgres postgres [.] DropRelationsAllBuffers
```
 98.21% DropRelationsAllBuffers
    smgrdounlinkall
    DropRelationFiles
    xact_redo_commit
    xact_redo
    PerformWalRecovery
    StartupXLOG
    StartupProcessMain
    postmaster_child_launch
    StartChildProcess
    PostmasterStateMachine
    ServerLoop.isra.0
 0.06%  postgres  postgres            [.] hash_search_with_hash_value
```
Not sure if there will be any problem of correctness. Also, not sure about
what to do about INIT fork.
Any thoughts?
--
Regards, Jingtang
View in PostgreSQL Archives →