Test cluster with high OIDs above the signed-int limit (2B+)

  • Jump to comment-1
    Dominique Devienne<ddevienne@gmail.com>
    Apr 20, 2026, 9:45 AM UTC
    Hi. A few weeks ago, one of our clusters, with high DDL churn from
    UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.
    I'm moving into creating clusters on-the-fly for testing, and would
    like to force that situation to avoid a future silent regression,
    since it takes a long time to cross that threshold, and we do move up
    in major versions, so the over-the-threshold cluster will be
    abandoned. How can I achieve that? A quick AI query yielded nothing,
    but this is unusual enough that there's little to no material to have
    good answers. Can PostgreSQL experts/hackers weigh in on this please?
    If not possible now, can this be supported in the future please? --DD
    • Jump to comment-1
      Ron Johnson<ronljohnsonjr@gmail.com>
      Apr 20, 2026, 12:44 PM UTC
      On Mon, Apr 20, 2026 at 5:45 AM Dominique Devienne <ddevienne@gmail.com>
      wrote:
      Hi. A few weeks ago, one of our clusters, with high DDL churn from
      UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.
      Because you track and remember OIDs?
      --
      Death to <Redacted>, and butter sauce.
      Don't boil me, I'm still alive.
      <Redacted> lobster!
      • Jump to comment-1
        Dominique Devienne<ddevienne@gmail.com>
        Apr 20, 2026, 1:00 PM UTC
        On Mon, Apr 20, 2026 at 2:45 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
        On Mon, Apr 20, 2026 at 5:45 AM Dominique Devienne <ddevienne@gmail.com> wrote:
        Hi. A few weeks ago, one of our clusters, with high DDL churn from
        UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.

        Because you track and remember OIDs?
        No. I don't even remember the exact bug, and we lost networking to our
        SCM right now, so can't even look it up (obviously it's not
        decentralized SCM). But signed vs unsigned and 2B+ is a classic bug,
        worth testing for, except it's impractical to reach such high OIDs on
        demand. Given there's a cluster-wide OID counter, surely there's a
        way, even hackish, to influence that counter, no? PostgreSQL itself
        has mitigation strategies when running out of OIDs, doesn't it? It's a
        different use-case, but that implies also reaching large OIDs, and I
        suspect this is unit tested, no?
        • Jump to comment-1
          Dominique Devienne<ddevienne@gmail.com>
          Apr 20, 2026, 1:08 PM UTC
          On Mon, Apr 20, 2026 at 2:59 PM Dominique Devienne <ddevienne@gmail.com> wrote:
          No. I don't even remember the exact bug
          Was an old test using lo_creat(-1) RETURNING the OID, and code doing
          `std::stoi(PQgetvalue(...))`. In production we don't use LO and use
          the binary protocol, so no such issue, still my original point
          remains. We process OIDs in several places, and making sure our test
          suite works with high OIDs would be better. If I fully control the
          cluster, which is created specifically for the test run, on-the-fly,
          it's like to be able to similate high OIDs "instantly".
          • Jump to comment-1
            Ron Johnson<ronljohnsonjr@gmail.com>
            Apr 20, 2026, 1:23 PM UTC
            On Mon, Apr 20, 2026 at 9:08 AM Dominique Devienne <ddevienne@gmail.com>
            wrote:
            On Mon, Apr 20, 2026 at 2:59 PM Dominique Devienne <ddevienne@gmail.com>
            wrote:
            No. I don't even remember the exact bug

            Was an old test using lo_creat(-1) RETURNING the OID, and code doing
            `std::stoi(PQgetvalue(...))`. In production we don't use LO and use
            the binary protocol, so no such issue, still my original point
            remains. We process OIDs in several places, and making sure our test
            suite works with high OIDs would be better. If I fully control the
            cluster, which is created specifically for the test run, on-the-fly,
            it's like to be able to similate high OIDs "instantly".
            It's an unsigned integer, so I'd say not use signed ints when processing
            OIDs.
            It's a valid question, though, what happens when the OID counter wraps
            around and hits a duplicate.
            --
            Death to <Redacted>, and butter sauce.
            Don't boil me, I'm still alive.
            <Redacted> lobster!
            • Jump to comment-1
              Dominique Devienne<ddevienne@gmail.com>
              Apr 20, 2026, 1:32 PM UTC
              On Mon, Apr 20, 2026 at 3:23 PM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
              It's an unsigned integer, so I'd say not use signed ints when processing OIDs.
              Well duh, that's why it's a bug.
              But it's a sneaky bug, because clusters rarely enter that high-OID territory.
              That's precisely why I'd like a way to provoke it.
              It's a valid question, though, what happens when the OID counter wraps around and hits a duplicate.
              Again, I'm NOT interested in OID wrap-around. But the "second-half" of
              the OID space.
    • Jump to comment-1
      Tom Lane<tgl@sss.pgh.pa.us>
      Apr 20, 2026, 1:29 PM UTC
      Dominique Devienne <ddevienne@gmail.com> writes:
      Hi. A few weeks ago, one of our clusters, with high DDL churn from
      UTs, crossed the 2B mark for OIDs, which exposed a bug in our code.
      I'm moving into creating clusters on-the-fly for testing, and would
      like to force that situation to avoid a future silent regression,
      since it takes a long time to cross that threshold, and we do move up
      in major versions, so the over-the-threshold cluster will be
      abandoned. How can I achieve that?
      See pg_resetwal --next-oid. Don't recall what else you need to say
      to avoid breaking the cluster in other ways.
      		regards, tom lane
      • Jump to comment-1
        Dominique Devienne<ddevienne@gmail.com>
        Apr 20, 2026, 1:40 PM UTC
        On Mon, Apr 20, 2026 at 3:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
        Dominique Devienne <ddevienne@gmail.com> writes:

        See pg_resetwal --next-oid. Don't recall what else you need to say
        to avoid breaking the cluster in other ways.
        Great, thanks Tom.
        So I just initdb, run the above, then start the cluster? That's it?
        • Jump to comment-1
          Tom Lane<tgl@sss.pgh.pa.us>
          Apr 20, 2026, 1:42 PM UTC
          Dominique Devienne <ddevienne@gmail.com> writes:
          On Mon, Apr 20, 2026 at 3:29 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
          See pg_resetwal --next-oid. Don't recall what else you need to say
          to avoid breaking the cluster in other ways.
          So I just initdb, run the above, then start the cluster? That's it?
          Right. As I said, I don't recall what other options you might
          need, but that's the game plan.
          		regards, tom lane
    • Jump to comment-1
      Greg Sabino Mullane<htamfids@gmail.com>
      Apr 20, 2026, 1:31 PM UTC
      You can change the define for FirstNomalObjectId
      in include/access/transam.h to a very large number and recompile Postgres.
      I don't know an easy way to increment that for an existing cluster other
      than creating/removing an object in a client loop.