Improvements and refactoring in shmem.c

  • Jump to comment-1
    Ashutosh Bapat<ashutosh.bapat.oss@gmail.com>
    Jan 29, 2026, 9:56 AM UTC
    Hi All,
    PFA two patches for $subject.
    0001: Adds assertions to InitShmemAccess() and InitShmemAllocation
    which indicate the conditions, EXEC_BACKEND and IsUnderPostmaster,
    these functions are expected to be called. I found these annotations
    to be useful when modifying these functions to handle multiple Shmem
    segments required by buffer pool resizing project [1].
    0002: We use two different methods to pass ShmemIndex and ShmemLock
    respectively to new backends even though both the structures are
    allocated before creating named structures in the shared memory. This
    patch consistently uses the same method - passing via PGShmemHeader.
    Further the patch gets rid of InitShmemAllocation and moves that code
    inside InitShmemAccess() itself. Maybe that's overkill but at least we
    would be able to call InitShmemAllocation() from InitShmemAccess() and
    declare first as static within shmem.c. That way we avoid a minor risk
    of InitShmemAllocation() being called twice.
    We may achieve consistency by passing ShmemIndex through
    BackendParameter, but I haven't tried that.
    [1] https://www.postgresql.org/message-id/cnthxg2eekacrejyeonuhiaezc7vd7o2uowlsbenxqfkjwgvwj%40qgzu6eoqrglb
    --
    Best Wishes,
    Ashutosh Bapat
    • Jump to comment-1
      Heikki Linnakangas<hlinnaka@iki.fi>
      Jan 29, 2026, 12:28 PM UTC
      On 29/01/2026 11:56, Ashutosh Bapat wrote:
      0001: Adds assertions to InitShmemAccess() and InitShmemAllocation
      which indicate the conditions, EXEC_BACKEND and IsUnderPostmaster,
      these functions are expected to be called. I found these annotations
      to be useful when modifying these functions to handle multiple Shmem
      segments required by buffer pool resizing project [1].
      0002: We use two different methods to pass ShmemIndex and ShmemLock
      respectively to new backends even though both the structures are
      allocated before creating named structures in the shared memory. This
      patch consistently uses the same method - passing via PGShmemHeader.
      Further the patch gets rid of InitShmemAllocation and moves that code
      inside InitShmemAccess() itself. Maybe that's overkill but at least we
      would be able to call InitShmemAllocation() from InitShmemAccess() and
      declare first as static within shmem.c. That way we avoid a minor risk
      of InitShmemAllocation() being called twice.
      We may achieve consistency by passing ShmemIndex through
      BackendParameter, but I haven't tried that.
      Hmm, I agree it's inconsistent currently. And I'd like to reduce the use of BackendParameters, I don't like that mechanism.
      I don't much like moving the 'shmem_lock' pointer to PGShmemHeader either though. It feels like a modularity violation, as no one else than shmem.c should be accessing those fields. The same goes for the existing 'index' and 'freeoffset' fields really.
      Also, does it make sense to have those fields in the "shim" shmem segment that PGSharedMemoryCreate() creates? It's a little confusing, we don't keep the 'freeoffset' in the shim up-to-date, for example.
      One idea is to move all those fields to another struct, see attached. What do you think?
      @@ -35,6 +36,7 @@ typedef struct PGShmemHeader / standard header for all Postgres shmem /
      Size freeoffset; / offset to first free space /
      dsmhandle dsmcontrol; / ID of dynamic shared memory control seg /
      void index; / pointer to ShmemIndex table */
      + slockt *shmemlock; / spinlock for shmem allocation /
      #ifndef WIN32 / Windows doesn't have useful inode#s /
      dev_t device; / device data directory is on /
      ino_t inode; / inode number of data directory /
      Could this contain the spinlock itself, instead of just a pointer to it? Would be a little simpler. (That's moot if we go with my approach though)
      - Heikki
      • Jump to comment-1
        Ashutosh Bapat<ashutosh.bapat.oss@gmail.com>
        Jan 29, 2026, 3:42 PM UTC
        On Thu, Jan 29, 2026 at 5:58 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

        On 29/01/2026 11:56, Ashutosh Bapat wrote:
        0001: Adds assertions to InitShmemAccess() and InitShmemAllocation
        which indicate the conditions, EXEC_BACKEND and IsUnderPostmaster,
        these functions are expected to be called. I found these annotations
        to be useful when modifying these functions to handle multiple Shmem
        segments required by buffer pool resizing project [1].

        0002: We use two different methods to pass ShmemIndex and ShmemLock
        respectively to new backends even though both the structures are
        allocated before creating named structures in the shared memory. This
        patch consistently uses the same method - passing via PGShmemHeader.
        Further the patch gets rid of InitShmemAllocation and moves that code
        inside InitShmemAccess() itself. Maybe that's overkill but at least we
        would be able to call InitShmemAllocation() from InitShmemAccess() and
        declare first as static within shmem.c. That way we avoid a minor risk
        of InitShmemAllocation() being called twice.

        We may achieve consistency by passing ShmemIndex through
        BackendParameter, but I haven't tried that.

        Hmm, I agree it's inconsistent currently. And I'd like to reduce the use
        of BackendParameters, I don't like that mechanism.

        I don't much like moving the 'shmem_lock' pointer to PGShmemHeader
        either though. It feels like a modularity violation, as no one else than
        shmem.c should be accessing those fields. The same goes for the existing
        'index' and 'freeoffset' fields really.

        Also, does it make sense to have those fields in the "shim" shmem
        segment that PGSharedMemoryCreate() creates? It's a little confusing, we
        don't keep the 'freeoffset' in the shim up-to-date, for example.

        One idea is to move all those fields to another struct, see attached.
        What do you think?
        This looks good. Good that we got rid of ShmemAllocUnlocked() too.
        A nitpick should content_offset be contentOffset (like totalSize) or
        contentoffset (like dsmcontrol)? I am ok either way.
        A minor discomfort I have is ShmemBase, which is the starting address
        that the allocator will use, remains outside of ShmemAllocatorData().
        But it doesn't change once set so no need to "share" it through the
        memory and also that will create a self-referencing pointer within the
        shared memory. So it's fine.
        @@ -35,6 +36,7 @@ typedef struct PGShmemHeader / standard header for all Postgres shmem /
        Size freeoffset; / offset to first free space /
        dsmhandle dsmcontrol; / ID of dynamic shared memory control seg /
        void index; / pointer to ShmemIndex table */
        + slockt *shmemlock; / spinlock for shmem allocation /
        #ifndef WIN32 / Windows doesn't have useful inode#s /
        dev_t device; / device data directory is on /
        ino_t inode; / inode number of data directory /

        Could this contain the spinlock itself, instead of just a pointer to it?
        Would be a little simpler. (That's moot if we go with my approach though)
        Yes, that would have been better just like dsm_control - I missed
        that. Not needed with your approach.
        --
        Best Wishes,
        Ashutosh Bapat
        • Jump to comment-1
          Ashutosh Bapat<ashutosh.bapat.oss@gmail.com>
          Jan 30, 2026, 9:22 AM UTC
          On Thu, Jan 29, 2026 at 9:11 PM Ashutosh Bapat
          <ashutosh.bapat.oss@gmail.com> wrote:
          On Thu, Jan 29, 2026 at 5:58 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

          On 29/01/2026 11:56, Ashutosh Bapat wrote:
          0001: Adds assertions to InitShmemAccess() and InitShmemAllocation
          which indicate the conditions, EXEC_BACKEND and IsUnderPostmaster,
          these functions are expected to be called. I found these annotations
          to be useful when modifying these functions to handle multiple Shmem
          segments required by buffer pool resizing project [1].

          0002: We use two different methods to pass ShmemIndex and ShmemLock
          respectively to new backends even though both the structures are
          allocated before creating named structures in the shared memory. This
          patch consistently uses the same method - passing via PGShmemHeader.
          Further the patch gets rid of InitShmemAllocation and moves that code
          inside InitShmemAccess() itself. Maybe that's overkill but at least we
          would be able to call InitShmemAllocation() from InitShmemAccess() and
          declare first as static within shmem.c. That way we avoid a minor risk
          of InitShmemAllocation() being called twice.

          We may achieve consistency by passing ShmemIndex through
          BackendParameter, but I haven't tried that.

          Hmm, I agree it's inconsistent currently. And I'd like to reduce the use
          of BackendParameters, I don't like that mechanism.

          I don't much like moving the 'shmem_lock' pointer to PGShmemHeader
          either though. It feels like a modularity violation, as no one else than
          shmem.c should be accessing those fields. The same goes for the existing
          'index' and 'freeoffset' fields really.

          Also, does it make sense to have those fields in the "shim" shmem
          segment that PGSharedMemoryCreate() creates? It's a little confusing, we
          don't keep the 'freeoffset' in the shim up-to-date, for example.

          One idea is to move all those fields to another struct, see attached.
          What do you think?

          This looks good. Good that we got rid of ShmemAllocUnlocked() too.

          A nitpick should content_offset be contentOffset (like totalSize) or
          contentoffset (like dsmcontrol)? I am ok either way.
          Huh, what was I looking at? there is no totalSize in PGShmemHeader.
          The new member's name fits the style of other names. Ignore this
          comment.

          A minor discomfort I have is ShmemBase, which is the starting address
          that the allocator will use, remains outside of ShmemAllocatorData().
          But it doesn't change once set so no need to "share" it through the
          memory and also that will create a self-referencing pointer within the
          shared memory. So it's fine.
          I think we can just get rid of ShmemBase and ShmemEnd. Their values
          can be derived from other variables at run time as done in the
          attached patch (0002). Alternatively, we could add them to
          ShmemAllocatorData itself to keep everything related to allocation
          together. But it's not really needed.
          I wanted to go as far as creating yet another structure to hold
          ShmemSegHdr and ShmemAllocator together. Having a structure will help
          in the shared buffer resizing project which needs multiple shared
          memory segments. But it can wait.
          What do you think?
          Added this to CF https://commitfest.postgresql.org/patch/6443/ to get
          exercised by CFBot.
          --
          Best Wishes,
          Ashutosh Bapat
          • Jump to comment-1
            Heikki Linnakangas<hlinnaka@iki.fi>
            Jan 30, 2026, 4:29 PM UTC
            On 30/01/2026 11:22, Ashutosh Bapat wrote:
            On Thu, Jan 29, 2026 at 9:11 PM Ashutosh Bapat
            <ashutosh.bapat.oss@gmail.com> wrote:
            A nitpick should content_offset be contentOffset (like totalSize) or
            contentoffset (like dsmcontrol)? I am ok either way.
            Huh, what was I looking at? there is no totalSize in PGShmemHeader.
            The new member's name fits the style of other names. Ignore this
            comment.
            Right. I renamed "freeoffset" to "free_offset", though, since we're moving it to a different struct anyway.
            A minor discomfort I have is ShmemBase, which is the starting address
            that the allocator will use, remains outside of ShmemAllocatorData().
            But it doesn't change once set so no need to "share" it through the
            memory and also that will create a self-referencing pointer within the
            shared memory. So it's fine.
            I think we can just get rid of ShmemBase and ShmemEnd. Their values
            can be derived from other variables at run time as done in the
            attached patch (0002). Alternatively, we could add them to
            ShmemAllocatorData itself to keep everything related to allocation
            together. But it's not really needed.
            I wanted to go as far as creating yet another structure to hold
            ShmemSegHdr and ShmemAllocator together. Having a structure will help
            in the shared buffer resizing project which needs multiple shared
            memory segments. But it can wait.
            What do you think?
            Meh, it doesn't feel much of an improvement to me. You're right that ShmemBase and ShmemEnd can be derived, but it feels kind of nice to have them as as explicit variables.
            So, pushed just the first patch, thanks!
            - Heikki
            • Jump to comment-1
              Ashutosh Bapat<ashutosh.bapat.oss@gmail.com>
              Jan 31, 2026, 9:00 AM UTC
              On Fri, Jan 30, 2026 at 9:58 PM Heikki Linnakangas <hlinnaka@iki.fi> wrote:

              On 30/01/2026 11:22, Ashutosh Bapat wrote:
              On Thu, Jan 29, 2026 at 9:11 PM Ashutosh Bapat
              <ashutosh.bapat.oss@gmail.com> wrote:
              A nitpick should content_offset be contentOffset (like totalSize) or
              contentoffset (like dsmcontrol)? I am ok either way.

              Huh, what was I looking at? there is no totalSize in PGShmemHeader.
              The new member's name fits the style of other names. Ignore this
              comment.

              Right. I renamed "freeoffset" to "free_offset", though, since we're
              moving it to a different struct anyway.
              A minor discomfort I have is ShmemBase, which is the starting address
              that the allocator will use, remains outside of ShmemAllocatorData().
              But it doesn't change once set so no need to "share" it through the
              memory and also that will create a self-referencing pointer within the
              shared memory. So it's fine.

              I think we can just get rid of ShmemBase and ShmemEnd. Their values
              can be derived from other variables at run time as done in the
              attached patch (0002). Alternatively, we could add them to
              ShmemAllocatorData itself to keep everything related to allocation
              together. But it's not really needed.

              I wanted to go as far as creating yet another structure to hold
              ShmemSegHdr and ShmemAllocator together. Having a structure will help
              in the shared buffer resizing project which needs multiple shared
              memory segments. But it can wait.

              What do you think?

              Meh, it doesn't feel much of an improvement to me. You're right that
              ShmemBase and ShmemEnd can be derived, but it feels kind of nice to have
              them as as explicit variables.

              So, pushed just the first patch, thanks!
              Thanks. Closed the CF entry as committed.
              --
              Best Wishes,
              Ashutosh Bapat
          • Jump to comment-1
            Ashutosh Bapat<ashutosh.bapat.oss@gmail.com>
            Jan 30, 2026, 2:35 PM UTC
            On Fri, Jan 30, 2026 at 2:52 PM Ashutosh Bapat
            <ashutosh.bapat.oss@gmail.com> wrote:

            Added this to CF https://commitfest.postgresql.org/patch/6443/ to get
            exercised by CFBot.
            Patches fixing windows build.
            --
            Best Wishes,
            Ashutosh Bapat