Some questions about JIT optimization

  • Jump to comment-1
    ywgrit<yw987194828@gmail.com>
    Jan 26, 2026, 3:11 AM UTC
    Presentation:
    https://anarazel.de/talks/2018-06-01-pgcon-state-of-jit/state-of-jit.pdf.
    It mentions “Future things to JIT: Aggregate & Hashjoin hash computation.”
    I'm not entirely clear where this optimization specifically manifests. I
    tested tpch-100G. Performance improvements for q5/11 were minimal because
    expression execution constitutes a very small proportion in these two
    queries, leaving little room for JIT optimization. Both queries share
    common hot functions: `ExecParallelScanHashBucket` and
    `ExecParallelHashJoin`. However, I don't see how these hot functions could
    be optimized via JIT. I'd appreciate hearing everyone's thoughts.
    Thanks.
    • Jump to comment-1
      David Rowley<dgrowleyml@gmail.com>
      Jan 26, 2026, 3:35 AM UTC
      On Mon, 26 Jan 2026 at 16:11, ywgrit <yw987194828@gmail.com> wrote:

      Presentation: https://anarazel.de/talks/2018-06-01-pgcon-state-of-jit/state-of-jit.pdf. It mentions “Future things to JIT: Aggregate & Hashjoin hash computation.” I'm not entirely clear where this optimization specifically manifests. I tested tpch-100G. Performance improvements for q5/11 were minimal because expression execution constitutes a very small proportion in these two queries, leaving little room for JIT optimization. Both queries share common hot functions: `ExecParallelScanHashBucket` and `ExecParallelHashJoin`. However, I don't see how these hot functions could be optimized via JIT. I'd appreciate hearing everyone's thoughts.
      I imagine Andres is talking about the computation of the hash value
      itself. The talk does predate [1] (from PG18), so I expect that item
      is no longer relevant.
      In [1], you can see the Hash Join hash value was obtained via
      ExecHashGetHashValue(), which evaluated each hash key independently
      (perhaps resulting in inefficient incremental tuple deformation)
      before calling the hash function on the resulting value. [1] moved all
      that into the expression evaluation code so that all hash keys are
      evaluated in the same expression. That allows the hash function call
      to be inlined when jitted. It also allowed the "keep_nulls" run-time
      check to be jitted away so that there's less overhead of effectively
      having to continually check which join type is being processed.
      David
      [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=adf97c156
      • Jump to comment-1
        ywgrit<yw987194828@gmail.com>
        Jan 27, 2026, 7:33 AM UTC
        I've got it figured out. Thank you very much.
        David Rowley <dgrowleyml@gmail.com> 于2026年1月26日周一 11:35写道:
        On Mon, 26 Jan 2026 at 16:11, ywgrit <yw987194828@gmail.com> wrote:

        Presentation:
        https://anarazel.de/talks/2018-06-01-pgcon-state-of-jit/state-of-jit.pdf.
        It mentions “Future things to JIT: Aggregate & Hashjoin hash computation.”
        I'm not entirely clear where this optimization specifically manifests. I
        tested tpch-100G. Performance improvements for q5/11 were minimal because
        expression execution constitutes a very small proportion in these two
        queries, leaving little room for JIT optimization. Both queries share
        common hot functions: `ExecParallelScanHashBucket` and
        `ExecParallelHashJoin`. However, I don't see how these hot functions could
        be optimized via JIT. I'd appreciate hearing everyone's thoughts.

        I imagine Andres is talking about the computation of the hash value
        itself. The talk does predate [1] (from PG18), so I expect that item
        is no longer relevant.

        In [1], you can see the Hash Join hash value was obtained via
        ExecHashGetHashValue(), which evaluated each hash key independently
        (perhaps resulting in inefficient incremental tuple deformation)
        before calling the hash function on the resulting value. [1] moved all
        that into the expression evaluation code so that all hash keys are
        evaluated in the same expression. That allows the hash function call
        to be inlined when jitted. It also allowed the "keep_nulls" run-time
        check to be jitted away so that there's less overhead of effectively
        having to continually check which join type is being processed.

        David

        [1]
        https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=adf97c156