Limit GRAPH_TABLE path combinations to prevent memory exhaustion

  • Jump to comment-1
    SATYANARAYANA NARLAPURAM<satyanarlapuram@gmail.com>
    Apr 29, 2026, 4:36 PM UTC
    Hi hackers,
    generatequeriesforpathpattern_recurse() enumerates all path
    combinations by recursing over the Cartesian product of matching elements
    per pattern position. Without IS label filters, each position matches
    ALL tables of that kind, leading to N^K combinations (N tables, K
    pattern positions). Each combination allocates a Query node via palloc
    causing unbounded memory growth.
    A 8-table graph with a -element pattern reaches 81.3 GB RES in a few seconds
    before I cancel the query. Tests in the patch (those were failed) can
    reproduce the problem
    without the fix included in the patch.
    top - 15:04:19 up 43 days, 19:18, 5 users, load average: 0.43, 0.19, 0.08
    Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
    %Cpu(s): 0.9 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si,
    0.0 st
    MiB Mem : 515766.2 total, 248412.7 free, 234847.7 used, 48014.7 buff/cache
    MiB Swap: 0.0 total, 0.0 free, 0.0 used. 280918.6 avail Mem
    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+
    COMMAND
    649642 azureus+ 20 0 212.2g 81.3g 33948 R 100.0 16.1 0:41.20
    postgres
    As a POC I added a pre-computation check that calculates the total number
    of path combinations before entering the
    generatequeriesforpathpattern_recurse.
    If the product exceeds MAXGRAPHTABLEPATHCOMBINATIONS (set to 10,000),
    the rewriter reports ERRCODEPROGRAMLIMIT_EXCEEDED with a hint suggesting
    IS label filters to reduce the search space. The limit of 10,000 is
    somewhat arbitrary
    but conservative. It caps memory at roughly 5 MB of Query nodes.
    Patterns that would exceed the limit without labels can always be made to
    succeed
    by adding IS expressions to pin specific positions to fewer tables.
    Alternatively, we can consider adding a GUC to control the limit but appears
    to be an overkill. Thoughts?
    Thanks,
    Satya
    • Jump to comment-1
      Ashutosh Bapat<ashutosh.bapat.oss@gmail.com>
      Apr 30, 2026, 7:16 AM UTC
      On Wed, Apr 29, 2026 at 10:05 PM SATYANARAYANA NARLAPURAM
      <satyanarlapuram@gmail.com> wrote:

      Hi hackers,

      generatequeriesforpathpattern_recurse() enumerates all path
      combinations by recursing over the Cartesian product of matching elements
      per pattern position. Without IS label filters, each position matches
      ALL tables of that kind, leading to N^K combinations (N tables, K
      pattern positions). Each combination allocates a Query node via palloc
      causing unbounded memory growth.

      A 8-table graph with a -element pattern reaches 81.3 GB RES in a few seconds
      before I cancel the query. Tests in the patch (those were failed) can reproduce the problem
      without the fix included in the patch.

      top - 15:04:19 up 43 days, 19:18, 5 users, load average: 0.43, 0.19, 0.08
      Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
      %Cpu(s): 0.9 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
      MiB Mem : 515766.2 total, 248412.7 free, 234847.7 used, 48014.7 buff/cache
      MiB Swap: 0.0 total, 0.0 free, 0.0 used. 280918.6 avail Mem

      PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
      649642 azureus+ 20 0 212.2g 81.3g 33948 R 100.0 16.1 0:41.20 postgres
      I tried to reproduce this problem on my laptop with queries included
      in the patch. For me all the queries finished. First within 4 ms,
      second within 133 ms and the third in 12ms. Did you try with more
      edges or more rows in the tables?

      As a POC I added a pre-computation check that calculates the total number
      of path combinations before entering the generatequeriesforpathpattern_recurse.
      If the product exceeds MAXGRAPHTABLEPATHCOMBINATIONS (set to 10,000),
      the rewriter reports ERRCODEPROGRAMLIMIT_EXCEEDED with a hint suggesting
      IS label filters to reduce the search space. The limit of 10,000 is somewhat arbitrary
      but conservative. It caps memory at roughly 5 MB of Query nodes.
      Patterns that would exceed the limit without labels can always be made to succeed
      by adding IS expressions to pin specific positions to fewer tables.
      Alternatively, we can consider adding a GUC to control the limit but appears
      to be an overkill. Thoughts?
      I understand the problem and can understand the pain. Somebody who is
      writing a query like ()-()-()-()-() ... should know that every pattern
      that doesn't have a label will be matched against each vertex/edge.
      Our documentation makes it explicit [1] "The above patterns would
      match any vertex, or any two vertices connected by any edge ... ". But
      if they are really interested in matching every vertex or edge they
      don't have any other choice. Using restrictive label expressions would
      lead to wrong or incomplete results. If they can use label expressions
      they should do so anyway, otherwise they will end up with incorrect
      results. To me it seems like somebody who is writing such a pattern
      without providing enough resources is writing a bad query.
      We have other places where queries can consume large amounts of memory
      during planning or execution. Simply take the SQL query equivalent to
      the above pattern. We do not have a way to prohibit such queries as
      far as I know. I understand that SQL/PGQ makes it easy to write such
      queries by hand. But that seems to be abusing a powerful tool.
      Another example is joining partitioned tables with thousands of
      partitions. We have a GUC which enables or disables partitionwise join
      but there is no GUC to limit the number of tables or partitions being
      joined.
      I think we can document that such a pattern can result in large
      queries which may consume memory.
      Said that 81.3 GB looks unreasonably large for
      generatequeriesforpathpattern_recurse() alone. I guess a large
      portion of it comes from planning and execution. How many rows did
      those tables had? Which phase of query execution consumed that much
      memory? Do you have a dump of memory contexts when it reaches that
      limit?
      [1] https://www.postgresql.org/docs/devel/queries-graph.html
      --
      Best Wishes,
      Ashutosh Bapat
      • Jump to comment-1
        SATYANARAYANA NARLAPURAM<satyanarlapuram@gmail.com>
        Apr 30, 2026, 7:01 PM UTC
        Hi,
        On Thu, Apr 30, 2026 at 12:16 AM Ashutosh Bapat <
        ashutosh.bapat.oss@gmail.com> wrote:
        On Wed, Apr 29, 2026 at 10:05 PM SATYANARAYANA NARLAPURAM
        <satyanarlapuram@gmail.com> wrote:

        Hi hackers,

        generatequeriesforpathpattern_recurse() enumerates all path
        combinations by recursing over the Cartesian product of matching elements
        per pattern position. Without IS label filters, each position matches
        ALL tables of that kind, leading to N^K combinations (N tables, K
        pattern positions). Each combination allocates a Query node via palloc
        causing unbounded memory growth.

        A 8-table graph with a -element pattern reaches 81.3 GB RES in a few
        seconds
        before I cancel the query. Tests in the patch (those were failed) can
        reproduce the problem
        without the fix included in the patch.

        top - 15:04:19 up 43 days, 19:18, 5 users, load average: 0.43, 0.19,
        0.08
        Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
        %Cpu(s): 0.9 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si,
        0.0 st
        MiB Mem : 515766.2 total, 248412.7 free, 234847.7 used, 48014.7
        buff/cache
        MiB Swap: 0.0 total, 0.0 free, 0.0 used. 280918.6 avail
        Mem

        PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
        COMMAND
        649642 azureus+ 20 0 212.2g 81.3g 33948 R 100.0 16.1 0:41.20
        postgres

        I tried to reproduce this problem on my laptop with queries included
        in the patch. For me all the queries finished. First within 4 ms,
        second within 133 ms and the third in 12ms. Did you try with more
        edges or more rows in the tables?

        As a POC I added a pre-computation check that calculates the total number
        of path combinations before entering the
        generatequeriesforpathpattern_recurse.
        If the product exceeds MAXGRAPHTABLEPATHCOMBINATIONS (set to 10,000),
        the rewriter reports ERRCODEPROGRAMLIMIT_EXCEEDED with a hint
        suggesting
        IS label filters to reduce the search space. The limit of 10,000 is
        somewhat arbitrary
        but conservative. It caps memory at roughly 5 MB of Query nodes.
        Patterns that would exceed the limit without labels can always be made
        to succeed
        by adding IS expressions to pin specific positions to fewer tables.
        Alternatively, we can consider adding a GUC to control the limit but
        appears
        to be an overkill. Thoughts?

        I understand the problem and can understand the pain. Somebody who is
        writing a query like ()-()-()-()-() ... should know that every pattern
        that doesn't have a label will be matched against each vertex/edge.
        Our documentation makes it explicit [1] "The above patterns would
        match any vertex, or any two vertices connected by any edge ... ". But
        if they are really interested in matching every vertex or edge they
        don't have any other choice. Using restrictive label expressions would
        lead to wrong or incomplete results. If they can use label expressions
        they should do so anyway, otherwise they will end up with incorrect
        results. To me it seems like somebody who is writing such a pattern
        without providing enough resources is writing a bad query.

        We have other places where queries can consume large amounts of memory
        during planning or execution. Simply take the SQL query equivalent to
        the above pattern. We do not have a way to prohibit such queries as
        far as I know. I understand that SQL/PGQ makes it easy to write such
        queries by hand. But that seems to be abusing a powerful tool.

        Another example is joining partitioned tables with thousands of
        partitions. We have a GUC which enables or disables partitionwise join
        but there is no GUC to limit the number of tables or partitions being
        joined.

        I think we can document that such a pattern can result in large
        queries which may consume memory.

        Said that 81.3 GB looks unreasonably large for
        generatequeriesforpathpattern_recurse() alone. I guess a large
        portion of it comes from planning and execution. How many rows did
        those tables had? Which phase of query execution consumed that much
        memory? Do you have a dump of memory contexts when it reaches that
        limit?
        I am worried about a potential DOS by a low privileged user or an
        accidental query
        causing OOM.
        Please find the attached patch that added traces to print the memory
        context.
        Patch also includes CFI to cancel the query which we didn't have earlier.
        You can see traces like below once you run the repro:
        2026-04-30 18:43:51.268 UTC [927121] LOG: GRAPH_TABLE: before
        generatequeriesforpathpattern_recurse, current memory context
        "MessageContext": 39392903168 bytes total
        2026-04-30 18:43:51.268 UTC [927121] STATEMENT: SELECT count(*) FROM
        GRAPH_TABLE (g5 MATCH (a)-[e1]->(b)-[e2]->(c)-[e3]->(d)-[e4]->(e) COLUMNS (
        a.id AS aid));
        2026-04-30 18:43:51.268 UTC [927121] ERROR: canceling statement due to
        user request
        2026-04-30 18:43:51.268 UTC [927121] STATEMENT: SELECT count(*) FROM
        GRAPH_TABLE (g5 MATCH (a)-[e1]->(b)-[e2]->(c)-[e3]->(d)-[e4]->(e) COLUMNS (
        a.id AS aid));
        Repro:
        CREATE temp TABLE v1 (id int PRIMARY KEY, val int);
        CREATE temp TABLE v2 (id int PRIMARY KEY, val int);
        CREATE temp TABLE v3 (id int PRIMARY KEY, val int);
        CREATE temp TABLE v4 (id int PRIMARY KEY, val int);
        CREATE temp TABLE v5 (id int PRIMARY KEY, val int);
        CREATE temp TABLE v6 (id int PRIMARY KEY, val int);
        CREATE temp TABLE v7 (id int PRIMARY KEY, val int);
        CREATE temp TABLE v8 (id int PRIMARY KEY, val int);
        CREATE temp TABLE e1 (id int PRIMARY KEY, src int, dest int);
        CREATE temp TABLE e2 (id int PRIMARY KEY, src int, dest int);
        CREATE temp TABLE e3 (id int PRIMARY KEY, src int, dest int);
        CREATE temp TABLE e4 (id int PRIMARY KEY, src int, dest int);
        CREATE temp TABLE e5 (id int PRIMARY KEY, src int, dest int);
        CREATE temp TABLE e6 (id int PRIMARY KEY, src int, dest int);
        CREATE temp TABLE e7 (id int PRIMARY KEY, src int, dest int);
        CREATE temp TABLE e8 (id int PRIMARY KEY, src int, dest int);
        CREATE PROPERTY GRAPH g5
        VERTEX TABLES (
            v1 LABEL vl PROPERTIES (id, val) LABEL vl2 PROPERTIES (id, val),
            v2 LABEL vl PROPERTIES (id, val),
            v3 LABEL vl PROPERTIES (id, val),
            v4 LABEL vl PROPERTIES (id, val),
            v5 LABEL vl PROPERTIES (id, val),
            v6 LABEL vl PROPERTIES (id, val),
            v7 LABEL vl PROPERTIES (id, val),
            v8 LABEL vl PROPERTIES (id, val)
        )
        EDGE TABLES (
            e1 SOURCE KEY (src) REFERENCES v1 (id) DESTINATION KEY (dest)
        REFERENCES v2 (id) LABEL el PROPERTIES (id),
            e2 SOURCE KEY (src) REFERENCES v2 (id) DESTINATION KEY (dest)
        REFERENCES v3 (id) LABEL el PROPERTIES (id),
            e3 SOURCE KEY (src) REFERENCES v3 (id) DESTINATION KEY (dest)
        REFERENCES v4 (id) LABEL el PROPERTIES (id),
            e4 SOURCE KEY (src) REFERENCES v4 (id) DESTINATION KEY (dest)
        REFERENCES v5 (id) LABEL el PROPERTIES (id),
            e5 SOURCE KEY (src) REFERENCES v5 (id) DESTINATION KEY (dest)
        REFERENCES v6 (id) LABEL el PROPERTIES (id),
            e6 SOURCE KEY (src) REFERENCES v6 (id) DESTINATION KEY (dest)
        REFERENCES v7 (id) LABEL el PROPERTIES (id),
            e7 SOURCE KEY (src) REFERENCES v7 (id) DESTINATION KEY (dest)
        REFERENCES v8 (id) LABEL el PROPERTIES (id),
            e8 SOURCE KEY (src) REFERENCES v8 (id) DESTINATION KEY (dest)
        REFERENCES v1 (id) LABEL el PROPERTIES (id)
        );
        
        SELECT count(*) FROM GRAPH_TABLE (g5 MATCH
        (a)-[e1]->(b)-[e2]->(c)-[e3]->(d)-[e4]->(e) COLUMNS (a.id AS aid));