Patch: Add tsmatch JSONPath operator for granular Full Text Search

  • Jump to comment-1
    Florents Tselai<florents.tselai@gmail.com>
    Jan 26, 2026, 5:23 PM UTC
    Hi,
    in real-life I work a lot with json & fts search, here's a feature I've
    always wished I had,
    but never tackle it. Until yesterday that is.
    SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" && @.body
    tsmatch "performance")');
    This patch introduces a tsmatch boolean operator to the JSONPath engine.
    By integrating FTS natively into path expressions,
    this operator allows for high-precision filtering of nested JSONB
    structures—
    solving issues with structural ambiguity and query complexity.
    Currently, users must choose between two suboptimal paths for FTS-ing
    nested JSON:
    - Imprecise Global Indexing
    jsonbtotsvector aggregates text into a flat vector.
    This ignores JSON boundaries, leading to false positives when the same key
    (e.g., "body")
    appears in different contexts (e.g., a "Product Description" vs. a
    "Customer Review").
    - Complex SQL Workarounds
    Achieving 100% precision requires unnesting the document via
    jsonbarrayelements and LATERAL joins.
    This leads to verbose SQL and high memory overhead from generating
    intermediate heap tuples.
    One of the most significant advantages of tsmatch is its ability to
    participate in multi-condition predicates
    within the same JSON object - something jsonbtotsvector cannot do.
    SELECT jsonb_path_query(doc, '$.comments[*] ? (@.user == "Alice" && @.body
    tsmatch "performance")');
    In a flat vector, the association between "Alice" and "performance" is lost.
    tsmatch preserves this link by evaluating the FTS predicate in-place during
    path traversal.
    While the SQL/JSON standard (ISO/IEC 9075-2) does not explicitly define an
    FTS operator,
    tsmatch is architecturally modeled after the standard-defined like_regex.
    The implementation follows the like_regex precedent:
    it is a non-indexable predicate that relies on GIN path-matching for
    pruning and heap re-checks for precision.
    Caching is scoped to the JsonPathExecContext,
    ensuring 'compile-once' efficiency per execution without violating the
    stability requirements of prepared statements.
    This initial implementation uses plainto_tsquery.
    However, the grammar is designed to support a "mode" flag (similar to
    like_regex flags)
    in future iterations to toggle between totsquery, websearchto_tsquery,
    and phraseto_tsquery.