Recursive query challenge - simple parent/child example Recursive query challenge - simple parent/child example postgresql postgresql

Recursive query challenge - simple parent/child example


With help from RhodiumToad on #postgresql, I've arrived at this solution:

WITH RECURSIVE node_graph AS (    SELECT ancestor_node_id as path_start, descendant_node_id as path_end,           array[ancestor_node_id, descendant_node_id] as path     FROM node_relations    UNION ALL     SELECT ng.path_start, nr.descendant_node_id as path_end,           ng.path || nr.descendant_node_id as path    FROM node_graph ng    JOIN node_relations nr ON ng.path_end = nr.ancestor_node_id) SELECT * from node_graph order by path_start, array_length(path,1);

The result is exactly as expected.


An alternative approach would be to traverse the graph in reversed order:

WITH RECURSIVE cte AS (   SELECT array[r.ancestor_node_id, r.descendant_node_id] AS path   FROM   node_relations r   LEFT   JOIN node_relations r0 ON r0.ancestor_node_id = r.descendant_node_id   WHERE  r0.ancestor_node_id IS NULL  -- start at the end   UNION ALL    SELECT r.ancestor_node_id || c.path   FROM   cte c   JOIN   node_relations r ON r.descendant_node_id = c.path[1]   ) SELECT pathFROM   cteORDER  BY path;

This produces a subset with every path from each root node to its ultimate descendant. For deep trees that also spread out a lot this would entail much fewer join operations. To additionally add every sub-path, you could append a LATERAL join to the outer SELECT:

WITH RECURSIVE cte AS (   SELECT array[r.ancestor_node_id, r.descendant_node_id] AS path   FROM   node_relations r   LEFT   JOIN node_relations r0 ON r0.ancestor_node_id = r.descendant_node_id   WHERE  r0.ancestor_node_id IS NULL  -- start at the end   UNION ALL    SELECT r.ancestor_node_id || c.path   FROM   cte c   JOIN   node_relations r ON r.descendant_node_id = c.path[1]   ) SELECT l.pathFROM   cte, LATERAL (   SELECT path[1:g] AS path   FROM   generate_series(2, array_length(path,1)) g   ) lORDER  BY l.path;

I ran a quick test, but it didn't run faster than RhodiumToad's solution. It might still be faster for big or wide tables. Try with your data.


I see two problems with the query:

  1. the non-recursive part does not specify the root node. You need to either explicitely select that using WHERE descendant_node_id = 14 or "dynamically" using:

    SELECT ..FROM   node_relations nr1WHERE  NOT EXISTS (SELECT 1                   FROM node_relations nr2                   WHERE nr2.ancestor_node_id = nr1.descendant_node_id)
  2. with the correct starting point, the path is not complete as it will miss the final node during the aggregation in the recursive part. So in the outer query you need to append ancestor_node_id to the generated path.

So the query would look like this:

WITH RECURSIVE node_graph AS (   SELECT nr1.id, nr1.ancestor_node_id, nr1.descendant_node_id, ARRAY[nr1.descendant_node_id] AS path, 0 as level   FROM   node_relations nr1   WHERE  NOT EXISTS (SELECT 1                      FROM node_relations nr2                      WHERE nr2.ancestor_node_id = nr1.descendant_node_id)   UNION  ALL   SELECT nr.id, nr.ancestor_node_id, nr.descendant_node_id, nr.descendant_node_id || ng.path, ng.level + 1 as level   FROM node_relations nr     JOIN node_graph ng ON ng.ancestor_node_id = nr.descendant_node_id)SELECT ancestor_node_id||path as path, -- add the last element of the sub-tree to the path       level as depthFROM   node_graphORDER  BY level

Here is the SQLFiddle: http://sqlfiddle.com/#!15/e646b/3