Mark Needham

Thoughts on Software Development

neo4j/cypher: Keep longest path when finding taxonomy

with one comment

I’ve been playing around with modelling a product taxonomy and one thing that I wanted to do was find out the full path where a product sits under the tree.

I created a simple data set to show the problem:

CREATE (cat { name: "Cat" })
CREATE (subcat1 { name: "SubCat1" })
CREATE (subcat2 { name: "SubCat2" })
CREATE (subsubcat1 { name: "SubSubCat1" })
CREATE (product1 { name: "Product1" })
CREATE (cat)-[:CHILD]-subcat1-[:CHILD]-subsubcat1
CREATE (product1)-[:HAS_CATEGORY]-(subsubcat1)

I wanted to write a query which would return ‘product1’ and the tree ‘Cat -> SubCat1 -> SubSubCat1’ and initially wrote the following query:

START product=node:node_auto_index(name="Product1") 
MATCH product-[:HAS_CATEGORY]-category, taxonomy=category<-[:CHILD*1..]-parent 
RETURN product, EXTRACT(n IN NODES(taxonomy): n.name)

which returns:

==> +--------------------------------------------------------------------+
==> | product                    | EXTRACT(n IN NODES(taxonomy): n.name) |
==> +--------------------------------------------------------------------+
==> | Node[888]{name:"Product1"} | ["SubSubCat1","SubCat1"]              |
==> | Node[888]{name:"Product1"} | ["SubSubCat1","SubCat1","Cat"]        |
==> +--------------------------------------------------------------------+
==> 2 rows

I didn’t want to return the first row since that isn’t the full tree and Andres suggested that looking for nodes which didn’t have any incoming children would help me do that:

START product=node:node_auto_index(name="Product1") 
MATCH product-[:HAS_CATEGORY]-category, 
      taxonomy=category<-[:CHILD*1..]-parent 
WHERE NOT parent<-[:CHILD]-() 
RETURN product, EXTRACT(n IN NODES(taxonomy): n.name)
==> +--------------------------------------------------------------------+
==> | product                    | EXTRACT(n IN NODES(taxonomy): n.name) |
==> +--------------------------------------------------------------------+
==> | Node[888]{name:"Product1"} | ["SubSubCat1","SubCat1","Cat"]        |
==> +--------------------------------------------------------------------+
==> 1 row

If we want to reverse the taxonomy so it’s in the right order we can follow Wes Freeman’s advice from the following Stack Overflow thread:

START product=node:node_auto_index(name="Product1") 
MATCH product-[:HAS_CATEGORY]-category, taxonomy=category<-[:CHILD*1..]-parent 
WHERE NOT parent<-[:CHILD]-() 
RETURN product, 
       REDUCE(acc=[], cat IN EXTRACT(n IN NODES(taxonomy): n.name): cat + acc) AS taxonomy
==> +-------------------------------------------------------------+
==> | product                    | taxonomy                       |
==> +-------------------------------------------------------------+
==> | Node[888]{name:"Product1"} | ["Cat","SubCat1","SubSubCat1"] |
==> +-------------------------------------------------------------+
==> 1 row
Be Sociable, Share!

Written by Mark Needham

May 19th, 2013 at 10:15 pm

Posted in neo4j

Tagged with ,