Mark Needham

Thoughts on Software Development

neo4J: Searching for nodes by name

with 2 comments

As I mentioned in a post a few days ago I’ve been graphing connections between ThoughtWorks people using neo4j and wanted to build auto complete functionality so I can search for the names of people in the graph.

The solution I came up was to create a Lucene index with an entry for each node and a common property on each document in the index so that I’d be able to get all the index entries easily.

I created the index like this, using the neography gem:

Neography::Rest.new.add_node_to_index("people", "type", "person", node)

I can then get all the names like this:

all_people = Neography::Rest.new.get_index("people", "type", "person").map { |n| n["data"]["name"] }

It seemed like there must be a better way to do this and Michael Hunger was kind enough to show me a couple of cleaner solutions.

One way is to query the initial index rather than creating a new one:

all_people = Neography::Rest.new.find_node_index("people", "name:*").map { |n| n["data"]["name"] }

The ‘find_node_index’ method allows us to pass in a Lucene query which gets executed via neo4j’s REST API. In this case we’re using a wild card query on the ‘name’ property so it will return all documents.

This way of getting all the names seemed to be much more intensive than my other approach and when I ran it a few times in a row I was getting OutOfMemory errors. My graph only has a few thousand nodes in it so I’m not sure why that is yet.

I think it should be possible to query the Lucene index directly with the partial name but I was struggling to get spaces in the search term to encode correctly and was getting back no results.

Another approach is to use a cypher query to get a collection of all the nodes:

all_people = Neography::Rest.new.execute_query("start n=node(*) return n")["data"].map { |n| n[0]["data"]["name"] }

I imagine this approach wouldn’t scale with graph size but for my graph it works just fine.

Be Sociable, Share!

Written by Mark Needham

April 20th, 2012 at 7:10 am

Posted in neo4j,Software Development

Tagged with ,

  • http://andypalmer.com Andy Palmer

    I’m probably doing it wrong, but I’ve set up my database with type-roots which means that I can get all people’s names either by:
    start root=node(0) match root-[:PERSON_ROOT]->()<-[:IS_A]-person return person.name
    or (where I know the node id of the person root):
    start person_root=node(1) match person_root<-[:IS_A]-person return person.name

  • Michael Hunger

    That’s perfectly fine, in graph indexes (which type- or category-tree’s) are, are the graphy way of addressing the problem. The only thing that could be better is checking the existence of a relationship between two nodes when adding to the type-node.

    Index based lookups, might help for some super-nodes issues (but there are also other solutions) and fulltext-searching.