Mark Needham

Thoughts on Software Development

neography/neo4j/Lucene: Getting a list of all the nodes indexed

with one comment

I’ve been playing around with neo4j using the neography gem to create a graph of all the people in ThoughtWorks and the connections between them based on working with each other.

I created a UI where you could type in the names of two people and see when they’ve worked together or the path between the shortest path between them if they haven’t.

I thought it would be cool to have auto complete functionality when typing in a name but I couldn’t figure out how to partially query the index of people’s names that I’d created.

I have this Lucene index:

@neo = Neography::Rest.new
@neo.create_node_index("people", "fulltext", "lucene")

Which I add to like this:

node = @neo.create_node("name" => "Mark Needham")
@neo.add_node_to_index("people", "name", "Mark Needham", node)
> @neo.get_index("people", "name", "Mark Needham")
=> [{"indexed"=>"http://localhost:7474/db/data/index/node/people/name/Mark%20Needham/979", "outgoing_relationships"=>"http://localhost:7474/db/data/node/979/relationships/out", "data"=>{"name"=>"Mark Needham"}, "traverse"=>"http://localhost:7474/db/data/node/979/traverse/{returnType}", "all_typed_relationships"=>"http://localhost:7474/db/data/node/979/relationships/all/{-list|&|types}", "property"=>"http://localhost:7474/db/data/node/979/properties/{key}", "self"=>"http://localhost:7474/db/data/node/979", "properties"=>"http://localhost:7474/db/data/node/979/properties", "outgoing_typed_relationships"=>"http://localhost:7474/db/data/node/979/relationships/out/{-list|&|types}", "incoming_relationships"=>"http://localhost:7474/db/data/node/979/relationships/in", "extensions"=>{}, "create_relationship"=>"http://localhost:7474/db/data/node/979/relationships", "paged_traverse"=>"http://localhost:7474/db/data/node/979/paged/traverse/{returnType}{?pageSize,leaseTime}", "all_relationships"=>"http://localhost:7474/db/data/node/979/relationships/all", "incoming_typed_relationships"=>"http://localhost:7474/db/data/node/979/relationships/in/{-list|&|types}"}]

I came across an old mailing list thread which suggested the following solution:

One solution is to add a field with a known and constant value to each document in the index. Then searching for that field and value will give you all documents in the index.

I changed my code to do that:

node = @neo.create_node("name" => "Mark Needham")
@neo.add_node_to_index("people", "name", "Mark Needham", node)
@neo.add_node_to_index("people", "type", "person", node)

From my sinatra web app I then put the names of all the people in an application level variable like so:

configure do
  set :all_people, Neography::Rest.new.get_index("people", "type", "person").map { |n| n["data"]["name"] }
end

And then search through that like so:

get '/people' do
    search_term = params["term"] ||= ""
    settings.all_people.select { |p| p.downcase.start_with?(search_term.downcase) }.to_json
end

It works and since there’s only one query to get the Lucene index when I first start the web server it’s pretty quick but surely there’s a less hacky/proper way?

Be Sociable, Share!

Written by Mark Needham

April 17th, 2012 at 6:54 am

Posted in Software Development

Tagged with , ,

  • Michael Hunger

    If you only have people nodes in the graph you can use a cypher query with start person = node(*) to select all nodes.

    But for the type-ahead it is more sensible to do a lucene index query (not lookup)

    query_index(“people”,”name:*” for all
    query_index(“people”,”name:*#{prefix}” for the people starting with prefix

    Cheers

    Michael