Mark Needham

Thoughts on Software Development

neo4j: Make properties relationships

with 5 comments

I spent some of the weekend working my way through Jim, Ian & Emil‘s book ‘Graph Databases‘ and one of the things that they emphasise is that graphs allow us to make relationships first class citizens in our model.

Looking back on a couple of the graphs that I modelled last year I realise that I didn’t quite get this and although the graphs I modelled had some relationships a lot of the time I was defining things as properties on nodes.

While it’s fine to do this I think we lose some of the power of a graph and it’s not necessarily obvious what we’ve lost until we model a property as a relationship and see what possibilities open up.

For example in my football graph I wanted to record the date of matches and initially stored this as a property on the match before realising that modelling it as a relationship which might open up some interesting queries.

I created this relationship between a match and the month that the match took place in:


As a result of having this relationship I can now really easily find out which matches Gareth Bale played in September for example:

START player = node:players('name:"Gareth Bale"'), month=node:months('name:September')
MATCH player-[:played_in]-game
WHERE game-[:in_month]-month
RETURN, game.home_goals + "-" +game.away_goals AS score,
|                                  | score |                   |
| "Reading vs Tottenham Hotspur"             | "1-3" | "2012-09-16 16:00:00 +0100" |
| "Tottenham Hotspur vs Norwich City"        | "1-1" | "2012-09-01 15:00:00 +0100" |
| "Tottenham Hotspur vs Queens Park Rangers" | "2-1" | "2012-09-23 16:00:00 +0100" |
| "Manchester United vs Tottenham Hotspur"   | "2-3" | "2012-09-29 17:30:00 +0100" |

Or we could find all the matches in December where one of the teams won by more than 2 goals:

START month=node:months('name:December')
MATCH month-[:in_month]-game
WHERE ABS(game.home_goals - game.away_goals) > 2
RETURN, game.home_goals + "-" +game.away_goals AS score,
|                            | score |                   |
| "Sunderland vs Reading"              | "3-0" | "2012-12-11 19:45:00 +0000" |
| "Reading vs Arsenal"                 | "2-5" | "2012-12-17 20:00:00 +0000" |
| "Newcastle United vs Wigan Athletic" | "3-0" | "2012-12-03 20:00:00 +0000" |
| "Fulham vs Tottenham Hotspur"        | "0-3" | "2012-12-01 15:00:00 +0000" |
| "Liverpool vs Fulham"                | "4-0" | "2012-12-22 17:30:00 +0000" |
| "Chelsea vs Aston Villa"             | "8-0" | "2012-12-23 16:00:00 +0000" |
| "Aston Villa vs Tottenham Hotspur"   | "0-4" | "2012-12-26 17:30:00 +0000" |
| "Aston Villa vs Wigan Athletic"      | "0-3" | "2012-12-29 15:00:00 +0000" |
| "Arsenal vs Newcastle United"        | "7-3" | "2012-12-29 17:30:00 +0000" |
| "Queens Park Rangers vs Liverpool"   | "0-3" | "2012-12-30 16:00:00 +0000" |

There are certainly other things that we can find out now that we’ve got this relationship from months to matches explicit but it’s not only dates where this idea comes in useful.

I already had players modelled in the data set but I thought it’d be interesting to find out more about the data set based on where players came from.

I therefore added the following relationships:


We can now find the top scorers in the Premiership (accurate until before last weekend) who come from South America for example:

START continent = node:continents('name:"South America"')
MATCH continent-[:is_in]-country-[:comes_from]-player-[:played|subbed_on]-stats-[:in]-game
WHERE player-[:scored_in]-game
RETURN,,, SUM(stats.goals) AS goals
|       | |       | goals |
| "Luis Suárez"     | "Uruguay"    | "Liverpool"       | 18    |
| "Sergio Agüero"   | "Argentina"  | "Manchester City" | 9     |
| "Carlos Tevez"    | "Argentina"  | "Manchester City" | 8     |
| "Franco Di Santo" | "Argentina"  | "Wigan Athletic"  | 5     |
| "Ramires"         | "Brazil"     | "Chelsea"         | 4     |

Or we could find out how many goals have been scored by players from each continent:

START continent = node:continents('name:*')
MATCH continent-[:is_in]-country-[:comes_from]-player-[:played|subbed_on]-stats-[:in]-game
WHERE player-[:scored_in]-game
RETURN, SUM(stats.goals) AS goals
|  | goals |
| "Europe"        | 569   |
| "Africa"        | 73    |
| "South America" | 62    |
| "North America" | 22    |
| "Asia"          | 3     |
| "Oceania"       | 3     |

I don’t think every property needs to be a relationship but it can certainly be useful to think about doing so because it does allow you to think of interesting queries that you may not have previously thought about.

As an aside I’m working on putting this data set somewhere so people can play around with cypher queries on it so if you’d be interested let me know.

Be Sociable, Share!

Written by Mark Needham

March 6th, 2013 at 12:59 am

Posted in neo4j

Tagged with ,

  • Well, I’d definitely be up for playing with your dataset 🙂

    I’m pretty sure that any queries you have with a WHERE clause that is just a graph description can be completely described in the MATCH clause.
    For example, continent-[:is_in]-country-[:comes_from]-player-[:scored_in], player-[:played]-stats

    Why do you make the differentiation between played and subbed_on? (This could possibly be a property on the relationship).
    E.g. :played[:on_time=0, :off_time=90]

  • Andy Palmer I was thinking of spinning up an AWS instance and putting it on there if that’s most useful. Or else I could just put it in a github repo and provider a script to set it up locally. Not sure which option is best!

    I think you’re probably right about the MATCH vs WHERE stuff, I’ll give it a try though.

    I’m not sure actually – I just realised that the distinction existed in the original data set and so I modelled it separately. I guess this may actually be one of those occasions where it’s better to use a property than relationship!

  • vinny

    Hi Mark.

    Great article. I’m taking my first steps with neo4j and your dataset is very similar to the 2 projects I’m currently working on. I come from a SQL server background and am finding the relationship vs property option quite tricky. Would I be able to look at your dataset for inspiration?

  • vinny I totally forgot to reply once I’d put the data up, sorry!

    I’ve created a VM with a script to import the football data into neo here ->

    I’ve put some instructions on the README file.

    If the VM approach is no good and you want to install it somewhere else then you can grab the files from and then copy the code from

    It makes use of the batch-import JAR which you can download from

    You’d obviously need to change the paths of stuff to suit whatever you have on your machine.

    Let me know if you need any help.


  • Pingback: neo4j/cypher: Properties or relationships? It’s easy to switch at Mark Needham()