Mark Needham

Thoughts on Software Development

Neo4j: Modelling hyper edges in a property graph

with 4 comments

At the Graph Database meet up in Antwerp last week we discussed how you would model a hyper edge in a property graph like Neo4j and I realised that I’d done this in my football graph without realising.

A hyper edge is defined as follows:

A hyperedge is a connection between two or more vertices, or nodes, of a hypergraph. A hypergraph is a graph in which generalized edges (called hyperedges) may connect more than two nodes with discrete properties.

In Neo4j an edge (or relationship) can only be between itself or another node, there’s no way of creating a relationship between more than 2 nodes.

I had problems when trying to model the relationship between a player and a football match because I wanted to say that a player participated in a match and represented a specific team in that match.

I started out with the following model:

2013 10 22 22 47 13

Unfortunately creating a direct relationship from the player to the match means that there’s no way to work out which team they played for.

This information is useful because sometimes players transfer teams in the middle of a season and we want to analyse how they performed for each team.

In a property graph we need to introduce an extra node which links the match, player and team together:

2013 10 22 22 54 25

Although we are forced to adopt this design it actually helps us realise an extra entity in our domain which wasn’t visible before – a player’s performance in a match.

If we want to capture information about a players’ performance in a match we can store it on this node.

We can also easily aggregate players stats by following the played relationship without needing to worry about the matches they played in.

The Neo4j manual has a few more examples of domain models containing hyper edges which are worth having a look at if you want to learn more.

Be Sociable, Share!

Written by Mark Needham

October 22nd, 2013 at 10:02 pm

Posted in neo4j

Tagged with

  • Patrick Durusau

    Err, but the definition of a hyperedge is not a series of connections between nodes. From Wikipedia: “In mathematics, a hypergraph is a generalization of a graph in which an edge can connect any number of vertices.” (http://en.wikipedia.org/wiki/Hyperedge)

    That is, one edge that contains an arbitrary number of nodes is a hyperedge.

    The Neo4j manual, elides the difference between a hyperedge and their simulation of a hyperedge saying: “The association of a User, a Group and a Role can be referred to as a HyperEdge. However, it can be easily modeled in a property graph as a node that captures this n-ary relationship, as depicted below in the U1G2R1 node.”

    Unless it is one edge with an arbitrary number of nodes, it’s not a hyperedge.

    For some purposes it may be sufficient to use the property graph modeling suggested but that doesn’t make it a hyperedge.

    Thanks for all the great graph posts!

    Patrick

  • http://www.markhneedham.com/blog Mark Needham

    Hi @patrickdurusau:disqus,

    Yes you are right…what I meant is that if you need a hyper edge in your model then this can be simulated by introducing an extra node as in my football example.

    Thanks for pointing this out and I’m glad you’re enjoying the graph posts!

    Mark

  • Pingback: Neo4j: Value in relationships, but value in nodes too! at Mark Needham

  • http://polycademy.com/ Roger Qiu

    What would it take to make Neo4j support hyperedges natively?