24 Aug 2013

Ranking Systems: What I've learnt so far

I often go off on massive tangents reading all about a new topic but don’t record what I’ve read so if I go back to the topic again in the future I have to start from scratch which is quite frustrating.

In this instance after playing around with calculating the eigenvector centrality of a sub graph I learnt that this algorithm can also be used in ranking systems.

I started off by reading a paper written by James Keener about the Perron-Frobenius Theorem and the ranking of American football teams.

The Perron-Frobenius Theorem asserts the following:

a real square matrix with positive entries has a unique largest real eigenvalue and that the corresponding eigenvector has strictly positive components

This is applicable for network based ranking systems as we can build up a matrix of teams, store a value representing their performance against each other, and then calculate an ordered ranking based on eigenvector centrality.

I also came across the following articles describing different network-based approaches to ranking teams/players in tennis and basketball respectively:

Unfortunately I haven’t come across any corresponding code showing how to implement those algorithms so I need to do a bit more reading and figure out how to do it.

In the world of non network based ranking systems I came across 3 algorithms:

Elo - this is a method originally developed to calculate the relative skill of chess players. Players start out with an average rating which then increases/decreases based on games they take part in. If they beat someone much more highly ranked then they’d gain a lot of points whereas losing to someone similarly ranked wouldn’t affect their ranking too much. I came across a version used to rank country football teams. and the algorithm is quite well described in Christopher Allen’s article on competitive ranking systems.
Glicko - this method was developed as the author, Mark Glickman, detected some flaws in the Elo rating system around the reliability of players' ratings. This algorithm therefore introduces the concept of a ratings deviation (RD) to measure uncertainty in a rating. If a player player plays regularly they’d have a low RD and if they don’t it’d be higher. This is then taken into account when assigning points based on games between different players. Rob Kohr has an implementation of this one using Javascript on his github.
TrueSkill - this one was developed by Microsoft Research to rank players using XBox Live. This seems similar to Glicko in that it has a rating and uncertainty for each player. TrueSkill’s FAQs suggest the following difference between the two:

Glicko was developed as an extension of ELO and was thus naturally limited to two player matches which end in either win or loss. Glicko cannot update skill levels of players if they compete in multi-player events or even in teams. The logistic model would make it computationally expensive to deal with team and multi-player games. Moreover, chess is usually played in pre-set tournaments and thus matching the right opponents was not considered a relevant problem in Glicko. In contrast, the TrueSkill ranking system offers a way to measure the quality of a match between any set of players.

Scott Hamilton has an implementation of all these algorithms in Python which I need to play around with. He based his algorithms on a blog post written by Jeff Moser in which he explains probabilities, the Gaussian distribution, Bayesian probability and factor graphs in deciphering the TrueSkill algorithm. Moser’s created a project implementing TrueSkill in C# on github.

I follow tennis and football reasonably closely so I thought I’d do a bit of reading about the main two rankings I know about there as well:

UEFA club coefficients - used to rank football clubs that have taken part in a European competition over the last 5 seasons. It takes into account the importance of the match but not the strength of the opposition
ATP Tennis Rankings - used to rank tennis players on a rolling basis over the last 12 months. They take into account the importance of a tournament and the round a player reached to assign ranking points.

Now that I’ve recorded all that it’s time to go and play with some of them!

About the author

I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.