Mark Needham

Thoughts on Software Development

Archive for the ‘ranking-systems’ tag

Elo Rating System: Ranking Champions League teams using Clojure Part 2

without comments

A few weeks ago I wrote about ranking Champions League teams using the Elo Rating algorithm, and since I wrote that post I’ve collated data for 10 years worth of matches so I thought an update was in order.

After extracting the details of all those matches I saved them to a JSON file so that I wouldn’t have to parse the HTML pages every time I tweaked the algorithm. This should also make it easier for other people to play with the data.

I described the algorithm in the previous post and had analysed the rankings for one season. However, it was difficult to understand why teams has been ranked in a certain order so I drilled into the data to find out.

Since the 2012/2013 season is the freshest in my memory I started with that.

The first thing to do was load the matches from disk:

(ns ranking-algorithms.uefa
  (:require [clj-time.format :as f])
  (:require [clojure.data.json :as json]))
 
(defn as-date [date-field] (f/parse (f/formatter "dd MMM YYYY") date-field ))
 
(defn date-aware-value-reader [key value] (if (= key :date) (as-date value) value))
 
(defn read-from-file [file]
  (json/read-str (slurp file)
                 :value-fn date-aware-value-reader
                 :key-fn keyword))

I’ve written previously about reifying a date masquerading as a string but for now we want to give those matches a name and then process them.

> (def the-matches (read-from-file "data/cl-matches-2013.json"))
#'ranking-algorithms.uefa/the-matches
 
> (count the-matches)
213

I already had a function top-teams which would apply the Elo algorithm across the matches, but since I wanted to drill into each team’s performance I wrapped that function in another one called print-top-teams:

(comment "other functions excluded for brevity")
 
(defn format-for-printing [all-matches idx [team ranking & [rd]]]
  (let [team-matches (show-matches team all-matches)]
    (merge  {:rank (inc idx)
             :team team
             :ranking ranking
             :rd rd
             :round (performance team-matches)}
            (match-record team-matches))))
 
(defn print-top-teams
  ([number all-matches] (print-top-teams number all-matches {}))
  ([number all-matches base-rankings]
      (clojure.pprint/print-table
       [:rank :team :ranking :round :wins :draw :loses]
       (map-indexed
        (partial format-for-printing all-matches)
        (top-teams number all-matches base-rankings)))))
> (ranking-algorithms.core/print-top-teams 10 the-matches)
========================================================================
:rank | :team       | :ranking | :round         | :wins | :draw | :loses
========================================================================
1     | Bayern      | 1272.74  | Final          | 10    | 1     | 2     
2     | PSG         | 1230.02  | Quarter-finals | 6     | 3     | 1     
3     | Dortmund    | 1220.96  | Final          | 7     | 4     | 2     
4     | Real Madrid | 1220.33  | Semi-finals    | 6     | 3     | 3     
5     | Porto       | 1216.97  | Round of 16    | 5     | 1     | 2     
6     | CFR Cluj    | 1216.56  | Group stage    | 7     | 1     | 2     
7     | Galatasaray | 1215.56  | Quarter-finals | 5     | 2     | 3     
8     | Juventus    | 1214.0   | Quarter-finals | 5     | 3     | 2     
9     | Málaga      | 1211.53  | Quarter-finals | 5     | 5     | 2     
10    | Valencia    | 1211.0   | Round of 16    | 4     | 2     | 2     
========================================================================

I’ve excluded most of the functions but you can find the source in core.clj on github.

Clojure-wise I learnt about the print-table function which came in handy and Elo-wise I realised that the ranking places a heavy emphasis on winning matches.

If you follow the Champions League closely you’ll have noticed that Barcelona are missing from the top 10 despite reaching the Semi Final. The Elo algorithm actually ranks them in 65th position:

=========================================================================================
:rank | :team               | :ranking | :round                  | :wins | :draw | :loses
=========================================================================================
63    | Motherwell          | 1195.04  | Third qualifying round  | 0     | 0     | 2     
64    | Feyenoord           | 1195.04  | Third qualifying round  | 0     | 0     | 2     
65    | Barcelona           | 1194.68  | Semi-finals             | 5     | 3     | 4     
66    | BATE                | 1194.36  | Group stage             | 5     | 3     | 4     
67    | Anderlecht          | 1193.41  | Group stage             | 4     | 2     | 4 
=========================================================================================

It’s all about winning

I thought there might be a bug in my implementation but having looked through it multiple times, Barcelona’s low ranking results from losing multiple games – to Celtic, AC Milan and twice to Bayern Munich – and progressing from their Quarter Final tie against PSG without winning either match.

I did apply a higher weighting to matches won later on in the competition but otherwise the Elo algorithm doesn’t take into account progress in a tournament.

Low variation in rankings

The initial ranking of each team was 1200, so I was surprised to see that the top ranked team had only achieved a ranking of 1272 – I expected it to be higher.

I read a bit more about the algorithm and learnt that a 200 points gap in ranking signifies that the higher ranked team should win 75% of the time.

For this data set the top ranked team has 1272 points and the lowest ranked team has 1171 points so we probably need to tweak the algorithm to make it more accurate.

Accuracy of the Elo algorithm

My understanding of the Elo algorithm is that it becomes more accurate as teams play more matches so I decided to try it out on all the matches from 2004 – 2012.

I adapted the print-top-teams function to exclude ‘:round’ since it doesn’t make sense in this context:

(comment "I really need to pull out the printing stuff into a function but I'm lazy so I haven't...yet")
 
(defn print-top-teams-without-round
  ([number all-matches] (print-top-teams-without-round number all-matches {}))
  ([number all-matches base-rankings]
      (clojure.pprint/print-table
       [:rank :team :ranking :wins :draw :loses]
       (map-indexed
        (partial format-for-printing all-matches)
        (top-teams number all-matches base-rankings)))))

If we evaluate that function we see the following rankings:

> (def the-matches (read-from-file "data/cl-matches-2004-2012.json"))
#'ranking-algorithms.uefa/the-matches
 
> (ranking-algorithms.core/print-top-teams-without-round 10 the-matches)
==========================================================
:rank | :team          | :ranking | :wins | :draw | :loses
==========================================================
1     | Barcelona      | 1383.85  | 55    | 25    | 12    
2     | Man. United    | 1343.54  | 49    | 21    | 14    
3     | Chelsea        | 1322.0   | 44    | 27    | 17    
4     | Real Madrid    | 1317.68  | 42    | 14    | 18    
5     | Bayern         | 1306.18  | 42    | 13    | 19    
6     | Arsenal        | 1276.83  | 47    | 21    | 18    
7     | Liverpool      | 1272.52  | 41    | 17    | 17    
8     | Internazionale | 1260.27  | 36    | 18    | 21    
9     | Milan          | 1257.63  | 34    | 22    | 18    
10    | Bordeaux       | 1243.04  | 12    | 3     | 7     
==========================================================

The only finalists missing from this list are Monaco and Porto who contested the final in 2004 but haven’t reached that level of performance since.

Bordeaux are the only unlikely entry in this list and have played ~60 games less than the other teams which suggests that we might not have as much confidence in their ranking. It was at this stage that I started looking at the Glicko algorithm which calculates a rating reliability as well as the rating itself.

I’ve added instructions to the github README showing some examples but if you have any questions feel free to ping me on here or twitter.

Written by Mark Needham

September 30th, 2013 at 8:26 pm

Posted in Ranking Systems

Tagged with ,

Elo Rating System: Ranking Champions League teams using Clojure

with 4 comments

As I mentioned in an earlier blog post I’ve been learning about ranking systems and one of the first ones I came across was the Elo rating system which is most famously used to rank chess players.

The Elo rating system uses the following formula to work out a player/team’s ranking after they’ve participated in a match:

R’ = R + K * (S – E)

  • R’ is the new rating
  • R is the old rating
  • K is a maximum value for increase or decrease of rating (16 or 32 for ELO)
  • S is the score for a game
  • E is the expected score for a game

I converted that formula into the following Clojure functions:

(defn ranking-after-win
  [{ ranking :ranking opponent-ranking :opponent-ranking importance :importance}]
  (+ ranking (* importance (- 1 (expected ranking opponent-ranking) ))))
 
(defn ranking-after-loss
  [{ ranking :ranking opponent-ranking :opponent-ranking importance :importance}]
  (+ ranking (* importance (- 0 (expected ranking opponent-ranking) ))))
 
(defn expected [my-ranking opponent-ranking]
  (/ 1.0
     (+ 1 (math/expt 10 (/ (- opponent-ranking my-ranking) 400)))))

which would be called like this to work out the new ranking of a team ranked 1200 that beat a team ranked 1500:

> (ranking-after-win { :ranking 1200 :opponent-ranking 1500 :importance 32 })
1227.1686541692377

The way it works is that we first work out the likelihood that we should win the match by calling expected:

> (expected 1200 1500)
0.15097955721132328

This tells us that we have a 15% chance of winning the match so if we do win then our ranking should be increased by a large amount as we aren’t expected to win. In this case a win gives us a points increase of ’32 * (1-0.15)’ which is ~27 points.

I kept things simple by always setting the importance/maximum value of increase or decrease to 32. The World Football Rankings took a different approach where they vary it based on the importance of a match and the margin of victory.

I decided to try out the algorithm on the 2002/2003 Champions League season. I was able to grab the data from The Rec Sport Soccer Statistics Foundation and I’ve written previously about how I scraped it using Enlive.

With a lot of help from Paul Bostrom I ended up with the following code to run a reduce over the matches while updating team rankings after each match:

(defn top-teams [number matches]
  (let [teams-with-rankings
    (apply array-map (mapcat (fn [x] [x {:points 1200}]) (extract-teams matches)))]
      (take number
        (sort-by (fn [x] (:points (val x)))
                 >
                 (seq (reduce process-match teams-with-rankings matches))))))
 
(defn process-match [ts match]
  (let [{:keys [home away home_score away_score]} match]
    (cond
     (> home_score away_score)
     (-> ts
         (update-in  [home :points]
                     #(ranking-after-win {:ranking % :opponent-ranking (:points (get ts away)) :importance 32}))
         (update-in  [away :points]
                     #(ranking-after-loss {:ranking % :opponent-ranking (:points (get ts home)) :importance 32}))) 
     (> away_score home_score)
     (-> ts
         (update-in  [home :points]
                     #(ranking-after-loss {:ranking % :opponent-ranking (:points  (get ts away)) :importance 32}))
         (update-in  [away :points]
                     #(ranking-after-win {:ranking % :opponent-ranking (:points (get ts home)) :importance 32})))
     (= home_score away_score) ts)))

The matches parameter that we pass into top-teams looks like this:

> (take 5 all-matches)
({:home "Tampere", :away "Pyunik Erewan", :home_score 0, :away_score 4} {:home "Pyunik Erewan", :away "Tampere", :home_score 2, :away_score 0} {:home "Skonto Riga", :away "Barry Town", :home_score 5, :away_score 0} {:home "Barry Town", :away "Skonto Riga", :home_score 0, :away_score 1} {:home "Portadown", :away "Belshina Bobruisk", :home_score 0, :away_score 0})

And calling extract-teams on it gets us a set of all the teams involved:

> (extract-teams (take 5 all-matches))
#{"Portadown" "Tampere" "Pyunik Erewan" "Barry Town" "Skonto Riga"}

We then mapcat over it to get a vector containing team/default points pairs:

> (mapcat (fn [x] [x {:points 1200}]) (extract-teams (take 5 all-matches)))
("Portadown" {:points 1200} "Tampere" {:points 1200} "Pyunik Erewan" {:points 1200} "Barry Town" {:points 1200} "Skonto Riga" {:points 1200})

before calling array-map to make a hash of the result:

> (apply array-map (mapcat (fn [x] [x {:points 1200}]) (extract-teams (take 5 all-matches))))
{"Portadown" {:points 1200}, "Tampere" {:points 1200}, "Pyunik Erewan" {:points 1200}, "Barry Town" {:points 1200}, "Skonto Riga" {:points 1200}}

We then apply a reduce over all the matches and call the function process-match on each iteration to update team rankings appropriately. The final step is to sort the teams by their ranking so we can list the top teams:

> (top-teams 10 all-matches)
(["CF Barcelona" {:points 1343.900393287903}] 
 ["Manchester United" {:points 1292.4731214788262}] 
 ["FC Valencia" {:points 1277.1820905112208}] 
 ["Internazionale Milaan" {:points 1269.8028023141364}] 
 ["AC Milan" {:points 1257.4564374787687}]
 ["Juventus Turijn" {:points 1254.2498432522466}] 
 ["Real Madrid" {:points 1248.0758162475993}] 
 ["Deportivo La Coruna" {:points 1235.7792317210403}] 
 ["Borussia Dortmund" {:points 1231.1671952364256}] 
 ["Sparta Praag" {:points 1229.3249513256828}])

Interestingly the winners (Juventus) are only in 5th place and the top 2 places are occupied by teams that lost in the Quarter Final. I wrote the following functions to investigate what was going on:

(defn show-matches [team matches]
  (->> matches
       (filter #(or (= team (:home %)) (= team (:away %))))
       (map #(show-opposition team %))))
 
(defn show-opposition [team match]
  (if (= team (:home match))
    {:opposition (:away match) :score (str (:home_score match) "-" (:away_score match))}
    {:opposition (:home match) :score (str (:away_score match) "-" (:home_score match))}))

If we call it with Juventus we can see how they performed in their matches:

ranking-algorithms.parse> (show-matches "Juventus Turijn" all-matches)
({:opposition "Feyenoord", :score "1-1"} 
 {:opposition "Dynamo Kiev", :score "5-0"} 
 {:opposition "Newcastle United", :score "2-0"} 
 {:opposition "Newcastle United", :score "0-1"} 
 {:opposition "Feyenoord", :score "2-0"} 
 {:opposition "Dynamo Kiev", :score "2-1"} 
 {:opposition "Deportivo La Coruna", :score "2-2"} 
 {:opposition "FC Basel", :score "4-0"} 
 {:opposition "Manchester United", :score "1-2"} 
 {:opposition "Manchester United", :score "0-3"} 
 {:opposition "Deportivo La Coruna", :score "3-2"} 
 {:opposition "FC Basel", :score "1-2"} 
 {:opposition "CF Barcelona", :score "1-1"} 
 {:opposition "CF Barcelona", :score "2-1"} 
 {:opposition "Real Madrid", :score "1-2"} 
 {:opposition "Real Madrid", :score "3-1"})

Although I’m missing the final – I need to fix the parser to pick that match up and it was a draw anyway – they actually only won 8 of their matches outright. Barcelona, on the other hand, won 13 matches although 2 of those were qualifiers.

The next step is to take into account the importance of the match rather than applying an importance of 32 across the board and adding some value to winning a tie/match even if it’s on penalties or away goals.

The code is on github if you want to play around with it or have suggestions for something else I can try out.

Written by Mark Needham

August 31st, 2013 at 1:01 pm

Ranking Systems: What I’ve learnt so far

with 5 comments

I often go off on massive tangents reading all about a new topic but don’t record what I’ve read so if I go back to the topic again in the future I have to start from scratch which is quite frustrating.

In this instance after playing around with calculating the eigenvector centrality of a sub graph I learnt that this algorithm can also be used in ranking systems.

I started off by reading a paper written by James Keener about the Perron-Frobenius Theorem and the ranking of American football teams.

The Perron-Frobenius Theorem asserts the following:

a real square matrix with positive entries has a unique largest real eigenvalue and that the corresponding eigenvector has strictly positive components

This is applicable for network based ranking systems as we can build up a matrix of teams, store a value representing their performance against each other, and then calculate an ordered ranking based on eigenvector centrality.

I also came across the following articles describing different network-based approaches to ranking teams/players in tennis and basketball respectively:

Unfortunately I haven’t come across any corresponding code showing how to implement those algorithms so I need to do a bit more reading and figure out how to do it.

In the world of non network based ranking systems I came across 3 algorithms:

  • Elo – this is a method originally developed to calculate the relative skill of chess players.

    Players start out with an average rating which then increases/decreases based on games they take part in. If they beat someone much more highly ranked then they’d gain a lot of points whereas losing to someone similarly ranked wouldn’t affect their ranking too much.

    I came across a version used to rank country football teams. and the algorithm is quite well described in Christopher Allen’s article on competitive ranking systems.

  • Glicko – this method was developed as the author, Mark Glickman, detected some flaws in the Elo rating system around the reliability of players’ ratings.

    This algorithm therefore introduces the concept of a ratings deviation (RD) to measure uncertainty in a rating. If a player player plays regularly they’d have a low RD and if they don’t it’d be higher. This is then taken into account when assigning points based on games between different players.

    Rob Kohr has an implementation of this one using Javascript on his github.

  • TrueSkill – this one was developed by Microsoft Research to rank players using XBox Live. This seems similar to Glicko in that it has a rating and uncertainty for each player. TrueSkill’s FAQs suggest the following difference between the two:

    Glicko was developed as an extension of ELO and was thus naturally limited to two player matches which end in either win or loss. Glicko cannot update skill levels of players if they compete in multi-player events or even in teams. The logistic model would make it computationally expensive to deal with team and multi-player games. Moreover, chess is usually played in pre-set tournaments and thus matching the right opponents was not considered a relevant problem in Glicko. In contrast, the TrueSkill ranking system offers a way to measure the quality of a match between any set of players.

Scott Hamilton has an implementation of all these algorithms in Python which I need to play around with. He based his algorithms on a blog post written by Jeff Moser in which he explains probabilities, the Gaussian distribution, Bayesian probability and factor graphs in deciphering the TrueSkill algorithm. Moser’s created a project implementing TrueSkill in C# on github.

I follow tennis and football reasonably closely so I thought I’d do a bit of reading about the main two rankings I know about there as well:

  • UEFA club coefficients – used to rank football clubs that have taken part in a European competition over the last 5 seasons. It takes into account the importance of the match but not the strength of the opposition
  • ATP Tennis Rankings – used to rank tennis players on a rolling basis over the last 12 months. They take into account the importance of a tournament and the round a player reached to assign ranking points.

Now that I’ve recorded all that it’s time to go and play with some of them!

Written by Mark Needham

August 24th, 2013 at 11:05 am