· python strava running strava-api dtw dynamic-time-warping google-encoded-polyline-algorithm-format

Strava: Calculating the similarity of two runs

I go running several times a week and wanted to compare my runs against each other to see how similar they are.

I record my runs with the Strava app and it has an API that returns lat/long coordinates for each run in the Google encoded polyline algorithm format.

We can use the polyline library to decode these values into a list of lat/long tuples. For example:

import polyline
polyline.decode('u{~vFvyys@fS]')
[(40.63179, -8.65708), (40.62855, -8.65693)]

Once we’ve got the route defined as a set of coordinates we need to compare them. My Googling led me to an algorithm called Dynamic Time Warping

DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restrictions. The sequences are "warped" non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.

The fastdtw library implements an approximation of this library and returns a value indicating the distance between sets of points.

We can see how to apply fastdtw and polyline against Strava data in the following example:

import os
import polyline
import requests
from fastdtw import fastdtw

token = os.environ["TOKEN"]
headers = {'Authorization': "Bearer {0}".format(token)}

def find_points(activity_id):
    r = requests.get("https://www.strava.com/api/v3/activities/{0}".format(activity_id), headers=headers)
    response = r.json()
    line = response["map"]["polyline"]
    return polyline.decode(line)

Now let’s try it out on two runs, 1361109741 and 1346460542:

from scipy.spatial.distance import euclidean

activity1_id = 1361109741
activity2_id = 1346460542

distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean)

>>> print(distance)
2.91985018100644

These two runs are both near my house so the value is small. Let’s change the second route to be from my trip to New York:

activity1_id = 1361109741
activity2_id = 1246017379

distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean)

>>> print(distance)
29383.492965394034

Much bigger!

I’m not really interested in the actual value returned but I am interested in the relative values. I’m building a little application to generate routes that I should run and I want it to come up with a routes that are different to recent ones that I’ve run. This score can now form part of the criteria.

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pinterest
  • Pocket