## Archive for the ‘strava api’ tag

## Strava: Calculating the similarity of two runs

I go running several times a week and wanted to compare my runs against each other to see how similar they are.

I record my runs with the Strava app and it has an API that returns lat/long coordinates for each run in the Google encoded polyline algorithm format.

We can use the polyline library to decode these values into a list of lat/long tuples. For example:

import polyline polyline.decode('u{~vFvyys@fS]') [(40.63179, -8.65708), (40.62855, -8.65693)] |

Once we’ve got the route defined as a set of coordinates we need to compare them. My Googling led me to an algorithm called Dynamic Time Warping

DTW is a method that calculates an optimal match between two given sequences (e.g. time series) with certain restrictions.

The sequences are “warped” non-linearly in the time dimension to determine a measure of their similarity independent of certain non-linear variations in the time dimension.

The fastdtw library implements an approximation of this library and returns a value indicating the distance between sets of points.

We can see how to apply fastdtw and polyline against Strava data in the following example:

import os import polyline import requests from fastdtw import fastdtw token = os.environ["TOKEN"] headers = {'Authorization': "Bearer {0}".format(token)} def find_points(activity_id): r = requests.get("https://www.strava.com/api/v3/activities/{0}".format(activity_id), headers=headers) response = r.json() line = response["map"]["polyline"] return polyline.decode(line) |

Now let’s try it out on two runs, 1361109741 and 1346460542:

from scipy.spatial.distance import euclidean activity1_id = 1361109741 activity2_id = 1346460542 distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean) >>> print(distance) 2.91985018100644 |

These two runs are both near my house so the value is small. Let’s change the second route to be from my trip to New York:

activity1_id = 1361109741 activity2_id = 1246017379 distance, path = fastdtw(find_points(activity1_id), find_points(activity2_id), dist=euclidean) >>> print(distance) 29383.492965394034 |

Much bigger!

I’m not really interested in the actual value returned but I am interested in the relative values. I’m building a little application to generate routes that I should run and I want it to come up with a routes that are different to recent ones that I’ve run. This score can now form part of the criteria.