Plotly: Visualising a normal distribution given average and standard deviation
I’ve been playing around with Microsoft’s TrueSkill algorithm, which attempts to quantify the skill of a player using the Bayesian inference algorithm. A rating in this system is a Gaussian distribution that starts with an average of 25 and a confidence of 8.333. I wanted to visualise various ratings using Plotly and that’s what we’ll be doing in this blog post.
To save you from having to install TrueSkill, we’re going to create a named tuple to simulate a TrueSkill Rating
object:
from collections import namedtuple
Rating = namedtuple('Rating', ['mu', 'sigma'])
base = Rating(25, 25/3)
If we print that object, we’ll see the following output:
Rating(mu=25, sigma=8.333333333333334)
To create a visualisation of this distribution, we’ll need to install the following libraries:
pip install plotly numpy scipy kaleido==0.2.1
Next, import the following libraries:
import numpy as np
from scipy.stats import norm
import plotly.graph_objects as go
Next, we’re going to create some values for the x-axis.
We’ll use numpy’s arange
function to create a series of values starting from -4 standard deviations to +4 standard deviations:
x = np.arange(base.mu-4*base.sigma, base.mu+4*base.sigma, 0.001)
array([-8.33333333, -8.33233333, -8.33133333, ..., 58.33066667,
58.33166667, 58.33266667])
Next, we’re going to run scipy’s probability density function over the x values for our mu and sigma values. In other words, we’re going to compute the likelihood of each of those x values for our normal distribution:
y = norm.pdf(x, base.mu, base.sigma)
array([1.60596271e-05, 1.60673374e-05, 1.60750513e-05, ...,
1.60801958e-05, 1.60724796e-05, 1.60647669e-05]
Now that we’ve got x and y values, we can create a visualisation by running the following code:
fig = go.Figure()
fig.add_trace(go.Scatter(x=x, y=y, mode='lines', fill='tozeroy', line_color='black'))
fig.write_image("fig1.png", width=1000, height=800)
If we run that code, the following image will be generated:

We can then wrap that all together into a function that can take in multiple ratings:
def visualise_distribution(**kwargs):
fig = go.Figure()
min_x = min([r.mu-4*r.sigma for r in kwargs.values()])
max_x = max([r.mu+4*r.sigma for r in kwargs.values()])
x = np.arange(min_x, max_x, 0.001)
for key, value in kwargs.items():
y = norm.pdf(x, value.mu, value.sigma)
fig.add_trace(go.Scatter(x=x, y=y, mode='lines', fill='tozeroy', name=key))
fig.write_image(f"fig_{'_'.join(kwargs.keys())}.png", width=1000, height=800)
Let’s give it a try with 3 different ratings:
visualise_distribution(p1=Rating(25, 8.333), p2=Rating(50, 4.12), p3=Rating(10, 5.7))
We can see the resulting image below:

We can see that the curve for p2
is much narrower than the other two, which is because we used a small sigma value.
p1
is the default score and as a result we have the most uncertainty in that rating.
p3
hasn’t performed well and there’s reasonably certainty around their low score.
About the author
I'm currently working on real-time user-facing analytics with Apache Pinot at StarTree. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.