Mark Needham

Thoughts on Software Development

Archive for the ‘Ruby’ Category

Ruby: Create and share Google Drive Spreadsheet

without comments

Over the weekend I’ve been trying to write some code to help me create and share a Google Drive spreadsheet and for the first bit I started out with the Google Drive gem.

This worked reasonably well but that gem doesn’t have an API for changing the permissions on a document so I ended up using the google-api-client gem for that bit.

This tutorial provides a good quick start for getting up and running but it still has a manual step to copy/paste the ‘OAuth token’ which I wanted to get rid of.

The first step is to create a project via the Google Developers Console. Once the project is created, click through to it and then click on ‘credentials’ on the left menu. Click on the “Create new Client ID” button to create the project credentials.

You should see something like this on the right hand side of the screen:

2014 08 17 16 29 39

These are the credentials that we’ll use in our code.

Since I now have two libraries I need to satisfy the OAuth credentials for both, preferably without getting the user to go through the process twice.

After a bit of trial and error I realised that it was easier to get the google-api-client to handle authentication and just pass in the token to the google-drive code.

I wrote the following code using Sinatra to handle the OAuth authorisation with Google:

require 'sinatra'
require 'json'
require "google_drive"
require 'google/api_client'
CLIENT_ID = 'my client id'
CLIENT_SECRET = 'my client secret'
REDIRECT_URI = 'http://localhost:9393/oauth2callback'
helpers do
  def partial (template, locals = {})
    haml(template, :layout => false, :locals => locals)
enable :sessions
get '/' do
  haml :index
configure do
  google_client =
  google_client.authorization.client_id = CLIENT_ID
  google_client.authorization.client_secret = CLIENT_SECRET
  google_client.authorization.scope = OAUTH_SCOPE
  google_client.authorization.redirect_uri = REDIRECT_URI
  set :google_client, google_client
  set :google_client_driver, google_client.discovered_api('drive', 'v2')
post '/login/' do
  client = settings.google_client
  redirect client.authorization.authorization_uri
get '/oauth2callback' do
  authorization_code = params['code']
  client = settings.google_client
  client.authorization.code = authorization_code
  oauth_token = client.authorization.access_token
  session[:oauth_token] = oauth_token
  redirect '/'

And this is the code for the index page:

    %title Google Docs Spreadsheet
        Create Google Docs Spreadsheet
        - unless session['oauth_token']
          %form{:name => "spreadsheet", :id => "spreadsheet", :action => "/login/", :method => "post", :enctype => "text/plain"}
            %input{:type => "submit", :value => "Authorise Google Account", :class => "button"}
        - else
          %form{:name => "spreadsheet", :id => "spreadsheet", :action => "/spreadsheet/", :method => "post", :enctype => "text/plain"}
            %input{:type => "submit", :value => "Create Spreadsheet", :class => "button"}

We initialise the Google API client inside the ‘configure’ block before each request gets handled and then from ‘/’ the user can click a button which does a POST request to ‘/login/’.

‘/login/’ redirects us to the OAuth authorisation URI where we select the Google account we want to use and login if necessary. We’ll then get redirected back to ‘/oauth2callback’ where we extract the authorisation code and then get an authorisation token.

We’ll store that token in the session so that we can use it later on.

Now we need to create the spreadsheet and share that document with someone else:

post '/spreadsheet/' do
  client = settings.google_client
  if session[:oauth_token]
    client.authorization.access_token = session[:oauth_token]
  google_drive_session = GoogleDrive.login_with_oauth(session[:oauth_token])
  spreadsheet = google_drive_session.create_spreadsheet(title = "foobar")
  ws = spreadsheet.worksheets[0]
  ws[2, 1] = "foo"
  ws[2, 2] = "bar"
  file_id = ws.worksheet_feed_url.split("/")[-4]
  drive = settings.google_client_driver
  new_permission ={
      'value' => "",
      'type' => "user",
      'role' => "reader"
  result = client.execute(
    :api_method => drive.permissions.insert,
    :body_object => new_permission,
    :parameters => { 'fileId' => file_id })
  if result.status == 200
    puts "An error occurred: #{['error']['message']}"
  "spreadsheet created and shared"

Here we create a spreadsheet with some arbitrary values using the google-drive gem before granting permission to a different email address than the one which owns it. I’ve given that other user read permission on the document.

One other thing to keep in mind is which ‘scopes’ the OAuth authentication is for. If you authenticate for one URI and then try to do something against another one you’ll get a ‘Token invalid – AuthSub token has wrong scope‘ error.

Written by Mark Needham

August 17th, 2014 at 9:42 pm

Posted in Ruby

Tagged with

Ruby: Receive JSON in request body

with one comment

I’ve been building a little Sinatra app to play around with the Google Drive API and one thing I struggled with was processing JSON posted in the request body.

I came across a few posts which suggested that the request body would be available as params[‘data’] or request[‘data’] but after trying several ways of sending a POST request that doesn’t seem to be the case.

I eventually came across this StackOverflow post which shows how to do it:

require 'sinatra'
require 'json'
post '/somewhere/' do
  request_payload = JSON.parse
  p request_payload

I can then POST to that endpoint and see the JSON printed back on the console:


{"i": "am json"}
$ curl -H "Content-Type: application/json" -XPOST http://localhost:9393/somewhere/ -d @dummy.json
{"i"=>"am json"}

Of course if I’d just RTFM I could have found this out much more quickly!

Written by Mark Needham

August 17th, 2014 at 12:21 pm

Posted in Ruby

Tagged with

Ruby: Google Drive – Error=BadAuthentication (GoogleDrive::AuthenticationError) Info=InvalidSecondFactor

without comments

I’ve been using the Google Drive gem to try and interact with my Google Drive account and almost immediately ran into problems trying to login.

I started out with the following code:

require "rubygems"
require "google_drive"
session = GoogleDrive.login("", "mypassword")

I’ll move it to use OAuth when I put it into my application but for spiking this approach works. Unfortunately I got the following error when running the script:

/Users/markneedham/.rbenv/versions/1.9.3-p327/lib/ruby/gems/1.9.1/gems/google_drive-0.3.10/lib/google_drive/session.rb:93:in `rescue in login': Authentication failed for Response code 403 for post Error=BadAuthentication (GoogleDrive::AuthenticationError)
	from /Users/markneedham/.rbenv/versions/1.9.3-p327/lib/ruby/gems/1.9.1/gems/google_drive-0.3.10/lib/google_drive/session.rb:86:in `login'
	from /Users/markneedham/.rbenv/versions/1.9.3-p327/lib/ruby/gems/1.9.1/gems/google_drive-0.3.10/lib/google_drive/session.rb:38:in `login'
	from /Users/markneedham/.rbenv/versions/1.9.3-p327/lib/ruby/gems/1.9.1/gems/google_drive-0.3.10/lib/google_drive.rb:18:in `login'
	from src/gdoc.rb:15:in `<main>'

Since I have two factor authentication enabled on my account it turns out that I need to create an app password to login:

2014 08 17 02 47 03

It will then pop up with a password that we can use to login (I have revoked this one!):

2014 08 17 02 46 29

We can then use this password instead and everything works fine:

require "rubygems"
require "google_drive"
session = GoogleDrive.login("", "tuceuttkvxbvrblf")

Written by Mark Needham

August 17th, 2014 at 1:49 am

Posted in Ruby

Tagged with

Ruby: Regex – Matching the Trademark ™ character

without comments

I’ve been playing around with some World Cup data and while cleaning up the data I wanted to strip out the year and host country for a world cup.

I started with a string like this which I was reading from a file:

1930 FIFA World Cup Uruguay ™

And I wanted to be able to extract just the ‘Uruguay’ bit without getting the trademark or the space preceding it. I initially tried the following to match all parts of the line and extract my bit:

p text.match(/\d{4} FIFA World Cup (.*?)/)[1]

Unfortunately that doesn’t actually compile:

tm.rb:4: syntax error, unexpected $end, expecting ')'
p text.match(/\d{4} FIFA World Cup (.*?) ™/)[1]

I was initially able to work around the problem by matching the unicode code point instead:

p text.match(/\d{4} FIFA World Cup (.*?) \u2122/)[1]

While working on this blog post I also remembered that you can specify the character set of your Ruby file and by default it’s ASCII which would explain why it doesn’t like the ™ character.

If we add the following line at the top of the file then we can happily use the ™ character in our regex:

# encoding: utf-8
# ...
p text.match(/\d{4} FIFA World Cup (.*?)/)[1]
# returns "Uruguay"

This post therefore ends up being more of a reminder for future Mark when he comes across this problem again having forgotten about Ruby character sets!

Written by Mark Needham

June 8th, 2014 at 1:34 am

Posted in Ruby

Tagged with

Ruby: Calculating the orthodromic distance using the Haversine formula

without comments

As part of the UI I’m building around my football stadiums data set I wanted to calculate the distance from a football stadium to a point on the map in Ruby since cypher doesn’t currently return this value.

I had the following cypher query to return the football stadiums near Westminster along with their lat/long values:

lat, long, distance = ["51.55786291569685", "0.144195556640625", 10]
query =  " START node = node:geom('withinDistance:[#{lat}, #{long}, #{distance}]')"
query << " RETURN,,, node.lon"
rows = result["data"].map do |row| 
         { :team => row[1], 
           :stadium => row[0],            
           :lat => row[2],
           :lon => row[3]
p rows

which returns the following:

[{:team=>"Millwall", :stadium=>"The Den", :lat=>51.4859, :lon=>-0.050743},
 {:team=>"Arsenal", :stadium=>"Emirates Stadium", :lat=>51.5549, :lon=>-0.108436}, 
 {:team=>"Chelsea", :stadium=>"Stamford Bridge", :lat=>51.4816, :lon=>-0.191034},
 {:team=>"Fulham", :stadium=>"Craven Cottage", :lat=>51.4749, :lon=>-0.221619}, 
 {:team=>"Queens Park Rangers", :stadium=>"Loftus Road", :lat=>51.5093, :lon=>-0.232204}, 
 {:team=>"Leyton Orient", :stadium=>"Brisbane Road", :lat=>51.5601, :lon=>-0.012551}]

In the neo4j spatial code the distance between two points is referred to as the ‘orthodromic distance’ but searching for that didn’t come up with anything. However, I did eventually come across the following post which referred to the Haversine formula which is exactly what we want.

There is a good explanation of the formula on the Ask Dr Math forum which defines the formula like so:

dlon = lon2 - lon1
dlat = lat2 - lat1
a = (sin(dlat/2))^2 + cos(lat1) * cos(lat2) * (sin(dlon/2))^2
c = 2 * atan2(sqrt(a), sqrt(1-a)) 
d = R * c


  • R – the radius of the Earth
  • c – the great circle distance in radians
  • c – the great circle distance in the same units as R
  • lat1, lat2, lon1, lon2 – latitude and longitudes in radians

To convert decimal degrees to radians we need to multiply the number of degrees by pi/180 radians/degree.

The Ruby translation of that formula looks like this:

def haversine(lat1, long1, lat2, long2)  
  radius_of_earth = 6378.14 
  rlat1, rlong1, rlat2, rlong2 = [lat1, long1, lat2, long2].map { |d| as_radians(d)}
  dlon = rlong1 - rlong2
  dlat = rlat1 - rlat2
  a = power(Math::sin(dlat/2), 2) + Math::cos(rlat1) * Math::cos(rlat2) * power(Math::sin(dlon/2), 2)
  great_circle_distance = 2 * Math::atan2(Math::sqrt(a), Math::sqrt(1-a))
  radius_of_earth * great_circle_distance
def as_radians(degrees)
  degrees * Math::PI/180
def power(num, pow)
  num ** pow

And if we change our initial code to use it:

lat, long, distance = ["51.55786291569685", "0.144195556640625", 10]
query =  " START node = node:geom('withinDistance:[#{lat}, #{long}, #{distance}]')"
query << " RETURN,,, node.lon"
rows = result["data"].map do |row| 
         { :team => row[1], 
           :stadium => row[0], 
           :distance => haversine(lat, long, row[2], row[3]).round(2),           
           :lat => row[2],
           :lon => row[3]
p rows

which gives us the output we want:

[{:team=>"Millwall", :stadium=>"The Den", :distance=>4.87, :lat=>51.4859, :lon=>-0.050743}, 
 {:team=>"Arsenal", :stadium=>"Emirates Stadium", :distance=>5.57, :lat=>51.5549, :lon=>-0.108436}, 
 {:team=>"Chelsea", :stadium=>"Stamford Bridge", :distance=>5.94, :lat=>51.4816, :lon=>-0.191034}, 
 {:team=>"Fulham", :stadium=>"Craven Cottage", :distance=>8.18, :lat=>51.4749, :lon=>-0.221619}, 
 {:team=>"Queens Park Rangers", :stadium=>"Loftus Road", :distance=>8.21, :lat=>51.5093, :lon=>-0.232204}, 
 {:team=>"Leyton Orient", :stadium=>"Brisbane Road", :distance=>9.33, :lat=>51.5601, :lon=>-0.012551}]

Written by Mark Needham

June 30th, 2013 at 10:53 pm

Posted in Ruby

Tagged with ,

Ruby/Python: Constructing a taxonomy from an array using zip

with 2 comments

As I mentioned in my previous blog post I’ve been hacking on a product taxonomy and I wanted to create a ‘CHILD’ relationship between a collection of categories.

For example, I had the following array and I wanted to transform it into an array of ‘SubCategory, Category’ pairs:

taxonomy = ["Cat", "SubCat", "SubSubCat"]
# I wanted this to become [("Cat", "SubCat"), ("SubCat", "SubSubCat")

In order to do this we need to zip the first 2 items with the last which I found reasonably easy to do using Python:

>>> zip(taxonomy[:-1], taxonomy[1:])
[('Cat', 'SubCat'), ('SubCat', 'SubSubCat')]

Here we using the python array slicing notation to get all but the last item of ‘taxonomy’ and then all but the first item of ‘taxonomy’ and zip them together.

I wanted to achieve that effect in Ruby though because my import job was written in that!

We can’t achieve the open ended slicing as far as I can tell so the following gives us an error:

> taxonomy[..-1]
SyntaxError: (irb):10: syntax error, unexpected tDOT2, expecting ']'
	from /Users/markhneedham/.rbenv/versions/1.9.3-p327/bin/irb:12:in `<main>'

The way negative indexing works is a bit different so to remove the last item of the array we use ‘-2’ rather than ‘-1’:

> taxonomy[0..-2].zip(taxonomy[1..-1])
=> [["Cat", "SubCat"], ["SubCat", "SubSubCat"]]

Written by Mark Needham

May 19th, 2013 at 10:44 pm

Posted in Python,Ruby

Tagged with ,

Ruby 1.9.3 p0: Investigating weirdness with HTTP POST request in net/http

with one comment

Thibaut and I spent the best part of the last couple of days trying to diagnose a problem we were having trying to make a POST request using rest-client to one of our services.

We have nginx fronting the application server so the request passes through there first:


The problem we were having was that the request was timing out on the client side before it had been processed and the request wasn’t reaching the application server.

We initially thought there might be a problem with our nginx configuration because we don’t have many POST requests with largish (40kb) payloads so we initially tried tweaking the proxy buffer size.

It was a bit of a long shot because changing that setting only reduces the likelihood that nginx writes the request body to disc and then loads it later which shouldn’t impact performance that much.

The next thing we tried was replicating the request using cURL with a smaller payload which worked fine. cURL had no problem with the bigger payload either.

We therefore thought there must be a difference in the request headers being sent by rest-client and our initial investigation suggested that it might be to do with the ‘Content-Length‘ header.

There was a 1 byte difference in the value being sent by cURL and the one being sent by rest-client which was to do with the last character of the payload being a 0A (linefeed) character.

We changed the ‘Content-Length’ header on our cURL request to match that of the rest-client request (i.e. 1 byte too large) and were able to replicate the timeout problem.

At this stage we thought that calling ‘strip’ on the body of our rest-client request would solve the problem as the ‘Content-Length’ header would now be set to the correct value. It did set the ‘Content-Length’ header properly but unfortunately didn’t get rid of the timeout.

Our next step was to check whether or not we could get any request to work from rest-client so we tried using a smaller payload which worked fine.

At this stage Jason heard us discussing what to do next and said that he’d come across it earlier and that upgrading our Ruby Version from ‘1.9.3p0’ would solve all our woes.

That Ruby version is a couple of years old and most of our servers are running ‘1.9.3p392’ but somehow this one had slipped through the net.

We spun up a new server with that version of Ruby installed and it did indeed fix the problem.

However, we were curious what the fix was and had a look at the change log of the first patch release after ‘1.9.3p0’. We noticed the following which seemed relevant:

Tue May 31 17:03:24 2011 Hiroshi Nakamura

* lib/net/http.rb, lib/net/protocol.rb: Allow to configure to wait
server returning ‘100 continue’ response before sending HTTP request
body. See NEWS for more detail. See #3622.
Original patch is made by Eric Hodel .

* test/net/http/test_http.rb: test it.

* NEWS: Add new feature.

One thing we noticed from looking at the requests with ngrep was that cURL was setting the 100 Continue Expect request header and rest-client wasn’t.

When the payload size was small nginx didn’t seem to send a ‘100 Continue’ response which was presumably why we weren’t seeing a problem with the small payloads.

I wasn’t sure how to go about finding out exactly what was going wrong but given how long it took us to get to this point I thought I’d summarise what we tried and see if anyone could explain it to me.

So if you’ve come across this problem (probably 2 years ago!) it’d be cool to know exactly what the problem was.

Written by Mark Needham

April 30th, 2013 at 9:37 pm

Posted in Ruby

Tagged with

Ruby/Haml: Conditionally/Optionally setting an attribute/class

with 2 comments

One of the things that we want to do reasonably frequently is set an attribute (most often a class) on a HTML element depending on the value of a variable.

I always forget how to do this in Haml so I thought I better write it down so I’ll remember next time!

Let’s say we want to add a success class to a paragraph if the variable correct is true and not have any value if it’s false.

The following code does what we want:

- correct = true
%p{:class => (correct ? "success" : nil) }
  important text

This generates the following HTML is correct is true:

<p class="success">
  important text

And the following HTML if it’s false

  important text

To summarise, if we set an attribute to nil in Haml it just won’t be rendered at all which is exactly what we want in this situation.

Written by Mark Needham

March 2nd, 2013 at 11:22 pm

Posted in Ruby

Tagged with ,

Ruby/Haml: Maintaining white space/indentation in a <pre> tag

without comments

I’ve been writing a little web app in which I wanted to display cypher queries inside a <pre> tag which was then prettified using SyntaxHighlighter but I was having problems with how code on new lines was being displayed.

I had the following Haml code to display a query looking up Gareth Bale in a graph:

%pre{ :class => "brush: cypher; gutter: false; toolbar: false;"}
  START player = node:players('name:"Gareth Bale"') 

When I rendered the page it looked like this:

Bale broken tiff

After a bit of googling I ended up on this Stack Overflow post which described the preserve helper which seems to do the job:

%pre{ :class => "brush: cypher; gutter: false; toolbar: false;"}
  = preserve do
    START player = node:players('name:"Gareth Bale"') 

That part of the page now looks much better:

Bale fixed

Written by Mark Needham

March 2nd, 2013 at 10:19 pm

Posted in Ruby

Tagged with ,

Ruby: Stripping out a non breaking space character (&nbsp;)

without comments

A couple of days ago I was playing with some code to scrape data from a web page and I wanted to skip a row in a table if the row didn’t contain any text.

I initially had the following code to do that:

rows.each do |row|
  next if row.strip.empty?
  # other scraping code

Unfortunately that approach broke down fairly quickly because empty rows contained a non breaking space i.e. ‘&nbsp;’.

If we try called strip on a string containing that character we can see that it doesn’t get stripped:

# it's hex representation is A0
> "\u00A0".strip
=> " "
> "\u00A0".strip.empty?
=> false

I wanted to see whether I could use gsub to solve the problem so I tried the following code which didn’t help either:

> "\u00A0".gsub(/\s*/, "")
=> " "
> "\u00A0".gsub(/\s*/, "").empty?
=> false

A bit of googling led me to this Stack Overflow post which suggests using the POSIX space character class to match the non breaking space rather than ‘\s’ because that will match more of the different space characters.


> "\u00A0".gsub(/[[:space:]]+/, "")
=> ""
> "\u00A0".gsub(/[[:space:]]+/, "").empty?
=> true

So that we don’t end up indiscriminately removing all spaces to avoid problems like this where we mash the two names together…

> "Mark Needham".gsub(/[[:space:]]+/, "")
=> "MarkNeedham"

…the poster suggested the following regex which does the job:

> "\u00A0".gsub(/\A[[:space:]]+|[[:space:]]+\z/, '')
=> ""
> ("Mark" + "\u00A0" + "Needham").gsub(/\A[[:space:]]+|[[:space:]]+\z/, '')
=> "Mark Needham"
  • \A matches the beginning of the string
  • \z matches the end of the string

So what this bit of code does is match all the spaces that appear at the beginning or end of the string and then replaces them with ”.

Written by Mark Needham

February 23rd, 2013 at 3:04 pm

Posted in Ruby

Tagged with ,