Mark Needham

Thoughts on Software Development

Archive for the ‘DevOps’ Category

Treat servers as cattle: Spin them up, tear them down

with one comment

A few agos I wrote a post about treating servers as cattle, not as pets in which I described an approach to managing virtual machines at uSwitch whereby we frequently spin up new ones and delete the existing ones.

I’ve worked on teams previously where we’ve also talked about this mentality but ended up not doing it because it was difficult, usually for one of two reasons:

  • Slow spin up – this might be due to the cloud providers infrastructure, doing too much on spin up or I’m sure a variety of other reasons.
  • Manual steps involved in spin up – the process isn’t 100% automated so we have to do some manual tweaks. Once the machine is finally working we don’t want to have to go through that again.

Martin Fowler wrote a post a couple of years ago where he said the following:

One of my favorite soundbites is: if it hurts, do it more often. It has the happy property of seeming nonsensical on the surface, but yielding some valuable meaning when you dig deeper

I think it applies in this context too and I have noticed that the more frequently we tear down and spin up new nodes the easier it becomes to do so.

Part of this is because there’s been less time for changes to have happened in package repositories but we are also more inclined to optimise things that we have to do frequently so the whole process is faster as well.

For example in one of our sets of machines we need to give one machine a specific tag so that when the application is deployed it sets up a bunch of cron jobs to run each evening.

Initially this was done manually and we were quite reluctant to ever tear down that machine but we’ve now got it all automated and it’s not a big deal anymore – it can be cattle just like the rest of them!

One neat rule of thumb Phil taught me is that if we make major changes to our infrastructure we should spin up some new machines to check that it still actually works.

If we don’t do this then when we actually need to spin up a new node because of a traffic spike or machine corruption problem it’s not going to work and we’re going to have to fix things in a much more stressful context.

For example we recently moved some repositories around in github and although it’s a fairly simple change spinning up new nodes helped us see all the places where we’d failed to make the appropriate change.

While I appreciate taking this approach is more time consuming in the short term I’d argue that if we automate as much of the pain as possible in the long run it will probably be beneficial.

Written by Mark Needham

April 27th, 2013 at 2:22 pm

Posted in DevOps

Tagged with

Puppet: Package Versions – To pin or not to pin

with 6 comments

Over the last year or so I’ve spent quite a bit of time working with puppet and one of the things that we had to decide when installing packages was whether or not to specify a particular version.

On the first project I worked on we didn’t bother and just let the package manager chose the most recent version.

Therefore if we were installing nginx the puppet code would read like this:

package { 'nginx':
  ensure  => 'present',
}

We can see which version that would install by checking the version table for the package:

$ apt-cache policy nginx
nginx:
  Installed: (none)
  Candidate: 1:1.2.6-1~43~precise1
  Version table:
     1:1.2.6-1~43~precise1 0
        500 http://ppa.launchpad.net/brightbox/ruby-ng/ubuntu/ precise/main amd64 Packages
     1.4.0-1~precise 0
        500 http://nginx.org/packages/ubuntu/ precise/nginx amd64 Packages
     1.1.19-1ubuntu0.1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise-updates/universe amd64 Packages
     1.1.19-1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise/universe amd64 Packages

In this case if we don’t specify a version the Brightbox ‘1:1.2.6-1~43~precise1’ version will be installed.

Running dpkg with the ‘compare-versions’ flag shows us that this version is considered higher than the nginx.org one:

$ dpkg --compare-versions '1:1.2.6-1~43~precise1' gt '1.4.0-1~precise' ; echo $?
0

From what I understand you can pin versions higher up the list by associating a higher number with them but given that all these versions are set to ‘500’ I’m not sure how it decides on the order!

The problem with not specifying a version is that when a new version becomes available the next time puppet runs it will automatically upgrade the version for us.

Most of the time this isn’t a problem but there were a couple of occasions when a version got bumped and something elsewhere stopped working and it took us quite a while to work out what had changed.

The alternative approach is to pin the package installation to a specific version. So if we want the recent 1.4.0 version installed we’d have the following code:

package { 'nginx':
  ensure  => '1.4.0-1~precise',
}

The nice thing about this approach is that we always know which version is going to be installed.

The problem we now introduce is that when an updated version is added to the repository the old one is typically removed which means a puppet run on a new machine will fail because it can’t find the version.

After working with puppet for a few months it becomes quite easy to see when this is the reason for the failure but it creates the perception that ‘puppet is always failing’ for newer people which isn’t so good.

I think on balance I prefer to have the versions explicitly defined because I find it easier to work out what’s going on that way but I’m sure there’s an equally strong argument for just picking the latest version.

Written by Mark Needham

April 27th, 2013 at 1:40 pm

Posted in DevOps

Tagged with

Puppet: Installing Oracle Java – oracle-license-v1-1 license could not be presented

with 5 comments

In order to run the neo4j server on my Ubuntu 12.04 Vagrant VM I needed to install the Oracle/Sun JDK which proved to be more difficult than I’d expected.

I initially tried to install it via the OAB-Java script but was running into some dependency problems and eventually came across a post which specified a PPA that had an installer I could use.

I wrote a little puppet Java module to wrap the commands in:

class java($version) {
  package { "python-software-properties": }
 
  exec { "add-apt-repository-oracle":
    command => "/usr/bin/add-apt-repository -y ppa:webupd8team/java",
    notify => Exec["apt_update"]
  }
 
  package { 'oracle-java7-installer':
    ensure => "${version}",
    require => [Exec['add-apt-repository-oracle']],
  }
}

I then included this in my default node definition:

node default {
  class { 'java': version => '7u21-0~webupd8~0', }
}

(as Dave Yeung points out in the comments, you may need to tweak the version. Running aptitude versions oracle-java7-installer should indicate the latest version.)

Unfortunately when I ran that I ended up with the following error:

err: /Stage[main]/Java/Package[oracle-java7-installer]/ensure: change from purged to present failed: Execution of '/usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install oracle-java7-installer' returned 100: Reading package lists...
Building dependency tree...
Reading state information...
The following extra packages will be installed:
  java-common
Suggested packages:
...
Unpacking oracle-java7-installer (from .../oracle-java7-installer_7u21-0~webupd8~0_all.deb) ...
 
oracle-license-v1-1 license could not be presented
try 'dpkg-reconfigure debconf' to select a frontend other than noninteractive
 
dpkg: error processing /var/cache/apt/archives/oracle-java7-installer_7u21-0~webupd8~0_all.deb (--unpack):
 subprocess new pre-installation script returned error exit status 2
Processing triggers for man-db ...
Errors were encountered while processing:
 /var/cache/apt/archives/oracle-java7-installer_7u21-0~webupd8~0_all.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)

I came across this post on Ask Ubuntu which explained a neat trick for getting around it by making it look like we’ve agreed to the licence. This is done by passing options to debconf-set-selections.

For a real server I guess you’d want some step where a person accepts the licence but since this is just for my hacking it seems to make sense.

My new Java manifest looks like this:

class java($version) {
  package { "python-software-properties": }
 
  exec { "add-apt-repository-oracle":
    command => "/usr/bin/add-apt-repository -y ppa:webupd8team/java",
    notify => Exec["apt_update"]
  }
 
  exec {
    'set-licence-selected':
      command => '/bin/echo debconf shared/accepted-oracle-license-v1-1 select true | /usr/bin/debconf-set-selections';
 
    'set-licence-seen':
      command => '/bin/echo debconf shared/accepted-oracle-license-v1-1 seen true | /usr/bin/debconf-set-selections';
  }
 
  package { 'oracle-java7-installer':
    ensure => "${version}",
    require => [Exec['add-apt-repository-oracle'], Exec['set-licence-selected'], Exec['set-licence-seen']],
  }
}

Written by Mark Needham

April 18th, 2013 at 11:36 pm

Posted in DevOps

Tagged with ,

dpkg/apt-cache: Useful commands

without comments

As I’ve mentioned in a couple of previous posts I’ve been playing around with creating a Vagrant VM that I can use for my neo4j hacking which has involved a lot of messing around with installing apt packages.

There are loads of different ways of working out what’s going on when packages aren’t installing as you’d expect so I thought it’d be good to document the ones I’ve been using so I can find them more easily next time.

Finding reverse dependencies

A couple of times I found myself wondering how a certain package had ended up on the VM because I hadn’t specified that it should be installed so I wanted to know who had!

I wanted to find out the reverse dependency for the package. e.g. to find out who depended on make which we can find out with the following command:

$ apt-cache rdepends make
make
Reverse Depends:
...
  build-essential
  make:i386
  libc6-dev:i386
  open-vm-dkms
  mythbuntu-desktop
  broadcom-sta-source
...

The nice thing about ‘rdepends’ is that it will tell us reverse dependencies even for a package that we haven’t installed. This was helpful here as I had forgotten to install ‘build-essential’ and this made it obvious.

Finding which version of a package is installed

I added one of the Brightbox repositories to get a more recent Ruby version and noticed that something weird was going on with the version of ‘nginx-common’ that puppet was trying to install.

It seemed like one one my dependencies was trying to pull in the ‘latest’ version of ‘nginx-common’ which I’d expected to be ‘1.1.19-1ubuntu0.1’.

By passing the ‘policy’ flag to apt-cache I was able to see that there was a recent version available via Brightbox:

$ apt-cache policy nginx-common
nginx-common:
  Installed: 1.1.19-1ubuntu0.1
  Candidate: 1:1.2.6-1~43~precise1
  Version table:
     1:1.2.6-1~43~precise1 0
        500 http://ppa.launchpad.net/brightbox/ruby-ng/ubuntu/ precise/main amd64 Packages
 *** 1.1.19-1ubuntu0.1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise-updates/universe amd64 Packages
        100 /var/lib/dpkg/status
     1.1.19-1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise/universe amd64 Packages

Finding which versions of a package are available

Another flag that we can pass to apt-cache is ‘madison’ which shows us the available versions for a package but doesn’t indicate which version is installed:

$ apt-cache madison nginx-common
nginx-common | 1:1.2.6-1~43~precise1 | http://ppa.launchpad.net/brightbox/ruby-ng/ubuntu/ precise/main amd64 Packages
nginx-common | 1.1.19-1ubuntu0.1 | http://us.archive.ubuntu.com/ubuntu/ precise-updates/universe amd64 Packages
nginx-common |   1.1.19-1 | http://us.archive.ubuntu.com/ubuntu/ precise/universe amd64 Packages
     nginx |   1.1.19-1 | http://us.archive.ubuntu.com/ubuntu/ precise/universe Sources
     nginx | 1.1.19-1ubuntu0.1 | http://us.archive.ubuntu.com/ubuntu/ precise-updates/universe Sources
     nginx | 1:1.2.6-1~43~precise1 | http://ppa.launchpad.net/brightbox/ruby-ng/ubuntu/ precise/main Sources

Finding which package a file belongs to

At some stage I wanted to check which exact package was installing nginx which I was able to do with the following command:

$ dpkg -S `which nginx`
nginx-extras: /usr/sbin/nginx

I had installed ‘nginx-common’ which I learn depends on ‘nginx-extras’ by using our ‘rdepends’ command:

$ apt-cache rdepends nginx-extras
nginx-extras
Reverse Depends:
  nginx-naxsi:i386
...
  nginx-common

Finding the dependencies of a package

I wanted to check the dependencies of the ‘ruby1.9.1’ package to see whether or not I needed to explicitly install ‘libruby1.9.1’ or if that would be taken care of.

Passing the ‘-s’ flag to dpkg let me check this:

$ dpkg -s ruby1.9.1
Package: ruby1.9.1
Status: install ok installed
Architecture: amd64
Version: 1:1.9.3.327-1bbox2~precise1
Replaces: irb1.9.1, rdoc1.9.1, rubygems1.9.1
Provides: irb1.9.1, rdoc1.9.1, ruby-interpreter, rubygems1.9.1
Depends: libruby1.9.1 (= 1:1.9.3.327-1bbox2~precise1), libc6 (>= 2.2.5)
Suggests: ruby1.9.1-examples, ri1.9.1, graphviz, ruby1.9.1-dev, ruby-switch
Conflicts: irb1.9.1 (<< 1.9.1.378-2~), rdoc1.9.1 (<< 1.9.1.378-2~), ri (<= 4.5), ri1.9.1 (<< 1.9.2.180-3~), ruby (<= 4.5), rubygems1.9.1
...

These are the ones that I’ve found useful so far. I’d love to here other people’s favourites though as I’m undoubtably missing some.

Written by Mark Needham

April 18th, 2013 at 9:54 pm

Posted in DevOps

Tagged with , ,

Puppet Debt

without comments

I’ve been playing around with a puppet configuration to run a neo4j server on an Ubuntu VM and one thing that has been quite tricky is getting the Sun/Oracle Java JDK to install repeatably.

I adapted Julian’s Java module which uses OAB-Java and although it was certainly working cleanly at one stage I somehow ended up with it not working because of failed dependencies:

[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns:  [x] Installing Java build requirements Ofailed
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns: ^[[m^O [i] Showing the last 5 lines from the logfile (/root/oab-java.sh.log)...
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns:  nginx-common
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns:  nginx-extras
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns: E: Sub-process /usr/bin/dpkg returned an error code (1)
...
[2013-04-12 07:03:10] Warning: /Stage[main]/Java/Package[sun-java6-jdk]: Skipping because of failed dependencies
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[default JVM]: Dependency Exec[install OAB repo] has failures: true
[2013-04-12 07:03:10] Warning: /Stage[main]/Java/Exec[default JVM]: Skipping because of failed dependencies

I spent a few hours looking at this problem but couldn’t quite figure out how to sort out the dependency problem and ended up running part one command manually after which applying puppet again worked.

Obviously this is a bit of a cop out because ideally I’d like it to be possible to spin up the VM in one puppet run without manual intervention.

A couple of days ago I was discussing the problem with Ashok and he suggested that it was probably good to know when I could defer fixing the problem to a later stage since having a completely automated spin up isn’t my highest priority.

i.e. when I could take on what he referred to as ‘Puppet debt

I think this is a reasonable way of looking at things and I have worked on projects where we’ve been baffled by puppet’s dependency graph and have setup scripts which run puppet twice until we have time to sort it out.

If we’re spinning up new instances frequently then we have less ability to take on this type of debt because it’s going to hurt us much more but if not then I think it is reasonable to defer the problem.

This feels like another type of technical debt to me but I’d be interested in others’ thoughts and whether I’m just a complete cop out!

Written by Mark Needham

April 16th, 2013 at 8:57 pm

Posted in DevOps

Tagged with ,

Capistrano: Host key verification failed. ** [err] fatal: The remote end hung up unexpectedly

with 3 comments

As I mentioned in my previous post I’ve been deploying a web application to a vagrant VM using Capistrano and my initial configuration was like so:

require 'capistrano/ext/multistage'
 
set :application, "thinkingingraphs"
set :scm, :git
set :repository,  "git@bitbucket.org:markhneedham/thinkingingraphs.git"
set :scm_passphrase, ""
 
set :ssh_options, {:forward_agent => true, :paranoid => false, keys: ['~/.vagrant.d/insecure_private_key']}
set :stages, ["vagrant"]
set :default_stage, "vagrant"
 
set :user, "vagrant"
server "192.168.33.101", :app, :web, :db, :primary => true
set :deploy_to, "/var/www/thinkingingraphs"

When I ran ‘cap deploy’ I ended up with the following error:

  * executing "git clone -q git@bitbucket.org:markhneedham/thinkingingraphs.git /var/www/thinkingingraphs/releases/20130414171523 && cd /var/www/thinkingingraphs/releases/20130414171523 && git checkout -q -b deploy 6dcbf945ef5b8a5d5d39784800f4a6b7731c7d8a && (echo 6dcbf945ef5b8a5d5d39784800f4a6b7731c7d8a > /var/www/thinkingingraphs/releases/20130414171523/REVISION)"
    servers: ["192.168.33.101"]
    [192.168.33.101] executing command
 ** [192.168.33.101 :: err] Host key verification failed.
 ** [192.168.33.101 :: err] fatal: The remote end hung up unexpectedly

As far as I can tell the reason for this is that bitbucket hasn’t been verified as a host by the VM and therefore the equivalent of the following happens when it tries to clone the repository:

$ ssh git@bitbucket.org
The authenticity of host 'bitbucket.org (207.223.240.182)' can't be established.
RSA key fingerprint is 97:8c:1b:f2:6f:14:6b:5c:3b:ec:aa:46:46:74:7c:40.
Are you sure you want to continue connecting (yes/no)?

Since we aren’t answering ‘yes’ to that question and bitbucket isn’t in our ~/.ssh/known_hosts file it’s not able to continue.

One solution to this problem is to run the ssh command above and then answer ‘yes’ to the question which will add bitbucket to our known_hosts file and we can then run ‘cap deploy’ again.

It’s a bit annoying to have that manual step though so another way is to set cap to use pty by putting the following line in our config file:

set :default_run_options, {:pty => true}

Now when we run ‘cap deploy’ we can see that bitbucket automatically gets added to the known_hosts file:

    servers: ["192.168.33.101"]
    [192.168.33.101] executing command
 ** [192.168.33.101 :: out] The authenticity of host 'bitbucket.org (207.223.240.181)' can't be established.
 ** RSA key fingerprint is 97:8c:1b:f2:6f:14:6b:5c:3b:ec:aa:46:46:74:7c:40.
 ** Are you sure you want to continue connecting (yes/no)?
 ** [192.168.33.101 :: out] yes
 ** [192.168.33.101 :: out] Warning: Permanently added 'bitbucket.org,207.223.240.181' (RSA) to the list of known hosts.

As far as I can tell this runs the command using a pseudo terminal and then automatically adds bitbucket into the known_hosts file but I’m not entirely sure how that works. My google skillz have also failed me so if anyone can explain it to me that’d be cool

Written by Mark Needham

April 14th, 2013 at 6:18 pm

Posted in DevOps

Tagged with

Capistrano: Deploying to a Vagrant VM

with 2 comments

I’ve been working on a tutorial around thinking through problems in graphs using my football graph and I wanted to deploy it on a local vagrant VM as a stepping stone to deploying it in a live environment.

My Vagrant file for the VM looks like this:

# -*- mode: ruby -*-
# vi: set ft=ruby :
 
Vagrant::Config.run do |config|
  config.vm.box = "precise64"
 
  config.vm.define :neo01 do |neo|
    neo.vm.network :hostonly, "192.168.33.101"
    neo.vm.host_name = 'neo01.local'
    neo.vm.forward_port 7474, 57474
    neo.vm.forward_port 80, 50080
  end
 
  config.vm.box_url = "http://files.vagrantup.com/precise64.box"
 
  config.vm.provision :puppet do |puppet|
    puppet.manifests_path = "puppet/manifests"
    puppet.manifest_file  = "site.pp"
    puppet.module_path = "puppet/modules"
  end
end

I’m port forwarding ports 80 and 7474 to 50080 and 57474 respectively so that I can access the web app and neo4j console from my browser.

There is a bunch of puppet code to configure the machine in the location specified.

Since the web app is written in Ruby/Sinatra the easiest deployment tool to use is probably capistrano and I found the tutorial on the beanstalk website really helpful for getting me setup.

My config/deploy.rb file which I’ve got Capistrano setup to read looks like this:

require 'capistrano/ext/multistage'
 
set :application, "thinkingingraphs"
set :scm, :git
set :repository,  "git@bitbucket.org:markhneedham/thinkingingraphs.git"
set :scm_passphrase, ""
 
set :ssh_options, {:forward_agent => true}
set :default_run_options, {:pty => true}
set :stages, ["vagrant"]
set :default_stage, "vagrant"

In my config/deploy/vagrant.rb file I have the following:

set :user, "vagrant"
server "192.168.33.101", :app, :web, :db, :primary => true
set :deploy_to, "/var/www/thinkingingraphs"

So that IP there is the same one that I assigned in Vagrantfile. If you didn’t do that then you’d need to use ‘vagrant ssh’ to go onto the VM and then ‘ifconfig’ to grab the IP instead.

I figured there was probably another step required to tell Capistrano where it should get the vagrant public key from but I thought I’d try and deploy anyway just to see what would happen.

$ bundle exec cap deploy

It asked me to enter the vagrant user’s password which is ‘vagrant’ by default and I eventually found a post on StackOverflow which suggested changing the ‘ssh_options’ to the following:

set :ssh_options, {:forward_agent => true, keys: ['~/.vagrant.d/insecure_private_key']}

And with that the deployment worked flawlessly! Happy days.

Written by Mark Needham

April 13th, 2013 at 11:17 am

Posted in DevOps

Tagged with

Treating servers as cattle, not as pets

with 2 comments

Although I didn’t go to Dev Ops Days London earlier in the year I was following the hash tag on twitter and one of my favourites things that I read was the following:

“Treating servers as cattle, not as pets” #DevOpsDays

I think this is particularly applicable now that a lot of the time we’re using virtualised production environments via AWS, Rackspace or <insert-cloud-provider-here>.

At uSwitch we use AWS and over the last week Sid and I spent some time investigating a memory leak by running our applications against two different versions of Ruby.

One of them was from the Brightbox repository and the other was custom built but they had annoyingly different puppet configurations so we decided to treat them as separate machine types.

We spun up one of the custom built Ruby nodes and put it in the load balancer alongside 11 of the other node types and left it for the day serving traffic.

The next day we had look at the New Relic memory consumption for both node types and it was clear that the custom built one’s memory usage was climbing much more slowly than the other one.

Instead of trying to work out how to change the Ruby version of the 11 existing nodes we realised it would probably be quicker to just spin up 11 new ones with the custom built Ruby and swap them with the existing ones.

This was pretty much as easy as removing the existing nodes from the load balancer and putting the new ones in although we do have one ‘special’ machine which runs some background jobs.

We needed to make sure there weren’t any jobs on its queue that hadn’t been processed and then make sure that we tagged one of the new machines so that they could take over that role.

One thing that made it particularly easy for us to do this is that spin up of new VMs is extremely quick and completely automated including the installation and start up of applications.

The only manual step we have is to put the new nodes into the load balancer which I think works ok as a manual step because it gives us a chance to quickly scan the box and check everything spun up correctly.

We install all packages/configuration on nodes using puppet headless which makes spin up easier than if you use server/client mode where you have to coordinate node registration with the master on spin up.

I do like this philosophy to machines and although I’m sure it doesn’t apply to all situations we’re almost at the point where if something breaks on a node we might as well spin up a new one while we’re investigating and see which finishes first!

Written by Mark Needham

April 7th, 2013 at 11:41 am

Posted in DevOps

Tagged with

Incrementally rolling out machines with a new puppet role

with one comment

Last week Jason and I with (a lot of) help from Tim have been working on moving several of our applications from Passenger to Unicorn and decided that the easiest way to do this was to create a new set of nodes with this setup.

The architecture we’re working with looks like this at a VM level:

Architecture

The ‘nginx LB’ nodes are responsible for routing all the requests to their appropriate application servers and the ‘web’ nodes serve the different applications initially using Passenger.

We started off by creating a new ‘nginx LB’ node which we pointed to a new ‘web ELB’ and just put one ‘unicorn web’ node behind it so that we could test everything was working.

We then pointed ‘www.uswitch.com’ at the IP of our new ‘nginx LB’ node in our /etc/hosts file and checked that the main flows through the various applications were working correctly.

Once we were happy this was working correctly we increased the number of ‘unicorn web’ nodes to three and then repeated our previous checks while tailing the log files across the three machines to make sure everything was ok.

The next step was to send some of the real traffic to the new nodes and check whether they were able to handle it.

Initially we thought that we could put our ‘unicorn web’ nodes alongside the ‘web’ nodes but we realised that we’d made some changes on our new ‘nginx LB’ nodes which meant that the ‘unicorn web’ nodes needed to receive requests proxies through there rather than from the old style node.

A combination of Jason and Sid came up with the idea of just plugging our new ‘nginx LB’ into the ‘nginx ELB’ and having the processing of the whole request treated separately.

Our intermediate architecture therefore looked like this:

Arhictecture rollover

We initially served 1/4 of the requests from the Unicorn and watched the performance of the nodes via New Relic to check that everything was working expected.

One thing we did notice was that the CPU usage on the Unicorn nodes was really high because we’d set up each Unicorn process with 5 workers which meant that we had 25 workers on the VM in total. In comparison our Passenger instances used 5 workers in total.

Once we’d sorted that out we removed one of the ‘nginx LB’ nodes from the ‘nginx ELB’ and served 1/3 of the traffic from our new stack.

We didn’t see any problems so we removed all the ‘nginx LB’ nodes and served all the traffic from our new stack for half an hour.

Again we didn’t notice any problems so our next step before we can decommission the old nodes is to run the new stack for a day and iron out any problems before using it for real.

Written by Mark Needham

March 24th, 2013 at 10:52 pm

Posted in DevOps

Tagged with

Understanding what lsof socket/port aliases refer to

with 2 comments

Earlier in the week we wanted to check which ports were being listened on and by what processes which we can do with the following command on Mac OS X:

$ lsof -ni | grep LISTEN
idea       2398 markhneedham   58u  IPv6 0xac8f13f77b903331      0t0  TCP *:49410 (LISTEN)
idea       2398 markhneedham   65u  IPv6 0xac8f13f7799a4af1      0t0  TCP *:58741 (LISTEN)
idea       2398 markhneedham  122u  IPv6 0xac8f13f7799a4711      0t0  TCP 127.0.0.1:6942 (LISTEN)
idea       2398 markhneedham  249u  IPv6 0xac8f13f777586711      0t0  TCP *:63342 (LISTEN)
idea       2398 markhneedham  253u  IPv6 0xac8f13f777586331      0t0  TCP 127.0.0.1:63342 (LISTEN)
java      16973 markhneedham  152u  IPv6 0xac8f13f777586af1      0t0  TCP *:56471 (LISTEN)
java      16973 markhneedham  154u  IPv6 0xac8f13f779e6b711      0t0  TCP *:menandmice-dns (LISTEN)
java      16973 markhneedham  168u  IPv6 0xac8f13f77b902f51      0t0  TCP 127.0.0.1:7474 (LISTEN)
java      16973 markhneedham  171u  IPv6 0xac8f13f77b013711      0t0  TCP 127.0.0.1:7473 (LISTEN)

One of the interesting things about this output is that for the most part it shows the port number and which IPs it will accept a connection from but sometimes it uses a socket/port alias.

In this case we can see that the 3rd last line refers to ‘menandmice-dns’ but others could be ‘http-alt’ or ‘mysql’.

We can find out what port those names refer to by looking in /etc/services:

$ cat /etc/services | grep menandmice-dns
menandmice-dns  1337/udp    # menandmice DNS
menandmice-dns  1337/tcp    # menandmice DNS
$ cat /etc/services | grep http-alt
http-alt	591/udp     # FileMaker, Inc. - HTTP Alternate (see Port 80)
http-alt	591/tcp     # FileMaker, Inc. - HTTP Alternate (see Port 80)
http-alt	8008/udp     # HTTP Alternate
http-alt	8008/tcp     # HTTP Alternate
http-alt	8080/udp     # HTTP Alternate (see port 80)
http-alt	8080/tcp     # HTTP Alternate (see port 80)

There’s a massive XML document on the IANA website with a full list of the port assignments which is presumably where /etc/services is derived from.

Written by Mark Needham

March 17th, 2013 at 2:00 pm

Posted in DevOps

Tagged with ,