Mark Needham

Thoughts on Software Development

Archive for the ‘puppet’ tag

Puppet: Package Versions – To pin or not to pin

with 6 comments

Over the last year or so I’ve spent quite a bit of time working with puppet and one of the things that we had to decide when installing packages was whether or not to specify a particular version.

On the first project I worked on we didn’t bother and just let the package manager chose the most recent version.

Therefore if we were installing nginx the puppet code would read like this:

package { 'nginx':
  ensure  => 'present',
}

We can see which version that would install by checking the version table for the package:

$ apt-cache policy nginx
nginx:
  Installed: (none)
  Candidate: 1:1.2.6-1~43~precise1
  Version table:
     1:1.2.6-1~43~precise1 0
        500 http://ppa.launchpad.net/brightbox/ruby-ng/ubuntu/ precise/main amd64 Packages
     1.4.0-1~precise 0
        500 http://nginx.org/packages/ubuntu/ precise/nginx amd64 Packages
     1.1.19-1ubuntu0.1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise-updates/universe amd64 Packages
     1.1.19-1 0
        500 http://us.archive.ubuntu.com/ubuntu/ precise/universe amd64 Packages

In this case if we don’t specify a version the Brightbox ‘1:1.2.6-1~43~precise1’ version will be installed.

Running dpkg with the ‘compare-versions’ flag shows us that this version is considered higher than the nginx.org one:

$ dpkg --compare-versions '1:1.2.6-1~43~precise1' gt '1.4.0-1~precise' ; echo $?
0

From what I understand you can pin versions higher up the list by associating a higher number with them but given that all these versions are set to ‘500’ I’m not sure how it decides on the order!

The problem with not specifying a version is that when a new version becomes available the next time puppet runs it will automatically upgrade the version for us.

Most of the time this isn’t a problem but there were a couple of occasions when a version got bumped and something elsewhere stopped working and it took us quite a while to work out what had changed.

The alternative approach is to pin the package installation to a specific version. So if we want the recent 1.4.0 version installed we’d have the following code:

package { 'nginx':
  ensure  => '1.4.0-1~precise',
}

The nice thing about this approach is that we always know which version is going to be installed.

The problem we now introduce is that when an updated version is added to the repository the old one is typically removed which means a puppet run on a new machine will fail because it can’t find the version.

After working with puppet for a few months it becomes quite easy to see when this is the reason for the failure but it creates the perception that ‘puppet is always failing’ for newer people which isn’t so good.

I think on balance I prefer to have the versions explicitly defined because I find it easier to work out what’s going on that way but I’m sure there’s an equally strong argument for just picking the latest version.

Written by Mark Needham

April 27th, 2013 at 1:40 pm

Posted in DevOps

Tagged with

Puppet Debt

without comments

I’ve been playing around with a puppet configuration to run a neo4j server on an Ubuntu VM and one thing that has been quite tricky is getting the Sun/Oracle Java JDK to install repeatably.

I adapted Julian’s Java module which uses OAB-Java and although it was certainly working cleanly at one stage I somehow ended up with it not working because of failed dependencies:

[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns:  [x] Installing Java build requirements Ofailed
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns: ^[[m^O [i] Showing the last 5 lines from the logfile (/root/oab-java.sh.log)...
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns:  nginx-common
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns:  nginx-extras
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[install OAB repo]/returns: E: Sub-process /usr/bin/dpkg returned an error code (1)
...
[2013-04-12 07:03:10] Warning: /Stage[main]/Java/Package[sun-java6-jdk]: Skipping because of failed dependencies
[2013-04-12 07:03:10] Notice: /Stage[main]/Java/Exec[default JVM]: Dependency Exec[install OAB repo] has failures: true
[2013-04-12 07:03:10] Warning: /Stage[main]/Java/Exec[default JVM]: Skipping because of failed dependencies

I spent a few hours looking at this problem but couldn’t quite figure out how to sort out the dependency problem and ended up running part one command manually after which applying puppet again worked.

Obviously this is a bit of a cop out because ideally I’d like it to be possible to spin up the VM in one puppet run without manual intervention.

A couple of days ago I was discussing the problem with Ashok and he suggested that it was probably good to know when I could defer fixing the problem to a later stage since having a completely automated spin up isn’t my highest priority.

i.e. when I could take on what he referred to as ‘Puppet debt

I think this is a reasonable way of looking at things and I have worked on projects where we’ve been baffled by puppet’s dependency graph and have setup scripts which run puppet twice until we have time to sort it out.

If we’re spinning up new instances frequently then we have less ability to take on this type of debt because it’s going to hurt us much more but if not then I think it is reasonable to defer the problem.

This feels like another type of technical debt to me but I’d be interested in others’ thoughts and whether I’m just a complete cop out!

Written by Mark Needham

April 16th, 2013 at 8:57 pm

Posted in DevOps

Tagged with ,

puppetdb: Failed to submit ‘replace catalog’ command for client to PuppetDB at puppetmaster:8081: [500 Server Error]

with 2 comments

I’m still getting used to the idea of following the logs when working out what’s going wrong with distributed systems but it worked well when trying to work out why our puppet client which was throwing this error when we ran ‘puppet agent -tdv’:

err: Could not retrieve catalog from remote server: Error 400 on SERVER: Failed to submit 'replace catalog' command for client to PuppetDB at puppetmaster:8081: [500 Server Error]

We were seeing the same error in /var/log/syslog on the puppet master and a quick look at the process list didn’t show that the puppet master or puppetdb services were under a particularly heavy load.

Eventually we ended up looking at the puppetdb logs which showed that it was running out of memory:

/var/log/puppetdb/puppetdb.log

2012-08-15 17:48:38,535 WARN  [qtp814855969-66] [server.AbstractHttpConnection] /commands
java.lang.OutOfMemoryError: Java heap space

The default /etc/default/puppetdb only has 192MB of heap space so we upped that to 1GB and problem solved! (at least for now)

/etc/default/puppetdb

###########################################
# Init settings for puppetdb
###########################################
 
# Location of your Java binary (version 6 or higher)
JAVA_BIN="/usr/bin/java"
 
# Modify this if you'd like to change the memory allocation, enable JMX, etc
JAVA_ARGS="-Xmx1024m"
 
# These normally shouldn't need to be edited if using OS packages
USER="puppetdb"
INSTALL_DIR="/usr/share/puppetdb"
CONFIG="/etc/puppetdb/conf.d"

Written by Mark Needham

August 16th, 2012 at 11:31 pm

Posted in Software Development

Tagged with ,

Puppet: Keeping the discipline

without comments

For the last 5 weeks or so I’ve been working with puppet every day to automate the configuration of various nodes in our stack and my most interesting observation so far is that you really need to keep your discipline when doing this type of work.

We can keep that discipline in three main ways when developing modules.

Running from scratch

Configuring various bits of software seems to follow the 80/20 rule and we get very close to having each thing working quite quickly but then end up spending a disproportionate amount of time tweaking the last little bits.

When it’s ‘just little bits’ there is a massive temptation to make changes manually and then retrospectively put it into a puppet module and assume that it will work if we run from scratch.

From doing that a few times and seeing a module fail when run at a later stage we’ve realised it isn’t a very good approach and we’re started to get into the discipline of starting from scratch each time.

Ideally that would mean creating snapshots of a Virtual Machine instance and rolling back after each run but that ends up taking quite a bit of time so more often we’re removing everything that a module installs and then running that module again.

Running as part of environment

We tend to run the modules individually when we start writing them but it’s also important to run them as part of an environment definition as well to make sure modules play nicely with each other.

This is particularly useful if there are dependencies between our module and another one – perhaps our module will be notified if there’s a configuration file change in another module for example.

Running on different machines

Even if we do the above two things we’ve still found that we catch problems with our modules when we try them out on other machines.

Machines which have been in use for a longer period of time will have earlier versions of a module which aren’t easy to upgrade and we’ll need to remove the old version and install the new one.

We’ve also come across occasions where other machines have left over files from modules that have since been removed and although the files seem harmless they can interfere with new modules that we’re writing.

Written by Mark Needham

July 29th, 2012 at 9:53 pm