Testing XML generation with vimdiff

A couple of weeks ago I spent a bit of time writing a Ruby DSL to automate the setup of load balancers, firewall and NAT rules through the VCloud API.

The VCloud API deals primarily in XML so the DSL is just a thin layer which creates the appropriate mark up.

When we started out we configured everything manually through the web console and then exported the XML so the first thing that the DSL needed to do was create XML that matched what we already had.

My previous experience using testing frameworks to do this is that they’ll tell you whether the XML you’ve generated is equivalent to your expected XML but if they differ it isn’t easy to work out what was different.

I therefore decided to use a poor man’s approach where I first copied one rule into an XML file, attempted to replicate that in the DSL, and then used vimdiff to compare the files.

Although I had to manually verify whether or not the code was working I found this approach useful as any differences between the two pieces of XML were very easy to see.

90% of the rules were almost identical so I focused on the 10% that were different and once I’d got those working it was reasonably plain sailing.

My vimdiff command read like this:

ruby generate_networking_xml.rb > bar && vimdiff -c 'set diffopt+=iwhite' bar initialFirewall.xml

After I was reasonably confident that I understood the way that the XML should be generated I created an Rspec test which checked that we could correctly create all of the existing configurations using the DSL.

While discussing this approach with Jen she suggested that an alternative would be to start with a Rspec test with most of the rules hard coded in XML and then replace them one by one with the DSL.

I think that probably does make more sense but I still quite like my hacky approach as well!

Testing: Trying not to overdo it

The design of the code which contains the main logic of the application that I’m currently working on looks a bit like the diagram on the right hand side:

Orchestration code

We load a bunch of stuff from an Oracle database, construct some objects from the data and then invoke a sequence of methods on those objects in order to execute our domain logic.

Typically we might expect to see unit level test against all the classes described in this diagram but we’ve actually been trying out an approach where we don’t test the orchestration code directly but rather only test it via the resource which makes use of it.

We originally started off writing some tests around that code but they ended up being really similar to our database and resource tests.

Having them around also made it difficult to change the way the orchestration worked since we’d end up breaking most of the tests when we tried to change anything.

One disadvantage of not testing this code is that we end up using the debugger more when trying to work out why resource tests aren’t working since we now have more code being directly tested.

Orchestration tests2

On the other hand we’ve been forced to drive logic into the domain objects as a result since we don’t have any other place to test that functionality from.

Testing directly against the domain objects is much easier since everything’s in memory and we can easily setup the data to be how we want it to be and inject it into the objects.

Another approach we could have taken would be to mock out the dependencies of the orchestration code but since this code is mostly coordinating other classes there are a lot of dependencies and the tests ended up being quite complicated and brittle.

Initially I was of the opinion that it wasn’t a good idea to not test the orchestration code but looking back a month later I think it’s working reasonably well and putting this constraint on ourselves has made the code easier to change while still being well tested.

A new found respect for acceptance tests

On the project that I’ve been working on over the past few months one of the key benefits of the application was its ability to perform various calculations based on user input.

In order to check that these calculators are producing the correct outputs we created a series of acceptance tests that ran directly against one of the objects in the system.

We did this by defining the inputs and expected outputs for each scenario in an excel spreadsheet which we converted into a CSV file before reading that into an NUnit test.

It looked roughly like this:


We found that testing the calculations like this gave us a quicker feedback cycle than testing them from UI tests both in terms of the time taken to run the tests and the fact that we were able to narrow in on problematic areas of the code more quickly.

As I mentioned on a previous post we’ve been trying to move the creation of the calculators away from the ‘CalculatorProvider’ and ‘CalculatorFactory’ so that they’re all created in one place based on a DSL which describes the data required to initialise a calculator.

In order to introduce this DSL into the code base these acceptance tests acted as our safety net as we pulled out the existing code and replaced it with the new DSL.


We had to completely rewrite the ‘CalculationService’ unit tests so those unit tests didn’t provide us much protection while we made the changes I described above.

The acceptance tests, on the other hand, were invaluable and saved us from incorrectly changing the code even when we were certain we’d taken such small steps along the way that we couldn’t possibly have made a mistake.

This is certainly an approach I’d use again in a similar situation although it could probably be improved my removing the step where we convert the data from the spreadsheet to CSV file.

TDD: Driving from the assertion up

About a year ago I wrote a post about a book club we ran in Sydney covering ‘The readability of tests’ from Steve Freeman and Nat Pryce’s book in which they suggest that their preferred way of writing tests is to drive them from the assertion up:

Write Tests Backwards

Although we stick to a canonical format for test code, we don’t necessarily write tests from top to bottom. What we often do is: write the test name, which helps us decide what we want to achieve; write the call to the target code, which is the entry point for the feature; write the expectations and assertions, so we know what effects the feature should have; and, write the setup and teardown to define the context for the test. Of course, there may be some blurring of these steps to help the compiler, but this sequence reflects how we tend to think through a new unit test. Then we run it and watch it fail.

At the time I wasn’t necessarily convinced that this was the best way to drive but we came across an interesting example today where that approach might have been beneficial.

The test in question was an integration test and we were following the approach of saving the test object directly through the NHibernate session and then loading it again through a repository.

We started the test from the setup of the data and decided to get the mappings and table setup in order to successfully persist the test object first. We didn’t write the assertion or repository call in the test initially.

Having got that all working correctly we got back to our test and wrote the rest of it only to realise as we drove out the repository code that we actually needed to create a new object which would be a composition of several objects including our original test object.

We wanted to retrieve a ‘Foo’ by providing a key and a date – we would retrieve different values depending on the values we provided for those parameters.

This is roughly what the new object looked like:

public class FooRecord
   public Foo Foo { get; set; }
   public FooKey FooKey { get; set; }
   public DateTime OnDate { get; set; } 

‘FooRecord’ would need to be saved to the session although we would still retrieve ‘Foo’ from the repository having queried the database for the appropriate one.

public class FooRepository
   public Foo Find(Date onDate, FooKey fooKey)
      // code to query NHibernate which retrieves FooRecords
      // and then filters those to find the one we want

We wouldn’t necessarily have discovered this more quickly if we’d driven from the assertion because we’d still have had to start driving the implementation with an incomplete test to avoid any re-work.

I think it would have been more likely that we’d have seen the problem though.

Late integration: Some thoughts

John Daniels has an interesting post summarising GOOSgaggle, an event run a few weeks ago where people met up to talk about the ideas in ‘Growing Object Oriented Software, Guided by Tests‘.

It’s an interesting post and towards the end he states the following:

Given these two compelling justifications for starting with end-to-end tests, why is it that many people apparently don’t start there? We came up with two possibilities, although there may be many others:

      Starting with the domain model can provide an illusion of rapid progress. You can show business features working while ignoring the realities of the larger system environment. Clearly, this is not normally an approach that addresses the biggest risks first. But it’s an easy option and attractive when you’re under pressure.
      For some reason the system environment is not available to you; perhaps, for example, the team creating the infrastructure is late delivering. So rather than taking the correct – and brave – option of loudly declaring progress on your project to be blocked, you restrict yourself to creating those parts of the system that are within your control.

My first thought on reading this was that it seems like a reasonable thing to say but what if we can’t do anything about the fact that it’s blocked?

I’ve previously worked on projects where we’ve had components that are integral to the whole application delivered late by another team and despite doing exactly as John suggests we’ve struggled to influence that situation.

In terms of systems thinking it might be said that we didn’t have sufficient leverage to change the system we were operating within to be the way that we wanted it.

Either way we were in the situation we could just sit there and stop doing anything or we could keep going and then accept that there would be some difficult late integration work later on.

We decided to go with the second option and we had exactly those illusions of ‘rapid progress’ until we actually had to integrate with those components.

However, we did made it clear that there was going to be a high cost with respect to re-work of the code from late integration. That was accepted as being a limitation of the system that we were working within and although the situation did improve slightly as time went on we never completely fixed it.

In an ideal world it would be good if just shouting loudly that you were blocked was enough to make everything fine but in reality it just becomes very repetitive and annoying!

TDD: Consistent test structure

While pairing with Damian we came across the fairly common situation where we’d written two different tests – one to handle the positive case and one the negative case.

While tidying up the tests after we’d got them passing we noticed that the test structure wasn’t exactly the same. The two tests looked a bit like this:

public void ShouldSetSomethingIfWeHaveAFoo()
	var aFoo = FooBuilder.Build.WithBar("bar").WithBaz("baz").AFoo();
	// some random setup
	// some stubs/expectations
	var result = new Controller(...).Submit(aFoo);
	Assert.That(result.HasFoo, Is.True);
public void ShouldNotSetSomethingIfWeDoNotHaveAFoo()
	// some random setup
	// some stubs/expectations
	var result = new Controller(...).Submit(null);
	Assert.That(result.HasFoo, Is.False);

There isn’t a great deal of difference between these two bits of code but the structure of the test isn’t the same because I inlined the ‘aFoo’ variable in the second test.

Damian pointed out that if we were just glancing at the tests in the future it would be much easier for us if the structure was exactly the same. This would mean that we would immediately be able to identify what the test was supposed to be doing and why.

In this contrived example we would just need to pull out the ‘null’ into a descriptive variable:

public void ShouldNotSetSomethingIfWeDoNotHaveAFoo()
	var noFoo = null;
	// some random setup
	// some stubs/expectations
	var result = new Controller(...).Submit(noFoo);
	Assert.That(result.HasFoo, Is.False);

Although this is a simple example I’ve been trying to follow this guideline wherever possible and my tests now tend to have the following structure:

public void ShouldShowTheStructureOfMarksTests()
	// The test data that's important for the test
	// Less important test data
	// Expectation/Stub setup
	// Call to object under test
	// Assertions

As a neat side effect I’ve also noticed that it seems to be easier to spot duplication that we can possibly extract with this approach as well.

Preventing systematic errors: An example

James Shore has an interesting recent blog post where he describes some alternatives to over reliance on acceptance testing and one of the ideas that he describes is fixing the process whenever a bug is found in exploratory testing.

He describes two ways of preventing bugs from making it through to exploratory testing:

  • Make the bug impossible
  • Catch the bug automatically

Sometimes we can prevent defects by changing the design of our system so that type of defect is impossible. For example, if find a defect that’s caused by mismatch between UI field lengths and database field lengths, we might change our build to automatically generate the UI field lengths from database metadata.

When we can’t make defects impossible, we try to catch them automatically, typically by improving our build or test suite. For example, we might create a test that looks at all of our UI field lengths and checks each one against the database.

We had an example of the latter this week around some code which loads rules out of a database and then tries to map those rules to classes in the code through use of reflection.

For example a rule might refer to a specific property on an object so if the name of the property in the database doesn’t match the name of the property on the object then we end up with an exception.

This hadn’t happened before because we hadn’t been making many changes to the names of those properties and when we did people generally remembered that if they changed the object then they should change the database script as well.

Having that sort of manual step always seems a bit risky to me since it’s prone to human error, so having worked out what was going on we wrote a couple of integration tests to ensure that every property in the database matched up with those in the code.

We couldn’t completely eliminate this type of bug in this case because the business wanted to have the rules configurable on the fly via the database.

It perhaps seems quite obvious that we should look to write these types of tests to shorten the feedback loop and allow us to catch problems earlier than we otherwise would but it’s easy to forget to do this so James’ post provides a good reminder!

TDD: Only mock types you own

Liz recently posted about mock objects and the original ‘mock roles, not objects‘ paper and one thing that stood out for me is the idea that we should only mock types that we own.

I think this is quite an important guideline to follow otherwise we can end up in a world of pain.

One area which seems particularly vulnerable to this type of thing is when it comes to testing code which interacts with Hibernate.

A common pattern that I’ve noticed is to create a mock for the ‘EntityManager‘ and then verify that certain methods on it were called when we persist or load an object for example.

There are a couple of reasons why doing this isn’t a great idea:

  1. We have no idea what the correct method calls are in the first place so we’re just guessing based on looking through the Hibernate code and selecting the methods that we think make it work correctly.
  2. If the library code gets changed then our tests break even though functionally the code might still work

The suggestion in the paper when confronted with this situation is to put a wrapper around the library and then presumably test that the correct methods were called on the wrapper.

Programmers should not write mocks for fixed types, such as those defined by the runtime or external libraries. Instead they should write thin wrappers to implement the application abstractions in terms of the underlying infrastructure. Those wrappers will have been defined as part of a need-driven test.

I’ve never actually used that approach but I’ve found that with Hibernate in particular it makes much more sense to write functional tests which verify the expected behaviour of using the library.

With other libraries which perhaps don’t have side effects like Hibernate does those tests would be closer to unit tests but the goal is still to test the result that we get from using the library rather than being concerned with the way that the library achieves that result.

TDD: Combining the when and then steps

I’ve written before about my favoured approach of writing tests in such a way that they have clear ‘Given/When/Then’ sections and something which I come across quite frequently is tests where the latter steps have been combined into one method call which takes care of both of these.

An example of this which I came across recently was roughly like this:

public void shouldCalculatePercentageDifferences() {
	verifyPercentage(50, 100, 100);
	verifyPercentage(100, 100, 0);
	verifyPercentage(100, 50, -50);
private void verifyPercentage(int originalValue, int newValue, int expectedValue) {
	assertEquals(expectedValue, new PercentageCalculator().calculatePercentage(originalValue, newValue));

This code is certainly adhering to the DRY principle although it took us quite a while to work out what the different numbers being passed into ‘verifyPercentage’ were supposed to represent.

With this type of test I think it makes more sense to have a bit of duplication to make it easier for us to understand the test.

We changed this test to have its assertions inline and make use of the Hamcrest library to do those assertions:

public void shouldCalculatePercentageDifferences() {
	assertThat(new PercentageCalculator().calculatePercentage(50, 100), is(100));
	assertThat(new PercentageCalculator().calculatePercentage(100, 100), is(0));
	assertThat(new PercentageCalculator().calculatePercentage(100, 50), is(-50));

I think we may have also created a field to instantiate ‘PercentageCalculator’ so that we didn’t have to instantiate that three times.

Although we end up writing more code than in the first example I don’t think it’s a problem because it’s now easier to understand and we’ll be able to resolve any failures more quickly than we were able to previously.

As Michael Feathers points out during Jay Fields’ ‘Beta Test‘ presentation we need to remember why we try and adhere to the DRY principle in the first place.

To paraphrase his comments:

In production code if we don’t adhere to the DRY principle then we might make a change to a piece of code and we won’t know if there’s another place where we need to make a change as well.

In test code the tests always tell us where we need to make changes because the tests will break.

Testing End Points: Integration tests vs Contract tests

We recently changed the way that we test against our main integration point on the project I’ve been working on so that in our tests we retrieve the service object from our dependency injection container instead of ‘newing’ one up.

Our tests therefore went from looking like this:

public void ShouldTestSomeService()
	var someService = new SomeService();
	// and so on

To something more like this:

public void ShouldTestSomeService()
	var someService = UnityFactory.Container.Resolve<ISomeService>();
	// and so on

This actually happened as a side effect of another change we made to inject users into our system via our dependency injection container.

We have some ‘authenticated services’ which require the request to contain a SAML token for a valid user so it seemed to make sense to use the container in the tests instead of having to duplicate this piece of behaviour for every test.

We needed to add our fake authorised user into the container for our tests but apart from this the container being used is the same as the one being used in the production code.

Our tests are therefore now calling the services in a way which is much closer to the way that they are called in the code than was previously the case.

I think this is good as it was previously possible to have the tests working but then have a problem calling the services in production because something in the container wasn’t configured properly.

The downside is that these tests now have more room for failure than they did previously and they are not just testing the end point which was their original purpose.

In a way what we have done is convert these tests from being contract tests to integration tests.

I like the new way but I’m not completely convinced that it’s a better approach.