Mark Needham

Thoughts on Software Development

TDD: Balancing DRYness and Readability

with 11 comments

I wrote previously about creating DRY tests and after some conversations with my colleagues recently about the balance between reducing duplication but maintaining readability I think I’ve found the compromise between the two that works best for me.

The underlying idea is that in any unit test I want to be aiming for a distinct 3 sections in the test – Given/When/Then, Arrange/Act/Assert or whatever your favourite description for those is.

Why?

I find that tests written like this are the easiest for me to understand – there would typically be a blank line between each distinct section so that scanning through the test it is easy to understand what is going on and I can zoom in more easily on the bit which concerns me at the time.

When there’s expectations on mocks involved in the test then we might end up with the meat of the ‘Then’ step being defined before the ‘When’ section but for other tests it should be possible to keep to the structure.

A lot of the testing I’ve been working on recently has been around mapping data between objects – there’s not that much logic going on but it’s still important to have some sort of verification that we have mapped everything that we need to.

We often end up with a couple of tests which might look something like this:

public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly()
{
	var customer = new Customer() 
					{
						Gender = Gender.Female
						Address = new Address(...)
					}
 
	var customerMessage = new CustomerMapper().MapFrom(customer)
 
	Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender);
	Assert.AreEqual(new Address(..), customerMessage.Address);
	// and so on...
}
 
public void ShouldEnsureThatMaleCustomerIsMappedCorrectly()
{
	var customer = new Customer() 
				{
					Gender = Gender.Male
					Address = new Address(...)
				}
 
	var customerMessage = new CustomerMapper().MapFrom(customer)
 
	Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender);
	Assert.AreEqual(new Address(..), customerMessage.Address);
	// and so on...
}

(For the sake of this example ‘CustomerMessage’ is being auto generated from an xsd)

We’ve got a bit of duplication here – it’s not that bad but if there are changes to the CustomerMessage class, for example, we have more than one place to change.

It is actually possible to refactor this so that we encapsulate nearly everything in the test, but I’ve never found a clean way to do this so that you can still understand the intent of the test.

public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly()
{
	AssertCustomerDetailsAreMappedCorrectly(customer, Gender.Female, CustomerMessage.Gender.Female);
}
 
public void ShouldEnsureThatMaleCustomerIsMappedCorrectly()
{				
	AssertCustomerDetailsAreMappedCorrectly(customer, Gender.Male, CustomerMessage.Gender.Male);
}
 
private void AssertCustomerDetailsAreMappedCorrectly(Customer customer, Gender gender, CustomerMessage.Gender gender)
{			
	var customer = new Customer() 
					{				
						Gender = gender,
						Address = new Address(...)
					}
 
	var customerMessage = new CustomerMapper().MapFrom(customer)
 
	Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender);
	// and so on...	
}

(Of course we would be mapping more than just gender normally but gender helps illustrate the pattern that I’ve noticed)

We’ve achieved our goal of reducing duplication but it’s not immediately obvious what we’re testing because that’s encapsulated too. I find with this approach that it’s more difficult to work out what went wrong when the test stops working, so I prefer to refactor to somewhere in between the two extremes.

public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly()
{
	var customer = CreateCustomer(Gender.Female, new Address(...));
 
	var customerMessage = MapCustomerToCustomerMessage(customer);
 
	AssertFemaleCustomerDetailsAreMappedCorrectly(customer, customerMessage);
}
 
public void ShouldEnsureThatMaleCustomerIsMappedCorrectly()
{
	var customer = CreateCustomer(Gender.Male, new Address(...));
 
	var customerMessage = MapCustomerToCustomerMessage(customer);
 
	AssertMaleCustomerDetailsAreMappedCorrectly(customer, customerMessage);
}
 
private CustomerMessage MapCustomerToCustomerMessage(Customer customer)
{
	return new CustomerMapper().MapFrom(customer);
}
 
private Customer CreateCustomer(Gender gender, Address address)
{
	return new Customer() 
				{
					Gender = gender,
					Address = address
				};
}
 
private void AssertMaleCustomerDetailsAreMappedCorrectly(Customer customer, CustomerMessage customerMessage)
{			
	Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender);
	// and so on...	
}
 
private void AssertFemaleCustomerDetailsAreMappedCorrectly(Customer customer, CustomerMessage customerMessage)
{			
	Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender);
	// and so on...	
}

Although this results in more code than the 1st approach I like it because there’s a clear three part description of what is going on which will make it easier for me to work out which bit is going wrong. I’ve also split the assertions for Male and Female because I think it makes the test easier to read.

I’m not actually sure whether we need to put the 2nd step into its own method or not – it’s an idea I’ve been experimenting with lately.

I’m open to different ideas on this – until recently I was quite against the idea of encapsulating all the assertion statements in one method but a few conversations with Fabio have led me to trying it out and I think it does help reduce some duplication without hurting our ability to debug a test when it fails.

Be Sociable, Share!

Written by Mark Needham

April 13th, 2009 at 12:47 am

Posted in Testing

Tagged with ,

  • Pingback: DotNetShoutout

  • http://www.jphamilton.net J.P. Hamilton

    Just thinking as I read this. For repetitive tests like this, what about leaving the first test alone, in a non-refactored state. This would be your usage example. Everything else could be strongly refactored for DRY. A little weird, I know.

  • http://www.guardian.co.uk Michael Brunton-Spall

    Hey Mark,

    Two points.
    At the Guardian, we use a Builder pattern for most of our domain objects. These are only used in tests, and produce the equivalent of your CreateCustomer method.
    See, within your test you are applying DRY, but between multiple tests you will have several CreateCustomer methods, each repeated.
    Using a builder means you can do something like:
    new CustomerBuilder().gender(Gender.Male).toCustomer()
    The CustomerBuilder will construct a well formed object, such that new CustomerBuilder().toCustomer() must return an object with a satisfactory set of defaults. This means that within your tests you don’t need to worry about specifying the address, since it’s irrelevant to the tests as specified.
    It also extracts the responsibility for creating an object for your tests into a single location, meaning that when you change your constructors, or the definition of your objects, you only have to change the builder, not every test that uses that object.

    Secondly, while I like the concept of using DRY in tests, I think it’s always a mistake to pull the asserts out into other methods, it makes the tests harder to read.
    If you feel your assert lines aren’t easy to read, then you need to either split up your tests so there are less assertions per test, or use a library like hamcrest to provide assert methods that are easier to read, for example:
    assertThat(customerMessage.gender, is(CustomerMessage.Gender.Male))
    which is far more readable than AssertMaleCustomerDetailsAreMappedCorrectly(customer, customerMessage)
    or
    Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender);

    Cheers

  • http://www.markhneedham.com Mark Needham

    Hey Michael,

    Good point about the builders – that would certainly make the test setup more readable.

    That’s quite an interesting point about Hamcrest – I really like Hamcrest and have been working on a .NET version with some colleagues. Maybe I should hurry up and finish that off so we don’t have to group them in methods.

    Would you be in favour of grouping Hamcrest customer assertions into a method so we’d have a list of Hamcrest assertions instead of a list of NUnit ones for example? Maybe that’s not actually needed because of the readability of Hamcrest. I’ve certainly never refactored code like in this post when using Hamcrest.

    Cheers,
    Mark

  • http://brianbrowning.wordpress.com Brian Browning

    Hey Mark,

    Do you have a public repo of your .net hamcrest port out there anywhere?
    I would love to take a look at it ( even a really early here there be dragons beta version :) ).

    Thanks,
    Brian

  • Pingback: Reflective Perspective - Chris Alcock » The Morning Brew #326

  • http://www.guardian.co.uk Michael Brunton-Spall

    Mark,

    I’d not be in favour of grouping the assertions into a method regardless of whether they are hamcrest or NUnit assertions.

    The reason for that is that in my mind, the primary purpose of the test is to execute a single method, then make assertions. Both of these should be in the core body of the test, so that it’s obvious what this test method is doing.
    The setup to get objects into the right state to call the method under test is very much incidental, and so can be factored out to a Builder, and to setup methods, or a private method.

    I understand that you are talking around the edges of BDD (or BDT I guess), in which you want your test separated into Given/When/Then methods.
    But for me, there are two use cases of my test. 90% of the time the test will be read, therefore making the test easy to read is key.
    Choosing a good private method name will do this, as will using matchers that are more expressive, and hiding object construction.
    However 10% of the time, a test will be modified.
    Now in my experience, people who change tests (including me) rarely remember to change the names of private methods, and often copy and paste construction code around. The Builders ensure that there is one true place to construct objects, and nice expressive matchers will naturally ensure that the tests express the truth (since they fail if you use the wrong matchers), but private method names will almost always be forgotten.

    You end up in the case whereby you have ridiculous methods like
    private void AssertMaleCustomerDetailsAreMappedCorrectly(Customer customer, CustomerMessage customerMessage)
    {
    Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender);
    }

    Where someone changed the requirement, but didn’t change the method name.
    At that point, the name AssertMaleCustomerDetailsAreMapppedCorrectly can be misleading in your test, whereas the assert will have changed.

    How strongly I feel about that is questionable, I’d love to work with someone who writes tests like that both to write some tests, and 6 months later, after the code base is maintained by 30 other developers, and see what the results are. Maybe I’m wrong and the developers would change the names of the private methods, if that was the developer culture. My experience tells me otherwise, but out of some 12,000 tests, only a few hundred are written like that, so it’s not culture here.

    An incidental benefit to keeping assertions and method under test inline is that it becomes apparent to everybody when a test is testing too much, or testing the wrong thing. A test with 14 lines of mocks, three method calls and 10 assertions are easily spotted by almost all developers as broken.
    A private method with 6 lines of assertions, which is called from every test method becomes much easier with your approach.
    (Rule of thumb, if it’s more than 4 lines of setup, or more than 4 assertions, it’s probably testing the wrong thing. YMMV of course)

    Cheers
    Michael

  • http://www.markhneedham.com Mark Needham

    @Brian – this is what we’ve got so far – http://bitbucket.org/markhneedham/hambread/ – I think there’s some stuff still to commit. Hopefully it makes some sort of sense. Quite a long way from finished as you shall see!

  • http://www.guardian.co.uk Michael Brunton-Spall

    Hey Mark,

    (Writing for the second time since firefox ate my comment I wrote over lunch)

    I still feel that extracting the assertions out to a separate method is the wrong thing to do for several reasons.

    Primarily, it’s extracting the core functionality of the test. The assertions are the single most important thing in the test method, with the method under test coming in second.
    When you factor out the assertions into a private helper method, you hide the very purpose of the test, making it one click further away for the developer.
    Well chosen helper method names are argued to counter this, but there also comes a problem when at a later date the requirements change, and the test needs to change to represent the requirement change.
    Using a private method or builder insulates your tests from the construction change, meaning that there is one single source of truth for object construction and changing the assertions is mandated by your test framework, and CI process.
    But in my experience within a large team, remembering to change the name of a private method to reflect the changes within the method seems to be suprisingly difficult. I will admit that I’ve done it myself, you are in the flow, you’re changing tests, you see a helper object that needs changing and you change it without looking at the name of the helper class or method. You checkin, and someone bearded comes and yells at you for making a method called “ensureCustomerIsMale” return Gender.Female!

    Maybe it’s a cultural thing, my experience in a large team, with over 12000 tests, of which only a few hundred are written this way, is that private methods are rarely renamed, leading to tests that seem to make assertions, or setup objects a certain way, and actually do something slightly different.

    There is a final point as well, When assertions are factored into a method, it can be deceptively easy to reuse them. I.e. your test does something like
    Setup();
    object = repository.getMethod();
    assertObjectIsValid(object);
    assertObjectCameFromRepository(object);
    assertObjectsDependenciesAreValid(object);

    where each assert method contains 3 or more assertions. If you had a single test that did 12 or more assertions, you should know something was wrong. Equally, if you have a test that has 10 lines of setup, including passing mock objects to other mock objects to be returned, then you know you’ve done something wrong.
    By hiding that implementation away within your test code, you are hiding the complexity surrounding the object.
    Does your test really need to check that the object is valid? If one test checks it, every other test can assume it. Test each thing once, not in every test.

    My version of your starting tests would be

    public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly()
    {
    var customer = new CustomerBuilder.Gender(Gender.Female).toCustomer()

    var customerMessage = new CustomerMapper().MapFrom(customer)

    Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender);
    }

    public void ShouldEnsureThatMaleCustomerIsMappedCorrectly()
    {
    var customer = new CustomerBuilder.Gender(Gender.Female).toCustomer()
    {
    var customerMessage = new CustomerMapper().MapFrom(customer)

    Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender);
    }

    public void ShouldEnsureThatcustomerAddressIsMappedCorrectly()
    {
    var address = new AddressBuilder().Number(“12″).Street(“Bob St”).toAddress();
    var customer = new CustomerBuilder.Address(address).toCustomer()
    {
    var customerAddress = new CustomerMapper().MapFrom(customer).getAddress()

    Assert.AreEqual(“12″, customerAddress.HouseNumber);
    Assert.AreEqual(“Bob St”, customerAddress.Line1);
    // or Assert.AreEqual(address, customerAddress);
    }

    Here, hopefully, the meaning is clear, for a customer who is unremarkable except for feature X, lets assert stuff about feature X.
    I’ve extrapolated a little about what the address assertions where, but I figure it’s probably something like that.
    I am of course assuming that the address here is not affected by the gender of the customer. If it is, I’d write the test a little differently.

    Cheers
    Michael

  • http://www.markhneedham.com Mark Needham

    @Michael – I found the original reply – hidden in the comments that Akismet decided is spam! Replies are a little different so I’ll leave both if that’s cool with you?

    I can definitely see where the approach I’ve mentioned could go wrong, can’t argue with any of your points!

    What I’ve noticed for for the tests I’ve described above is that at least so far changes to the code I’m testing seem to all be around the same area and the names of the methods are quite unlikely to change since they’re not really doing that much.

    Very good point about keeping the methods small as well.

  • Pingback: Good Lazy and Bad Lazy at Mark Needham