Archive for the ‘Testing’ tag
Writing unit tests can be fun
I recently came across Pavel Brodzinski’s blog and while browsing through some of his most recent posts I came across one discussing when unit testing doesn’t work.
The majority of what Pavel says I’ve seen happen before on projects I’ve worked on but I disagree with his suggestion that writing unit tests is boring:
6. Writing unit tests is boring. That’s not amusing or challenging algorithmic problem. That’s not cool hacking trick which you can show off with in front of your geeky friends. That’s not a new technology which gets a lot of buzz. It’s boring. People don’t like boring things. People tend to skip them.
I think it depends on the way that the unit tests are being written.
When I first started working at ThoughtWorks I used to think that writing tests was boring and that it was much more fun writing production code. A couple of years have gone by since then and I think I actually get more enjoyment out of writing tests these days.
There are some things we've done on teams I've worked on which contribute to my enjoyment when writing unit tests:
Small steps
While working on a little application to parse some log files last week I had to implement an algorithm to find the the closing tag of an xml element in a stream of text.
I had a bit of an idea of how to do that but coming up with little examples to drive out the algorithm helped me a lot as I find it very difficult to keep large problems in my head.
The key with following the small steps approach is to only writing one test at a time as that helps keep you focused on just that one use of this class which I find much easier than considering all the cases at the same time.
The feeling of progress all the time, however small, contributes to my enjoyment of using this approach.
Test first
I think a lot of the enjoyment comes from writing unit tests before writing code, TDD style.
The process of moving up and down the code as we discover different objects that should be created and different places where functionality should be written means that writing our tests/examples first is a much more enjoyable process than writing them afterwards.
The additional enjoyment in this process comes from the fact that we often discover scenarios of code use and problems that we probably wouldn’t have come across if we hadn’t driven our code that way.
Ping pong pairing
I think this is the most fun variation of pair programming that I’ve experienced, the basic idea being that one person writes a test, the other writes the code and then the next test before the first person writes the code for that test.
I like it to become a bit of a game whereby when it’s your turn to write the code you write just the minimal amount of code possible to make the test pass before driving out a proper implementation with the next test you write.
I think this makes the whole process much more light hearted than it can be otherwise.
In Summary
The underlying premise of what makes writing unit tests pretty much seems to be about driving our code through those unit tests and preferably while working with someone else.
Even if we choose not to unit test because we find it boring we’re still going to test the code whether or not we do it in an automated way!
I don’t have time not to test!
I recently read a blog post by Joshua Lockwood where he spoke of some people who claim they don’t have time to test.
Learning the TDD approach to writing code has been one of best things that I’ve learnt over the last few years – before I worked at ThoughtWorks I didn’t know how to do it and the only way I could verify whether something worked was to load up the application and manually check it.
It was severely painful and on one particular occasion I managed to put some code with a bug into production because I didn’t know all the places that making that code change would impact.
It’s not a good way of working and I’m glad I’ve been given the opportunity to work with people who have showed me a better way.
My experience pretty much matches a comment made by Chris Missal on the post where he pointed out that you are going to test your code anyway so you might as well automate that test!
“You’re already testing with the debugger, TestPage1.aspx, or whatever… Just save that code and automate it!”
I’ve just spent the last 2 hours doing some refactoring on an F# twitter application I’m working on and because I didn’t write any tests it’s been a very painful experience indeed.
Every time I make a change I have to copy all the code into F# interactive, run the code and then manually make sure that I haven’t broken anything.
I’ve been doing this in fairly small steps – make one change then run it – but the cycle time is still much greater than it would be if I had just put some tests around the code in the first place.
I think we should be looking to test more than just the ‘complex code’ as well – there have been numerous occasions when I’ve put the logic for a conditional statement the wrong way around and a test has come to the rescue.
It pretty much applies to all the languages that I’ve worked in and if we can’t see how to easily create an automated test for a bit of code then it’s a sign that we’re doing something wrong and we might want to take a look at that!
TDD: Balancing DRYness and Readability
I wrote previously about creating DRY tests and after some conversations with my colleagues recently about the balance between reducing duplication but maintaining readability I think I’ve found the compromise between the two that works best for me.
The underlying idea is that in any unit test I want to be aiming for a distinct 3 sections in the test – Given/When/Then, Arrange/Act/Assert or whatever your favourite description for those is.
Why?
I find that tests written like this are the easiest for me to understand – there would typically be a blank line between each distinct section so that scanning through the test it is easy to understand what is going on and I can zoom in more easily on the bit which concerns me at the time.
When there’s expectations on mocks involved in the test then we might end up with the meat of the ‘Then’ step being defined before the ‘When’ section but for other tests it should be possible to keep to the structure.
A lot of the testing I’ve been working on recently has been around mapping data between objects – there’s not that much logic going on but it’s still important to have some sort of verification that we have mapped everything that we need to.
We often end up with a couple of tests which might look something like this:
public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly() { var customer = new Customer() { Gender = Gender.Female Address = new Address(...) } var customerMessage = new CustomerMapper().MapFrom(customer) Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender); Assert.AreEqual(new Address(..), customerMessage.Address); // and so on... } public void ShouldEnsureThatMaleCustomerIsMappedCorrectly() { var customer = new Customer() { Gender = Gender.Male Address = new Address(...) } var customerMessage = new CustomerMapper().MapFrom(customer) Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender); Assert.AreEqual(new Address(..), customerMessage.Address); // and so on... }
(For the sake of this example ‘CustomerMessage’ is being auto generated from an xsd)
We’ve got a bit of duplication here – it’s not that bad but if there are changes to the CustomerMessage class, for example, we have more than one place to change.
It is actually possible to refactor this so that we encapsulate nearly everything in the test, but I’ve never found a clean way to do this so that you can still understand the intent of the test.
public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly() { AssertCustomerDetailsAreMappedCorrectly(customer, Gender.Female, CustomerMessage.Gender.Female); } public void ShouldEnsureThatMaleCustomerIsMappedCorrectly() { AssertCustomerDetailsAreMappedCorrectly(customer, Gender.Male, CustomerMessage.Gender.Male); } private void AssertCustomerDetailsAreMappedCorrectly(Customer customer, Gender gender, CustomerMessage.Gender gender) { var customer = new Customer() { Gender = gender, Address = new Address(...) } var customerMessage = new CustomerMapper().MapFrom(customer) Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender); // and so on... }
(Of course we would be mapping more than just gender normally but gender helps illustrate the pattern that I’ve noticed)
We’ve achieved our goal of reducing duplication but it’s not immediately obvious what we’re testing because that’s encapsulated too. I find with this approach that it’s more difficult to work out what went wrong when the test stops working, so I prefer to refactor to somewhere in between the two extremes.
public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly() { var customer = CreateCustomer(Gender.Female, new Address(...)); var customerMessage = MapCustomerToCustomerMessage(customer); AssertFemaleCustomerDetailsAreMappedCorrectly(customer, customerMessage); } public void ShouldEnsureThatMaleCustomerIsMappedCorrectly() { var customer = CreateCustomer(Gender.Male, new Address(...)); var customerMessage = MapCustomerToCustomerMessage(customer); AssertMaleCustomerDetailsAreMappedCorrectly(customer, customerMessage); } private CustomerMessage MapCustomerToCustomerMessage(Customer customer) { return new CustomerMapper().MapFrom(customer); } private Customer CreateCustomer(Gender gender, Address address) { return new Customer() { Gender = gender, Address = address }; } private void AssertMaleCustomerDetailsAreMappedCorrectly(Customer customer, CustomerMessage customerMessage) { Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender); // and so on... } private void AssertFemaleCustomerDetailsAreMappedCorrectly(Customer customer, CustomerMessage customerMessage) { Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender); // and so on... }
Although this results in more code than the 1st approach I like it because there’s a clear three part description of what is going on which will make it easier for me to work out which bit is going wrong. I’ve also split the assertions for Male and Female because I think it makes the test easier to read.
I’m not actually sure whether we need to put the 2nd step into its own method or not – it’s an idea I’ve been experimenting with lately.
I’m open to different ideas on this – until recently I was quite against the idea of encapsulating all the assertion statements in one method but a few conversations with Fabio have led me to trying it out and I think it does help reduce some duplication without hurting our ability to debug a test when it fails.
TDD: Test DRYness
I had a discussion recently with Fabio about DRYness in our tests and how we don’t tend to adhere to this principal as often in test code as in production code.
I think certainly some of the reason for this is that we don’t take as much care of our test code as we do production code but for me at least some of it is down to the fact that if we make our tests too DRY then they become very difficult to read and perhaps more importantly, very difficult to debug when there is a failure.
There seem to be different types of DRYness that weave themselves into our test code which result in our code becoming more difficult to read.
Suboptimal DRYness
Setup method
Putting code into a setup method is the most common way to reduce duplication in our tests but I don’t think this is necessarily the best way to do it.
The problem is that we end up increasing the context required to understand what a test does such that the reader needs to read/scroll around the test class a lot more to work out what is going on. This problem becomes especially obvious when we put mock expectations into our setup method.
One of those expectations becomes unnecessary in one of our tests and not only is it not obvious why the test has failed but we also have a bit of a refactoring job to move the expectations out and only into the tests that rely on them.
Helper methods with more than one responsibility
Extracting repeated code into helper methods is good practice but going too far and putting too much code into these methods defeats the purpose.
One of the most common ways that this is violated is when we have methods which create the object under test but also define some expectations on that object’s dependencies in the same method.
This violates the idea of having intention revealing method names as well as making it difficult to identify the reason for test failures when they happen.
Assertions
I tend to follow the Arrange, Act, Assert approach to designing tests whereby the last section of the test asserts whether or not the code under test acted as expected.
I’m not yet convinced that following the DRY approach is beneficial here because it means that you need to do more work to understand why a test is failing.
On the other hand if assertions are pulled out into an intention revealing method then the gain in readability might level out the extra time it takes to click through to a failing assertion.
My favourite approach to test assertions is to use behavioral style assertions
e.g.
stringValue.ShouldEqual("someString")
…and I don’t think applying the DRY principle here, if we have a lot of similar assertions, adds a lot of value.
DRY and expressive
I’m not against DRYness in tests, I think it’s a good thing as long as we go about it in a way that still keeps the code expressive.
Test data setup
The setup and use of test data is certainly an area where we don’t gain an awful lot by having duplication in our tests. If anything having duplication merely leads to clutter and doesn’t make the tests any easier to read.
I have found the builder pattern to be very useful for creating clutter free test data where you only specifically define the data that you care about for your test and default the rest.
Single responsibility helper methods
If we decide that extracting code into a helper method increases the readability of a test then the key for is to ensure that these helper methods only do one thing otherwise it becomes much more difficult to understand what’s going on.
My current thinking is that we should aim for having only one statement per method where possible so that we can skim through these helper methods quickly without having to spend too much time working out what’s going on.
An idea Dan North talks about (and which is nicely illustrated in this blog post) is putting these helper methods just before the test which makes use of them. I haven’t tried this out yet but it seems like a neat way of making the code more DRY and more readable.
In Summary
I’ve noticed recently that I don’t tend to read test names as often as I used to so I’m looking to the test code to be expressive enough that I can quickly understand what is going on just from scanning the test.
Keeping the code as simple as possible, extracting method when it makes sense and removing clutter are some useful steps on the way to achieving this.
TDD: Design tests for failure
As with most code, tests are read many more times than they are written and as the majority of the time the reason for reading them is to identify a test failure I think it makes sense that we should be designing our tests with failure in mind.
Several ideas come to mind when thinking about ways to write/design our tests so that when we do have to read them our task is made easier.
Keep tests data independent
The worst failures for me are the ones where a test fails and when we investigate the cause it turns out that it only failed because some data it relied on changed.
This tends to be the case particularly when we are writing boundary tests against external services where the data is prone to change.
In this situations we need to try and keep our tests general enough that they don’t give us these false failures, but also specific enough that they aren’t completely worthless.
As an example, when testing XML based services it makes more sense to check that certain elements exist in the document rather than checking that these elements have certain values. The latter approach leads to brittle, difficult to maintain tests while the former leads to tests that are more independent and whose failures are actually a cause for concern.
Consistent Structure
Jay Fields touched on this in a post he wrote a couple of months ago about having a ubiquitous assertion syntax for every test.
That way when we look at a failing test we know what to expect and we can get down to fixing the test rather than trying to work out how exactly it failed.
We have used the Arrange, Act, Assert approach on the last couple of projects I’ve worked on which has worked quite well for dividing the tests into their three main parts. We typically leave empty lines between the different sections or add a comment explaining what each section is.
The nice thing about this approach when you get it right is that you don’t even have to read the test name – the test reads like a specification and explains for itself what it going on.
My personal preference for the Assert step is that I should be able to work out why the test is failing from within the test method without having to click through to another method in the test class. There is a debate about whether or not that approach is DRY, but that’s a discussion for another post!
Avoid false failures
Failing because of test reliance on data is one example of a false failure but there are other ways that a test failure can be quite misleading as to the actual reason that it failed.
Null Reference or Null Pointer Exceptions are the chief culprits when it comes to this – a test will seemingly randomly start throwing one of these exceptions either on an assertion or in the actual code.
With the former we should shore the test up by testing something more general further up the test, so that we get a more meaningful failure the next time.
With the latter this usually happens because we added in some code without changing the tests first. I always get bitten when I disrespect Uncle Bob’s Three Laws.
- Write no production code except to pass a failing test.
- Write only enough of a test to demonstrate a failure
- Write only enough production code to pass the test
Sometimes we get false failures due to not having enough data set up on our objects. Depending on the situation we might have a look at the test to see whether it is testing too much and the class has taken on more responsibility.
If it turns out all is fine then the builder pattern is a really good way for ensuring we don’t run into this problem again.
Testing First vs Testing Last
I recently posted about my experiences of testing last where it became clear to me how important writing the test before the code is.
If we view the tests purely as a way of determining whether or not our code works correctly for a given set of examples then it doesn’t make much difference whether we test before or after we have written the code.
If on the other hand we want to get more value out of our tests such as having them the tests act as documentation, drive the design of our APIs and generally prove useful reading to ourself and others in future then a test first approach is the way to go.
Testing last means we’ve applied assumption driven development when we wrote the code and now we’re trying to work out how to use the API rather than driving out the API with some examples.
In a way writing tests first is applying the YAGNI concept to this area of development. Since we are only writing code to satisfy the examples/tests that we have written it is likely to include much less ‘just in case’ code and therefore lead to a simpler solution. Incrementally improving the code with small steps works particularly well for keeping the code simple.
As Scott Bellware points out, the costs of testing after the code has been written is much higher than we would imagine and we probably won’t cover as many scenarios as we would have done had we taken a test first approach.
I think we also spend less time thinking about exactly where the best place to test a bit of functionality is and therefore don’t end up writing the most useful tests.
Obviously sometimes we want to just try out a piece of code to see whether or not an approach is going to work but when we have gained this knowledge it makes sense to go back and test drive the code again.
As has been said many times, TDD isn’t about the testing, it’s much more.
TDD: Mock expectations in Setup
One of the ideas that I mentioned in a recent post about what I consider to be a good unit test was the ideas that we shouldn’t necessarily consider the DRY (Don’t Repeat Yourself) principle to be our number one driver.
I consider putting mock expectations in the setup methods of our tests to be one of those occasions where we shouldn’t obey this principle and I thought this would be fairly unanimously agreed upon but putting the question to the Twittersphere led to mixed opinions.
The case for expectations in setup
The argument for putting expectations in the setup method is that it helps remove duplication and helps us to fail more quickly.
This would certainly be the case if, for example, we instantiated our object under test in the setup method and there were some expectations on its dependencies on creation.
The case against expectations in setup
The reason I’m so against putting expectations in setup methods derives from the pain of trying to debug NMock error messages when we put expectations and stubs in the setup method on a project I worked on about a year ago.
The number of times we were caught out by a failure which seemed ‘impossible’ from looking at the failing test was ridiculous.
After that experience we made sure that it was always obvious which expectations belonged to which test by inlining them and taking the duplication hit.
I believe a lot of the value of tests comes from the way that they fail, and if we can write tests in a way that the failure message and subsequent fix are really obvious then we are going the right way.
My current approach
My current approach to try and get the best of both worlds is to follow the approach Phil describes in his post on Domain Driven Tests.
If we have repeated expectations across different tests then I now try to extract those into an appropriately named methods which can be called from each test.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | [Test] public void ShouldDoSomething() { ExpectServiceToReturnSomeValue(); // rest // of // test } private void ExpectServiceToReturnSomeValue() { // code describing expectations } |
This creates a little bit of duplication in that we have to call this method individually in each test which uses it but I think it makes the test more readable and easier to debug.
I’m still not sure what I consider the best way to name these types of methods – Phil uses a combination of a comment and method name to create readable tests but I’m keen to try and have the intent completely described by a method name if possible.
Testing: What is a defect?
One of the key ideas that I have learnt from my readings of The Toyota Way and Taaichi Ohno’s Workplace Management is that we should strive not to pass defects through the system to the next process, which you should consider to be your customer.
As a developer the next process for each story is the testing phase where the testers will (amongst other things) run through the acceptance criteria and then do some exploratory testing for scenarios which weren’t explicitly part of the acceptance criteria.
The question is how far should we go down this route and what exactly is a defect using this terminology – if a tester finds a bug which was listed in the acceptance criteria then I think it’s reasonable enough to suggest that the developer has moved a defect onto the next stage.
But what about if that bug only appears on one particular browser and that’s one that the developer didn’t test against but the tester did. Clearly automating tests against different browsers can help solve this problem but there are still some types of tests (particularly ones requiring visual verification) where it’s much more grey.
We want developers to write code with as few defects as possible but at the end of the day testers are much better at using software in ways that is likely to expose defects that developers wouldn’t even think about and I think this is definitely a good thing.
My current thinking around this area is that a defect is something which was covered by the acceptance criteria or something which has been previously exposed by exploratory testing and reappears.
Anything else is a normal part of the process.
TDD: One test at a time
My colleague Sarah Taraporewalla has written a series of posts recently about her experiences with TDD and introducing it at her current client.
While I agreed with the majority of the posts, one thing I found interesting was that in the conversation with a TDDer there were two tests being worked on at the same time (at least as far as I understand from the example).
This means that there will be two tests failing if we run our test suite, something which I try to avoid wherever possible.
I like to keep my focus just on the test that I am currently working on, so my approach if I had another test that I knew needed to be written would either be to write it down on a piece of paper or to write the skeleton and then just not put anything inside the test body.
This could be seen as being a touch risky in case I then forget to actually write the test and the build remains green, but I prefer this trade off than the distraction that I feel when having more than one test red.
When driving out the design of classes I am now veering towards the approach of severe simplicity such that we literally only do enough to make the test green even if that involves returning a hard coded value for example.
The next test after that would probably be the one that drives out the implementation since it becomes easier to write the code to handle the general case rather than hard coding specific implementations for the individual tests.
I started becoming convinced of this approach after trying the Karate Chop Code Kata a couple of months ago where I set up all the tests initially and therefore had 20 tests failing all at once.
It felt quite overwhelming having that many tests failing, and the sense of progress from making a test pass wasn’t there for me.
It seems a bit ridiculous but keeping the steps as small as possible is certainly the approach I am seeing the most success with at the moment.
What makes a good unit test?
Following on from my post around the definition of a unit test, a recent discussion on the Test Driven Development mailing list led me to question what my own approach is for writing unit tests.
To self quote from my previous post:
A well written unit test in my book should be simple to understand and run quickly.
Quite simple in theory but as I have learnt (and am still learning) the hard way, much harder to do in practice. Breaking that down further what does it actually mean?
Intention revealing name
There was some discussion a few months ago with regards to whether test names were actual valuable, but as the majority of my work has been in Java or C# I think it is very important.
I favour BDD style test names which describe the behaviour of what we are testing rather than the implementation details. For me naming the tests in this way allows people who look at the test in future to question whether it is a valid test as well as whether it is actually doing what it claims to be doing.
No clutter
If we can keep tests short and to the point they are much easier for the next person to read.
To achieve this we need to ensure that we keep the code in the test method to the minimum, including putting object setup code into another method so that it doesn’t clutter the test and only setting the expectations that we care about if we are using a mocking framework.
This is made much easier by the Arrange-Act-Assert approach being followed by mocking frameworks nowadays. I think this approach maps quite nicely to the Given-When-Then BDD syntax as a nice way of defining our tests or examples in BDD land.
Don’t remove all duplication
While removing duplication from code is generally a good thing I don’t think we should apply the DRY principle too judiciously on test code.
As Phil points out this can make tests very difficult to read and understand. I tend to favour test expressiveness over removing all duplication.
One behaviour per test
I used to try and follow the idea of having only one assertion per test but Sczcepan’s idea of testing one behaviour per class is much better.
This is one part of writing tests where we should stick to the Single Responsibility Principle in as far as not overloading the test with assertions which then make it more difficult to work out where the code failed if a test fails.
Expressive failure messages
When using JUnit or NUnit for assertions in the IDE the assertion failure messages don’t really make much difference because we have the code fresh in our mind and it’s only one click to get to the failure.
If an assertion with either of these frameworks fails on the build on the other hand it’s much harder at a glance to tell exactly why it failed. This is why I favour Hamcrest which tells you precisely why your test failed.
In Summary
For me the key with unit tests is to make sure that other people in the team can read and understand them easily.
No doubt there are other ways of ensuring our unit tests are well written but these are the ways that I consider the most important at the moment.