Archive for the ‘Testing’ tag
A new found respect for acceptance tests
On the project that I've been working on over the past few months one of the key benefits of the application was its ability to perform various calculations based on user input.
In order to check that these calculators are producing the correct outputs we created a series of acceptance tests that ran directly against one of the objects in the system.
We did this by defining the inputs and expected outputs for each scenario in an excel spreadsheet which we converted into a CSV file before reading that into an NUnit test.
It looked roughly like this:
We found that testing the calculations like this gave us a quicker feedback cycle than testing them from UI tests both in terms of the time taken to run the tests and the fact that we were able to narrow in on problematic areas of the code more quickly.
As I mentioned on a previous post we've been trying to move the creation of the calculators away from the 'CalculatorProvider' and 'CalculatorFactory' so that they're all created in one place based on a DSL which describes the data required to initialise a calculator.
In order to introduce this DSL into the code base these acceptance tests acted as our safety net as we pulled out the existing code and replaced it with the new DSL.
We had to completely rewrite the 'CalculationService' unit tests so those unit tests didn't provide us much protection while we made the changes I described above.
The acceptance tests, on the other hand, were invaluable and saved us from incorrectly changing the code even when we were certain we'd taken such small steps along the way that we couldn't possibly have made a mistake.
This is certainly an approach I'd use again in a similar situation although it could probably be improved my removing the step where we convert the data from the spreadsheet to CSV file.
TDD: Driving from the assertion up
About a year ago I wrote a post about a book club we ran in Sydney covering 'The readability of tests' from Steve Freeman and Nat Pryce's book in which they suggest that their preferred way of writing tests is to drive them from the assertion up:
Write Tests Backwards
Although we stick to a canonical format for test code, we don't necessarily write tests from top to bottom. What we often do is: write the test name, which helps us decide what we want to achieve; write the call to the target code, which is the entry point for the feature; write the expectations and assertions, so we know what effects the feature should have; and, write the setup and teardown to define the context for the test. Of course, there may be some blurring of these steps to help the compiler, but this sequence reflects how we tend to think through a new unit test. Then we run it and watch it fail.
At the time I wasn't necessarily convinced that this was the best way to drive but we came across an interesting example today where that approach might have been beneficial.
The test in question was an integration test and we were following the approach of saving the test object directly through the NHibernate session and then loading it again through a repository.
We started the test from the setup of the data and decided to get the mappings and table setup in order to successfully persist the test object first. We didn't write the assertion or repository call in the test initially.
Having got that all working correctly we got back to our test and wrote the rest of it only to realise as we drove out the repository code that we actually needed to create a new object which would be a composition of several objects including our original test object.
We wanted to retrieve a 'Foo' by providing a key and a date – we would retrieve different values depending on the values we provided for those parameters.
This is roughly what the new object looked like:
public class FooRecord { public Foo Foo { get; set; } public FooKey FooKey { get; set; } public DateTime OnDate { get; set; } }
'FooRecord' would need to be saved to the session although we would still retrieve 'Foo' from the repository having queried the database for the appropriate one.
public class FooRepository { public Foo Find(Date onDate, FooKey fooKey) { // code to query NHibernate which retrieves FooRecords // and then filters those to find the one we want } }
We wouldn't necessarily have discovered this more quickly if we'd driven from the assertion because we'd still have had to start driving the implementation with an incomplete test to avoid any re-work.
I think it would have been more likely that we'd have seen the problem though.
Late integration: Some thoughts
John Daniels has an interesting post summarising GOOSgaggle, an event run a few weeks ago where people met up to talk about the ideas in 'Growing Object Oriented Software, Guided by Tests'.
It's an interesting post and towards the end he states the following:
Given these two compelling justifications for starting with end-to-end tests, why is it that many people apparently don't start there? We came up with two possibilities, although there may be many others:
Starting with the domain model can provide an illusion of rapid progress. You can show business features working while ignoring the realities of the larger system environment. Clearly, this is not normally an approach that addresses the biggest risks first. But it's an easy option and attractive when you're under pressure.
For some reason the system environment is not available to you; perhaps, for example, the team creating the infrastructure is late delivering. So rather than taking the correct – and brave – option of loudly declaring progress on your project to be blocked, you restrict yourself to creating those parts of the system that are within your control.
My first thought on reading this was that it seems like a reasonable thing to say but what if we can't do anything about the fact that it's blocked?
I've previously worked on projects where we've had components that are integral to the whole application delivered late by another team and despite doing exactly as John suggests we've struggled to influence that situation.
In terms of systems thinking it might be said that we didn't have sufficient leverage to change the system we were operating within to be the way that we wanted it.
Either way we were in the situation we could just sit there and stop doing anything or we could keep going and then accept that there would be some difficult late integration work later on.
We decided to go with the second option and we had exactly those illusions of 'rapid progress' until we actually had to integrate with those components.
However, we did made it clear that there was going to be a high cost with respect to re-work of the code from late integration. That was accepted as being a limitation of the system that we were working within and although the situation did improve slightly as time went on we never completely fixed it.
In an ideal world it would be good if just shouting loudly that you were blocked was enough to make everything fine but in reality it just becomes very repetitive and annoying!
TDD: Consistent test structure
While pairing with Damian we came across the fairly common situation where we'd written two different tests – one to handle the positive case and one the negative case.
While tidying up the tests after we'd got them passing we noticed that the test structure wasn't exactly the same. The two tests looked a bit like this:
[Test] public void ShouldSetSomethingIfWeHaveAFoo() { var aFoo = FooBuilder.Build.WithBar("bar").WithBaz("baz").AFoo(); // some random setup // some stubs/expectations var result = new Controller(...).Submit(aFoo); Assert.That(result.HasFoo, Is.True); }
[Test] public void ShouldNotSetSomethingIfWeDoNotHaveAFoo() { // some random setup // some stubs/expectations var result = new Controller(...).Submit(null); Assert.That(result.HasFoo, Is.False); }
There isn't a great deal of difference between these two bits of code but the structure of the test isn't the same because I inlined the 'aFoo' variable in the second test.
Damian pointed out that if we were just glancing at the tests in the future it would be much easier for us if the structure was exactly the same. This would mean that we would immediately be able to identify what the test was supposed to be doing and why.
In this contrived example we would just need to pull out the 'null' into a descriptive variable:
[Test] public void ShouldNotSetSomethingIfWeDoNotHaveAFoo() { var noFoo = null; // some random setup // some stubs/expectations var result = new Controller(...).Submit(noFoo); Assert.That(result.HasFoo, Is.False); }
Although this is a simple example I've been trying to follow this guideline wherever possible and my tests now tend to have the following structure:
[Test] public void ShouldShowTheStructureOfMarksTests() { // The test data that's important for the test // Less important test data // Expectation/Stub setup // Call to object under test // Assertions }
As a neat side effect I've also noticed that it seems to be easier to spot duplication that we can possibly extract with this approach as well.
Preventing systematic errors: An example
James Shore has an interesting recent blog post where he describes some alternatives to over reliance on acceptance testing and one of the ideas that he describes is fixing the process whenever a bug is found in exploratory testing.
He describes two ways of preventing bugs from making it through to exploratory testing:
- Make the bug impossible
- Catch the bug automatically
Sometimes we can prevent defects by changing the design of our system so that type of defect is impossible. For example, if find a defect that's caused by mismatch between UI field lengths and database field lengths, we might change our build to automatically generate the UI field lengths from database metadata.
When we can't make defects impossible, we try to catch them automatically, typically by improving our build or test suite. For example, we might create a test that looks at all of our UI field lengths and checks each one against the database.
We had an example of the latter this week around some code which loads rules out of a database and then tries to map those rules to classes in the code through use of reflection.
For example a rule might refer to a specific property on an object so if the name of the property in the database doesn't match the name of the property on the object then we end up with an exception.
This hadn't happened before because we hadn't been making many changes to the names of those properties and when we did people generally remembered that if they changed the object then they should change the database script as well.
Having that sort of manual step always seems a bit risky to me since it's prone to human error, so having worked out what was going on we wrote a couple of integration tests to ensure that every property in the database matched up with those in the code.
We couldn't completely eliminate this type of bug in this case because the business wanted to have the rules configurable on the fly via the database.
It perhaps seems quite obvious that we should look to write these types of tests to shorten the feedback loop and allow us to catch problems earlier than we otherwise would but it's easy to forget to do this so James' post provides a good reminder!
TDD: Only mock types you own
Liz recently posted about mock objects and the original 'mock roles, not objects' paper and one thing that stood out for me is the idea that we should only mock types that we own.
I think this is quite an important guideline to follow otherwise we can end up in a world of pain.
One area which seems particularly vulnerable to this type of thing is when it comes to testing code which interacts with Hibernate.
A common pattern that I've noticed is to create a mock for the 'EntityManager' and then verify that certain methods on it were called when we persist or load an object for example.
There are a couple of reasons why doing this isn't a great idea:
- We have no idea what the correct method calls are in the first place so we're just guessing based on looking through the Hibernate code and selecting the methods that we think make it work correctly.
- If the library code gets changed then our tests break even though functionally the code might still work
The suggestion in the paper when confronted with this situation is to put a wrapper around the library and then presumably test that the correct methods were called on the wrapper.
Programmers should not write mocks for fixed types, such as those defined by the runtime or external libraries. Instead they should write thin wrappers to implement the application abstractions in terms of the underlying infrastructure. Those wrappers will have been defined as part of a need-driven test.
I've never actually used that approach but I've found that with Hibernate in particular it makes much more sense to write functional tests which verify the expected behaviour of using the library.
With other libraries which perhaps don't have side effects like Hibernate does those tests would be closer to unit tests but the goal is still to test the result that we get from using the library rather than being concerned with the way that the library achieves that result.
TDD: Combining the when and then steps
I've written before about my favoured approach of writing tests in such a way that they have clear 'Given/When/Then' sections and something which I come across quite frequently is tests where the latter steps have been combined into one method call which takes care of both of these.
An example of this which I came across recently was roughly like this:
@Test public void shouldCalculatePercentageDifferences() { verifyPercentage(50, 100, 100); verifyPercentage(100, 100, 0); verifyPercentage(100, 50, -50); }
private void verifyPercentage(int originalValue, int newValue, int expectedValue) { assertEquals(expectedValue, new PercentageCalculator().calculatePercentage(originalValue, newValue)); }
This code is certainly adhering to the DRY principle although it took us quite a while to work out what the different numbers being passed into 'verifyPercentage' were supposed to represent.
With this type of test I think it makes more sense to have a bit of duplication to make it easier for us to understand the test.
We changed this test to have its assertions inline and make use of the Hamcrest library to do those assertions:
@Test public void shouldCalculatePercentageDifferences() { assertThat(new PercentageCalculator().calculatePercentage(50, 100), is(100)); assertThat(new PercentageCalculator().calculatePercentage(100, 100), is(0)); assertThat(new PercentageCalculator().calculatePercentage(100, 50), is(-50)); }
I think we may have also created a field to instantiate 'PercentageCalculator' so that we didn't have to instantiate that three times.
Although we end up writing more code than in the first example I don't think it's a problem because it's now easier to understand and we'll be able to resolve any failures more quickly than we were able to previously.
As Michael Feathers points out during Jay Fields' 'Beta Test' presentation we need to remember why we try and adhere to the DRY principle in the first place.
To paraphrase his comments:
In production code if we don't adhere to the DRY principle then we might make a change to a piece of code and we won't know if there's another place where we need to make a change as well.
In test code the tests always tell us where we need to make changes because the tests will break.
Testing End Points: Integration tests vs Contract tests
We recently changed the way that we test against our main integration point on the project I've been working on so that in our tests we retrieve the service object from our dependency injection container instead of 'newing' one up.
Our tests therefore went from looking like this:
[Test] public void ShouldTestSomeService() { var someService = new SomeService(); // and so on }
To something more like this:
[Test] public void ShouldTestSomeService() { var someService = UnityFactory.Container.Resolve<ISomeService>(); // and so on }
This actually happened as a side effect of another change we made to inject users into our system via our dependency injection container.
We have some 'authenticated services' which require the request to contain a SAML token for a valid user so it seemed to make sense to use the container in the tests instead of having to duplicate this piece of behaviour for every test.
We needed to add our fake authorised user into the container for our tests but apart from this the container being used is the same as the one being used in the production code.
Our tests are therefore now calling the services in a way which is much closer to the way that they are called in the code than was previously the case.
I think this is good as it was previously possible to have the tests working but then have a problem calling the services in production because something in the container wasn't configured properly.
The downside is that these tests now have more room for failure than they did previously and they are not just testing the end point which was their original purpose.
In a way what we have done is convert these tests from being contract tests to integration tests.
I like the new way but I'm not completely convinced that it's a better approach.
Test Doubles: My current approach
My colleague Sarah Taraporewalla recently wrote about her thoughts on test doubles (to use Gerard Meszaros' language) and it got me thinking about the approach I generally take in this area.
Stub objects
I use stubs mostly to control the output of depended on components of the system under test where we don't want to verify those outputs.
Most of the time I make use of the mocking library's ability to stub out method calls on these dependencies.
I find that it generally seems to require less effort to do this than to create hand written stubs although chatting to Dave about this he pointed out that one situation where it would make more sense to use a hand written stub is when stubbing out a clock/time provider. This is because there are likely to be multiple calls to it all over the place and most of the time you probably want it to return the same value anyway.
I actually quite like the fact that you need to specify all the stub calls that you want to make in each test – it helps you to see when you have too many dependencies and then hopefully you can do something about that.
On previous projects I worked on we decided the way to get around that problem was to define all the stub method calls in a setup method but that seems to lead to a world of pain later on when you forget that you've stubbed a method in the setup and now want to assert an expectation on it or (to a lesser extent) write a test which doesn't actually use the stub.
Fake objects
I often confuse fakes with stubs as seem to be quite similar to each other in their intent – the difference as I understand it is that with a stub we are controlling the output of a dependency whereas a fake just sits there and lets interactions happen with it. The values passed in earlier calls to the fake may be returned in later calls to it.
The most common use of this pattern is to replace a real database with a fake one for testing although on a recent project we were making use of a hand written fake session store to avoid having to refer to the real session in our test code.
We might have one call to the 'SessionFake' to store a value and then if a retrieve call is made later on we would return the value that we previously stored.
The approach Sarah describes for stubbing repositories seems quite similar to this as well.
Mock objects
I use mocks to replace depended on components of the system under test when I do care about the way that is is used i.e. we want to verify the behaviour of the dependencies.
If we see a mock object being created in a test then we should see a call to a 'verify' method later on to ensure that the expected methods are called on it.
I used to use these all over the place for just about every test where I wanted to control the way that a dependency acted until I realised how fragile and confusing that made the tests.
Now, after recently watching a presentation by Jay Fields, I try to ensure that I'm only setting up one expectation per test and use of the other test double approaches for any other dependencies that needs to be taken care of in that test.
Dummy objects
Most of the time when I pass dummy values into tests they tend to be strings and I prefer to pass in a value of 'irrelevantValue' rather than just passing in a null which may lead to difficult to locate Null Pointer Exceptions further down the line if the value which we thought was just a dummy starts being used.
We are generally only passing in these dummy values to satisfy the requirements of the system under test which may require values to be entered even if the particular piece of functionality that we are testing doesn't make use of them.
Overall
I think my current approach to testing leans more towards mockist rather than classicist although I think I am probably moving more towards the middle as I see the problems we can run into with over mocking.
With test doubles my current approach has minimising the effort required to create them as the most important aspect but I'm sure that will change given a different context. With all the test doubles I generally try and use test data builders where it's not overkill.
Book Club: The Readability of Tests – Growing Object Oriented Software (Steve Freeman/Nat Pryce)
Our technical book club this week focused on 'The Readability of Tests' chapter from Steve Freeman & Nat Pryce's upcoming book 'Growing Object Oriented Software, guide by tests'.
I've been reading through some of the other chapters online and I thought this would be an interesting chapter to talk about as people seem to have different opinions on how DRY tests should be, how we build test data, how we name tests and so on.
These were some of my thoughts and our discussion on the chapter:
- I found it interesting that there wasn't any mention of the BDD style of test naming whereby the name of the test begins with 'should…'. I've been using these style of naming for about 2 years now as I find it useful for allowing us to question whether or not the test is valid. There are equally arguments against using the word 'should' as it's not particularly assertive and perhaps we ought to be more certain about what our tests are asserting.
Recently I have started to move more towards Jay Fields idea that test names are just comments and if we write tests to be really clear and readable then the test name becomes redundant.
- The chapter talks about the order in which the authors write their tests, the approach being to try and start with the assertion first and then write the execution and setup steps. My current approach is to write the execution step first and then build up the setup and expectations almost simultaneously. I've never been able to quite get the hang of writing the test bottom up but it's something I might experiment with again.
- Refactoring tests is something I've written about previously and my current thinking is that our aim shouldn't be to remove absolutely all duplication in tests but instead remove it to a stage where we can still easily understand the test when it fails. This seems to fit in with the authors' idea of 'refactoring but not too hard'.
I am currently following the idea of having three distinct areas in my tests (Given, When, Then) with each section separated by an empty line. I find writing them in this style makes it easier for me to quickly work out why a test is failing.
I was recently watching Jay Fields' presentation from SpeakerConf and Michael Feathers makes an interesting comment that we need to keep in mind that the reason for removing duplication in code is so that when we need to make changes we know where to do that. In test code the test failing will tell us where we need to make changes so the need to remove duplication to do this is less.
I'm still heavily in favour of trading duplication for better readability when it comes to writing tests.
- The idea of keeping consistency in tests is an important one although I think it's difficult to keep this consistency across the whole suite of tests. Certainly within a single test fixture it should be possible though.
One example of something which doesn't follow this approach is the 'ExpectedException' annotation in JUnit/NUnit which goes against the style of pretty much all other tests.
- When it comes to setting up tests data I think it's pretty much given that test data builders are a really good way to help remove noise and duplication from our tests. Other patterns such as object mother can be useful but it doesn't seem to work as well when you have multiple different was that you want to setup your data for tests.
- There's no specific mention of 'Setup' and 'Teardown' methods in the chapter but this is another area which I think has an important impact on readability.
I'm not yet completely against tear down methods for integration style tests but I've seen a lot of pain causes by putting mocks in setup methods and even just having the setup method means that you have to go up and down the test fixture just to work out what's going on. I prefer to try and keep all the context needed for a test in one place .
- I found the section about the way that we name literals/variables in tests to be particularly interesting as this is a discussion I've been having with a couple of colleagues recently.
I find it useful to state why that variable is important or not important for this particular test (i.e. give it context) so that someone can easily understand what's going on when they look at the test. For example if we have a variable in a test that doesn't affect the outcome then it might be useful to name it 'stubFoo' or 'irrelevantFoo' or something similar.
I've previously been against the idea of naming dependencies we're mocking/stubbing as 'mockRepository' or 'stubRepository' but I've been trying this out a bit this week and it exposed some mistakes I'd made which I don't think I would have seen otherwise.
- Another idea which I quite liked is the idea of only testing one feature set per test.
I've certainly written a lot of tests which break this rule and you really suffer when you need to make a change later on.
Jay Fields also applies this rule to mocks whereby you can only have one expectation per test but as many stubs as you want.
I've been trying that out both these approaches this week and although there's probably more code overall as a result of writing more tests, each of the tests feels much more succinct and understandable.