Archive for January, 2009
Coding Dojo #8: Isola
Our latest coding dojo involved writing the board game Isola in Java.
The Format
We used the Randori approach again with around 8 or 9 people participating for the majority of the session, our biggest turnout yet. I think the majority of people had the opportunity to drive a couple of times over the evening.
We had the pair driving at the front of the room and everyone else further back to stop the tendency of observers to whiteboard stuff.
What We Learnt
- We took the approach of trying to get a useable front end for the game working as quickly as possible this time which was a bit different to our normal approach where we tend to focus more heavily on the modeling side of the problem. We therefore decided to do the simplest thing that could possibly work and hardcoded the representation of the board as a string, which surprisingly (to me at least) proved adequate for the whole time we were coding and it was only towards the end that we felt we needed to put in a more robust data structure. Certainly a lesson for me of the value of not over engineering a solution.
- This led to a discussion around what sort of situation this would represent on a real project. The closest we came up with was that of using an in memory repository early on until a real database is actually needed. We need to trade off the complexity we are adding in by doing this integration versus the gains we get from integrating early.
- An cool approach which Nick showed us was to always implement code inline, make the test pass and then extract it into methods or classes as part of a refactoring step. This is similar to the idea of sprouting inner classes which Pat Kua wrote about a couple of months ago. This takes the idea of taking small steps even further which can only be a good thing.
- It was interesting to see that without a domain expert we ending up changing the ubiquitous language in the code quite frequently but never really came up with one that made sense to everyone – when talking about the code we ended up with people doing translation between their understanding of various concepts.
For next time
- The plan for next week is to continue working on Isola – it turned out to be quite an interesting game to try and model, simple enough that we could understand the rules quickly but complicated enough that it takes a while to implement.
TDD: Test DRYness
I had a discussion recently with Fabio about DRYness in our tests and how we don't tend to adhere to this principal as often in test code as in production code.
I think certainly some of the reason for this is that we don't take as much care of our test code as we do production code but for me at least some of it is down to the fact that if we make our tests too DRY then they become very difficult to read and perhaps more importantly, very difficult to debug when there is a failure.
There seem to be different types of DRYness that weave themselves into our test code which result in our code becoming more difficult to read.
Suboptimal DRYness
Setup method
Putting code into a setup method is the most common way to reduce duplication in our tests but I don't think this is necessarily the best way to do it.
The problem is that we end up increasing the context required to understand what a test does such that the reader needs to read/scroll around the test class a lot more to work out what is going on. This problem becomes especially obvious when we put mock expectations into our setup method.
One of those expectations becomes unnecessary in one of our tests and not only is it not obvious why the test has failed but we also have a bit of a refactoring job to move the expectations out and only into the tests that rely on them.
Helper methods with more than one responsibility
Extracting repeated code into helper methods is good practice but going too far and putting too much code into these methods defeats the purpose.
One of the most common ways that this is violated is when we have methods which create the object under test but also define some expectations on that object's dependencies in the same method.
This violates the idea of having intention revealing method names as well as making it difficult to identify the reason for test failures when they happen.
Assertions
I tend to follow the Arrange, Act, Assert approach to designing tests whereby the last section of the test asserts whether or not the code under test acted as expected.
I'm not yet convinced that following the DRY approach is beneficial here because it means that you need to do more work to understand why a test is failing.
On the other hand if assertions are pulled out into an intention revealing method then the gain in readability might level out the extra time it takes to click through to a failing assertion.
My favourite approach to test assertions is to use behavioral style assertions
e.g.
stringValue.ShouldEqual("someString")
…and I don't think applying the DRY principle here, if we have a lot of similar assertions, adds a lot of value.
DRY and expressive
I'm not against DRYness in tests, I think it's a good thing as long as we go about it in a way that still keeps the code expressive.
Test data setup
The setup and use of test data is certainly an area where we don't gain an awful lot by having duplication in our tests. If anything having duplication merely leads to clutter and doesn't make the tests any easier to read.
I have found the builder pattern to be very useful for creating clutter free test data where you only specifically define the data that you care about for your test and default the rest.
Single responsibility helper methods
If we decide that extracting code into a helper method increases the readability of a test then the key for is to ensure that these helper methods only do one thing otherwise it becomes much more difficult to understand what's going on.
My current thinking is that we should aim for having only one statement per method where possible so that we can skim through these helper methods quickly without having to spend too much time working out what's going on.
An idea Dan North talks about (and which is nicely illustrated in this blog post) is putting these helper methods just before the test which makes use of them. I haven't tried this out yet but it seems like a neat way of making the code more DRY and more readable.
In Summary
I've noticed recently that I don't tend to read test names as often as I used to so I'm looking to the test code to be expressive enough that I can quickly understand what is going on just from scanning the test.
Keeping the code as simple as possible, extracting method when it makes sense and removing clutter are some useful steps on the way to achieving this.
TDD: Design tests for failure
As with most code, tests are read many more times than they are written and as the majority of the time the reason for reading them is to identify a test failure I think it makes sense that we should be designing our tests with failure in mind.
Several ideas come to mind when thinking about ways to write/design our tests so that when we do have to read them our task is made easier.
Keep tests data independent
The worst failures for me are the ones where a test fails and when we investigate the cause it turns out that it only failed because some data it relied on changed.
This tends to be the case particularly when we are writing boundary tests against external services where the data is prone to change.
In this situations we need to try and keep our tests general enough that they don't give us these false failures, but also specific enough that they aren't completely worthless.
As an example, when testing XML based services it makes more sense to check that certain elements exist in the document rather than checking that these elements have certain values. The latter approach leads to brittle, difficult to maintain tests while the former leads to tests that are more independent and whose failures are actually a cause for concern.
Consistent Structure
Jay Fields touched on this in a post he wrote a couple of months ago about having a ubiquitous assertion syntax for every test.
That way when we look at a failing test we know what to expect and we can get down to fixing the test rather than trying to work out how exactly it failed.
We have used the Arrange, Act, Assert approach on the last couple of projects I've worked on which has worked quite well for dividing the tests into their three main parts. We typically leave empty lines between the different sections or add a comment explaining what each section is.
The nice thing about this approach when you get it right is that you don't even have to read the test name – the test reads like a specification and explains for itself what it going on.
My personal preference for the Assert step is that I should be able to work out why the test is failing from within the test method without having to click through to another method in the test class. There is a debate about whether or not that approach is DRY, but that's a discussion for another post!
Avoid false failures
Failing because of test reliance on data is one example of a false failure but there are other ways that a test failure can be quite misleading as to the actual reason that it failed.
Null Reference or Null Pointer Exceptions are the chief culprits when it comes to this – a test will seemingly randomly start throwing one of these exceptions either on an assertion or in the actual code.
With the former we should shore the test up by testing something more general further up the test, so that we get a more meaningful failure the next time.
With the latter this usually happens because we added in some code without changing the tests first. I always get bitten when I disrespect Uncle Bob's Three Laws.
- Write no production code except to pass a failing test.
- Write only enough of a test to demonstrate a failure
- Write only enough production code to pass the test
Sometimes we get false failures due to not having enough data set up on our objects. Depending on the situation we might have a look at the test to see whether it is testing too much and the class has taken on more responsibility.
If it turns out all is fine then the builder pattern is a really good way for ensuring we don't run into this problem again.
Learning alone or Learning together
One of the things that I have been curious about since we started running coding dojos is whether people learn more effectively alone or when learning as part of a group.
Not that I think they are mutually exclusive, I think a combination of both is probably the way to go depending on what it is we are trying to learn and the way that we're trying to learn it.
A new language
I think learning a new programming language is one of the times when learning on your own makes the most sense.
That way we have the freedom to try things out and understand how everything fits together without frustrating other people who may understand more than we do.
There comes a certain stage though where we have questions that we can't answer and then I think it is quite useful to be able to work through the problem with someone else and get their input.
I'm currently learning F# and although I've been learning it alone it has still proved to be useful to talk through some of the ideas I've come across with colleagues and get their input on useful approaches to take for my future learning.
Technical Book Club
We recently started running a weekly Technical book club in the ThoughtWorks Sydney office. The first book we're reading is Eric Evans' Domain Driven Design.
The idea is that people read part of a chapter alone and then we come together once a week to discuss that area of reading. The idea of discussing what you read is one which is encouraged by Andy Hunt in Pragmatic Learning and Thinking, and I have certainly got more from reading the book and discussing it with my colleagues than I did from just reading it alone about 6 months ago.
We've been going through a few chapters over the last couple of weeks which we've found quite difficult to discuss but actually showing some code of how to implement certain ideas has made the sessions much more interesting.
For example last week two of the members of the group were able to show some examples of using the specification pattern which brought the ideas to life a lot more for me than just reading about them in the book.
I've found taking part in a book club brings a renewed focus when reading and others in the book club are able to explain areas of the text that I didn't understand when I read it.
Coding Dojos
Coding Dojos are the ultimate when it comes to learning together. The format that we have used the most is to have one pair driving the code, rotating a new person in every seven minutes.
We started out solving problems in an area where everyone had a high level of proficiency but the last two weeks have tried playing around with libraries that not everyone had a great deal of experience with.
While there is learning to be gained from the latter, people seemed less keen to get involved at the keyboard on these sessions so I don't know if as much was gained from these sessions as could have been.
In previous sessions though I think the dojo approach has been really good for helping us to 'practice what we preach' in terms of following good practices while at the keyboard. Having 4 or 5 other people watching you tends to encourage that!
I'm sure we haven't found all the learning areas where coding dojos can be useful, but I'm veering towards thinking the maximum is gained when the majority of people have some experience with the technology being used.
Alan Dean looks to be taking the idea even further with his Open Space Coding Days which start next weekend. It will be interesting to see the feedback on the learning people were able to achieve in this environment.
jQuery: Approaches to testing
We've been doing a bit of work with jQuery and true to our TDD roots we've been trying to work out the best way to test drive our coding in this area.
There seem to be 3 main ways that you can go about doing this, regardless of the testing framework you choose to you. We are using screw-unit for our javascript testing.
Mock everything out
The idea here is that we mock out all calls made to jQuery functions and then we assert that the expected calls were made in our test.
Taking this approach means we are able to reduce the dependencies in our tests and they run very quickly.
The problem is that a lot of assertions become assertions checking that certain operations were called on the DOM since jQuery makes a lot of these type of calls in its code. The tests therefore end up being quite long and difficult to understand unless you were the one who initially wrote them.
We also effectively end up testing how the framework interacts with the DOM rather than testing our own code with this approach.
Don't mock anything
The opposite approach is to just test directly against jQuery and then do assertions against the part of the DOM affected by the javascript code we are testing.
The problem here is that if you want to test against plugins which make ajax requests or carry out animation effects then the tests become dependent on how long these calls take and we need to find a way to block the test until those calls return which is quite difficult!
Only stub certain calls
The happy medium is that we test directly against the jQuery library but stub out ajax requests and animation effects. Our test assertions are against the state of the DOM to check that it was changed in the way that we expected.
This approach allows us to test our javascript code in terms of its behaviour without testing the internals of how jQuery works.
The wiring up of events to different controls on the page is done in our test but the actual logic that happens when these events are fired is in our js file under test.
We currently favour this last approach as it seems to give us the best of both worlds so to speak. It be would be interesting to hear how are other people going about Javascript testing.
Coding Dojo #7: Retlang/Hamcrest .NET attempt
We ran a sort of coding dojo/playing around session which started with us looking at the .NET concurrency library, Retlang, and ended with an attempt to write Hamcrest style assertions in C#.
The Format
We had the same setup as for our normal coding dojos with two people at the keyboard although we didn't rotate as aggressively as normal.
What We Learnt
- We started off having a look at a concurrency problem in Cruise Control .NET which Dave Cameron recently fixed. The intention was to try and take one of the cases of multi threading and replace it with a message based approach using the Retlang library.
- As I understand it, you can have any number of subscribers subscribe to any channel using Retlang which is different to the Erlang approach whereby only one subscriber would be allowed. A bit of experimentation also suggests that subscribers need to be subscribed to a channel at the time a message is published in order to receive it.
-
We started off with an initial test case but got sidetracked in trying to work out how to make the assertion syntax a bit nicer. The original assertion read like the examples on the website in that we check the state of a ManualResetEvent so that we know whether or not a message was received by a subscriber.
The assertion read like this:
var gotMessage = new ManualResetEvent(false); ... Assert.IsTrue(gotMessage.WaitOne(2000, false));
We initially worked this to read like so:
AssertThat(gotMessage, HasTrippedWithin(2.Seconds());
AssertThat and HasTrippedWithin were local methods and Seconds was an extension method. It's pretty nice but the problem is that we can't reuse this code easily in other test classes and keep the readability.
C# doesn't have Java's ability to import static methods so we would need to reference the class which the AssertThat method and HasTrippedWithin methods reside on directly either by having every test case extend it or by explicitly referencing it when we use the methods.
- A bit more playing around with extension methods and trying to work out a good way to write Matchers led us to the following syntax:
gotMessage.Should(Be.TrippedWithin(2.Seconds());
We also considered putting a Verify extension method on object so that a test case could have a series of different matchers to be evaluated.
this.Verify( gotMessage.Is().TrippedWithin(2.Seconds()) );
For some reason we need to use the 'this' keyword in order to access an extension method defined on object – I don't really understand why as I thought classes implicitly extended object, meaning the following should be possible:
Verify(gotMessage.Is().TrippedWithin(2.Seconds()));
- I think the way that our tests fail and the way that they report this failure is vital for getting the most out of TDD so I'd be interested to know of any ideas people have with regards to this. The thing that makes Hamcrest so good is not just the fluent syntax but the error messages that you receive when tests fail – it's very clear where the problem lies when a test fails, there is rarely a need to get out the debugger in complete confusion as to why the test failed.
Next Time
- I think we may make a return to coding some OO problems again next week – I'm not convinced that we are getting the most out of the Dojo sessions learning something which is new to the majority of people taking part.
C#: Builder pattern still useful for test data
I had thought that with the ability to use the new object initalizer syntax in C# 3.0 meant that the builder pattern was now no longer necessary but some recent refactoring efforts have made me believe otherwise.
My original thought was that the builder pattern was really useful for providing a nicely chained way of creating objects, but after a bit of discussion with some colleagues I have come across three different reasons why we might want to use the builder pattern to create test data:
- It creates a nice to read fluent interface describing the object being created. This argument holds more for Java rather than C# where we now have object initializers.
- Domain objects are a bit complicated to create – encapsulate this logic in the builder.
- We want to default non null data on some of the fields in our object. If we don't explicitly set a value for a property in C# it defaults to null.
Even with the object initializer syntax we can still end up having to specify extra data that we don't really care about in our test. The following is not uncommon:
new Foo {Bar = "bar", Baz = "baz", Bling = "bling"};
public class Foo { public string Bar {get; set;} public string Baz { get; set; } public string Bling { get; set; } }
Let's say we only care about Bar for this test though but Baz and Bling are both being used in our code so we end up with a Null Reference Exception if we don't set values for them. We can quickly end up having this redundant data being repeated across all our tests.
In steps the builder pattern!
new FooBuilder().Bar("bar").Build();
public class FooBuilder { private string bar = "defaultBar"; private string baz = "defaultBaz"; private string bling = "defaultBling"; public FooBuilder Bar(string value) { bar = value; return this; } public FooBuilder Baz(string value) { baz = value; return this; } public FooBuilder Bling(string value) { bling = value; return this; } public Foo Build() { return new Foo {Bar = bar, Baz = baz, Bling = bling}; } }
It takes a bit more code to setup but every time we use the builder it saves us typing in extra data that we don't need.
It would be even better if we could not have to call that 'Build' method and we can get around this by using the implicit operator, the problem being that you need to apply it to the target class (i.e. Foo) rather than the class you want to implicitly convert from (i.e. FooBuilder).
I don't really want to change a production code class just for test purposes so the 'Build' will have to stay in there for the time being.
Coding: Contextual learning
While reading my colleague's notes on a brown bag session on pair programming she gave I was reminded of my belief that we learn much more effectively when we are learning in a practical environment.
The bit that interested me was this bit regarding onboarding:
On board new team members to bring them up to speed on the overall goal and design, so you do not need to repeat basic details when you work with them on a story.
It's fairly normal for the Tech Lead to give new starters on a project this kind of overview and although it is useful as a starting point, nearly everyone I have worked with is keen to see how these ideas are implemented in the code.
I think there is still a place for the presentation/context free style of teaching but we should look for opportunities to get people into the context of what they are being taught as often as possible.
Pair Programming
I've written about this quite a few times in the past so I don't want to labour the point, but this is by far the most effective learning approach that I have seen so far.
It works especially well when at least one person is skillful in the technology currently being used. Obviously it works even better if both people know it but it is useful to have one person who has the ability to teach the other.
Being shown how to do something and then trying it out yourself is much more effective than having someone talk about it at a more abstract level and then trying to apply what they have taught.
For example, we have been discussing recently how to write better Javascript/jQuery code and all the things talked about make sense but it didn't really click in my head until I got the chance to work with some colleagues who were really skilled in this area. I'm certainly not an expert but having this opportunity has given me the chance to improve more effectively.
Coding Dojo Style Learning
We have been holding some Coding Dojo sessions over the last couple of months in the ThoughtWorks Sydney office and I think they are really useful for helping to spread best practices.
For example, one of the key ideas of TDD is that we should take small steps, at all times making a change and then running the tests to make sure we didn't break something. In a pair programming session it is quite easy to ignore this guideline and then suffer the consequences, but with 5 or 6 other people watching you make that mistake it is much less likely to happen!
I haven't seen this approach used on a project yet, but Alan Dean has been posting on Twitter recently about using a Coding Dojo for a refactoring session on the code base he is currently working on.
This certainly seems like a more effective approach than talking about how the code base needs refactoring in a white boarding session without being able to show exactly what is meant.
And if you must whiteboard…
Although I think these other two approaches are more effective, the whiteboard is still an effective tool as long as we use it in an interactive way.
If it's just one person drawing stuff out and others are not having any input then from my experience it's not going to be an effective way to learn.
A far more useful approach is for the first person to start drawing out their ideas and the others can then add to this to check that their understanding is correct.
Although this is a useful exercise, it certainly makes sense to then go and try out those ideas in the code to ensure that you actually did understand what was being discussed.
Overall
I think the thing with all these approaches is that they are designed for small groups – with pair programming just two people for example!
I'm not sure how we could get the same effectiveness of learning with a bigger group – certainly the university style of lecturing is not the answer.
Whatever approach we take, keeping people involved and keeping it contextual is the best way to go.
Cruise: Pipelining for fast visual feedback
One of the cool features in build servers like Cruise and Team City is the ability to create build pipelines.
I have done a bit of work using this feature in previous projects but the key driver for doing so there was to create a chain of producers/consumers (producing and consuming artifacts) eventually resulting in a manual step to put the application into a testing environment.
While this is certainly a good reason to create a build pipeline, a colleague pointed out an equally useful way of using this feature to split the build into separate steps pipelined together.
By doing this we get a nice graphical display from the cruise dashboard which allows us to see where the build is failing, therefore pointing out where we need to direct our focus.

One way to use the pipelines is to work out the distinct potential areas where you would want to signal that something needs to be investigated and then make each of these targets a separate build target.
For example we could set it up like so:
No dependency build =>
Services build =>
End to End smoke test build =>
Full build
Benefits of this approach
The benefit of this approach is that it helps to create more confidence in the build process.
When we have a long running build it is easy to get into a state where it is failing after 3/4 of the build has run and all we get is the red failed build to indicate something has gone wrong. We can drill down to find out where the failure is but it's not as obvious.
The approach we have taken to checking in is that it is fine to do as long as the first stage of the build is green. This has worked reasonably well so far and failure further down stream has been fixed relatively quickly.
Things to watch for
We have setup the final step of the build to be a manual step due to the fact that it takes quite a long time to run and we've been unable to get a dedicated machine to run an agent on. Ideally we would have it running constantly on its own agent.
This isn't run as frequently as when we had it running automatically and I guess the danger is that we are pushing problems further down stream rather than catching them early. Hopefully this issue will be solved if we can get a dedicated agent running this build.
We're still looking for ways to improve our build process but this is what's currently working for reasonably well for us at the moment. It would be interesting to hear what others are doing.
F# vs C# vs Java: Functional Collection Parameters
I wrote a post about a month ago on using functional collection parameters in C# and over the weekend Fabio and I decided to try and contrast the way you would do this in Java, C# and then F# just for fun.
Map
Map evaluates a high order function on all the elements in a collection and then returns a new collection containing the results of the function evaluation.
Given the numbers 1-5, return the square of each number
Java
int[] numbers = { 1,2,3,4,5}; for (int number : numbers) { System.out.println(f(number)); } private int f(int value) { return value*value; }
C#
new List<int> (new[] {1, 2, 3, 4, 5}.Select(x => x*x)).ForEach(Console.WriteLine);
F#
[1..5] |> List.map (fun x -> x*x) |> List.iter (printfn "%d");;
Filter
Filter applies a predicate against all of the elements in a collection and then returns a collection of elements which matched the predicate.
Given the numbers 1-5, print out only the numbers greater than 3:
Java
int[] numbers = { 1,2,3,4,5}; for (int number : numbers) { f(number); } private void f(int value) { if(value > 3) { System.out.println(value); } }
C#
new List<int> { 1,2,3,4,5}.FindAll(x => x > 3).ForEach(Console.WriteLine);
F#
[1..5] |> List.filter (fun x -> x > 3) |> List.iter (printfn "%d");;
Reduce
Reduce applies a high order function against all the elements in a collection and then returns a single result.
Given a list of numbers 1-5, add them all together and print out the answer
Java
int sum = 0; int[] numbers = { 1,2,3,4,5}; for (int number : numbers) { sum += number; } System.out.println(sum);
C#
Console.WriteLine(new[] {1, 2, 3, 4, 5}.Aggregate(0, (accumulator, x) => accumulator + x));
F#
[1..5] |> List.fold_left (+) 0 |> printfn "%d";;
In Summary
I was surprised that we could achieve these results in relatively few lines of Java. The C# and F# versions are still more concise but the Java version isn't too bad. The Apache Commons Library has a class which allows you to write these in a functional way but the need to use anonymous methods means it's not as clean as what you can achieve in C# and F#.
I think there is still a bit of a mindset switch to make from thinking procedurally about these things to thinking in a way that allows you to make the most of functional programming concepts.
Keeping the code as declarative as possible and reducing the amount of state in our code are the most obvious things I have learned so far from playing with F#.