Mark Needham

Thoughts on Software Development

Roy Osherove’s TDD Kata: My first attempt

with 5 comments

I recently came across Roy Osherove’s commentary on Corey Haines’ attempt at Roy’s TDD Kata so I thought I’d try it out in C#.

Andrew Woodward has recorded his version of the kata where he avoids using the mouse for the whole exercise so I tried to avoid using the mouse as well and it was surprisingly difficult!

I’ve only done the first part of the exercise so far which is as follows:

  1. Create a simple String calculator with a method int Add(string numbers)
    1. The method can take 0, 1 or 2 numbers, and will return their sum (for an empty string it will return 0) for example “” or “1” or “1,2”
    2. Start with the simplest test case of an empty string and move to 1 and two numbers
    3. Remember to solve things as simply as possible so that you force yourself to write tests you did not think about
    4. Remember to refactor after each passing test
  2. Allow the Add method to handle an unknown amount of numbers
  3. Allow the Add method to handle new lines between numbers (instead of commas).
    1. the following input is ok:  “1\n2,3”  (will equal 6)
    2. the following input is NOT ok:  “1,\n” 
    3. Make sure you only test for correct inputs. there is no need to test for invalid inputs for these katas
  4. Allow the Add method to handle a different delimiter:
    1. to change a delimiter, the beginning of the string will contain a separate line that looks like this:   “//[delimiter]\n[numbers…]” for example “//;\n1;2” should return three where the default delimiter is ‘;’ .
    2. the first line is optional. all existing scenarios should still be supported
  5. Calling Add with a negative number will throw an exception “negatives not allowed” – and the negative that was passed.if there are multiple negatives, show all of them in the

Mouseless coding

I know a lot of the Resharper shortcuts but I found myself using the mouse mostly to switch to the solution explorer and run the tests.

These are some of the shortcuts that have become more obvious to me from trying not to use the mouse:

  • I’m using a Mac and VMWare so I followed the instructions on Chris Chew’s blog to setup the key binding for ‘Alt-Insert’. I also setup a key binding for ‘Ctrl-~’ to map to ‘Menu’ to allow me to right click on the solution explorer menu to create my unit tests project, to add references and so on. I found that I needed to use VMWare 2.0 to get those key bindings setup – I couldn’t work out how to do it with the earlier versions.
  • I found that I had to use ‘Ctrl-Tab‘ to get to the various menus such as Solution Explorer and the Unit Test Runner. ‘Ctrl-E‘ also became useful for switching between the different code files.

Simplest thing possible

The first run through of the exercise I made use of a guard block for the empty string case and then went straight to ‘String.Split’ to get each of the numbers and then add them together.

It annoyed me that there had to be a special case for the empty string so I changed my solution to make use of a regular expression instead:

Regex.Matches(numbers, "\\d").Cast<Match>().Select(x => int.Parse(x.Value)).Aggregate(0, (acc, num) => acc + num);

That works for nearly all of the cases provided but it’s not incremental at all and it doesn’t even care if there are delimeters between each of the numbers or not, it just gets the numbers!

It eventually came unstuck when trying to work out if there were negative numbers or not. I considered trying to work out how to do that with a regular expression but it did feel as if I’d totally missed the point of the exercise:

Remember to solve things as simply as possible so that you force yourself to write tests you did not think about

I decided to watch Corey’s video to see how he’d achieved this and I realised he was doing much smaller steps than me.

I started again following his lead and found it interesting that I wasn’t naturally seeing the smallest step but more often than not the more general solution to a problem.

For example the first part of the problem is to add together two numbers separated by a comma.

Given an input of “1,2″ we should get a result of 3.

I really wanted to write this code to do that:

if(number == "") return 0;
return number.Split(',').Aggregate(0, (acc, num) => acc + int.Parse(num));

But a simpler version would be this (assuming that we’ve already written the code for handling a single number):

if (number == "") return 0;
if (number.Length == 1) return int.Parse(number); 
return int.Parse(number.SubString(0,1)) + int.Parse(number.SubString(2, 1));

After writing a few more examples we do eventually end up at something closer to that first solution.

Describing the relationships in code

I’m normally a fan of doing simple incremental steps but for me the first solution expresses the intent of our solution much more than the second one does and the step from using ‘SubString’ to using ‘Split’ doesn’t seem that incremental to me. It’s a bit of a leap.

This exercise reminds me a bit of a post by Reg Braithwaite where he talks about programming golf. In this post he makes the following statement:

The goal is readable code that expresses the underlying relationships.

In the second version of this we’re describing the relationship very specifically and then we’ll generalise that relationship later when we have an example which forces us to do that. I think that’s a good thing that the incremental approach encourages.

Programming in the large/medium/small

In this exercise I found that the biggest benefit of only coding what you needed was that the code was easier to change when a slightly different requirement was added. If we’ve already generalised our solution then it can be quite difficult to add that new requirement.

I recently read a post by Matt Podwysocki where he talks about three different types of programming:

  • Programming in the large: a high level that affects as well as crosscuts multiple classes and functions
  • Programming in the medium: a single API or group of related APIs in such things as classes, interfaces, modules
  • Programming in the small: individual function/method bodies

From my experience generalising code prematurely hurts us the most when we’re programming in the large/medium and it’s really difficult to recover once we’ve done that.

I’m not so sure where the line is when programming in the small. I feel like generalising code inside small functions is not such a bad thing although based on this experience perhaps that’s me just trying to justify my currently favoured approach!

Written by Mark Needham

December 25th, 2009 at 10:25 pm

Posted in Coding

Tagged with

  • http://adventuresinsoftware.com/blog/ Michael L Perry

    I have a hard time seeing the smallest possible step as well. I, like you, tend to find a more general solution.

    I believe that this is the core of the argument against TDD. While some find it more natural to gradually evolve a correct solution, others prefer to reason through the whole problem and solve it once.

    It is not incorrect to write a complete algorithm the first time. There is a time for evolutionary coding and a time for revolutionary coding. Each person must find their own balance.

  • Pingback: The Morning Brew - Chris Alcock » The Morning Brew #506

  • Thomas Eyde

    I decided to code my way through the whole kata before I commented.

    First of all, I will say that your SubString solution is simple, but incorrect. It relates to positional data and not the separator at all. It will choke on two digit numbers.

    Then I have to agree with Michael, that the next simple step can very well be the generic solution. If you have done these kinds of things before, you better know the generic solution already.

    So to my experiences with this kata: 3 times did I experience that my next test was already covered. I don’t see that as a bad thing, but as a reminder to investigate why that is.

    At one point I discovered my multi-separator algorithm succeeded of the wrong reason: It treated each character as a separator, so that [*][%] was treated as 4 separators, not two, because there are 4 distinct characters present.

    Introducing multi-separators was also the point where I had to rethink and refactor my approach, and I spent just as much time implementing this one test as all the previous.

    At the end, with all tests passing, I had this one giant Calculator class which did more than its share of responsibilities.

    The final refactoring also took nearly as much time as all the coding preceding it. The final code had more classes and looked cleaner, but if this were a real life project, I would defer the last refactoring until I actually needed to change the code.

    The single calculator-class solution wasn’t that bad, but it contained a lot of small static helper methods which had nothing to do with the actual calculation.

  • http://www.markhneedham.com Mark Needham

    Hey Thomas,

    Yeh the SubString solution certainly doesn’t work once you’ve driven out an example with 2 digit numbers but it does work if you only have single digit numbers which is what I did for my first example. It’s totally position related as you point out. If we’re building incrementally then it seems like it’s a step we should be looking to take.

    I haven’t done the second part of the exercise which you describe but it does feel like the Calculator class does too much even in the state that I’ve got it in. There are a few responsibilities which don’t belong there.

    Did you post up your final solution anywhere?

  • Pingback: “Code Katas”, Scott Wallace | some assembly required, batteries not included