Mark Needham

Thoughts on Software Development

Coding: Copy/Paste then refactor

with 5 comments

We’re currently reading Michael Feathers ‘Working Effectively With Legacy Code‘ in our technical book club and one interesting technique which he describes in the Test Driven Development section is copying and pasting some existing code, changing the appropriate part to make the test pass before refactoring to remove the duplication we just created.

I can’t remember coming across this approach previously but I found myself using it to solve a Scala problem last week.

I wanted to find the smallest number in a list having previously written the code to find the largest number in a list.

This was the code I already had:

1
2
3
4
5
6
7
8
  def max(values : List[Int]) : Int = {
    def inner(remainingValues : List[Int], currentMax : Int) : Int = remainingValues match {
      case Nil => currentMax
      case head::tail if head > currentMax => inner(tail, head)
      case _::tail => inner(tail, currentMax)
    }
    inner(values, Math.MIN_INT)
  }

I figured finding the smallest value would be quite similar to that but I couldn’t quite see where the duplication between the two functions was going to be.

I somewhat reluctantly found myself copying and pasting my way to the following code:

1
2
3
4
5
6
7
8
  def min(values : List[Int]) : Int = {
     def inner(remainingValues : List[Int], currentMin : Int) : Int = remainingValues match {
      case Nil => currentMin
      case head::tail if head < currentMin => inner(tail, head)
      case _::tail => inner(tail, currentMin)
    }
    inner(values, Math.MAX_INT)
  }

Having done that it became clearer that the duplication between the two functions was on line 4 where we do the comparison of values and on line 7 where we choose the initial max/min value.

My first attempt at refactoring was to try and pull out the comparator function on line 4:

def comparator(x : Int, y : Int, f : Function2[Int, Int, Boolean]) : Boolean = f(x,y)

Which resulted in the ‘max’ function looking like this:

  def max(values : List[Int]) : Int = {
    def inner(remainingValues : List[Int], currentMax : Int) : Int = remainingValues match {
      case Nil => currentMax
      case head::tail if comparator(head, currentMax, (x,y) => x > y) => inner(tail, head)
      case _::tail => inner(tail, currentMax)
    }
    inner(values, Math.MIN_INT)
  }

At this point I realised that I was doing the refactoring the wrong way around and that I should be trying to make ‘max’ and ‘min’ call another function while passing in the differences as parameters to that function.

I ended up with this common function:

   def findValue(values : List[Int], comparisonFunction : Function2[Int, Int, Boolean], startingValue : Int) : Int = {
    def inner(remainingValues : List[Int], current : Int) : Int = remainingValues match {
      case Nil => current
      case head::tail if comparisonFunction(head, current) => inner(tail, head)
      case _::tail => inner(tail, current)
    }
    inner(values, startingValue)
  }
  def max(values : List[Int]) : Int = findValue(values, (x,y) => x>y, Math.MIN_INT)
  def min(values : List[Int]) : Int = findValue(values, (x,y) => x<y, Math.MAX_INT)

I’m not sure whether the name ‘findValue’ is a very good one for the common function but I can’t think of a better one at the moment.

The neat thing about this approach is that it made it much easier to see where the duplication was and I was able to have both the functions working before trying to do that refactoring.

I’m not sure if there are built in functions to do these calculations. I looked quickly and couldn’t find any and thought it would be an interesting exercise to write the code anyway.

It worked out much better than I thought it would.

Be Sociable, Share!

Written by Mark Needham

October 31st, 2009 at 5:54 pm

Posted in Coding

Tagged with

  • Kent Beck uses this method as his introduction to TDD, he had me sold right away.

  • Oh yeah, and I started to notice recently sometimes the 3rd function that is born ends up becoming a public method and a core part of the system’s API. Maybe not in this case but take for example a method extraction in the middle of a loop, now you have a method to call when you have an object instead of a collection

  • Pingback: Book Club: Working Effectively With Legacy Code – Chapter 8 (Michael Feathers) at Mark Needham()

  • Typically I find when working with collections that everything has been done already and I like to use the mechanisms there where possible. They may not be the most efficient but often easy to understand. I know you’re learning Scala and one of the best ways is trying to do things manually so you learn the idioms etc.

    Anyway after looking at the code you ended up with it reminded me of a sorted collection and a parameterised sort closure.
    So in Smalltalk (sorry I haven’t learned Scala yet, but Groovy would look similar) it would be something like:

    |min max|
    min := aCollectionHoldingInts asSortedCollection first.

    max := aCollectionHoldingInts asSortedCollection last.
    Or we could do (although slower),
    max := aCollectionHoldingInts asSortedCollection reversed last.

    Alternatively if you’re not storing int’s in the collection but rather an object then you sould pass the sort block around:
    max := aCollection asSortedCollection: [:a :b | a someIntField > b someIntField].

    Cheers
    Carlo

  • Pingback: Coding: Generalising too early at Mark Needham()