Mark Needham

Thoughts on Software Development

Archive for the ‘refactoring’ tag

Small step refactoring: Overload constructor

with 2 comments

I’ve previously written about some approaches that I’ve been taught with respect to taking small steps when refactoring code and another approach which a couple of colleagues have been using recently is the idea of overloading the constructor when refactoring objects.

On a couple of occasions we’ve been trying to completely change the way an object was designed and changing the current constructor would mean that we’d have to change all the tests against that object before checking if the new design was actually going to work or not.

Given a simple example where we want to change the following object to take in a completely different type and not use ‘field1′ or ‘field2′ anymore:

public class Foo
{
	public Foo(string field1, int field2)
	{
		// and so on
	}
}

One approach would be to change that constructor to take in the extra parameter and then gradually phase out the other parameters:

public class Foo
{
	public Foo(string field1, int field2, Bar bar)
	{
		// and so on
	}
}

An alternative is to create another constructor and leave the old one as it is before changing the tests one by one to make use of the new approach.

public class Foo
{
	public Foo(string field1, int field2)
	{
		// we gradually phase this one out
	}
 
	public(Bar bar)
	{
		// the new constructor
	}
}

Even if we want to still use ‘field1′ and ‘field2′ we can still make use of this approach and just default the extra value until we’ve migrated our tests across.

public class Foo
{
	public Foo(string field1, int field2, Bar bar)
	{
		// and so on
	}
 
	public Foo(string field1, int field2) : this(field1, field2, new Bar())
	{
	}
}

I quite like this approach as it allows us to keep the code compiling while we’re making changes to it to improve the design.

The main thing to remember with this approach is not to keep the old approach around for too long and to make sure we move our code across to use the new approach as quickly as possible otherwise it can become very confusing for other pairs which come across the code in this half way state.

Written by Mark Needham

April 25th, 2010 at 10:48 pm

Refactoring: Small steps to pull out responsibilities

with 3 comments

I wrote previously about how I’ve been using effect sketches to identify responsibilities in objects so that I can pull them out into other objects and once I’ve done this I often find that I can’t see a small next step to take.

At this stage in the past I’ve often then stopped and left the refactoring until I have more time to complete it but this hasn’t really worked and a lot of the time I end up only seeing the code change in my mind and not in the actual code.

I came across this problem again last week in an object which had 8 dependencies when I came across it.

Having drawn the effect sketch I realised that 3 of those could be pulled out into a new object which could then be injected into the original object and help to encapsulate those other 3 dependencies.

The code that I wanted to change was something like this:

public class TheObject 
{
	private readonly DependencyA;
	private readonly DependencyB;
	private readonly DependencyB;
	...	
 
	public Foo FooCreation()
	{
		var dependencyAValue = dependencyA.GetSomething();
		var dependencyBValue = dependencyB.GetSomething();
		var dependencyCValue = dependencyC.GetSomething();
 
		return new Foo(dependencyAValue, dependencyBValue, dependencyCValue);
	}
 
	...
}

I wanted to pull the ‘FooCreation’ method out into another object and then change all the places that were calling ‘TheObject’FooCreation’ to just call this new object directly.

The first step was to create a ‘FooFactory’ and just have that delegated to internally:

public class FooFactory 
{
	private readonly DependencyA;
	private readonly DependencyB;
	private readonly DependencyB;
	...	
 
	public Foo Create()
	{
		var dependencyAValue = dependencyA.GetSomething();
		var dependencyBValue = dependencyB.GetSomething();
		var dependencyCValue = dependencyC.GetSomething();
 
		return new Foo(dependencyAValue, dependencyBValue, dependencyCValue);
	}
 
}
public class TheObject 
{
	...	
 
	public Foo FooCreation()
	{
		return new FooFactory(dependencyA, dependencyB, dependencyC).Create();
	}
 
	...
}

I ran out of time at this stage to finish off the refactoring but it was obvious where the refactoring was going so the next time I got the chance I injected the ‘FooFactory’ into ‘TheObject':

public interface IFooFactory 
{
	Foo Create();
}
public class TheObject 
{
	private readonly IFooFactory;
	...	
	public TheObject(IFooFactory fooFactory) 
	{
		this.fooFactory = fooFactory;
	}
 
	public Foo FooCreation()
	{
		return fooFactory.Create();
	}
 
	...
}

To do this I had to go and change the tests on ‘TheObject’ and move some of them to go directly against ‘FooFactory’.

The third stage of the refactoring was to change all the places which called ‘TheObject.FooCreation()’ to just call ‘FooFactory.Create()’ directly.
Some of those places were also using other methods on ‘TheObject’ so those objects now have an extra dependency although I think at least the code is more intention revealing than it was previously.

I’m sure there are some other patterns for this type of small step refactoring but this is just one that I’ve noticed so far.

Written by Mark Needham

February 24th, 2010 at 12:45 am

Safe refactoring: Removing object initializer, introducing builder

with 3 comments

I previously wrote about an approach we took to safely remove some duplication and I recently followed a similar mantra to replace an object initializer call which had around 40 properties being setup with a builder to try and make the code a bit easier to understand.

We did have tests checking the values being setup by the object initializer so I was already able to refactor with some degree of safety – it would probably have been possible to just create the builder and build the object from that and then delete the old code and replace it with the new but I’ve caused myself too many problems from doing that before that I decided to try a more incremental approach.

The idea was to have the builder and the object initializer both creating the object at the same time and as I built a property from the builder I would set it in the object initializer until eventually all of the properties were being set directly from the builder’s object and then I could just return that instead – this approach feels quite similar to what Kent Beck describes as having parallel implementations in his recent presentation on Responsive Design.

The code I started with was something like this:

public Foo CreateMeAFoo() 
{
	return new Foo
	{
		Bar = "someValue",
		OtherBar = "someOtherValue",
		UltraBar = "aValue",
		...
		AnotherArgumentWayDownHere = 1
		...
		AndSoOn = "yes"
 
	}
}

I worked together with one of the business analysts on our team who pointed out that there were actually some clear groupings in what I had just been considering ‘data’ and we were able to put those explicit domain concepts into the code as part of the builder.

My first step however was to remove the object initializer to avoid making any silly mistakes – an example of one I have made when using object initializers is to set a property using ‘Int16.Parse’ and then passing in a string with a value of “53700” which causes the method to throw an exception and the stack trace just directs you to the first line of the object initializer, making it quite difficult to work out which line has failed.

Having worked the code into a sequence of setters I gradually added methods to the builder to create the policy:

public Foo CreateMeAFoo() 
{
	var fooBuilder = new FooBuilder().AddBars("someValue", "someOtherValue", "aValue);
	var newFoo = fooBuilder.Build();
 
	var foo = new Foo();
	foo.Bar = newFoo.Bar;
	foo.OtherBar = newFoo.OtherBar;
	foo.UltraBar = newFoo.UltraBar;
	...
	foo.AnotherArgumentWayDownHere = 1
 
	return foo;
}
public class FooBuilder  
{
	private string bar;
	private string otherBar;
	private string ultraBar;
 
	public FooBuilder AddBars(string bar, string otherBar, string ultraBar)
	{
		this.bar = bar;
		this.otherBar = otherBar;
		this.ultraBar = ultraBar;
		return this;
	}
 
	public Foo Build()
	{
		var foo = new Foo();
		foo.Bar = bar;
		foo.OtherBar = otherBar;
		foo.UltraBar = ultraBar;
		return foo;
	}
}

I created some duplication by doing this – I am now creating the ‘Foo’ twice – but I didn’t check any of the code into the main branch until I had totally replaced the original Foo creation with the builder.

I did about two of three properties at a time and then ran the tests which I thought might be too small a step but actually saved me on a couple of occasions so it probably actually saved me time.

Eventually when I had all the tests passing I got rid of the original Foo and replaced it with the one from the builder:

public Foo CreateMeAFoo() 
{
	var fooBuilder = new FooBuilder().AddBars("someValue", "someOtherValue", "aValue);
	return fooBuilder.Build();
}

This code is still in a state of transition – it is still possible to create an object with half the fields not set by passing in nulls to the builder for example – and I’m trying to work out what the best step is to fix that.

I generally prefer to have everything setup in the constructor and then you know that the object is in a good state as soon as you create it, but in this case moving everything straight into the constructor will probably make the object even more difficult to deal with.

My current thinking is to maybe check the pre conditions for creating the object inside the builder and then refactor out value objects so that there are not so many properties overall but I’m open to other ideas.

Written by Mark Needham

June 26th, 2009 at 12:02 am

Coding: Why do we extract method?

with 9 comments

Ever since I’ve read Uncle Bob’s Clean Code book my approach to coding has been all about the ‘extract method‘ refactoring – I pretty much look to extract method as much as I can until I get to the point where extracting another method would result in me just describing the language semantics in the method name.

One of the approaches that I’ve come across with regards to doing this refactoring is that it’s only used when there is duplication of code and we want to reduce that duplication so that it’s all in one place and then call that method from two places.

While this is certainly a valuable reason for doing extracting method I think there are other reasons why we would want to do it more frequently than just this.

Expressing intent

One of the main reasons we design code in an object oriented way is that it often allows us to describe the intent of our code more clearly than we would be able to it we wrote it with a more procedural approach and I think the same thing applies when extracting methods inside a class.

Quite often when reading code we’re interested in knowing a bit more about a class than we can derive from its name but we’re not really that interested in all the low level details of its implementation.

If methods have been extracted abstracting all that detail away from us then we’re able to quickly glance at the class and fairly quickly work out what is going on and then move back to working out what we were actually doing in the first place.

It makes code easier to read

A consequence of extracting that detail away is that it makes the code easier to read because we don’t have to hold as much information about what is going on in our head at any one time.

The aim is to try and ensure that the chunks of code that we extract into a method are all at the same level of abstraction – Neal Ford refers to this as the Single Level of Abstraction Principle (SLAP) in The Productive Programmer.

We would therefore not have a chunk of code which described some business concept or rule mixed in with a bit of code that was interacting with the database as an extreme example.

I find myself most frequently extracting method when I come across several lines of code doing similar operations, the aim being that when we read the code we don’t need to care about each of the individual operations but just the fact that operations are being done.

It exposes semantic errors

One benefit which I hadn’t actually appreciated until recently is that extracting a method can actually help to identify areas of code which shouldn’t actually be where they are.

We were recently working on some code around starting up a Selenium session where the ‘ResetSeleniumSession’ method was doing the following:

public ISelenium ResetSeleniumSession()
{
	if(Selenium != null)
	{
		Selenium.Stop();
	}
	Selenium = new CustomSelenium(....)
	Selenium.Start()
	Selenium.Open(ApplicationRootUrl);
	Selenium.WindowMaximize();
}

We didn’t think those last two lines belonged in there so we extracted them out so that we could make sure that the opening of the selenium client was still being done in all the places that ResetSeleniumSession was being called:

public ISelenium ResetSeleniumSession()
{
	...
	Selenium = new CustomSelenium(....)
	Selenium.Start()
	LoadAndMaximise(ApplicationRootUrl);
}

Later on another colleague passed by and saw us looking at this method and pointed out that it was wrong that we were launching the client from inside this method and had probably been added into that method by mistake!

Maybe that code would have been spotted anyway but it had been like that for a while and I think extracting it out into its own method to make it more obvious was useful for exposing that.

In Summary

That’s all I can think of for the moment although I’m sure there are more reasons why we’d want to extract method.

From my experience extract method is the most useful refactoring that we can do and it can quickly make a bit of code that seems impossible to understand somewhat readable and it can help keep new code that we write easy for others to understand instead of becoming a legacy mess.

Written by Mark Needham

June 4th, 2009 at 8:30 pm

Posted in Coding

Tagged with , ,

Pair Programming: Refactoring

with 5 comments

One area of development where I have sometimes wondered about the value that we can get from pair programming is when we’re spending time doing some major refactoring of our code base.

The reason I felt that pairing on big refactoring tasks might be difficult compared to when working on a story together is that with a story there tends to be a more defined goal and the business have defined that goal whereas with a refactoring task that goal is often less clear and people have much wider ranging differing opinions about the approach that should be taken.

Having spent time pairing on some refactoring work for most of the last week I have changed my mind and I think that pairing on refactoring tasks is actually extremely valuable – certainly equally as valuable as pairing on story work and perhaps more so.

What I noticed in my time pairing on this task is that actually although there is more conflict over the approach that should be taken this actually works out reasonably well in terms of driving an approach that is somewhere between pragmatic and dogmatic.

From my experience, the mentality when you’re the driver in a refactoring session is pretty much to want to fix everything that you come across but having someone else alongside you helps to rein in this desire and focus on making the fixes that add value to the code right now.

I was quite surprised to notice that within just one keyboard switch we had both suggested to the other that making a particular refactoring should probably be added to our ‘refactoring list’ to have a look at later on rather than looking at it now and potentially getting into a yak shaving situation.

Another thing I noticed was that the navigator was often able to point out things that the other person didn’t see – sometimes making a certain change to the code had a bigger impact than the driver had expected and the navigator was able to spot this quickly and initiate a discussion about what the next step should be before we had gone too far down the wrong path.

Refactoring code effectively and not making mistakes when doing so is one of the more difficult skills to learn as a developer and I think it is very difficult to learn by just working on your own since the approach to effectively refactor code by taking very small steps is not necessarily obvious.

While there are certainly books which explain how to do refactorings on our code a lot of the approaches that I like to use have been taught to me by more experienced people that I’ve had the chance to pair with. Pairing creates the opportunity for these skills to be passed on to other members of the team.

Written by Mark Needham

May 26th, 2009 at 11:44 pm

Refactoring: Removing duplication more safely

with 5 comments

One of the most important things that I’ve learnt from the coding dojo sessions that we’ve been running over the last six months is the importance of small step refactorings.

Granted we have been trying to take some of the practices to the extreme but the basic idea of trying to keep the tests green for as much time as well as keeping our code in a state where it still compiles (in a static language) is very useful no matter what code we’re working on.

Recently a colleague and I were doing some work on our code base to try and remove some duplication – we had two classes which were doing pretty much the same thing but they were named just slightly differently.

The implementation was also slightly different – one was a list containing objects with Key and Value properties on and the other was a dictionary/map of key/value pairs.

We spent a bit of time checking that we hadn’t misunderstood what these two different classes were doing and having convinced ourselves that we knew what was going on decided to get rid of one of them – one was used in around 50% more places than the other so we decided to keep that one.

We now needed to replace the usages of the other one.

My pre-coding dojo refactoring approach would have been to just find the first place that the one we wanted to replace was being used, delete that and then let the compiler guide me to replacing all its usages and then do that with the second usage and so on.

The problem with this approach is that we would probably have had the code in a state where it didn’t compile for maybe half an hour, leading to a situation where we would be unable to run our tests for any of this time which would mean that we would lose confidence that the changes we were making actually worked.

The approach we therefore took was to add in the class we wanted to keep side by side with the one we wanted to get rid of and slowly move our tests across to setup data for that.

We therefore ended up with code looking a bit like this to start with:

public class IHaveUsages
{
	public IDictionary<OldType, string> OldType { get; set; }
	public IList<NewType> OldType2 { get; set; }
}

When changing tests we took the approach of commenting out the old setup code to start with so that we could see exactly what was going on in the setup and then delete it once we had done so. I’ve written previously about my dislike for commented code but we were using it in this case as a mechanism to guide our refactoring and we didn’t ever check the code in with these comments in so I think it was a reasonable approach.

[Test]
public void SomeTest()
{
	// some test stuff
	new IHaveUsages 
	{
		// OldType = new Dictionary<OldType, string> {{new OldType("key"), "value"}},
		OldType2 = new List<NewType> { new NewType { Key = "key", Value = "value" } } 
	}
}

The intention was to try and reduce the number of places that OldType was being used until the point where there were none left which would allow us to safely remove it.

Once we had made that change to the test setup we needed to make the changes in the class using that data to get our green bar back.

On a couple of occasions we found methods in the production code which took in the OldType as a parameter. In order to refactor these areas we decided to take a copy of the method and renamed it slightly before re-implementing it using the NewType.

private void OldMethod(OldType oldType) 
{
	// Do some stuff
}
private void OldMethod2(NewType newType) 
{
	// replicate what's being done in DoSomeStuffOn
}

We then looked for the usages of ‘OldMethod’ and replaced those with calls to ‘OldMethod2′, also ensuring that we passed in NewType to the method instead of OldType.

I’m intrigued as to whether there is an even better way to perform this refactoring – when I chatted with Nick about this he suggested that it might have been even easier to create a temporary inheritance hierarchy with NewType extending OldType. We could then just change any calls which use OldType to use NewType before eventually getting rid of OldType.

I haven’t tried this approach out yet but if it makes our feedback cycle quicker and chances of failing less then I’m all for it.

Written by Mark Needham

May 26th, 2009 at 1:20 pm

F#: Refactoring that little twitter application into objects

with 9 comments

I previously wrote about a little twitter application I’ve been writing to go through my twitter feed and find only the tweets with links it and while it works I realised that I was finding it quite difficult to add any additional functionality to it.

I’ve been following the examples in Real World Functional Programming which has encouraged an approach of creating functions to do everything that you want to do and then mixing them together.

This works quite well for getting a quick development cycle but I found that I ended up mixing different concerns in the same functions, making it really difficult to test the code I’ve been working on – I decided not to TDD this application because I don’t know the syntax well enough. I am now suffering from that decision!

Chatting to Nick about the problems I was having he encouraged me to look at the possibility of structuring the code into different objects – this is still the best way that I know for describing intent and managing complexity although it doesn’t feel like ‘the functional way’.

Luckily Chapter 9 of the book (which I hadn’t reached yet!) explains how to restructure your code into a more manageable structure.

I’m a big fan of creating lots of little objects in C# land so I followed the same approach here. I found this post really useful for helping me understand the F# syntax for creating classes and so on.

I started by creating a type to store all the statuses:

type Tweets = { TwitterStatuses: seq<TwitterStatus> }

F# provides quite a nice way of moving between the quick cycle of writing functions and testing them to structuring objects with behaviour and data together by allowing us to append members using augmentation.

From the previous code we have these two functions:

let withLinks (statuses:seq<TwitterStatus>) = 
    statuses |> Seq.filter (fun eachStatus -> eachStatus.Text.Contains("http"))
 
let print (statuses:seq<TwitterStatus>) =
    for status in statuses do
        printfn "[%s] %s" status.User.ScreenName status.Text

We can add these two methods to the Tweets type using type augmentations:

type Tweets with
    member x.print() = print x.TwitterStatuses
    member x.withLinks() = { TwitterStatuses = withLinks x.TwitterStatuses}

It looks quite similar to C# extension methods but the methods are actually added to the class rather than being defined as static methods. The type augmentations need to be in the same file as the type is defined.

Next I wanted to put the tweetsharp API calls into their own class. It was surprisingly tricky working out how to create a class with a no argument constructor but I guess it’s fairly obvious in the end.

type TwitterService() = 
        static member GetLatestTwitterStatuses(recordsToSearch) =    
            findStatuses(0L, 0, recordsToSearch, [])

I managed to simplify the recursive calls to the Twitter API to keep getting the next 20 tweets as well:

let friendsTimeLine = FluentTwitter.CreateRequest().AuthenticateAs("userName", "password").Statuses().OnFriendsTimeline()
let getStatusesBefore (statusId:int64) = 
    if(statusId = 0L) then
        friendsTimeLine.AsJson().Request().AsStatuses()  
    else
        friendsTimeLine.Before(statusId).AsJson().Request().AsStatuses()        
 
let rec findStatuses (args:int64 * int * int * seq<TwitterStatus>) =
    let findOldestStatus (statuses:seq<TwitterStatus>) = 
        statuses |> Seq.sort_by (fun eachStatus -> eachStatus.Id) |> Seq.hd
    match args with 
    | (_, numberProcessed, statusesToSearch, soFar) when numberProcessed >= statusesToSearch -> soFar
    | (lastId, numberProcessed, statusesToSearch, soFar) ->  
        let latestStatuses = getStatusesBefore lastId
        findStatuses(findOldestStatus(latestStatuses).Id, numberProcessed + 20, statusesToSearch, Seq.append soFar latestStatuses)

To get the tweets we can now do the following:

let myTweets = { TwitterStatuses = TwitterService.GetLatestTwitterStatuses 100 };;
myTweets.withLinks().print();;

I still feel that I’m thinking a bit too procedurally when writing this code but hopefully that will get better as I play around with F# more.

One other lesson from this refactoring is that it’s so much easier to refactor code when you have tests around them – because I didn’t do this I had to change a little bit then run the code manually and check nothing had broken. Painful!

Written by Mark Needham

April 18th, 2009 at 8:47 am

Posted in F#

Tagged with ,