Mark Needham

Thoughts on Software Development

Archive for the ‘Coding’ tag

Coding: Make the mutation obvious

with 7 comments

Although I’m generally quite opposed to coding approaches whereby we mutate objects, sometimes the way a framework is designed seems to make this a preferable option.

We came across a situation like this last week when we wanted to hydrate an object with data coming back from the browser.

The signature of the action in question looked like this:

public class SomeController
{
	public ActionResult SomeAction(string id, UserData userData)
	{
	}

We were able to automatically bind most of the values onto ‘UserData’ except for the ‘id’ which was coming in from the URL.

We could have included the ‘id’ in a hidden field in the page to get around this problem but that seemed like a more complicated/hacky solution.

Instead what we needed to do is the following:

public class SomeController
{
	public ActionResult SomeAction(string id, UserData userData)
	{
		userData.Id = id.ParseIntoId();
	}

‘ParseIntoId’ is an extension method which converts a string representation of an ‘id’ into an object representation.

While this method is fine as it is, I find that as code gets added it can sometimes be less than obvious that we’ve actually mutated ‘UserData’ so I now prefer to explicitly call that out like so:

public class SomeController
{
	public ActionResult SomeAction(string id, UserData userData)
	{
		var userDataWithId = UpdateUserDataWithId(userData, id);
	}
 
	private UserData UpdateUserDataWithId(UserData userData, string id)
	{
		userData.Id = id.ParseIntoId();
		return userData;
	}

We would then want to use ‘userDataWithId’ after this point in the function.

Of course the original ‘UserData’ has still been mutated and it doesn’t seem worth the extra code that it would take to avoid this so we’ll just trust that anyone reading the code would realise that they should use ‘userDataWithId’ instead!

Written by Mark Needham

May 4th, 2010 at 6:32 pm

Posted in Coding

Tagged with

Coding: The Kestrel

with 4 comments

Reg Braithwaite has a cool series of posts where he covers the different combinators from Raymond Smullyan’s ‘To Mock a Mockingbird’ book and one of my favourites is the ‘Kestrel’ or ‘K Combinator’ which describes a function that returns a constant function.

It’s described like so:

Kxy = x

The Kestrel function would take in 2 arguments and return the value of the first one. The second argument would probably be a function that takes in the first argument and then performs some side effects with that value.

Braithwaite descirbes the ‘returning’ function from Rails as an example of this combinator whereby instead of writing code like this:

def registered_person(params = {})
  person = Person.new(params.merge(:registered => true))
  Registry.register(person)
  person.send_email_notification
  person
end

We can write this:

def registered_person(params = {})
  returning Person.new(params.merge(:registered => true)) do |person|
    Registry.register(person)
    person.send_email_notification
  end
end

i.e. we can group all the side effects together and it’s more obvious to the reader that we’re returning the value of ‘Person.new(…)’ from this method.

I’ve been writing a bit of code in F# to generate some test objects and I realised that I had code like this in a few places:

let build (t:Type) =
      let theType = Activator.CreateInstance(t)
      theType.GetType().GetProperties() |> Array.iter (fun p -> p.SetValue(t, createValueFor p, null))
      theType

We’re creating ‘theType’ and then mutating it straight away using reflection before returning the value.

We can implement a ‘returning’ function like so to simplify the code a little:

let returning t f = f(t); t
let build (t:Type) =
    returning (Activator.CreateInstance(t)) (fun t -> 
        t.GetType().GetProperties() |> Array.iter (fun p -> p.SetValue(t, createValueFor p, null)))

I think this is the same as what Martin Fowler refers to as a nested closure and it seems quite a neat way of encapsulating side effects and making the code a bit more expressive.

Written by Mark Needham

May 3rd, 2010 at 12:28 am

Posted in Coding

Tagged with

Coding: Generalising too early

with 3 comments

I’ve previously written about the value of adding duplication to code before removing it and we had an interesting situation this week where we failed to do that and ended up generalising a piece of code too early to the point where it actually didn’t solve the problem anymore.

The problem we were trying to solve was around the validation of some dependent fields and to start with we had this requirement:

Given 2 fields...
 
Field 1
Field 2 - Can be empty if Field 1 is 'Foo'. Otherwise must have a value.

We wrote an object that looked a bit like this to do that logic:

public class DependentOnField1 : IValidateFields
{
	public bool IsSatisfied(Field field)
	{
		if(OtherFieldIsFoo())
		{
			return true;
		}		
 
		return string.IsNullOrEmpty(field.Value);
	}
}

Later on we had the following requirement:

Given 2 more fields...
 
Field 3 
Field 4 - Must have a value of more than 0 if Field 3 is 'Bar'. Otherwise can be empty.

It seemed at first glance that they were the same type of validation because in both cases there is talk of dependence between fields.

We started refactoring the ‘DependentOnField1′ object into a state where we would be able to use it for both cases. This meant that we needed to parameterise the object so that we could vary the dependent field.

We eventually ended up with an interface that could be called like this:

var field1DependenceValidator = DependentOn.Field(field1);
public class DependentOn : IValidateFields
{
	public static DependentOn Field(Field field)
	{
		return new DependentOn(field);
	}
}

Unfortunately while refactoring this object we hadn’t taken into account the fact that the requirement for fields 3 and 4 is slightly different to that for fields 1 and 2. With fields 1 & 2 we only needed one condition to be true whereas with fields 3 and 4 two conditions needed to be true – field 3 must be ‘Bar’ AND field 4 must be greater than 0.

We initially tried to hack in the fields 3/4 requirement like this:

public class DependentOn : IValidateFields
{
	public bool IsSatisfied(Field field)
	{		
		if(OtherFieldIsX() && Double.Parse(field.Value) > 0)
		{
			return true;
		}		
 
		return string.IsNullOrEmpty(field.Value);
	}
}

Unfortunately that breaks the way that validation works for fields 1 & 2 and every other attempt we made seemed to make the situation worse and worse until after about half an hour of wrestling with this we decided to revert everything and just create a new object for this new type of validation.

We would accept some duplication until we were able to see a more meaningful abstraction that we could pull out.

Dave Cameron, who first taught me this approach, recently wrote a post describing how he removed some duplication in JavaScript code and you can see from the initial example in his post that he waited until he was exactly sure where the duplication was before getting rid of it.

I think this is the best approach and the rule of thumb seems to be that we should wait until we have the same thing 3 times before trying to remove the duplication.

Written by Mark Needham

April 30th, 2010 at 7:12 am

Posted in Coding

Tagged with

Listening to your tests: An example

without comments

I was recently reading a blog post by Esko Luontola where he talks about the direct and indirect effects of TDD and one particularly interesting point he makes is that driving our code with a TDD approach helps to amplify the problems caused by writing bad code.

if the code is not maintainable, it will be hard to change. Also if the code is not testable, it will be hard to write tests for it. If the code would not be written iteratively and no tests would be written for it, life would be much easier to the developer. In other words, TDD increases the pain caused by bad code, because it’s not possible to avoid changing the code, nor avoid writing tests for it.

This is something Steve Freeman and Nat Pryce talk about in their book and we recently had an example of this which really stemmed from a failure to drive this particular piece of code from the outside in.

We’d reached the stage where one object was taking in 8 different parameters in the constructor and then only used 2 of them in any one path through the code.

The code was pretty bad but it was even more noticeable how much of a mess we’d made when we had to write a new test.

This is a rough example of what the test fixture had begun to look like:

[TestFixture]
public class BadObjectTests
{
	private double parameter1 = 0.10;
	private double parameter2 = 0.20;
	private double parameter3 = 0.30
 
	[Test]
	public void ATestGoneABitWrong()
	{
		var badObject = CreateBadObject();
		var result = badObject.CalculateSomething()
 
		Assert.That(result, Is.EqualTo(parameter1));
	}
 
	private BadObject CreateBadObject()
	{
		return new BadObject(parameter1, parameter2, parameter3...);
	}
}

As a general guideline it’s good if we’re able to keep the context of a test all in one place but since the ‘BadObject’ had become increasingly difficult to construct we’d pulled that out into another method and extracted all its parameters as fields in the test fixture.

The alternative was to have all 8 parameters created in each test and then construct the ‘BadObject’ each time and in retrospect that would actually have made it more obvious that we were doing something wrong.

With that second approach we would undoubtably start copy/pasting tests which would provide another signal that we need to look at the design of the code and make it easier to test.

In this case the solution was to create several smaller objects which actually used all the parameters being passed in. I think we ended up with around 4 objects instead of 1 and each had simple tests that were easy to write.

Written by Mark Needham

April 27th, 2010 at 10:34 pm

Posted in Coding

Tagged with

Coding: Another outside in example

with 4 comments

I’ve written before about my thoughts on outside in development and we came across another example last week where we made our life difficult by not initially following this approach.

The rough design of what we were working on looked like this:

outsideIn1.gif

My pair and I were working on the code to do the calculations and we deliberately chose not to drive the functionality from the UI because the other pair were reworking all our validation code and we didn’t want to step on each others toes.

We therefore started driving the code directly from the individual calculations and decided to create an object to represent each of the types of calculation. It worked quite well for all but one of the calculation classes, ‘Calculation3, which became increasingly complicated as we added more test cases.

At this stage we rotated pairs and still unable to see a way to simplify ‘Calculation3′ we decided to try and drive it from the object which would actually make use of it.

As Darren Hobbs pointed out in another discussion:

Don’t design an API. Write the code that you’d want to be able to write to use your API.

In this case it became clear that the missing object which could drive out the Calculation objects was a ‘Calculator’:

outsideIn2.gif

We started writing tests for the calculator and initially it didn’t seem to make that much difference – the design of nearly all the individual ‘Calculation’ objects remained the same.

However, once we got to the tests which would drive out the functionality currently in ‘Calculation3′ it became clear that we had actually got 3 objects inside 1 and that what we really needed in this case was an orchestrating class which could delegate down to small objects to do the calculations.

outsideIn3.gif

The most interesting thing about this situation for me is that I often do drive code from the outside in and I’ve even previously written about the benefits of doing so but in this context I got it wrong and it was a slightly painful lesson!

Written by Mark Needham

April 18th, 2010 at 10:46 pm

Posted in Coding

Tagged with

Coding: Maybe vs Null Object patterns

with 13 comments

On the project I’m currently working on my colleague Christian Blunden has introduced a version of the Maybe type into the code base, a concept that originally derives from the world of functional programming.

The code looks a bit like this:

public interface Maybe<T>
{
	bool HasValue();
	T Value();
}
public class Some<T> : Maybe<T>
{
	private readonly T t;
 
	public Some(T t)
	{
		this.t = t;
	}
 
	public bool HasValue()
	{	
		return true;
	}
 
	public T Value()
	{		
		return t;
	}	
}
public class None<T> : Maybe<T>
{
	public bool HasValue()
	{	
		return false;
	}
 
	public T Value()
	{		
		throw new NotImplementedException();
	}	
}

We would then use it in the code like this:

public FooRepository
{
	public Maybe<Foo> Find(int fooId)
	{
		var foo = LookUpFooFromDatabase();
 
		if(foo == null)
		{
			return new None<Foo>();
		}
		return new Some<Foo>(foo);
	}
var maybeFoo = fooRepository.Find(1);
 
if(maybeFoo.HasValue())
{
	// do something with it
}
// fail in misery

The benefit we get from using this pattern is that we’re explicitly defining in the contract of ‘FooRepository.Find’ that the method might not return a ‘Foo’ rather than leaving the callee to work out whether or not they need to check for a null value.

It’s effectively the Nullable pattern except we can use it for reference types and not just primitives.

An alternative approach which Dermot pointed out is the null object pattern.

Typically when using that pattern we would treat the result of calling ‘FooRepository.Find’ the same regardless of whether we get a real ‘Foo’ or not.

That pattern would work quite well if we have to show a list of items in a grid, for example, and just showed blank cells if there isn’t a real ‘Foo’.

In our case we want to distinguish between whether we did or did not find a ‘Foo’ because the application behaves differently if we can’t find one. Therefore in this case the null object pattern doesn’t work so well.

Written by Mark Needham

April 10th, 2010 at 11:21 am

Posted in Coding

Tagged with

Coding: FindOrCreateUser and similar methods

with 8 comments

One of the general guidelines that I like to follow when writing methods is trying to ensure that it’s only doing one thing but on several recent projects I’ve noticed us breaking this guideline and it feels like the right thing to do.

The method in question typically takes in some user details, looks up that user in some data store and then returning it if there is an existing user and creating a new user if not.

It would probably look something like this:

public class UserService
{
	public User FindOrCreateUser(string username)
	{
		var user = userRepository.FindUserBy(username);
		if(user == null)
		{
			user = CreateUserFrom(username);
		}	
		return user;
	}
}

My initial thought when we wrote it was that it seemed wrong but the method name does clearly describe our requirement so I’m not so sure it’s a bad thing.

NHibernate has a ‘SaveOrUpdate’ method which to me seems to cover similar ground although Save/Update are much closer in functionality then Find/Create so maybe that’s more justifiable.

I’d be interested to know whether you think this type of method is perfectly fine or whether we’re very very bad people for writing it!

Written by Mark Needham

April 9th, 2010 at 7:09 am

Posted in Coding

Tagged with

Coding: Shared libraries

with 6 comments

On a few projects that I’ve worked on one of the things that we’ve done is create a shared library of objects which can be used across several different projects and while at the time it seemed like a good idea, in hindsight I’m not sure if it’s an entirely successful strategy.

I’m quite a fan of not recreating effort which is generally the goal when trying to pull out common code and within one team this seems to be a good approach the majority of the time.

When it comes to sharing across teams then I think we need to consider the perceived benefits a bit more because it doesn’t come without costs.

These are some of the types of code that we’ve shared previously:

Domain objects

I think this is the most dangerous type of code to share because although we often do have the same domain concepts in different projects, it’s quite rare that they mean exactly the same thing.

In addition there is an implicit coupling created with our database since we pretty much now have to make sure that our database schema matches up with the current version of that domain object.

Either that or we do have a shared database for all the applications which use that shared domain object in which case we have an even stronger coupling between applications.

We’re assuming that the two application have exactly the same domain concept and from my experience quite often that isn’t the case – even if there is a concept with the same name it may be used in different ways or mean something completely different in different applications.

This is quite similar to the problem with having a universal domain model which Dan points out in his classic SOA article.

In general I don’t think it makes sense to share this type of code.

Test code

This one seems like it should fairly universally a good idea – after all we often import 3rd party testing libraries so it seems like just sharing some common testing code shouldn’t be much different.

One piece of code that we shared was the Selenium bootstrapping code and this approach worked reasonably well until we wanted to adjust the amount of time between each command because commands were being sent to the browser before elements had the chance to load.

Apart from the fact that the other users of the library didn’t want anything change with respect to how they used the code we had to go and make the change in another project, build that and then update the reference that we had to the library.

Certainly this process would have been made easier if we’d used something like Ivy but the amount of duplication of code that we were saving didn’t seem worth the hassle it caused so we ended up inlining the code.

Infrastructure code

General infrastructure code e.g. code to handle NHibernate transactions which is quite unlikely to change seems one candidate which can work quite well in a shared library and so far I haven’t seen many problems arise from doing this.

I think the key with these bits of reusable code is that we keep them quite small and ensure that they have only one responsibility which will be useful for all the applications.

We eventually ended up slimming down our shared library and the majority of the code that remains in there is solving specific infrastructure type problems which will be the same across any applications using the same technical stack.

Things to be careful about when sharing code

One reason that we may share code is so that if there is a change then it only needs to be done in one place.

We need to have a degree of confident that if we put code in a shared library that this is actually the case.

If it’s likely that different applications might need shared code to change in different ways then we might not want to make that bit of code shared otherwise we’ll just end up with application specific code in a shared library.

From what I’ve noticed it makes most sense to put code which is unlikely to change and is generic enough to be useful across several applications as is into shared libraries.

For any other code it might actually be beneficial to accept that there will be some duplication between applications in the same organisation and not try and pull out a common piece.

Written by Mark Needham

February 26th, 2010 at 12:36 am

Posted in Coding

Tagged with

Refactoring: Small steps to pull out responsibilities

with 3 comments

I wrote previously about how I’ve been using effect sketches to identify responsibilities in objects so that I can pull them out into other objects and once I’ve done this I often find that I can’t see a small next step to take.

At this stage in the past I’ve often then stopped and left the refactoring until I have more time to complete it but this hasn’t really worked and a lot of the time I end up only seeing the code change in my mind and not in the actual code.

I came across this problem again last week in an object which had 8 dependencies when I came across it.

Having drawn the effect sketch I realised that 3 of those could be pulled out into a new object which could then be injected into the original object and help to encapsulate those other 3 dependencies.

The code that I wanted to change was something like this:

public class TheObject 
{
	private readonly DependencyA;
	private readonly DependencyB;
	private readonly DependencyB;
	...	
 
	public Foo FooCreation()
	{
		var dependencyAValue = dependencyA.GetSomething();
		var dependencyBValue = dependencyB.GetSomething();
		var dependencyCValue = dependencyC.GetSomething();
 
		return new Foo(dependencyAValue, dependencyBValue, dependencyCValue);
	}
 
	...
}

I wanted to pull the ‘FooCreation’ method out into another object and then change all the places that were calling ‘TheObject’FooCreation’ to just call this new object directly.

The first step was to create a ‘FooFactory’ and just have that delegated to internally:

public class FooFactory 
{
	private readonly DependencyA;
	private readonly DependencyB;
	private readonly DependencyB;
	...	
 
	public Foo Create()
	{
		var dependencyAValue = dependencyA.GetSomething();
		var dependencyBValue = dependencyB.GetSomething();
		var dependencyCValue = dependencyC.GetSomething();
 
		return new Foo(dependencyAValue, dependencyBValue, dependencyCValue);
	}
 
}
public class TheObject 
{
	...	
 
	public Foo FooCreation()
	{
		return new FooFactory(dependencyA, dependencyB, dependencyC).Create();
	}
 
	...
}

I ran out of time at this stage to finish off the refactoring but it was obvious where the refactoring was going so the next time I got the chance I injected the ‘FooFactory’ into ‘TheObject’:

public interface IFooFactory 
{
	Foo Create();
}
public class TheObject 
{
	private readonly IFooFactory;
	...	
	public TheObject(IFooFactory fooFactory) 
	{
		this.fooFactory = fooFactory;
	}
 
	public Foo FooCreation()
	{
		return fooFactory.Create();
	}
 
	...
}

To do this I had to go and change the tests on ‘TheObject’ and move some of them to go directly against ‘FooFactory’.

The third stage of the refactoring was to change all the places which called ‘TheObject.FooCreation()’ to just call ‘FooFactory.Create()’ directly.
Some of those places were also using other methods on ‘TheObject’ so those objects now have an extra dependency although I think at least the code is more intention revealing than it was previously.

I’m sure there are some other patterns for this type of small step refactoring but this is just one that I’ve noticed so far.

Written by Mark Needham

February 24th, 2010 at 12:45 am

Coding: Effect sketches and the Mikado method

with 7 comments

I’ve written previously about how useful I find effect sketches for helping me to understand how an object’s methods and fields fit together and while drawing one a couple of weeks ago I noticed that it’s actually quite useful for seeing which parts of the code will be the easiest to change.

I was fairly sure one of the object’s in our code base was doing too many things due to the fact that it had a lot of dependencies.

However, it wasn’t obvious to me from looking at the code which would be the easiest place to start in pulling out some of those responsibilities.

I therefore drew out an effect sketch which looked something like this:

blog.png

From the diagram I could see more clearly that ‘MethodC’ is using 3 fields which are not used by any of the other methods in the object.

This therefore seemed like the perfect method to pull out since I could do so really easily and get rid of 3 of the object’s fields since noone else used them anyway.

This reminded me a lot of the Mikado method for addressing technical debt which I’ve read about but haven’t used yet.

As I understand it, the goal with the Mikado method is to locate areas of the code base that we can change easily because there are no dependencies on this piece of code.

When using effect sketches the goal is to try and use the sketch to work out how we can group functionality and I think the idea of making initial changes that have a low impact is a good one to follow.

I drew my initial effect sketch on paper but I noticed that drawing it up in graphviz actually makes it even more obvious which bits of functionality are related.

For example I hadn’t realised that ‘fieldB’ was used by so many methods until I typed this up.

It’s quite a neat tool and easy to pick up. This is my ‘dot’ file for the above sketch:

blog.dot

digraph effectgraph {
	size="8,8"; 
 
	"MethodA" -> "fieldA";
	"MethodA" -> "fieldB";
	"MethodA" -> "fieldC";
 
	"MethodB" -> "fieldB";
	"MethodB" -> "fieldE";
	"MethodB" -> "fieldF";
 
	"MethodC" -> "fieldG";
	"MethodC" -> "fieldH"
	"MethodC" -> "fieldD"
 
	"MethodD" -> "fieldB"
	"MethodD" -> "fieldI"
}

And to generate a png I ran the following command from the terminal:

dot -Tpng -blog.png blog.dot

Written by Mark Needham

February 23rd, 2010 at 12:29 am

Posted in Coding

Tagged with