Mark Needham

Thoughts on Software Development

Archive for April, 2009

Coding: Applying levels of abstraction

with one comment

One interesting situation that we often arrive at when writing code is working out when the best time to apply a level of abstraction is.

I think there is always a trade off to be made when it comes to creating abstractions – creating the abstraction adds to the complexity of the code we’re writing but it is often the case that creating it makes it easier for us to navigate the code base.

The trick then seems to be working out when the benefits we get from making that abstraction outweigh the complexity/indirection that we create in the code. Of course if we can name the abstraction in an obvious way that it need not be the case that we over complicate the code that much.

If we apply a pattern or abstraction effectively then we would hope that the code becomes more expressive and readable.

A recent situation where I was confronted with this decision was in a bit of code being used to render the view model for one of our pages.

The code started out quite simple and there was originally just one path that we would go down when creating that model.

Soon though there became a second path where the data being rendered differed slightly if there was a logged in user.

It felt like there was a need to try and abstract this into a Model Renderer or something similar but it felt like I would be over engineering the solution if I went for this approach. Possible case of YAGNI linked to my dislike of having to write if statements in the code!

Anyway I didn’t apply the abstraction and now unfortunately that code has ended up having 5 different paths and it’s quite tricky to refactor since the logic is spread all over the place.

Although I felt the decision I made at the time was reasonable I don’t think it satisfied one of the ideas that I’ve picked up from speaking to Dan North – that we should make it easy for people to do the right thing in the code. If I’d gone with the abstraction when I considered it then maybe the code would have taken a different and better direction.

One idea suggested to me recently by Nick about when to create the abstraction is based on the idea of what I think might be a myth of the aboriginal counting system but nevertheless is quite useful here.

The idea is that we only have three numbers ’1′, ’2′, and ‘Many’. When we reach the stage where we have ‘Many’ branches in an if/else statement for example that might be a good time to create an abstraction to take care of that complexity.

I know there generally aren’t rules that we can apply at all times in software development but this seems a reasonable rule to keep in mind. Maybe it should also be the type of idea that goes together with the coding conventions that a team decides to follow.

Written by Mark Needham

April 19th, 2009 at 11:03 pm

Posted in Coding

Tagged with ,

I don’t have time not to test!

with 6 comments

I recently read a blog post by Joshua Lockwood where he spoke of some people who claim they don’t have time to test.

Learning the TDD approach to writing code has been one of best things that I’ve learnt over the last few years – before I worked at ThoughtWorks I didn’t know how to do it and the only way I could verify whether something worked was to load up the application and manually check it.

It was severely painful and on one particular occasion I managed to put some code with a bug into production because I didn’t know all the places that making that code change would impact.

It’s not a good way of working and I’m glad I’ve been given the opportunity to work with people who have showed me a better way.

My experience pretty much matches a comment made by Chris Missal on the post where he pointed out that you are going to test your code anyway so you might as well automate that test!

“You’re already testing with the debugger, TestPage1.aspx, or whatever… Just save that code and automate it!”

I’ve just spent the last 2 hours doing some refactoring on an F# twitter application I’m working on and because I didn’t write any tests it’s been a very painful experience indeed.

Every time I make a change I have to copy all the code into F# interactive, run the code and then manually make sure that I haven’t broken anything.

I’ve been doing this in fairly small steps – make one change then run it – but the cycle time is still much greater than it would be if I had just put some tests around the code in the first place.

I think we should be looking to test more than just the ‘complex code’ as well – there have been numerous occasions when I’ve put the logic for a conditional statement the wrong way around and a test has come to the rescue.

It pretty much applies to all the languages that I’ve worked in and if we can’t see how to easily create an automated test for a bit of code then it’s a sign that we’re doing something wrong and we might want to take a look at that!

Written by Mark Needham

April 18th, 2009 at 9:25 am

Posted in Testing

Tagged with

F#: Refactoring that little twitter application into objects

with 9 comments

I previously wrote about a little twitter application I’ve been writing to go through my twitter feed and find only the tweets with links it and while it works I realised that I was finding it quite difficult to add any additional functionality to it.

I’ve been following the examples in Real World Functional Programming which has encouraged an approach of creating functions to do everything that you want to do and then mixing them together.

This works quite well for getting a quick development cycle but I found that I ended up mixing different concerns in the same functions, making it really difficult to test the code I’ve been working on – I decided not to TDD this application because I don’t know the syntax well enough. I am now suffering from that decision!

Chatting to Nick about the problems I was having he encouraged me to look at the possibility of structuring the code into different objects – this is still the best way that I know for describing intent and managing complexity although it doesn’t feel like ‘the functional way’.

Luckily Chapter 9 of the book (which I hadn’t reached yet!) explains how to restructure your code into a more manageable structure.

I’m a big fan of creating lots of little objects in C# land so I followed the same approach here. I found this post really useful for helping me understand the F# syntax for creating classes and so on.

I started by creating a type to store all the statuses:

type Tweets = { TwitterStatuses: seq<TwitterStatus> }

F# provides quite a nice way of moving between the quick cycle of writing functions and testing them to structuring objects with behaviour and data together by allowing us to append members using augmentation.

From the previous code we have these two functions:

let withLinks (statuses:seq<TwitterStatus>) = 
    statuses |> Seq.filter (fun eachStatus -> eachStatus.Text.Contains("http"))
 
let print (statuses:seq<TwitterStatus>) =
    for status in statuses do
        printfn "[%s] %s" status.User.ScreenName status.Text

We can add these two methods to the Tweets type using type augmentations:

type Tweets with
    member x.print() = print x.TwitterStatuses
    member x.withLinks() = { TwitterStatuses = withLinks x.TwitterStatuses}

It looks quite similar to C# extension methods but the methods are actually added to the class rather than being defined as static methods. The type augmentations need to be in the same file as the type is defined.

Next I wanted to put the tweetsharp API calls into their own class. It was surprisingly tricky working out how to create a class with a no argument constructor but I guess it’s fairly obvious in the end.

type TwitterService() = 
        static member GetLatestTwitterStatuses(recordsToSearch) =    
            findStatuses(0L, 0, recordsToSearch, [])

I managed to simplify the recursive calls to the Twitter API to keep getting the next 20 tweets as well:

let friendsTimeLine = FluentTwitter.CreateRequest().AuthenticateAs("userName", "password").Statuses().OnFriendsTimeline()
let getStatusesBefore (statusId:int64) = 
    if(statusId = 0L) then
        friendsTimeLine.AsJson().Request().AsStatuses()  
    else
        friendsTimeLine.Before(statusId).AsJson().Request().AsStatuses()        
 
let rec findStatuses (args:int64 * int * int * seq<TwitterStatus>) =
    let findOldestStatus (statuses:seq<TwitterStatus>) = 
        statuses |> Seq.sort_by (fun eachStatus -> eachStatus.Id) |> Seq.hd
    match args with 
    | (_, numberProcessed, statusesToSearch, soFar) when numberProcessed >= statusesToSearch -> soFar
    | (lastId, numberProcessed, statusesToSearch, soFar) ->  
        let latestStatuses = getStatusesBefore lastId
        findStatuses(findOldestStatus(latestStatuses).Id, numberProcessed + 20, statusesToSearch, Seq.append soFar latestStatuses)

To get the tweets we can now do the following:

let myTweets = { TwitterStatuses = TwitterService.GetLatestTwitterStatuses 100 };;
myTweets.withLinks().print();;

I still feel that I’m thinking a bit too procedurally when writing this code but hopefully that will get better as I play around with F# more.

One other lesson from this refactoring is that it’s so much easier to refactor code when you have tests around them – because I didn’t do this I had to change a little bit then run the code manually and check nothing had broken. Painful!

Written by Mark Needham

April 18th, 2009 at 8:47 am

Posted in F#

Tagged with ,

Coding Dojo #12: F#

with one comment

In our latest coding dojo we worked on trying to port some of the functionality of some C# 1.0 brain models, and in particular one around simulating chaos behaviour, that Dave worked on at university.

The Format

This was more of an experimental dojo since everyone was fairly new to F# so we didn’t rotate the pair at the keyboard as frequently as possible.

What We Learnt

  • The aim of the session was to try and put some unit tests around the C# code and then try and replace that code with an F# version of it piece by piece. We created an F# project in the same solution as the C# one and then managed to hook up the C# and F# projects, referencing the C# one from the F# one, with some success although the references did seem to get slightly confused at times. The support from the IDE isn’t really there yet so it can be a bit tricky at times.
  • We were using the XUnit.NET framework to unit test our code – this seems like a useful framework for testing F# code since it doesn’t require so much setup to get a simple test working. We can just annotate a function with the ‘Fact’ annotation and we’re good to go. One thing to be careful about is to make sure that you are actually creating functions to be evaluated by the test runner and not having the tests evaluated immediately and therefore not being picked up by the runner. For a while I had written a test similar to this and it wasn’t being picked up:
    [<Fact>] let should_do_something = Assert.AreEqual(2,2)

    The type of ‘should_do_something” is ‘unit’ and as I understand it gets evaluated immediately. What we really want to do though is create a function (with type ‘unit -> unit’) which can be evaluated later on:

    [<Fact>] let should_do_something() = Assert.AreEqual(2,2)

    The brackets are important, something that I hadn’t appreciated. We were generally running the test by directly calling the test runner from the command line – we couldn’t quite work out how to hook everything up inside Visual Studio.

  • I’m not sure if we went exactly to the book with our refactoring of the code to make it testable – the method on the class doing the work was private so we made it public – but it helped to get us moving. We were able to then replace this with an F# function while verifying that the output was still the same. As I mentioned on my post about the little twitter application I’m working on, I’m intrigued as to how we should structure code in F#. Apparently the answer is as objects but I’m interested how the design would differ from one done in a predominantly OO as opposed to functional language.

For next time

  • I’m really enjoying playing around with F# – it’s definitely interesting learning a different approach to programming than I’m used to – so we might continue working on that next time around. If not then we need to find another game to model!

Written by Mark Needham

April 16th, 2009 at 6:20 pm

Posted in Coding Dojo,F#

Tagged with ,

Lean: Big Picture over Local Optimisations

with 7 comments

I recently finished reading Lean Thinking and one of the things that was repeatedly emphasised is the need to look at the process as a whole rather than trying to optimise each part individually.

If we phrased this in a similar way to the Agile Manifesto it would probably read ‘Big Picture over Local Optimisations‘.

The examples in Lean Thinking tend to be more manufacturing focused but I think this idea can certainly be applied in thinking about software projects too.

Individual vs Team Focus

For me on a software development team what we care about is the quality of the team as a whole not the individuals on it.

It’s all good being very strong technically but you have to be able to work effectively with other people to solve problems otherwise your value to a development team is much reduced.

I wrote a bit about this previously in my lean look at pair programming but I think it goes beyond just ensuring that we are sharing knowledge around the team by having people collaborating closely with each other.

When I first joined ThoughtWorks my former colleague Fred George was always talking about the value of being poly skilled – having the ability to carry out more than one role – and I think this is lost when we refer to people as being their role i.e. ‘you ARE a developer’ rather than ‘you CAN PLAY the developer role’.

If we have people on a team who are polyskilled or generalising specialist (although I think generalising specialist refers more to poly skilled across a role rather than being able to carry out multiple roles) then we can get away with building teams which are smaller in number and which are more resilient and able to respond when team members are ill or absent.

Horizonal vs Vertical teams

Creating teams of people by the layer or horizontal which they work on rather than the piece of functionality or vertical which they are working on seems to be one of the most obvious ways of locally optimising instead of looking at the big picture i.e. the delivery of a piece of functionality to the business.

I guess the theory is that if we put everyone working on similar things on the same layer in the same team then it will result in greater productivity than having people who are working on different areas of the system but have the same underlying goals.

The problem is that we create a lot of handover points – opportunities for context and information to be lost – and the ineffectiveness of this approach as a communication mechanism means that it takes much longer to integrate the components from each horizontal than it would have done with a team with people from each layer in it.

In addition any techniques used to try and increase the productivity of an individual team will be ineffective since they will only satisfy a local optima and will have a knock on effect to the rest of the system.

For example, let’s say we have three teams A, B and C and team B aren’t as productive as we were expecting. Applying pressure to team B may have an impact on their immediate productivity but it’s likely to affect the other two teams since team B will probably be less communicative towards those teams since they are trying to optimise their own performance.

Having a vertical team which is responsible for the delivery of a feature or features end to end works much better from my experience.

Colocated vs Distributed teams

One of my favourite agile ideas is having a co-located team, with the business people, analysts and developers all working in the same physical location.

Sometimes though the decision is taken to distribute teams across multiple locations – be it because that’s more convenient or cheaper to do.

The big picture i.e. project delivery is likely to become more difficult to achieve due to the increased cost of communication that has been created.

For me there is nothing better than face to face communication – even with phone calls/instant messaging/emailing it is still an extremely valuable form of communication. With any of the other three it’s much more difficult to read what someone is really thinking and you can come across as being much more aggressive than you intended to in any of those mediums.

My colleagues in India/China will certainly have some tools/techniques to bridge these gaps that I’m not aware of but I still remain convinced that if we have the choice then a co-located team is the best choice.

In Summary

These are just some of the ideas that came to mind when trying to apply one of the principles of lean to software development teams. Although many of these may seem obvious sometimes it takes another angle of looking at the problem to make that visible.

If you can think of any others feel free to mention them in the comments.

Written by Mark Needham

April 14th, 2009 at 10:10 pm

Posted in Agile,Lean

Tagged with

F#: A day of writing a little twitter application

with 13 comments

I spent most of the bank holiday Monday here in Sydney writing a little application to scan through my twitter feed and find me just the tweets which have links in them since for me that’s where a lot of the value of twitter lies.

I’m sure someone has done this already but it seemed like a good opportunity to try and put a little of the F# that I’ve learned from reading Real World Functional Programming to use. The code I’ve written so far is at the end of this post.

What did I learn?

  • I didn’t really want to write a wrapper on top of the twitter API so I put out a request for suggestions for a .NET twitter API. It pretty much seemed to be a choice of either Yedda or tweetsharp and since the latter seemed easier to use I went with that. In the code you see at the end I have added the ‘Before’ method to the API because I needed it for what I wanted to do.
  • I found it really difficult writing the ‘findLinks’ method – the way I’ve written it at the moment uses pattern matching and recursion which isn’t something I’ve spent a lot of time doing. Whenever I tried to think how to solve the problem my mind just wouldn’t move away from the procedural approach of going down the collection, setting a flag depending on whether we had a ‘lastId’ or not and so on.

    Eventually I explained the problem to Alex and working together through it we realised that there are three paths that the code can take:

    1. When we have processed all the tweets and want to exit
    2. The first call to get tweets when we don’t have a ‘lastId’ starting point – I was able to get 20 tweets at a time through the API
    3. Subsequent calls to get tweets when we have a ‘lastId’ from which we want to work backwards from

    I think it is probably possible to reduce the code in this function to follow just one path by passing in the function to find the tweets but I haven’t been able to get this working yet.

  • I recently watched a F# video from Alt.NET Seattle featuring Amanda Laucher where she spoke of the need to explicitly state types that we import from C# into our F# code. You can see that I needed to do that in my code when referencing the TwitterStatus class – I guess it would be pretty difficult for the use of that class to be inferred but it still made the code a bit more clunky than any of the other simple problems I’ve played with before.
  • I’ve not used any of the functions on ‘Seq’ until today – from what I understand these are available for applying operations to any collections which implement IEnumerable – which is exactly what I had!
  • I had to use the following code to allow F# interactive to recognise the Dimebrain namespace:
    #r "\path\to\Dimebrain.Tweetsharp.dll"

    I thought it would be enough to reference it in my Visual Studio project and reference the namespace but apparently not.

The code

This is the code I have at the moment – there are certainly some areas that it can be improved but I’m not exactly sure how to do it.

In particular:

  • What’s the best way to structure F# code? I haven’t seen any resources on how to do this so it’d be cool if someone could point me in the right direction. The code I’ve written is just a collection of functions which doesn’t really have any structure at all.
  • Reducing duplication – I hate the fact I’ve basically got the same code twice in the ‘getStatusesBefore’ and ‘getLatestStatuses’ functions – I wasn’t sure of the best way to refactor that. Maybe putting the common code up to the ‘OnFriendsTimeline’ call into a common function and then call that from the other two functions? I think a similar approach can be applied to findLinks as well.
  • The code doesn’t feel that expressive to me – I was debating whether or not I should have passed a type into the ‘findLinks’ function – right now it’s only possible to tell what each part of the tuple means by reading the pattern matching code which feels wrong. I think there may also be some opportunities to use the function composition operator but I couldn’t quite see where.
  • How much context should we put in the names of functions? Most of my programming has been in OO languages where whenever we have a method its context is defined by the object on which it resides. When naming functions such as ‘findOldestStatus’ and ‘oldestStatusId’ I wasn’t sure whether or not I was putting too much context into the function name. I took the alternative approach with the ‘withLinks’ function since I think it reads more clearly like that when it’s actually used.
#light
 
open Dimebrain.TweetSharp.Fluent
open Dimebrain.TweetSharp.Extensions
open Dimebrain.TweetSharp.Model
open Microsoft.FSharp.Core.Operators 
 
let getStatusesBefore (statusId:int64) = FluentTwitter
                                            .CreateRequest()
                                            .AuthenticateAs("userName", "password")
                                            .Statuses()
                                            .OnFriendsTimeline()
                                            .Before(statusId)
                                            .AsJson()
                                            .Request()
                                            .AsStatuses()
 
let withLinks (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = 
    statuses |> Seq.filter (fun eachStatus -> eachStatus.Text.Contains("http"))
 
let print (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) =
    for status in statuses do
        printfn "[%s] %s" status.User.ScreenName status.Text    
 
let getLatestStatuses  = FluentTwitter
                            .CreateRequest()
                            .AuthenticateAs("userName", "password")
                            .Statuses()
                            .OnFriendsTimeline()
                            .AsJson()
                            .Request()
                            .AsStatuses()                                    
 
let findOldestStatus (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = 
    statuses |> Seq.sort_by (fun eachStatus -> eachStatus.Id) |> Seq.hd
 
let oldestStatusId = (getLatestStatuses |> findOldestStatus).Id  
 
let rec findLinks (args:int64 * int * int) =
    match args with
    | (_, numberProcessed, recordsToSearch) when numberProcessed >= recordsToSearch -> ignore
    | (0L, numberProcessed, recordsToSearch) -> 
        let latestStatuses = getLatestStatuses
        (latestStatuses |> withLinks) |> print
        findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)    
    | (lastId, numberProcessed, recordsToSearch) ->  
        let latestStatuses = getStatusesBefore lastId
        (latestStatuses |> withLinks) |> print
        findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)
 
 
let findStatusesWithLinks recordsToSearch =
    findLinks(0L, 0, recordsToSearch) |> ignore

And to use it to find the links contained in the most recent 100 statuses of the people I follow:

findStatusesWithLinks 100;;

Any advice on how to improve this will be gratefully received. I’m going to continue working this into a little DSL which can print me up a nice summary of the links that have been posted during the times that I’m not on twitter watching what’s going on.

Written by Mark Needham

April 13th, 2009 at 10:09 pm

Posted in .NET,F#

Tagged with ,

TDD: Balancing DRYness and Readability

with 11 comments

I wrote previously about creating DRY tests and after some conversations with my colleagues recently about the balance between reducing duplication but maintaining readability I think I’ve found the compromise between the two that works best for me.

The underlying idea is that in any unit test I want to be aiming for a distinct 3 sections in the test – Given/When/Then, Arrange/Act/Assert or whatever your favourite description for those is.

Why?

I find that tests written like this are the easiest for me to understand – there would typically be a blank line between each distinct section so that scanning through the test it is easy to understand what is going on and I can zoom in more easily on the bit which concerns me at the time.

When there’s expectations on mocks involved in the test then we might end up with the meat of the ‘Then’ step being defined before the ‘When’ section but for other tests it should be possible to keep to the structure.

A lot of the testing I’ve been working on recently has been around mapping data between objects – there’s not that much logic going on but it’s still important to have some sort of verification that we have mapped everything that we need to.

We often end up with a couple of tests which might look something like this:

public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly()
{
	var customer = new Customer() 
					{
						Gender = Gender.Female
						Address = new Address(...)
					}
 
	var customerMessage = new CustomerMapper().MapFrom(customer)
 
	Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender);
	Assert.AreEqual(new Address(..), customerMessage.Address);
	// and so on...
}
 
public void ShouldEnsureThatMaleCustomerIsMappedCorrectly()
{
	var customer = new Customer() 
				{
					Gender = Gender.Male
					Address = new Address(...)
				}
 
	var customerMessage = new CustomerMapper().MapFrom(customer)
 
	Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender);
	Assert.AreEqual(new Address(..), customerMessage.Address);
	// and so on...
}

(For the sake of this example ‘CustomerMessage’ is being auto generated from an xsd)

We’ve got a bit of duplication here – it’s not that bad but if there are changes to the CustomerMessage class, for example, we have more than one place to change.

It is actually possible to refactor this so that we encapsulate nearly everything in the test, but I’ve never found a clean way to do this so that you can still understand the intent of the test.

public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly()
{
	AssertCustomerDetailsAreMappedCorrectly(customer, Gender.Female, CustomerMessage.Gender.Female);
}
 
public void ShouldEnsureThatMaleCustomerIsMappedCorrectly()
{				
	AssertCustomerDetailsAreMappedCorrectly(customer, Gender.Male, CustomerMessage.Gender.Male);
}
 
private void AssertCustomerDetailsAreMappedCorrectly(Customer customer, Gender gender, CustomerMessage.Gender gender)
{			
	var customer = new Customer() 
					{				
						Gender = gender,
						Address = new Address(...)
					}
 
	var customerMessage = new CustomerMapper().MapFrom(customer)
 
	Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender);
	// and so on...	
}

(Of course we would be mapping more than just gender normally but gender helps illustrate the pattern that I’ve noticed)

We’ve achieved our goal of reducing duplication but it’s not immediately obvious what we’re testing because that’s encapsulated too. I find with this approach that it’s more difficult to work out what went wrong when the test stops working, so I prefer to refactor to somewhere in between the two extremes.

public void ShouldEnsureThatFemaleCustomerIsMappedCorrectly()
{
	var customer = CreateCustomer(Gender.Female, new Address(...));
 
	var customerMessage = MapCustomerToCustomerMessage(customer);
 
	AssertFemaleCustomerDetailsAreMappedCorrectly(customer, customerMessage);
}
 
public void ShouldEnsureThatMaleCustomerIsMappedCorrectly()
{
	var customer = CreateCustomer(Gender.Male, new Address(...));
 
	var customerMessage = MapCustomerToCustomerMessage(customer);
 
	AssertMaleCustomerDetailsAreMappedCorrectly(customer, customerMessage);
}
 
private CustomerMessage MapCustomerToCustomerMessage(Customer customer)
{
	return new CustomerMapper().MapFrom(customer);
}
 
private Customer CreateCustomer(Gender gender, Address address)
{
	return new Customer() 
				{
					Gender = gender,
					Address = address
				};
}
 
private void AssertMaleCustomerDetailsAreMappedCorrectly(Customer customer, CustomerMessage customerMessage)
{			
	Assert.AreEqual(CustomerMessage.Gender.Male, customerMessage.Gender);
	// and so on...	
}
 
private void AssertFemaleCustomerDetailsAreMappedCorrectly(Customer customer, CustomerMessage customerMessage)
{			
	Assert.AreEqual(CustomerMessage.Gender.Female, customerMessage.Gender);
	// and so on...	
}

Although this results in more code than the 1st approach I like it because there’s a clear three part description of what is going on which will make it easier for me to work out which bit is going wrong. I’ve also split the assertions for Male and Female because I think it makes the test easier to read.

I’m not actually sure whether we need to put the 2nd step into its own method or not – it’s an idea I’ve been experimenting with lately.

I’m open to different ideas on this – until recently I was quite against the idea of encapsulating all the assertion statements in one method but a few conversations with Fabio have led me to trying it out and I think it does help reduce some duplication without hurting our ability to debug a test when it fails.

Written by Mark Needham

April 13th, 2009 at 12:47 am

Posted in Testing

Tagged with ,

The Mythical Man Month: Book Review

with 4 comments

The Book

The Mythical Man Month by Fred Brooks Junior

The Review

Pretty much since I started working at ThoughtWorks 2 1/2 years ago I’ve been told that this is a book I have to read and I’ve finally got around to doing so.

Maybe it’s not that surprising but my overriding thought about the book is that just about every mistake that we make in software development today is covered in this book!

What did I learn?

  • The title of the book and the second chapter of the book refers to the situation that surely everyone who has ever worked on a software development project is aware of – if a project is late then adding new people onto it will make it even later. This is due to the fact that a big part of software development is communication and adding people makes that communication more complicated than it previously was, therefore meaning it takes longer to get things done. My colleague Francisco has a nice post describing the ways that adding people can slow down a development team. The idea that a baby can’t be produced any quicker by having 9 women rather than just one is a particularly common metaphor used to explain this.
  • Incompleteness and inconsistencies of ideas only becomes clear during implementation – pretty much putting a dagger into the idea that we can define everything up front and then code it just like that. This is certainly the area that the agile and lean approaches look to change and certainly the earlier we can try out different ideas by using approaches such as set based concurrent engineering the more quickly we can end up with a useful solution.
  • An interesting idea about creating a surgical team ,with a few very experienced people doing the majority of the coding and being assisted by other members of the team, is suggested as being a successful route to delivering software. It sounds quite different to the teams that I have worked on where everyone on the team is involved although the objectives behind it seem valid – reducing the communication points and ensuring the conceptual integrity of the solution. Uncle Bob recently wrote about this describing these teams as master craftsman teams but it sounds as if this would require quite a radical shift in the recruiting strategies of organisations. Dave Hoover also has an interesting post on this subject but he takes the angle of building apprentices on teams like this.
  • This seems closely linked to another idea about team composition described later on in the book which speaks of the need for a team to have a technical director and a producer – the technical director sounds to be quite similar to Toyota’s idea of the Chief Engineer and they would be technically in charge while the producer (Iteration Manager?) is in charge of everything else. The underlying idea here is that we don’t just have one person in charge of a team, there are two distinct and important roles.
  • Brooks says the most important aspect of the design of a system is to ensure its conceptual integrity i.e. a consistent set of design ideas. In order to achieve this Brooks suggests the need for a system architect – while I agree with this idea I think it is more a role and maybe one that can be done by the Tech Lead on a project. The Poppendieck’s also talk of the need for conceptual integrity in Lean Software Development. The point here is to create a system which is easy to use both in terms of function to conceptual complexity. I am reminded of a Dan North quote at this stage: “We’re done not when there’s nothing more to add, but when there’s nothing more to take away”
  • The productivity increases gained by using high level languages are mentioned – the underlying idea being that using these allow us to avoid an entire level of exposure to error. I think this makes sense and as an example I think the introduction of functional collection parameters into C# 3.0 will lead to a reduction in the amount of time spent debugging loop constructs since we no longer have to use these so frequently.
  • When talking about object oriented programming Brooks speaks of the need to design objects which describe the concepts of the client.

    If we design large grained classes that address concepts our clients are already working with, they can understand and question the design as it grows, and they can cooperation in the design of test cases.

    In other words…Domain Driven Design! Reading this part of the book very much reminded me of Phil Will’s QCon presentation where he spoke of the way that the business and software development teams at the Guardian were able to collaborate to drive the design of the domain model for their new website.

  • The idea of only performing system debugging when each individual component actually works is something which should be obvious but is often not followed. If we know a component doesn’t work on its own then we can guarantee it is not going to work when we try to integrate it with other components so the exercise seems slightly pointless to me. Common sense advice I think!
  • Speaking of code reuse Brooks points out that the key here is the perceived cost of finding a component to reuse that is important – this ties in nicely with an idea from Dan Bergh Johnsson’s QCon presentation

    Your API has 10-30 seconds to direct a programmer to the right spot before they implement it [the functionality] themselves

  • Brooks talks of the need to have documentation for our projects – he uses a project workbook to do this and I think the modern day equivalent would be the project wiki. The idea of creating self documenting programs to help minimise the documentation that needs to be written is also covered. The importance of how we name concepts in our code is especially important in this area.
  • The need to progressively refine the system by growing it rather than building it is suggested later on in the book – the limitations of the waterfall model are described and the approaches of agile/lean are pretty much described – building frequently, getting it working end to end, rapid prototyping and so on.

In Summary

I really enjoyed reading this book and seeing how a lot of the ideas in more modern methodologies were already known about in the 1980s and aren’t in essence new ideas.

I’d certainly recommend this book.

Written by Mark Needham

April 11th, 2009 at 12:33 pm

Posted in Books

Tagged with ,

Pair Programming: The Code Fairy

with 6 comments

One of the hardest situations that comes up when pair programming is when you want to solve a problem in a certain way but you can’t persuade your pair that it’s the approach you should take.

The temptation in these situations is to wait until your pair isn’t around, maybe by staying late at the end of the day or coming in early the next day and then making the changes to the code that you wanted to make but didn’t when you were pairing with them.

My colleague Cam Swords has coined this pattern of behaviour ‘the code fairy‘ and it’s something that we want to try and avoid in software development teams.

The thing to note here is that you’re following another direction than you had been when pairing, not that you’re working alone on the code just because you’re pair can’t work with you at the moment for example.

The problem with following this approach is that you lose the benefit of pairing in that you have two eyes looking at the code and reducing the likelihood of stupid code being put into the code base – by stupid I mean code which has been written with cleverness in mind rather than simplicity. I find that when I work alone those types of ideas are much more likely to seem a good idea than when I have someone else there to put me right.

When we can’t convince our pair to follow our approach it is often due to the fact that we can’t articulate why we prefer it well enough to convince them. In a recent conversation with Cam about the problem of persuading people of our approach he pointed out that even if we are really good at constructing an argument for why our approach is better our pair may still not want to do it and we can’t do a lot about it.

Most recently I’ve been really enthused by the idea of writing functional OO code but I find it very frustrating that I can’t currently clearly explain why I prefer this approach to writing code over an approach which favours there being more mutable state.

I really want to go and change the code to follow that approach but it would be inconsistent with the rest of the code base as well as being less clear to my team mates since it would have been me who wrote it alone rather than doing it in collaboration with a colleague.

Another problem is that since we ended up going and re-writing the code ourselves the same situation will happen again in the future so we haven’t really achieved anything apart from temporarily making the code a bit better in our eyes. In addition there are now less people on the team who understand that particular bit of code which is not a good position to be in when we’re working in a team.

It’s definitely not easy to resist the urge to go and ‘fix’ the code on our own but for the benefit of our team mates it’s something we need to consider.

Written by Mark Needham

April 10th, 2009 at 7:28 pm

Posted in Pair Programming

Tagged with

Coding: Passing booleans into methods

with 9 comments

In a post I wrote a couple of days ago about understanding the context of a piece of code before criticising it, one of the examples that I used of a time when it seems fine to break a rule was passing a boolean into a method to determine whether or not to show an editable version of a control on the page.

Chatting with Nick about this yesterday it became clear to me that I’ve missed one important reason why you’d not want to pass a boolean into a method.

The first reason I hate passing booleans around is that it usually means we are controlling the path code should take inside a method rather than just calling the appropriate method ourself.

The following type code is not that unusual to see:

public void SomeMethod(bool someBoolean) 
{
	if(someBoolean) 
	{
		// doThis
	}
	else
	{
		// doThat		
	}
}

The client of this method knows what it wants to happen so why not just have two methods, like so:

public void DoThis() 
{
}
public void DoThat() 
{
}

In the specific case I was referring to in the post we had a HtmlHelper (ASP.NET MVC) method called DropDownOrReadOnly which either rendered a drop down with options for a user to select or just displayed the option they had previously selected if they were an existing user.

The boolean in this case was a property on the model which indicated whether or not the user had the ability to change these options or not.

It was therefore a case of doing an if statement in the aspx page or inside the helper. Initially we went for putting it in the aspx page but they started to look so messy we moved it into the helper.

Now what I totally didn’t see in this example until Nick pointed it out is that where we are passing in a boolean to this method, what we really want is an object which defines a strategy for how we render the control – we can delegate the decision for whether to display a drop down or read only version of the control.

Instead of passing in a boolean we could end up with something like this:

public abstract class EditMode
{
    public static readonly EditMode Editable = new Editable();
    public static readonly EditMode ReadOnly = new ReadOnly();
 
    public abstract void RenderFieldWith(HtmlHelper htmlHelper);
}
public class Editable : EditMode
{
    public override void RenderFieldWith(HtmlHelper htmlHelper)
    {
        htmlHelper.Label(...);
    }
}
public class ReadOnly : EditMode
{
    public override void RenderFieldWith(HtmlHelper htmlHelper)
    {
        htmlHelper.DropDownList(...);
    }
}

We’ve added the ‘Label’ method to HtmlHelper as an extension method for the sake of the above example. I’m sure the API for EditMode can be done better but that’s the basic idea.

We could then use it like this:

public static class HtmlHelperExtensions
{
    public static void DropDownOrReadOnly(this HtmlHelper htmlHelper, EditMode editMode)
    {
        editMode.Render(htmlHelper);
    }
}

Again I’ve simplified the API to show the idea of delegating responsibility for how we render the control to the EditMode. Nick has written more about this idea in a post about refactoring to the law of demeter.

The final reason that passing booleans around is not a great idea is that when you read the code it’s not immediately obvious what’s going on – the API is not expressible at all.

If we compare

HtmlHelper.DropDownOrReadOnly(true)

with

HtmlHelper.DropDownOrReadOnly(EditMode.ReadOnly)

I think it’s clear that with the second approach it’s much easier for someone coming into the code to understand what is going on.

Written by Mark Needham

April 8th, 2009 at 5:43 am

Posted in Coding

Tagged with