Mark Needham

Thoughts on Software Development

Archive for July, 2009

Book Club: Why noone uses functional languages (Philip Wadler)

with 3 comments

Our latest technical book club discussion was based around Philip Wadler’s paper ‘Why noone uses functional langauges‘ which he wrote in 1998. I came across this paper when reading some of the F# goals in the FAQs on the Microsoft website.

These are some of my thoughts and our discussion of the paper:

  • One of the points suggested in the paper is that functional languages aren’t used because of their lack of availability on machines but as Dave pointed out this doesn’t really seem to be such a big problem these days – certainly for F# I’ve found it relatively painless to get it setup and running and even for a language like Ruby people are happy to download and install it on their machines and it is also pretty much painless to do so.
  • Erik pointed us to an interesting article which suggests that functional programming can be very awkward for solving certain problems – I think this is definitely true to an extent although perhaps not as much as we might think. I am certainly seeing some benefit in an overall OO approach with some functional concepts mixed in which seems to strike a nice balance between code which is descriptive yet concise in places. I’m finding the problems that F# is useful for tend to be very data intensive in nature.
  • Matt Dunn pointed out that an e-commerce store written by Paul Graham, which he later sold to Yahoo, was actually written in Lisp – to me this would seem like the type of problem that wouldn’t be that well suited for a functional language but interestingly only part of the system was written in Lisp and the other part in C.

    Viaweb at first had two parts: the editor, written in Lisp, which people used to build their sites, and the ordering system, written in C, which handled orders. The first version was mostly Lisp, because the ordering system was small. Later we added two more modules, an image generator written in C, and a back-office manager written mostly in Perl.

  • The article also suggests that it takes a while for Java programmers to come to grips with functional programs – I would agree with this statement to an extent although one of the things I found really hard when first reading functional programs is the non descriptiveness of the variable names. It seems to be more idiomatic to make use of single letter variable names instead of something more descriptive which I would use in an imperative language.

    I’m intrigued as to whether this will change as more people use functional languages or whether this is just something we will need to get used to.

  • The author makes a very valid point with regards to the risk that a project manager would be taking if they decided to use a functional language for a project:

    If a manager chooses to use a functional language for a project and the project fails, then he or she will certainly be fired. If a manager chooses C++ and the project fails, then he or she has the defense that the same thing has happened to everyone else.

    I’m sure I remember a similar thing being said about the reluctance to make use of Ruby a couple of years ago – it’s something of a risk and human nature is often geared towards avoiding those!

  • I think the availability of libraries is probably very relevant even today – it helps F# a lot that we have access to all the .NET libraries and I imagine it’s also the same for Scala with the Java libraries. I don’t know a lot about the Lisp world but I’m told that people often end up rolling their own libraries for some quite basic things since there aren’t standard libraries available as there are in some other languages.
  • Another paper pointed out as being a good one to read was ‘Functional Programming For The Rest Of Us‘ – I haven’t read it yet but it does look quite lengthy! Wes Dyer also has a couple of articles which I found interesting – one around thinking functionally and the other around how functional programming can fit in a mixed programming environment

I think in general a lot of the points this paper raises have been addressed by some of the functional languages which are gaining prominence more recently – Erlang, F# and Scala to name a few.

It will definitely be interesting to see what role functional languages have to play in the polyglot programming era that my colleague Neal Ford foresees.

Written by Mark Needham

July 8th, 2009 at 12:29 am

C#: Removing duplication in mapping code with partial classes

with one comment

One of the problems that we’ve come across while writing the mapping code for our anti corruption layer is that there is quite a lot of duplication of mapping similar types due to the fact that each service has different auto generated classes representing the same data structure.

We are making SOAP web service calls and generating classes to represent the requests and responses to those end points using SvcUtil. We then translate from those auto generated classes to our domain model using various mapper classes.

One example of duplication which really stood out is the creation of a ‘ShortAddress’ which is a data structure consisting of a postcode, suburb and state.

In order to map address we have a lot of code similar to this:

private ShortAddress MapAddress(XsdGeneratedAddress xsdGeneratedAddress)
{
	return new ShortAddress(xsdGeneratedAddress.Postcode, xsdGeneratedAddress.Suburb, xsdGeneratedAddress.State);
}
private ShortAddress MapAddress(AnotherXsdGeneratedAddress xsdGeneratedAddress)
{
	return new ShortAddress(xsdGeneratedAddress.Postcode, xsdGeneratedAddress.Suburb, xsdGeneratedAddress.State);
}

Where the XsdGeneratedAddress might be something like this:

public class XsdGeneratedAddress
{
	string Postcode { get; }
	string Suburb { get; }
	string State { get; }
	// random other code
}

It’s really quite boring code to write and it’s pretty much exactly the same apart from the class name.

I realise here that if we were using a dynamic language we wouldn’t have a problem since we could just write the code as if the object being passed into the method had those properties on it.

Sadly we are in C# which doesn’t yet have that capability!

Luckily for us the SvcUtil generated classes are partial classes so (as Dave pointed out) we can create another partial class which inherits from an interface that we define. We can then refer to types which implement this interface in our mapping code, helping to reduce the duplication.

In this case we create a ‘ShortAddressDTO’ with properties that match those on the auto generated class:

public interface ShortAddressDTO 
{
	string Postcode { get; }
	string Suburb { get; }
	string State { get; }
}

We then need to make the generated classes inherit from this:

public partial class XsdGeneratedAddress : ShortAddressDTO {}

Which means in our mapping code we can now do the following:

private ShortAddress MapAddress(ShortAddressDTO shortAddressDTO)
{
	return xsdGeneratedAddress.ConvertToShortAddress();
}

Which uses this extension method:

public static class ServiceDTOExtensions 
{
	public static ShortAddress ConvertToShortAddress(ShortAddressDTO shortAddressDTO)
	{
		return new ShortAddress(shortAddressDTO.Postcode, shortAddressDTO.Suburb, shortAddressDTO.State);
	}
}

Which seems much cleaner than what we had to do before.

Written by Mark Needham

July 7th, 2009 at 6:11 pm

Posted in .NET

Tagged with ,

Domain Driven Design: Anti Corruption Layer

with 6 comments

I previously wrote about some of the Domain Driven Design patterns we have noticed on my project and I think the pattern which ties all these together is the anti corruption layer.

The reason why you might use an anti corruption layer is to create a little padding between subsystems so that they do not leak into each other too much.

Remember, an ANTICORRUPTION LAYER is a means of linking two BOUNDED CONTEXTS. Ordinarily, we are thinking of a system created by someone else; we have incomplete understanding of the system and little control over it.

Even if the model we are using is being defined by an external subsystem I think it still makes sense to have an anti corruption layer, no matter how thin, to restrict any future changes we need to make in our code as a result of external system changes to that layer.

In our case the anti corruption layer is a variation on the repository pattern although we do have one repository per service rather than one repository per aggregate root so it’s not quite the same as the Domain Driven Design definition of this pattern.

anti-corruption.gif

The mapping code is generally just used to go from our our representation of the domain to a representation of the domain in auto generated from an xsd file.

We also try to ensure that any data which is only important to the service layer doesn’t find its way into the rest of our code.

The code looks a bit similar to this:

public class FooRepository 
{
	private readonly FooService fooService;
 
	public FooRepository(FooService fooService)
	{
		this.fooService = fooService;
	}
 
	public Foo RetrieveFoo(int fooId)
	{
		var xsdGeneratedFooRequest = new FooIdToXsdFooRequestMapper().MapFrom(fooId);
		var xsdGeneratedFooResponse = fooService.RetrieveFoo(xsdGeneratedFooRequest); 
		return new XsdFooResponseToFooMapper().MapFrom(xsdGeneratedFooResponse);
	}
}
public class FooIdToXsdFooRequestMapper 
{
	public XsdGeneratedFooRequest MapFrom(int fooId)
	{
		return new XsdGeneratedFooRequest { fooId = fooId };
	}
}
public class XsdFooResponseToFooMapper 
{
	public Foo MapFrom(XsdGeneratedFooResponse xsdGeneratedFooResponse)
	{
		var bar = MapToBar(xsdGeneratedFooResponse.Bar);
		// and so on
		return new Foo(bar);
	}
}

Right now we are transitioning our code to a place where it conforms more closely to the model being defined in the service layer so inside some of the mappers there is some code which is complicated in terms of the number of branches it has but doesn’t really add much value.

We are in the process of moving to a stage where the mappers will just be moving data between data structures with minimal logic for working out how to do so.

This will lead to a much simpler anti corruption layer but I think it will still add value since the coupling between the sub systems will be contained mainly to the mapper and repository classes and the rest of our code doesn’t need to care about it.

Written by Mark Needham

July 7th, 2009 at 9:05 am

Brownfield Application Development in .NET: Book Review

without comments

The Book

Brownfield Application Development in .NET by Kyle Baley and Donald Belcham

The Review

I asked to be sent this book to review by Manning as I was quite intrigued to see how well it would complement Michael Feather’s Working Effectively with Legacy Code, the other book I’m aware of which covers approaches to dealing with non green field applications.

What did I learn?

  • The authors provide a brief description of the two different approaches to unit testing – state based and behaviour based – I’m currently in favour of the latter approach and Martin Fowler has a well known article which covers pretty much anything you’d want to know about this topic area.
  • I really like the section of the book which talks about ‘Zero Defect Count’, whereby the highest priority should be to fix any defects that are found in work done previously rather than racing ahead onto the next new piece of functionality:

    Developers are geared towards driving to work on, and complete, new features and tasks. The result is that defect resolution subconsciously takes a back seat in a developer’s mind.

    I think this is quite difficult to achieve when the team is getting pressure to complete new features but then again it will take longer to fix defects if we leave them until later since we need to regain the context around them which is more fresh in our mind the earlier we fix them.

  • Another cool idea is that of time boxing efforts at fixing technical debt in the code base – that way we spend a certain amount of time fixing one area and when the time’s up we stop. I think this will work well as an approach as often when trying to fix code we can either get into the mindset of not fixing anything at all because it will take too long to do so or ending up shaving the yak in an attempt to fix a particularly problematic area of code.
  • I like the definition of abstraction that the authors give:

    From the perspective of object- oriented programming, it is the method in which we simplify a complex “thing”, like an object, a set of objects, or a set of services.

    I often end up over complicating code in an attempt to create ‘abstractions’ but by this definition I’m not really abstracting since I’m not simplifying but complicating! This seems like a useful definition to keep in mind when looking to make changes to code.

  • Maintainability of code is something which is seriously undervalued – I think it’s very important to write your code in such a way that the next person who works with it can actually understand what’s going on. The authors have a fantastic quote from Perl Best Practices:

    Always code as if the guy who ends up maintaining your code is a violent psychopath who knows where you live.

    Writing code that is easy for the next person to understand is much harder than I would expect it to be although on teams which pair programmed frequently I’ve found the code easier to understand. I recently read a blog post by Jaibeer Malik where he claims that it is harder to read code than to write code which I think is certainly true in some cases.

  • There is a discussion of some of the design patterns and whether or not we should explicitly call out their use in our code, the suggestion being that we should only do so if it makes our intent clearer.
  • While describing out how to refactor some code to loosen its dependencies it’s pointed out that when the responsibilities of a class are a bit fuzzy the name of the class will probably be quite fuzzy too – it seems like this would server as quite a useful indicator for refactoring code to the single responsibility principle. The authors also suggest trying not to append the suffix ‘Service’ to classes since it tends to be a very overloaded term and a lot of the time doesn’t add much value to our code.
  • It is constantly pointed out how important it is to do refactoring in small steps so that we don’t break the rest of our code and to allow us to get rapid feedback on whether the refactoring is actually working or not. This is something that we’ve practiced in coding dojos and Kent mentions it as being one of his tools when dealing with code – I’ve certainly found that the overall time is much less when doing small step refactorings than trying to do everything in one go.

    I’m quite interested in trying out an idea called ‘Bowling Scorecards‘ which my former colleague Bernardo Heynemann wrote about – the idea to have a card which has a certain number of squares, each square reprsenting a task that needs to be done. These are then crossed off as members of the team do them.

  • An interesting point which is made when talking about how to refactor data access code is to try and make sure that we are getting all the data from a single entry point – this is something which I noticed on a recent project where we were cluttering the controller with two calls to different repositories to retrieve some data when it probably could have been encapsulated into a single call.
  • Although they are talking specifically about poor encapsulation in data access layers, I think the following section about this applies to anywhere in our code base where we expose the inner workings of classes by failing to encapsulate properly:

    Poor encapsulation will lead to the code changes requiring what is known as the Shotgun Effect. Instead of being able to make one change, the code will require you to make changes in a number of scattered places, similar to how the pellets of a shotgun hit a target. The cost of performing this type of change quickly becomes prohibitive and you will see developers pushing to not have to make changes where this will occur.

  • The creation of an anti corruption layer to shield us from 3rd party dependency changes is suggested and I think this is absolutely vital otherwise whenever there is a change in the 3rd party code our code breaks all over the place. The authors also adeptly point out:

    The reality is that when you rely on another company’s web service, you are ultimately at their mercy. It’s the nature of third-party dependencies. You don’t have control over them.

    Even if we do recognise that we are completely reliant on a 3rd party service for our model I think there is still a need for an anti corruption layer even if it is very thin to protect us from changes.

    The authors also describe run time and compile time 3rd party dependencies – I think it’s preferable if we can have compile time dependencies since this gives us much quicker feedback and this is an approach we used on a recent project I worked on by making use of generated classes to interact with a SOAP service rather than using WCF message attributes which only provided us feedback at runtime.

In Summary

This book starts off with the very basics of any software development project covering things such as version control, continuous integration servers, automated testing and so on but it gets into some quite interesting areas later on which I think are applicable to any project and not necessarily just ‘brownfield’ ones.

There is a lot of useful advice about making use of abstractions to protect the code against change both from internal and external dependencies and I particularly like the fact that the are code examples showing the progression of the code through each of the refactoring ideas suggested by the authors.

Definitely worth reading although if you’ve been working on any type of agile projects then you’re probably better off skim reading the first half of the book but paying more attention to the second half.

Written by Mark Needham

July 6th, 2009 at 12:43 am

Posted in Books

Tagged with , ,

Domain Driven Design: Conformist

with 2 comments

Something which constantly surprises me about Domain Driven Design is how there is a pattern described in the book for just about every possible situation you find yourself in when coding on projects.

A lot of these patterns appear in the ‘Strategic Design’ section of the book and one which is very relevant for the project I’m currently working on is the ‘Conformist’ pattern which is described like so:

When two development teams have an upstream/downstream relationship in which the upstream has no motivation to provide for the downstream team’s needs, the downstream team is helpless. Altruism may motivate upstream developers to make promises, but they are unlikely to be fulfilled. Belief in those good intentions leads the downstream team to make plans based on features that will never be available. The downstream project will be delayed until the team ultimately learns to live with what it is given. An interface tailored to the needs of the downstream team is not in the cards.

We are working on the front end of an application which interacts with some services to get and save the data from the website.

We realised that we had a situation similar to this originally but didn’t know that it was the conformist pattern and our original approach was to rely completely on the model in the service layer to the extent that we were mapping directly from SOAP calls to WCF message objects and then passing these around the code – I originally described this as being an externally defined domain model.

This led to quite a lot of pain as whenever there was a change in the service layer model our code base broke all over the place and we then ended up spending most of the day fire fighting – we were too tightly coupled to an external system.

At this stage we were reading Domain Driven Design in our Technical Book Club and I was fairly convinced that what we really needed to do was have our own model and create an anti corruption layer to translate between the service layer model and the new model that we would create.

We changed our code to follow this approach and created repositories and mappers which were the main places in our code base where we cared about this external dependency and although the isolation of the end point has worked much better we never really ended up with a rich domain model that really represented the business domain.

We had something in between the service layer model and the real business model which didn’t really help anyone and meant we ended up spending a lot of time trying to translate between the different definitions that were floating around.

Writing the code for the anti corruption layer also takes a lot of time, is quite frustrating/tedious and it was hard to see the value we were getting from doing so.

We’ve now reached the stage where we know this is the case and that it probably makes much more sense to just accept it and to not spend any more time trying to create our own model but instead just adapt what we have to more closely match the model we get from the services layer.

We will still keep a thin mapping layer as this gives us some protection against changes that may happen in the service layer.

I think a key thing for me here is that it’s really easy to be in denial about what is actually happening since what you really want is to be in control of your own domain model and design it so that it closely matches the business so that they would be able to read and understand your code if they wanted to. Sometimes that isn’t the case.

Chatting with Dave about this he suggested that a lesson for us here is that it’s important to know which pattern you are following which Andy Palmer also pointed out on twitter.

Written by Mark Needham

July 4th, 2009 at 10:17 am

Coding Dojo #19: Groovy Traveling salesman variation

with one comment

Our latest coding dojo involved working on a variation of the traveling salesman problem in Groovy again.

The Format

We had 8 people participating this week so we returned to the Randori format, rotating the pair at the keyboard every 7 minutes.

Give the number of people it might have actually been better to have a couple of machines and use the UberDojo format.

What We Learnt

  • The importance of just getting started stood out a lot for me in this dojo – there have been quite a few times when we’ve met intending to do some coding and spent so long talking about coding that we didn’t end up writing anything. Luckily Dave took the lead in this dojo and got the ball rolling. The code we wrote originally wasn’t perfect but it helped create the momentum to keep the session going so it was valuable in that way.
  • Another interesting feature of dojos for me is that it really doesn’t matter if you make mistakes – if you write really terrible code in a dojo it’s probably a good thing since you’ll probably not go and repeat the same mistake on a real project. I learnt a lot about the perils of not refactoring early enough and having too much state in our code from our Isola Dojo a few months ago.
  • We refactored much earlier than we normally do in this dojo and I think it worked really well for allowing us to progress later on. Often we fall into the trap of just chasing the green bar a bit too much and we forget to clean up the code after each cycle but we had that a bit better in this one.

    We also backed up a bit after around 3 cycles after realising that the code was becoming a bit horrific and spent 1 cycle working it into shape for the next one.

  • We fell into the trap of going several cycles with broken tests while trying to do some redesign on the code – the steps were clearly not small enough!

    Later on we corrected this when refactoring the code into a more functional style by taking very small steps and running the tests after each small change – this was a far more effective approach.

  • Although we were working in a dynamic language it didn’t feel that the conversations were that different when discussing the code – we were still talking about types when working out what to do. I’m not sure whether this means we haven’t quite got the idea of dynamic languages or whether there isn’t such a big difference between the way you talk about your code in them.

For next time

  • We might continue with another problem in Groovy – it’s been quite fun working in a language that runs on the JVM without the verbosity you sometimes get when writing Java code.

Written by Mark Needham

July 4th, 2009 at 9:36 am

Posted in Coding Dojo

Tagged with

F#: Pattern matching with the ‘:?’ operator

without comments

I’ve been doing a bit more reading of the Fake source code and one interesting thing which I came across which I hadn’t seen was an active pattern which was making use of the ‘:?’ operator to match the input type against .NET types.

  let (|File|Directory|) (fileSysInfo : FileSystemInfo) =
    match fileSysInfo with
      | :? FileInfo as file -> File (file.Name)
      | :? DirectoryInfo as dir -> Directory (dir.Name, seq { for x in dir.GetFileSystemInfos() -> x })
      | _ -> failwith "No file or directory given."

I thought maybe this was just a wild card operator to say that we don’t care what the value is as long as it matches ‘FileInfo’ or ‘DirectoryInfo’ respectively but I couldn’t see it defined on the list of operators on the Microsoft Research website.

A bit of googling led me to Matthew Podwysocki’s post about pattern matching which explained the purpose of the operator (about 1/3 of the way down):

What the above example does is check for the corresponding .NET types by using the ‘:?’ operator especially reserved for this behavior.

I’ve been playing around with a simple ‘add’ function to try and understand F#’s type inference and one thing I noticed is that if you just define it with minimal code you end up with a function which takes in 2 integers and returns an integer as the result:

let add a b = a + b
 
val add: int -> int -> int

I had thought that the signature and result of that function might remain generic due to the fact that there are more types than just ‘int’ with which you can make use of the addition operator.

For example, it is possible to add two string together but in fact you need to be more explicit about that:

let add (a:string) (b:string) = a + b
 
val add: string -> string -> string

From what I can tell if we wanted to write a generic add function we would need to do something like this – I originally tried just returning ‘new A + new B’ from each of the pattern matches but the return type of add3 then becomes ‘string’ since the first path in the pattern matching returns a ‘string’.

    let add3 a b =
        match (box a,box b) with
            | (:? string as newA),(:? string as newB) -> newA +  newB |> box
            | (:? int as newA),(:? int as newB) -> newA + newB |> box
            | (:? decimal as newA),(:? decimal as newB) -> newA + newB |> box
            | _ -> failwith "you can't add these together"

Which is slightly verbose and has a type of “‘a -> ‘b -” obj’ – I haven’t been able to work out whether it’s possible to create a generic function like this without needing to cast the result down to ‘obj’.

I thought it might be possible to get rid of the boxing by making use of the downcast operator:

You can also use the downcast operator to perform a dynamic type conversion. The following expression specifies a conversion down the hierarchy to a type that is inferred from program context.

I tried surrounding the ‘newA + new B |> box’ code with a call to ‘downcast’ but that just resulted in the following error message when trying to make use of the function:

Value restriction. The value 'it' has been inferred to have generic type
	val it : '_a
Either define 'it' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.

I’d be intrigued to see if anyone has worked out how to do this as I’m out of ideas.

Written by Mark Needham

July 2nd, 2009 at 11:10 pm

Posted in F#

Tagged with ,

Book Club: Logging – Release It (Michael Nygaard)

without comments

Our latest technical book club session was a discussion of the logging section in Michael Nygard’s Release It.

I recently listened to an interview with Michael Nygard on Software Engineering Radio so I was interested in reading more of his stuff and Cam suggested that the logging chapter would be an interesting one to look at as it’s often something which we don’t spend a lot of time thinking about on software development teams.

These are some of my thoughts and our discussion of the chapter:

  • An idea which Nick introduced on a project I worked on last year was the idea of having a ‘SupportTeam‘ class that could be used to do any logging of information that would be useful to the operations/support team that looked after our application once it was in production.

    This is an approach also suggested by Steve Freeman/Nat Pryce in Growing Object Oriented software (in the ‘Logging is a feature’ section) and the idea is that we will then focus more on logging the type of information that is actually useful to them rather than just logging what we think is needed.

    One thing which Dave pointed out is that it’s often difficult to get access to the operations team to try and get their requirements for the type of logging and monitoring they need and so often ends up being something that’s done very late on. On projects I’ve worked on there has often been a story card for logging and I think this is a good way to go as they are a stakeholder of the system so logging shouldn’t just be dealt with as a nice extra.

  • Something which I hadn’t considered until reading this book is the idea of making logs human readable and machine parseable as well. The default format of most of the logging tools is not actually that useful when you’re trying to scan through hundreds of lines of data and it was intriguing how a little indentation could improve this so dramatically with the added benefit of making it much easier to create a regular expression to find what you want.
  • One thing I’m interested in understanding is how we work out what’s too much logging and what’s too little since it seems that it seems that the answer to this question is fairly context sensitive. For example on a recent project we logged all unhandled exceptions that came from the system as well as any exceptions that happened when retrieving data from the service layer. In general the data we’ve had available has been enough to solve problems but we could probably have done more, just working out what would be useful doesn’t seem obvious.
  • I think it was Alex who pointed out that it’s often useful to have an explicit step in the build to remove any debug logging from the code so that it doesn’t end up in production by mistake. This seems like a pretty neat idea although I haven’t seen it done yet – it also leads towards the idea that logging is for the operations team which I think is correct although it is often suggested that logging is actually for developers since it is assumed that they would be the ones to eventually solve any problems that arise.
  • The idea of having message codes for specific errors messages seems like a really cool idea for allowing easy searching of log files – we’ve done this on some projects I’ve worked on and not on others. I guess the key here is to ensure we don’t end up with too many different error codes otherwise it’s just as confusing as not having them at all.

Written by Mark Needham

July 2nd, 2009 at 12:04 pm

Posted in Book Club

Tagged with , ,