Mark Needham

Thoughts on Software Development

Archive for the ‘Book Club’ Category

Book Club: Why noone uses functional languages (Philip Wadler)

with 3 comments

Our latest technical book club discussion was based around Philip Wadler’s paper ‘Why noone uses functional langauges‘ which he wrote in 1998. I came across this paper when reading some of the F# goals in the FAQs on the Microsoft website.

These are some of my thoughts and our discussion of the paper:

  • One of the points suggested in the paper is that functional languages aren’t used because of their lack of availability on machines but as Dave pointed out this doesn’t really seem to be such a big problem these days – certainly for F# I’ve found it relatively painless to get it setup and running and even for a language like Ruby people are happy to download and install it on their machines and it is also pretty much painless to do so.
  • Erik pointed us to an interesting article which suggests that functional programming can be very awkward for solving certain problems – I think this is definitely true to an extent although perhaps not as much as we might think. I am certainly seeing some benefit in an overall OO approach with some functional concepts mixed in which seems to strike a nice balance between code which is descriptive yet concise in places. I’m finding the problems that F# is useful for tend to be very data intensive in nature.
  • Matt Dunn pointed out that an e-commerce store written by Paul Graham, which he later sold to Yahoo, was actually written in Lisp – to me this would seem like the type of problem that wouldn’t be that well suited for a functional language but interestingly only part of the system was written in Lisp and the other part in C.

    Viaweb at first had two parts: the editor, written in Lisp, which people used to build their sites, and the ordering system, written in C, which handled orders. The first version was mostly Lisp, because the ordering system was small. Later we added two more modules, an image generator written in C, and a back-office manager written mostly in Perl.

  • The article also suggests that it takes a while for Java programmers to come to grips with functional programs – I would agree with this statement to an extent although one of the things I found really hard when first reading functional programs is the non descriptiveness of the variable names. It seems to be more idiomatic to make use of single letter variable names instead of something more descriptive which I would use in an imperative language.

    I’m intrigued as to whether this will change as more people use functional languages or whether this is just something we will need to get used to.

  • The author makes a very valid point with regards to the risk that a project manager would be taking if they decided to use a functional language for a project:

    If a manager chooses to use a functional language for a project and the project fails, then he or she will certainly be fired. If a manager chooses C++ and the project fails, then he or she has the defense that the same thing has happened to everyone else.

    I’m sure I remember a similar thing being said about the reluctance to make use of Ruby a couple of years ago – it’s something of a risk and human nature is often geared towards avoiding those!

  • I think the availability of libraries is probably very relevant even today – it helps F# a lot that we have access to all the .NET libraries and I imagine it’s also the same for Scala with the Java libraries. I don’t know a lot about the Lisp world but I’m told that people often end up rolling their own libraries for some quite basic things since there aren’t standard libraries available as there are in some other languages.
  • Another paper pointed out as being a good one to read was ‘Functional Programming For The Rest Of Us‘ – I haven’t read it yet but it does look quite lengthy! Wes Dyer also has a couple of articles which I found interesting – one around thinking functionally and the other around how functional programming can fit in a mixed programming environment

I think in general a lot of the points this paper raises have been addressed by some of the functional languages which are gaining prominence more recently – Erlang, F# and Scala to name a few.

It will definitely be interesting to see what role functional languages have to play in the polyglot programming era that my colleague Neal Ford foresees.

Written by Mark Needham

July 8th, 2009 at 12:29 am

Book Club: Logging – Release It (Michael Nygaard)

without comments

Our latest technical book club session was a discussion of the logging section in Michael Nygard’s Release It.

I recently listened to an interview with Michael Nygard on Software Engineering Radio so I was interested in reading more of his stuff and Cam suggested that the logging chapter would be an interesting one to look at as it’s often something which we don’t spend a lot of time thinking about on software development teams.

These are some of my thoughts and our discussion of the chapter:

  • An idea which Nick introduced on a project I worked on last year was the idea of having a ‘SupportTeam‘ class that could be used to do any logging of information that would be useful to the operations/support team that looked after our application once it was in production.

    This is an approach also suggested by Steve Freeman/Nat Pryce in Growing Object Oriented software (in the ‘Logging is a feature’ section) and the idea is that we will then focus more on logging the type of information that is actually useful to them rather than just logging what we think is needed.

    One thing which Dave pointed out is that it’s often difficult to get access to the operations team to try and get their requirements for the type of logging and monitoring they need and so often ends up being something that’s done very late on. On projects I’ve worked on there has often been a story card for logging and I think this is a good way to go as they are a stakeholder of the system so logging shouldn’t just be dealt with as a nice extra.

  • Something which I hadn’t considered until reading this book is the idea of making logs human readable and machine parseable as well. The default format of most of the logging tools is not actually that useful when you’re trying to scan through hundreds of lines of data and it was intriguing how a little indentation could improve this so dramatically with the added benefit of making it much easier to create a regular expression to find what you want.
  • One thing I’m interested in understanding is how we work out what’s too much logging and what’s too little since it seems that it seems that the answer to this question is fairly context sensitive. For example on a recent project we logged all unhandled exceptions that came from the system as well as any exceptions that happened when retrieving data from the service layer. In general the data we’ve had available has been enough to solve problems but we could probably have done more, just working out what would be useful doesn’t seem obvious.
  • I think it was Alex who pointed out that it’s often useful to have an explicit step in the build to remove any debug logging from the code so that it doesn’t end up in production by mistake. This seems like a pretty neat idea although I haven’t seen it done yet – it also leads towards the idea that logging is for the operations team which I think is correct although it is often suggested that logging is actually for developers since it is assumed that they would be the ones to eventually solve any problems that arise.
  • The idea of having message codes for specific errors messages seems like a really cool idea for allowing easy searching of log files – we’ve done this on some projects I’ve worked on and not on others. I guess the key here is to ensure we don’t end up with too many different error codes otherwise it’s just as confusing as not having them at all.

Written by Mark Needham

July 2nd, 2009 at 12:04 pm

Posted in Book Club

Tagged with , ,

Book Club: The Readability of Tests – Growing Object Oriented Software (Steve Freeman/Nat Pryce)

with 2 comments

Our technical book club this week focused on ‘The Readability of Tests‘ chapter from Steve Freeman & Nat Pryce’s upcoming book ‘Growing Object Oriented Software, guide by tests‘.

I’ve been reading through some of the other chapters online and I thought this would be an interesting chapter to talk about as people seem to have different opinions on how DRY tests should be, how we build test data, how we name tests and so on.

These were some of my thoughts and our discussion on the chapter:

  • I found it interesting that there wasn’t any mention of the BDD style of test naming whereby the name of the test begins with ‘should…’. I’ve been using these style of naming for about 2 years now as I find it useful for allowing us to question whether or not the test is valid. There are equally arguments against using the word ‘should’ as it’s not particularly assertive and perhaps we ought to be more certain about what our tests are asserting.

    Recently I have started to move more towards Jay Fields idea that test names are just comments and if we write tests to be really clear and readable then the test name becomes redundant.

  • The chapter talks about the order in which the authors write their tests, the approach being to try and start with the assertion first and then write the execution and setup steps. My current approach is to write the execution step first and then build up the setup and expectations almost simultaneously. I’ve never been able to quite get the hang of writing the test bottom up but it’s something I might experiment with again.
  • Refactoring tests is something I’ve written about previously and my current thinking is that our aim shouldn’t be to remove absolutely all duplication in tests but instead remove it to a stage where we can still easily understand the test when it fails. This seems to fit in with the authors’ idea of ‘refactoring but not too hard’.

    I am currently following the idea of having three distinct areas in my tests (Given, When, Then) with each section separated by an empty line. I find writing them in this style makes it easier for me to quickly work out why a test is failing.

    I was recently watching Jay Fields’ presentation from SpeakerConf and Michael Feathers makes an interesting comment that we need to keep in mind that the reason for removing duplication in code is so that when we need to make changes we know where to do that. In test code the test failing will tell us where we need to make changes so the need to remove duplication to do this is less.

    I’m still heavily in favour of trading duplication for better readability when it comes to writing tests.

  • The idea of keeping consistency in tests is an important one although I think it’s difficult to keep this consistency across the whole suite of tests. Certainly within a single test fixture it should be possible though.

    One example of something which doesn’t follow this approach is the ‘ExpectedException’ annotation in JUnit/NUnit which goes against the style of pretty much all other tests.

  • When it comes to setting up tests data I think it’s pretty much given that test data builders are a really good way to help remove noise and duplication from our tests. Other patterns such as object mother can be useful but it doesn’t seem to work as well when you have multiple different was that you want to setup your data for tests.
  • There’s no specific mention of ‘Setup’ and ‘Teardown’ methods in the chapter but this is another area which I think has an important impact on readability.

    I’m not yet completely against tear down methods for integration style tests but I’ve seen a lot of pain causes by putting mocks in setup methods and even just having the setup method means that you have to go up and down the test fixture just to work out what’s going on. I prefer to try and keep all the context needed for a test in one place .

  • I found the section about the way that we name literals/variables in tests to be particularly interesting as this is a discussion I’ve been having with a couple of colleagues recently.

    I find it useful to state why that variable is important or not important for this particular test (i.e. give it context) so that someone can easily understand what’s going on when they look at the test. For example if we have a variable in a test that doesn’t affect the outcome then it might be useful to name it ‘stubFoo’ or ‘irrelevantFoo’ or something similar.

    I’ve previously been against the idea of naming dependencies we’re mocking/stubbing as ‘mockRepository’ or ‘stubRepository’ but I’ve been trying this out a bit this week and it exposed some mistakes I’d made which I don’t think I would have seen otherwise.

  • Another idea which I quite liked is the idea of only testing one feature set per test.

    I’ve certainly written a lot of tests which break this rule and you really suffer when you need to make a change later on.

    Jay Fields also applies this rule to mocks whereby you can only have one expectation per test but as many stubs as you want.

    I’ve been trying that out both these approaches this week and although there’s probably more code overall as a result of writing more tests, each of the tests feels much more succinct and understandable.

Written by Mark Needham

June 20th, 2009 at 11:26 am

Posted in Book Club

Tagged with ,

Book Club: Arguments and Results (James Noble)

without comments

We restarted our book club again last week by reading James Noble’s Arguments and Results paper, a paper I came across from a Michael Feathers blog post a few months ago detailing 10 papers that every programmer should read.

We decided to try out the idea of reading papers/individual chapters from books as it allows us to vary the type of stuff we’re reading more frequently and is an approach which Obie seems to be having some success with.

This firs half of paper describes some approaches for detailing with complexity in the arguments that we send to methods and the second half approaches for the results that we get back from methods.

We split the different patterns between us and attempted to come up with examples in code we’d worked on where we’d seen each of the patterns in use.

Arguments Object

This pattern is used to simplify a method’s signature by grouping up the common arguments and object and changing the method signature to take in this object instead.

The author quotes Alan Perlis as saying “If you have a procedure with 10 parameters you probably missed some” which I think best sums up the reason why you would want to do this refactoring.

Another advantage of doing this, which Cam pointed out, is that it may help to bring out domain concepts which weren’t explicit before.

The disadvantage of this approach is that we make it more difficult for clients to use our API. I think this was best summed up in a comment on a post I wrote about using weak or strong APIs by ‘Eric':

The choice is between passing (1) what the client probably has on hand already (the pieces) vs. passing (2) what the class demands (a pre-formed object). Which serves the client better and more naturally?

Which brick-and-mortar shop gets more business–the one taking credit cards (already in the customers’ hands), or the one demanding cash in some foreign currency (which the customers first have to go get)?

Subordinate the service’s convenience to the client’s. Accept what a client likely has at the ready. (At the least, offer an overloaded constructor that does so.)

A simple example from some code I worked on recently involved refactoring a few methods similar to this:

public void Process(string streetName, string streetNumber, string state) { }

to:

public void Process(Address address) { }

Some other examples which we thought of were:

  • The way that Ruby makes use of hashmaps to pass data into constructors instead of passing in each option individually to a separate parameter
  • The specification pattern/Query object pattern

Selector Object

This pattern is used to try and reduce the number of similar methods that exist on an object by creating one method which takes in the original object and an additional ‘selector object’ argument which is used to determine what exactly we do inside the new method.

The author also describes another way of solving this problem which is to build a small inheritance hierarchy and then make use of the double dispatch pattern to determine which specific method needs to be called.

The advantage of this pattern is that it helps to simplify the API of the object as there is now just one method to call to perform that specific operation.

The disadvantage is that the client of this object needs to do more work to use it although again I feel that this patterns helps to draw out domain concepts and move behaviour to the correct places.

I worked on a project where we did something similar to this although I’m not sure if it’s exactly the same:

public abstract class OurObject 
{
	public abstract void Configure(Configuration configuration)
}
 
public class ObjectOne : OurObject
{
	public void Configure(Configuration configuration)
	{
		configuration.ConfigureObjectOne(this);
	}
}
 
public class ObjectTwo : OurObject
{
	public void Configure(Configuration configuration)
	{
		configuration.ConfigureObjectTwo(this);
	}
}
public class Configuration
{
	public void ConfigureObjectOne(ObjectOne objectOne)
	{
		// do config stuff
	}	
 
	public void ConfigureObjectTwo(ObjectTwo objectTwo)
	{
		// do config stuff
	}	
	...
}

We felt that possibly the composite pattern might work well especially for the example described in the paper although we didn’t come up with any other examples of this pattern on code we’d worked on.

Curried Object

This pattern is used to help simplify the arguments that a client needs to send to an object by simplifying that interface through the use of a ‘curried object’ which takes care of any values that the object needs which the client doesn’t need to worry about (these could be arguments which don’t really change for example).

The client now sends its arguments to the ‘curried object’ which is then sent to the object.

I’ve come across the idea of currying while playing around with F# and in the functional world it seems to be about composing functions together, each of which only takes in one argument. I’m not sure if the author uses such a strict definition as that.

The advantage of this pattern is that it helps to simplify things for the client of the object although we now add in a level of indirection into the code because we send data to the ‘curried object’ which is actually executed by another example.

I’m not sure whether it’s strictly currying but an example which seemed to fit into this pattern is where we make calls to external services which might require some arguments passed to them that the rest of our application doesn’t care about.

We therefore keep this setup inside a ‘Gateway’ object which the rest of our code interacts with. The ‘Gateway’ object can then send the external services this data and the data we pass it.

Iterators are suggested as being the most common use of currying in our code as they shield us from the internals of how the data is stored.

Result Object

This pattern is used to allow us to keep the results of potentially expensive operations so that we don’t need to make several of these expensive calls to get the data that we want.

The advantage of doing this is that we are able to make out code more efficient although the client needs to do more work to get the data they care about out of the returned object.

An example of this could be if we are making a call across the network to get some data. If there are three pieces of data that we want then it makes more sense to get all this data in one ‘result object’
instead of making individual calls for the data.

I think this pattern is also useful in situations where there are multiple outcomes from an operation and we want to signify this in the result we return.

An example of where this might be useful could be if we want to know whether a call was successful or not and if it wasn’t then we want details about the way in which it failed. This could easily be modeled in a result object.

I’m not sure whether an Option/Maybe in functional programming could be considered to be a result object – they do return more information than other data types do although this isn’t for performance purposes.

Future Object

This pattern is used when we want to perform an expensive operation and do something else while we wait for that operation to return – effectively we want to asynchronously process a result and then probably call a callback when it’s done.

This pattern is useful when we want to go and get some data via a network call but we don’t want to freeze up the user interface while we’re doing that. I think this pattern is probably more applicable for client side applications than on the web where the typical approach I’ve seen is to block the use from doing anything while an operation is being executed. Perhaps something like Gmail does make use of this pattern though, I’m not sure.

The concurrency aspects should be taken care of by the ‘future object’ in this pattern meaning that the future object will be more complicated than other code.

F# asynchronous work flows certainly seem to be an example of this pattern whereby we make use of other threads to make network calls or put data into a database before returning results to the main thread when they’re done.

Lazy Object

This pattern is used when we want to return a result but we don’t know whether or not that method will actually be called – we therefore only get the data when the method is actually called.

The advantage of this is that we don’t get data unnecessarily although it can be difficult to debug since we don’t know exactly when the data is going to be fetched.

An example of this is Hibernate which by default lazy loads our data. If we later on try to access some data inside an aggregate root then we need to ensure that we have a Hibernate session open so that it is able to go and fetch the data for us.

F# also has a ‘lazy’ keyword which we can use to create lazy values which are only evaluated when specifically called:

let foo value = 
    printfn "%d" value
    value > 10
 
let fooBar = lazy foo 10    
 
> fooBar.Force();;
10
false

LINQ in C# also makes use of lazy evaluation.

In Summary

I think this is a really interesting paper and it was the first one that caught my eye from briefly skimming through the 10 that Michael Feathers listed.

I found it quite difficult explaining some of the patterns so if anything doesn’t make sense or you can think of a better way describing a pattern then please let me know.

Book club wise it was good to get to discuss what I’d read as others always come up with ideas that you hadn’t thought of and we had some interesting discussions.

Next time we are reading ‘The Readability of Tests‘ from Steve Freeman and Nat Pryce’s upcoming book ‘Growing Object Oriented Software, guided by tests‘.

Written by Mark Needham

June 16th, 2009 at 11:37 pm