Mark Needham

Thoughts on Software Development

Archive for May, 2009

Tackling the risk early on at a task level

with one comment

I wrote previously about the idea of tackling the risky tasks in a project early on – an idea that I learnt about when reading Alistair Cockburn’s Crystal Clear.

Towards the end of the post I wondered whether we could apply this idea at a story level whereby we would identify the potentially risky parts of a story and make sure that we addressed those risks before they became problematic to us.

I define risk in this sense to mean something that we don’t know a lot about and therefore don’t know how long it is going to take, something which we think might be difficult to do or something which is likely to cause us problems.

From my experience I’ve noticed that there tends to be risk around any boundaries with other systems which are likely to be a black box to us and in areas which we lack information about the best approach and therefore need to do some research/spiking first.

I’ve been trying to apply this approach and in a recent example my pair and I needed to fix a bug whereby some values on the website weren’t being correctly updated when the user changed a value in a particular field.

We started by informally working out the different changes we might need to make in order to fix this problem:

  • Javascript to handle the page refreshing with the new values
  • C# mapping data from service into JSON object
  • Service call to get back the new values

We had quite a tight timeframe to make this fix and it seemed clear that the call to the service was probably the biggest area of risk in fixing this bug – we had done some investigation which indicated that we hadn’t been sending a particular piece of data to the service and we weren’t sure what would happen when we did.

As it turned out when we did pass this data we weren’t quite getting the response that we expected but we were able to communicate with the other team and get the problem resolved really quickly.

The only thing I would have changed about our approach to this problem was that we tested whether or not the service was working by making a call to it through the UI having ensured that we were passing the correct data through to it.

We would have been able to find out whether our integration was working much more quickly if we had just written an automated test directly hitting the service and made some assertions on the results. This only became apparent to me while watching an Uncle Bob presentation which included a section about the value of testing.

Written by Mark Needham

May 11th, 2009 at 11:54 pm

Posted in Software Development

Tagged with

F#: Regular expressions/active patterns

with 4 comments

Josh has been teaching me how to do regular expressions in Javascript this week and intrigued as to how you would do this in F# I came across a couple of blog posts by Chris Smith talking about active patterns and regular expressions via active patterns.

As I understand them active patterns are not that much different to normal functions but we can make use of them as part of a let or match statement which we can’t do with a normal function.

I wanted to create an active pattern that would be able to tell me if a Twitter status has a url in it and to return me that url. If there are no urls then it should tell me that as well.

This is therefore a partial active pattern as it does not necessarily describe something. Adapted from Chris Smith’s blog I therefore ended up with the following active pattern:

1
2
3
4
5
open System.Text.RegularExpressions
 
let (|Match|_|) pattern input =
    let m = Regex.Match(input, pattern) in
    if m.Success then Some (List.tl [ for g in m.Groups -> g.Value ]) else None

This is a generic active pattern which will take in a string and a regular expression and return an Option containing the matches if there are some and none if there aren’t any.

The ‘_’ in the active pattern definition is the partial bit – we don’t necessarily have a match.

I quite liked what Chris did on line 4 of this statement whereby the results returned exclude the first item in the group of matches since this contains the entirety of the matched string rather than the individual matches.

I was then able to make use of the active pattern to check whether or not a Tweet contains a url:

let ContainsUrl value = 
    match value with
        | Match "(http:\/\/\S+)" result -> Some(result.Head)
        | _ -> None

Active patterns seem pretty cool from my limited playing around with them and are something that I came across by chance when looking around for ways to use regular expressions in F#.

Written by Mark Needham

May 10th, 2009 at 8:58 am

Posted in F#

Tagged with

C#: Using virtual leads to confusion?

with 7 comments

A colleague and I were looking through some code that I worked on a couple of months ago where I had created a one level hierarchy using inheritance to represent the response status that we get back from a service call.

The code was along these lines:

public class ResponseStatus
{
    public static readonly ResponseStatus TransactionSuccessful = new TransactionSuccessful();
    public static readonly ResponseStatus UnrecoverableError = new UnrecoverableError();
 
    public virtual bool RedirectToErrorPage
    {
        get { return true; }
    }
}
 
public class UnrecoverableError : ResponseStatus
{
 
}
 
public class TransactionSuccessful : ResponseStatus
{
    public override bool RedirectToErrorPage
    {
        get { return false; }
    }
}

Looking at it now it does seem a bit over-engineered, but the main confusion with this code is that when you click through to the definition of ‘RedirectToError’ it goes to the ResponseStatus version of that property and it’s not obvious that it is being overridden in a sub class, this being possible due to my use of the virtual key word.

You therefore need to look in two places to work out what’s going on which isn’t so good.

A solution which we came up with which is a bit cleaner is like so:

public abstract class ResponseStatus
{
    public static readonly ResponseStatus TransactionSuccessful = new TransactionSuccessful();
    public static readonly ResponseStatus UnrecoverableError = new UnrecoverableError();
 
    public abstract bool RedirectToErrorPage { get; }
}
 
public class UnrecoverableError : ResponseStatus
{
    public override bool RedirectToErrorPage
    {
        get { return true; }
    }
}
 
public class TransactionSuccessful : ResponseStatus
{
    public override bool RedirectToErrorPage
    {
        get { return false; }
    }
}

When you have more response statuses then I suppose there does become a bit more duplication but it’s traded off against the improved ease of use/reading that we get.

It’s generally considered good practice to favour composition over inheritance and from what I can tell the virtual keyword is only ever going to be useful if you’re creating an inheritance hierarchy.

An interesting lesson learned.

Written by Mark Needham

May 6th, 2009 at 7:30 pm

Posted in .NET

Tagged with ,

Adding humour to Tester/Developer collaboration

without comments

Pat Kua has a recent post where he talks about the language used between testers and developers when talking about defects that testers come across when testing some functionality and while I would agree with him that the language used is important, I’ve always found that injecting some humour into the situation takes the edge off.

As Dahlia points out I think this is probably only possible if there is good rapport between the developers and testers on the team so perhaps this has been the case for the teams I’ve worked on.

I would find it quite disappointing if my first attempt at a story cleared all the way through to business sign off without a tester in the team at least coming up with some cases where it doesn’t work properly – I try to think of the scenarios that someone with a testing hat on would come up with but they are way better at that role than I am so there’s bound to be something that I’ve missed.

Now this doesn’t mean that I should keep recreating the same types of defects/bugs over and over again – that would be the waste of re-learning and doesn’t add a whole lot of value.

In all the teams I’ve worked on there has definitely been a bit of banter between the testers and the developers whereby the testers tell us off ‘tongue in cheek’ for putting so many bugs into the code and we respond by asking them not to keep on breaking the application.

I’ve always felt that this approach worked reasonably well although it should probably be pointed out that I only do that with my ThoughtWorks colleagues where we pretty much have an implicit understanding that we are not criticising each other when talking in such a (supposedly) blunt manner.

If there’s any underlying lesson from this approach then I would suggest it’s that developers would be better of assuming that a tester is probably going to find a bug in their code and that they shouldn’t assume something is finished just because it is development complete.

Testers on the other hand maybe can be less confrontational (as Pat suggests) when they find bugs – the developers didn’t put them in there deliberately! You guys just happen to be way better at using the application in a way that finds its’ flaws than we are.

Keeping it light hearted is also way more fun!

Written by Mark Needham

May 4th, 2009 at 11:43 pm

Pair Programming: When your pair steps away

without comments

I’ve been having a bit of a discussion recently with some of my colleagues about what we should do when pair programming and one of the people in the pair has to step away to go and help someone else or to take part in an estimation session or whatever it happens to be.

If we’re pairing in an effective way then it should be possible for the person still at the computer to keep on going on the story/task that the pair were working on alone. Obviously sometimes that isn’t the case especially if one person has been driving for the majority of the time but for this post we’ll assume that both people are capable of continuing alone.

Continuing alone doesn’t necessarily mean that you become the code fairy, which is where one of the people in a pair goes and implements the functionality of something they had been pairing on in their own favoured style.

My initial thought is that if the absence is only short term then you shouldn’t plow on too much otherwise you need to spend time bringing them back on the same page when they return.

To give an example, a couple of weeks ago I was pairing with a colleague and we were retrieving a value from a Dictionary if it existed and creating a value in the Dictionary if it did not exist.

Dave had recently shown me quite a clean way of doing this which I wanted to discuss with my colleague in case they hadn’t seen it before – the approach we had been taking to solve this problem wasn’t along these lines before my pair was called away.

public class DictionaryExample
{
    private readonly Dictionary<string, string> values = new Dictionary<string, string>();
 
    public string FindValue(string key)
    {
        if(!values.ContainsKey(key))
        {
            values[key] = "somethingNew";
        }
        return values[key];
    }
}

When he came back I suggested this approach and he was happy to go with it.

I sometimes write down stuff I’m unsure of when pairing and I find that if my pair goes off for a short amount of time then this can be a useful time to look that up.

If we decide to keep on going during their absence then I think it’s important that we keep going down the same path that we were when we were pairing to reduce the amount of catching up our pair needs to do when they return.

If they have gone away for a longer period of time then we should treat it as them having left the pair and we can look for someone else to pair with or just code as if we were working alone.

That’s my current thinking on this – some colleagues have suggested they think it’s better if we just keep on coding regardless but I think this approach finds a happy medium.

Written by Mark Needham

May 3rd, 2009 at 7:08 pm

Posted in Pair Programming

Tagged with

F#: Stuff I get confused about

with 6 comments

Coming from the world of C# I’ve noticed that there are a couple of things that I sometimes get confused about when playing around with stuff in F# land.

Passing arguments to functions

The way that we pass arguments to functions seems to be a fairly constant cause of confusion at the moment especially when doing that as part of a chain of other expressions where the use of brackets starts to become necessary.

In C# I’m used to putting the arguments in parentheses but that doesn’t quite work in F#.

For example in my twitter application I was trying to append two lists together similar to this:

let first_item = Seq.singleton("mark")
let second_item = Seq.singleton "needham"
let joined_items = Seq.append (first_item, second_item)

Which doesn’t compile with the following error message:

The type 'b * 'c' is not compatible with the type 'seq<'a>'

What we’ve done here is pass in a tuple containing ‘first_item’ and ‘second_item’ instead of passing them separately as arguments to the function.

The correct way of doing this is like so:

let joined_items = Seq.append first_item second_item

Values and Expressions

As I understand it in everything that we create in F# is an expression and when those expressions get evaluated we end up with some values.

I wrote previously how we got confused about this distinction in a coding dojo a couple of weeks ago. That particular example was around how we need to create functions which take in an argument of type ‘unit’ if they are to be picked up by the XUnit.NET test runner.

Dave explains how this works in the comments of that post:

Given this code:

let should_do_something () = Assert.AreEqual(2,2)


The extra space implies that should_do_something is a function, which takes one argument which is a unit. This is more similar to the syntax for declaring a one argument function where the argument is actually a value, such as

let square_it x = x * x

When we put brackets around the arguments we are passing to functions they stop being passed as arguments as the compiler now tried to evaluate what’s in the brackets first and pass it to the function.

To give an example from playing around with Seq.append, if we do this:

let joined_items = Seq.append (first_item second_item)

We get a compilation error over ‘first_item’:

The value is not a function can cannot be applied

Here the compiler attempts to evaluate the function ‘first_item’ with an argument ‘second_item’ but since ‘first_item’ is actually a value and not a function this is impossible.

Referencing other types in our code

From my experiences so far it seems that F# uses one pass compilation such that you can only reference types or functions which have been defined either earlier in the file you’re currently in or appear in a file which is specified earlier in the compilation order.

This seems a bit restrictive to me although I’m sure there’s probably some benefits of this approach that I’m not yet aware of, maybe around type checking.

Written by Mark Needham

May 2nd, 2009 at 2:38 pm

Posted in F#

Tagged with

F#: Entry point of an application

with 3 comments

In an attempt to see whether or not the mailboxes I’ve been working on for my twitter application were actually processing messages on different threads I ran into the problem of defining the entry point of an F# application.

I thought it would be as simple as defining a function called ‘main’ but I put this function into my code ran the executable and nothing happened!

Googling the problem a bit led me to believe that it is possible to do but that the function needs to be the last thing that happens in the compilation sequence of the project. i.e. it needs to be on the last line of the last file that gets compiled.

That just seemed wrong to me and a Stack Overflow thread suggested that it should be possible to get around this problem by using the EntryPointAttribute on the main function.

I tried this leading to the following code:

[<EntryPoint>]    
let main args = 
    printfn "in main function"
    0

It seems like if you have the EntryPointAttribute on a function that function needs to be of type ‘string array -> int’ hence the returning of 0 by this function.

This still didn’t solve my problem though and when I tried to build the project it was now failing to compile but no errors were showing up on the Visual Studio error list.

This gave me the chance to try out an F# build tool I came across a couple of weeks ago called Fake.

I setup a build file just to compile the code based on Steffen Forkman’s blog post and then tried to compile the project, leading to the following error:

error FS0191: A function labelled with the 'EntryPointAttribute' atribute must be the last declaration in the last file in the compilation sequence.

Which suggests that it doesn’t actually matter whether or not you have the attribute or not, the main method still needs to be the last step of compilation.

I wanted to get something working so I’ve just rearranged my fsproj file so that the file with the main function in is the last one listed but it seems a ridiculous fix and I’m sure there must be a better way to do this.

Any ideas?

Written by Mark Needham

May 2nd, 2009 at 1:56 am

Posted in F#

Tagged with

F#: Erlang style messaging passing

with 3 comments

As I mentioned in my previous post about over loading methods in F# I’ve been trying to refactor my twitter application into a state where it can concurrently process twitter statuses while continuing to retrieve more of them from the twitter website.

I played around a bit with Erlang last year and one thing that I quite liked is the message passing between processes to allow operations to be performed concurrently.

I found a cool blog post by Matthew Podwysocki where he explains how we can achieve Erlang message passing in F# by using mail boxes so I decided to try and follow his example to see if I could do a similar thing with my twitter application.

As far as I understand the Erlang approach to messaging follows the actor model which is defined as follows:

An actor is a computational entity that, in response to a message it receives, can concurrently:

  • send a finite number of messages to other actors
  • create a finite number of new actors
  • designate the behavior to be used for the next message it receives.

I can definitely see the first two ideas in the solution that I’ve ended up with but I’m not sure how you would do the third.

From reading Joe Armstrong’s Programming Erlang book and Ulf Wiger’s comment on Robert Pickering’s blog, I understand that the code we can create in F# is not exactly the same as what we can do in Erlang since in Erlang each process has its own mailbox whereas in F# a thread can handle more than one mailbox.

The reason for me wanting to do this is because the twitter API only allows me to retrieve 20 statuses at a time and if I’m getting a large number of them my original design means that we are just waiting for the statuses to be accumulated before we can do anything else with them – I want to make this a bit more real time.

This is what the code looks like at the moment:

open System
open Microsoft.FSharp.Control.CommonExtensions
open Microsoft.FSharp.Control
open System.Threading
 
type Message = Phrase of TwitterStatus | Stop
 
type LinkProcessor(callBack) =
  let agent = MailboxProcessor.Start(fun inbox ->
    let rec loop () =
      async {
              let! msg = inbox.Receive()
              match msg with
              | Phrase item ->
                callBack item
                return! loop()
              | Stop ->
                return ()
            }
    loop()
  )
 
     member x.Send(message) =
        match box message with
            | :? seq<TwitterStatus> as message -> message |> Seq.iter (fun message -> agent.Post(Phrase(message)))
            | :? TwitterStatus as message -> agent.Post(Phrase(message))
            | _ -> failwith "Unmatched message type" 
 
   member x.Stop() = agent.Post(Stop)
 
let linkProcessor = new LinkProcessor(fun status -> printfn "[%s] %s, thread id: (%d)" status.User.ScreenName status.Text Thread.CurrentThread.ManagedThreadId)
 
let hasLink (message:TwitterStatus) = message.Text.Contains("http")
 
type MainProcessor() =
  let agent = MailboxProcessor.Start(fun inbox ->
    let rec loop () =
      async {
              let! msg = inbox.Receive()
              match msg with
              | Phrase item when item |> hasLink -> 
                linkProcessor.Send(item)
                return! loop()
              | Phrase item ->
                printfn "in mainprocessor, thread id: (%d)" Thread.CurrentThread.ManagedThreadId
                return! loop()
              | Stop ->
                return ()
            }
    loop()
  )
 
   member x.Send(message) =
        match box message with
            | :? seq<TwitterStatus> as message -> message |> Seq.iter (fun message -> agent.Post(Phrase(message)))
            | :? TwitterStatus as message -> agent.Post(Phrase(message))
            | _ -> failwith "Unmatched message type" 
 
   member x.Stop() = agent.Post(Stop)
 
let centralProcessor = new MainProcessor()

And this is the code where we process the statuses:

let rec findStatuses (args:int64 * int * int * seq<TwitterStatus>) =
    let findOldestStatus (statuses:seq<TwitterStatus>) = 
        statuses |> Seq.sort_by (fun eachStatus -> eachStatus.Id) |> Seq.hd
    match args with 
    | (_, numberProcessed, statusesToSearch, soFar) when numberProcessed >= statusesToSearch -> soFar |> ignore
    | (lastId, numberProcessed, statusesToSearch, soFar) ->  
        let latestStatuses = getStatusesBefore lastId
        centralProcessor.Send(latestStatuses)
        findStatuses(findOldestStatus(latestStatuses).Id, numberProcessed + 20, statusesToSearch, Seq.append soFar latestStatuses)

(The rest of the code is here)

There is certainly some duplication in there – I think it should be possible to get a BaseMailboxProcessor – and I found it annoying that I needed to have a different type of mail box processor for each of the cases so that I could have different pattern matching in each.

In Erlang that scaffolding is built into the language and you just need to care about the pattern matching which is the important thing here.

I’ve setup a callback function that’s passed to the LinkProcessor which prints out the status when it processes it. The next step is to store that somewhere so I can view them later.

Running this though the threadId is always the same. The console output looks like this:

in mainprocessor, thread id: (6)
in mainprocessor, thread id: (6)
in mainprocessor, thread id: (6)
in mainprocessor, thread id: (6)
[jbristowe] Beautiful morning in downtown Edmonton: http://twitpic.com/4c6n8 #YEG, thread id: (6)
[MParekh] ABC News does a fly-by correction of a critical 2007 Torture story. http://bit.ly/C9tNH, thread id: (6)

They processing of statuses doesn’t ever interleave either so it looks like the thread is switching its attention between the two mail boxes.

I was expecting to see different threads processing each mail box but I’m not sure whether that’s a correct expectation or not?

Written by Mark Needham

May 2nd, 2009 at 1:53 am

Posted in F#

Tagged with