Mark Needham

Thoughts on Software Development

Archive for the ‘F#’ tag

F#: Values, functions and DateTime

without comments

One of the things I’ve noticed recently in my playing around with F# is that when we decide to wrap calls to the .NET DateTime methods there is a need to be quite careful that we are wrapping those calls with an F# function and not an F# value.

If we don’t do this then the DateTime method will only be evaluated once and then return the same value for every call which is probably not the behaviour we’re looking for.

The following shows how we could wrap call to get the current time in string format inside a value:

let timeNow = System.DateTime.Now.ToLongTimeString()

If we then execute ‘timeNow’ to show the current time before and after a sleep this is what we see:

printfn "The time now is %s" timeNow
System.Threading.Thread.Sleep(2000)
printfn "The time now is %s" timeNow
The time now is 2:00:29 PM
The time now is 2:00:29 PM

As we can see the time has remained the same despite the fact that we put a sleep in between thew two calls.

Looking at the C# version of this code via Reflector we can see that ‘timeNow’ is just a string:

public string timeNow;

The way to get the real current time is to define a function to get the time so that it will be reevaluated each time we ask for the time:

let timeNowUpToDate () = System.DateTime.Now.ToLongTimeString()

If we do the same test as we did above:

printfn "The time now is %s" (timeNowUpToDate())
System.Threading.Thread.Sleep(2000)
printfn "The time now is %s" (timeNowUpToDate())

We get the following results:

The time now is 2:05:13 PM
The time now is 2:05:15 PM

Which is what we were looking for in the first place!

Via Reflector again this is what that code would look like in C#:

[Serializable]
internal class timeNowUpToDate@127 : FastFunc<Unit, string>
{
    // Methods
    internal timeNowUpToDate@127()
    {
    }
 
    public override string Invoke(Unit unitVar0)
    {
        return DateTime.Now.ToLongTimeString();
    }
}

As we can see every time the function is invoked a call to the DateTime API will be made which is what we want to happen.

Written by Mark Needham

July 25th, 2009 at 2:10 pm

Posted in F#

Tagged with

F#: Active patterns for parsing xml

with one comment

I decided to spend some time doing some refactoring on the FeedBurner application that I started working on last week and the first area I worked on was cleaning up the way that the xml we get from FeedBurner is parsed.

While playing around with the application from the command line I realised that it didn’t actually cover error conditions – such as passing in an invalid feed name – very well and I thought this would be a good opportunity to make use of an active pattern to handle this.

I wanted to try and test drive this bit of code so my first idea was to try and call the active pattern directly from my test – I am testing using NUnit 2.5 which now allows us to create tests without the need for a class with a [TestFixture] attribute on:

[<Test>]
let should_return_no_feed_given_invalid_xml () =
	let feedType = Xml.(|NoFeedFound|FeedBurnerFeed|) "invalid xml"
	// other code
let (|NoFeedFound|FeedBurnerFeed|) xml = 
	NoFeedFound()

The problem I ran into with this approach is that the value of feedType when this test ran was ‘Microsoft.FSharp.Core.Choice`2+_Choice1Of2′ and I couldn’t see a way to access this at compile time in order to assert against it. Either way a test asserting that the return value was ‘Choice1Of2′ doesn’t seem to be the most expressive test anyway.

I chatted with about this a bit with Dave and he suggested that it would probably be easier to test the active pattern via the function while actually makes use of it.

I ended up with the following three tests:

open FeedBurnerService
 
[<Test>]
let should_throw_exception_if_feed_xml_is_invalid () =
    Assert.Throws<FailureException>(fun () -> FeedBurnerService.Parse "some broken xml" |> ignore) |> ignore
 
[<Test>]
let should_throw_exception_if_no_feed_found () =
    let feedXml = @"<?xml version=""1.0"" encoding=""utf-8"" ?>
                    <rsp stat=""fail"">
                        <err code=""1"" msg=""Feed Not Found"" />
                    </rsp>"
 
    Assert.Throws<FailureException>((fun () -> FeedBurnerService.Parse feedXml |> ignore), "Failed to process feed: Feed Not Found") |> ignore   
 
[<Test>]     
let should_retrieve_circulation_and_date_if_valid_xml () =
    let feedXml = @"<?xml version=""1.0"" encoding=""UTF-8""?>
                    <rsp stat=""ok"">
                        <feed id=""tdv0bg210cr731gc3nssn512cg"" uri=""MarkNeedham"">
                            <entry date=""2009-07-16"" circulation=""630"" hits=""1389"" reach=""629"" />
                        </feed>
                    </rsp>"
 
    let feedBurnerApi = FeedBurnerService.Parse feedXml
    let entry = feedBurnerApi |> Entries |> Seq.hd
 
    Assert.AreEqual(entry.Circulation, 630) 
    Assert.AreEqual(entry.Date, "2009-07-16")

The interesting thing here is that the ‘Assert.Throws’ method takes in a C# delegate so we need to wrap the call to ‘FeedBurnerService.Parse’ inside a function. As with xUnit.NET’s equivalent method we need to ignore the results of the function call in these tests.

module FeedBurnerService = 
    open System.Xml.Linq
    open System
 
    let GetDescendants element (xDocument:XDocument)  = xDocument.Descendants(xName element)
    let GetAttribute element (xElement:XElement) = xElement.Attribute(xName element)   
 
	type FeedBurnerApi(entries:seq<Entry>) =	
    		member x.Entries = entries
	and
		Entry(date : string, circulation : int) =
        		member x.Date = date 
        		member x.Circulation = circulation
 
    let Entries (feedBurnerApi:FeedBurnerApi) = feedBurnerApi.Entries 
 
    let (|NoFeedFound|FeedBurnerFeed|) xml = 
        try 
            let document = xml |> XDocument.Parse
            let entries = document |> 
                          GetDescendants "entry" |> 
                          Seq.map (fun element -> GetAttribute "circulation" element, GetAttribute "date" element) |>
                          Seq.map (fun attribute -> new Entry(circulation =  Int32.Parse((fst attribute).Value), date = (snd attribute).Value) )
 
            match Seq.length entries with 
                | 0 -> NoFeedFound((document |> GetDescendants "err" |> Seq.hd |> GetAttribute "msg").Value)
                | _ -> FeedBurnerFeed(new FeedBurnerApi(entries))
        with 
            | :? System.Xml.XmlException as ex -> NoFeedFound(ex.Message)
 
    let Parse xml =
        match xml with 
            | NoFeedFound(error) -> failwith ("Failed to process feed: " + error)
            | FeedBurnerFeed(entries) -> entries

I continued using the idea of creating F# functions to wrap C# style method calls with the ‘Entries’ function which delegates to the ‘Entries’ property on ‘FeedBurnerApi’ which reduces the need to store intermediate state. I probably could have done the same for the ‘Date’ and ‘Circulation’ properties although I couldn’t see a significant improvement in the readability of the code by doing this.

I have also made use of the ‘and’ keyword to define the ‘Entry’ type because it is referenced by the ‘FeedBurnerApi’ type and therefore needs to be defined at that stage. The other way to ensure this was the case would be to define ‘Entry’ before ‘FeedBurnerApi’ although this doesn’t seem to read as nicely to me.

We are making use of a multi case active pattern in the code which means that the input we are processing with the active pattern can be split into two different things in this case. Don Syme goes into more detail on the different types of active patterns in his paper and Chris Smith also covers them in his post.

The code for the active pattern feels a bit too imperative at the moment although I wasn’t sure of the best way to cover the different scenarios without writing it this way – no doubt there is a more functional way to do this but I can’t see it yet.

Making use of the active pattern in the code has made it much easier to work with than passing around a sequence of tuples as I was doing previously. It has also made it easy to exit from the program early if there is a problem with the data inputted.

Written by Mark Needham

July 19th, 2009 at 12:12 pm

Posted in F#

Tagged with ,

F#: Passing command line arguments to a script

with one comment

I’ve been doing a bit of refactoring of my FeedBurner application so that I can call it from the command line with the appropriate arguments and one of the problems I came across is working out how to pass arguments from the command line into an F# script.

With a compiled application we are able to make use of the ‘EntryPointAttribute‘ to get access to the arguments passed in:

[<EntryPointAttribute>]
let main args =
    ShowFeedBurnerStats args
    0

Sadly this doesn’t work with a script but it was pointed out on Hub FS that we can get access to all the command line arguments by using ‘Sys.argv’ or ‘System.Environment.GetCommandLineArgs()’ which seems to be the preferred choice of the compiler.

The problem is that with that method you get every single argument passed to the command line and there are some that we don’t care about given the way you would typically call an F# script:

fsi --exec --nologo CreateFeedBurnerGraph.fsx -- "scotthanselman" "2009-03-01" "2009-07-14"

Results in the following arguments:

fsi
--exec
--nologo
CreateFeedBurnerGraph.fsx
--
scotthanselman
2009-03-01
2009-07-14

We care about everything after the ‘–’ so I wrote a little function to just gather those values:

    let GetArgs initialArgs  =
        let rec find args matches =
            match args with
            | hd::_ when hd = "--" -> List.to_array (matches)
            | hd::tl -> find tl (hd::matches) 
            | [] -> Array.empty
        find (List.rev (Array.to_list initialArgs) ) []

I’m not sure this works for every possible case (if you put ‘–’ in as an argument it wouldn’t work as expected!) but it’s doing the job so far.

An even better way of doing this which I came across while writing this is to use ‘fsi.CommandLineArgs’ which allows you to just get the arguments passed to the script. Even with this approach though the ‘–’ is still counted as one of the arguments so the function above still makes sense.

GetArgs [|"--"; "scotthanselman"; "2009-03-01"; "2009-07-14"|]
val it : string array = [|"scotthanselman"; "2009-03-01"; "2009-07-14"|]

And from the script I have the following:

let programArgs = fsi.CommandLineArgs |> GetArgs
ShowFeedBurnerStats programArgs

Written by Mark Needham

July 16th, 2009 at 7:40 am

Posted in F#

Tagged with

F#: A day writing a Feedburner graph creator

with 6 comments

I’ve spent a bit of the day writing a little application to take the xml from my Feedburner RSS feed and create a graph showing the daily & weekly average subscribers.

What did I learn?

  • I decided that I wanted to parameterise the feedburner url so that I would be able to run the code for different time periods and against different feeds. In C# we’d probably make use of ‘string.Format()’ which has an equivalent in F# called ‘sprintf’

    My initial thought was that I would be able to do something like this:

    let ShowFeedBurnerStats feed =
        let statsUrl = "https://feedburner.google.com/api/awareness/1.0/GetFeedData?uri=%s&dates=2009-01-01,2009-07-11"
        sprintf statsUrl feed |> GetXml
        // more code

    Which actually results in the following compilation error:

    The type 'string' is not compatible with the type 'Printf.StringFormat

    After a bit of searching I found a post by Robert Pickering where he explains that the format string needs to be next to the sprintf function to work as expected:

    let ShowFeedBurnerStats feed =
        let statsUrl = sprintf "https://feedburner.google.com/api/awareness/1.0/GetFeedData?uri=%s&dates=2009-01-01,2009-07-11"
        statsUrl feed |> GetXml
        // more code

    ‘statsUrl’ therefore becomes a function taking in a ‘string’ and returning a ‘string’.

  • I’m still trying to work out the best way to decompose the code I write into functions which make sense in terms of the domain I’m working in.

    I often found myself splitting up a function along the boundary of where any I/O interaction was happening so that I could execute the I/O function and save the data before using it in another function which I would execute a lot more frequently (using F# interactive) while I was tweaking it.

  • I still haven’t come up with a completely satisfactory approach to coding these little applications – right now I’m finding that the feedback cycle is significantly quicker if I just write functions and then run them in F# interactive and then tweak anything which isn’t working as expected.

    I didn’t write any unit tests while coding this although I did find myself writing shorter functions than I originally did when writing my little twitter application. The problem of not writing the tests is that I lose the protection against regression that I would otherwise get.

  • I still have a bit of a love hate relationship with tuples – I found myself making use of them early on when I was focused on getting the code to work and I could still understand the code easily.

    Originally I was only storing ‘date’ and ‘circulation’ in the tuple but once I added a third value to the tuple (‘weeklyAverage’) it became too confusing for me to understand so I decided to introduce the ‘FeedBurnerStats’ type to simplify things for myself.

  • I ended up writing a function called ‘Join’ which is quite similar to ‘Seq.zip’ because I wanted to join two sequences together but only join items which had the same date (the ‘string’ value in the tuple).

    Therefore, if I had some data like this:

    ‘dailyStats’

    "2009-01,07", 200
    "2009-01,08", 222

    ‘weeklyAverages’

    "2009-01,07", 300
    "2009-01,08", 322

    I wanted the join of the two sequences to look like this:

    "2009-01,07", 200, 300
    "2009-01,08", 222, 322

    Which wasn’t working as expected when I used ‘Seq.zip’ – the items that were getting matched together seemed to be quite random to me.

    let Join (dailyStats:seq<decimal*string>) (weeklyAverages:seq<decimal*string>) =
        dailyStats |> Seq.map (fun d -> { Date = d |> snd; 
                                          Circulation = d |> fst;
                                          WeeklyAverage = weeklyAverages |> Seq.find (fun w -> snd d = snd w) |> fst})

I’ve included the code is at the end of the post – there are some areas where I don’t really like the way I’ve solved a problem but I’m not sure of a better way at the moment.

In particular:

  • I wanted to make use of ‘Seq.windowed’ to find the rolling weekly average but I needed it to go back 7 days rather than forward 7 days which meant I needed to reverse the sequence. Right now I’ve done this by converting it to a list and using ‘List.rev’ to do so but this seems like a fairly inefficient way of doing this.

    The alternative seemed to be to write a function to change the order of the items in the sequence but again this doesn’t seem like a great approach.

  • What do you do with functions which are only used by one other areas of the code? For example ‘ConvertToCommaSeparatedString’ is only used by ‘CreateGoogleGraphUri’ so I defined it inside that function – I could then pull it to a function in its own right if other areas of the code need it. I did this to reduce the clutter of functions hanging around but it then makes ‘CreateGoogleGraphUri’ more difficult to read.

I decided to run it against some blogs I follow to see what the graphs, created using Google’s Charts API, would look like:

ShowFeedBurnerStats "scotthanselman" "2009-03-01" "2009-07-11";;
ShowFeedBurnerStats "youdthinkwithallmy" "2009-03-01" "2009-07-11";;
ShowFeedBurnerStats "codinghorror" "2009-03-01" "2009-07-11";;

hanselman.png

jasonyip.png

codinghorror.png

Interestingly you can actually see the points where feedburner for some reason counted a particular days circulation as being 0.

And here’s the code:

open System.IO
open System.Net
open Microsoft.FSharp.Control
open System.Xml.Linq
open System
 
let downloadUrl (url:string) = async{
    let request =  HttpWebRequest.Create(url)
    let! response = request.AsyncGetResponse()
    let stream = response.GetResponseStream()
    use reader = new StreamReader(stream)
    return! reader.AsyncReadToEnd() }
 
let xName value = XName.Get value
let GetDescendants element (xDocument:XDocument)  = xDocument.Descendants(xName element)
let GetAttribute element (xElement:XElement) = xElement.Attribute(xName element)
 
let GetXml = downloadUrl >> Async.Run >> XDocument.Parse 
 
let GetDateAndCirculation (document:XDocument) = 
    document |> 
    GetDescendants "entry"  |> 
    Seq.map (fun element -> GetAttribute "circulation" element, GetAttribute "date" element)  |> 
    Seq.map (fun attribute -> Decimal.Parse((fst attribute).Value), (snd attribute).Value) 
 
let CalculateAverage days (feedStats:seq<decimal * string>) =
    let ReverseSequence (sequence:seq<_>) = sequence |> Seq.to_list |> List.rev |> List.to_seq
    feedStats |> 
    ReverseSequence |>
    Seq.windowed days |>
    Seq.map (fun x -> x |> Array.map (fun y -> y |> fst) |> Array.average, x.[0] |> snd) |>
    ReverseSequence    
 
let CalculateWeeklyAverage (feedStats:seq<decimal * string>) = CalculateAverage 7 feedStats
 
type FeedBurnerStats = { Date : string; Circulation: decimal; WeeklyAverage: decimal }
 
 
let Join (dailyStats:seq<decimal*string>) (weeklyAverages:seq<decimal*string>) =
    dailyStats |> Seq.map (fun d -> { Date = d |> snd; 
                                      Circulation = d |> fst;
                                      WeeklyAverage = weeklyAverages |> Seq.find (fun w -> snd d = snd w) |> fst})        
 
let GetFeedBurnerStats feed startDate endDate =
    let statsUrl = sprintf "https://feedburner.google.com/api/awareness/1.0/GetFeedData?uri=%s&dates=%s,%s"
    let allStats = GetDateAndCirculation (statsUrl feed startDate endDate |> GetXml)
    let weeklyAverages = allStats |> CalculateWeeklyAverage
    let dailyStats = allStats |> Seq.filter (fun x -> weeklyAverages |> Seq.exists (fun y -> snd y = snd x)) 
    Join dailyStats weeklyAverages   
 
let CreateGoogleGraphUri feed (stats:seq<FeedBurnerStats>) =
    let ConvertToCommaSeparatedString (value:seq<string>) =
        let rec convert (innerVal:List<string>) acc =
            match innerVal with
                | [] -> acc
                | hd::[] -> convert [] (acc + hd)
                | hd::tl -> convert tl (acc + hd + ",")          
        convert (Seq.to_list value) ""  
 
    let graphUrl = sprintf "http://chart.apis.google.com/chart?cht=lc&chtt=%s&&chco=000000,FF0000&chdl=WeeklyAverage|Daily&chs=600x240&chds=%s,%s&chd=t:%s|%s"
    let weeklyAverages = stats |> Seq.map (fun f -> f.WeeklyAverage.ToString("f0")) |> ConvertToCommaSeparatedString 
    let circulation = stats |> Seq.map (fun f -> f.Circulation.ToString("f0")) |> ConvertToCommaSeparatedString 
 
    let maximum = stats |> Seq.map (fun f -> f.Circulation) |> Seq.max
    let minimum = stats |> Seq.map (fun f -> f.Circulation) |> Seq.min
 
    new System.Uri(graphUrl feed (minimum.ToString("f0")) (maximum.ToString("f0")) weeklyAverages circulation)      
 
let ShowFeedBurnerStats feed startDate endDate = CreateGoogleGraphUri feed (GetFeedBurnerStats feed startDate endDate)

Written by Mark Needham

July 12th, 2009 at 5:14 pm

Posted in F#

Tagged with

F#: Wrapping .NET library calls

with 6 comments

I’ve been spending a bit of time writing some code to parse the xml of my Feedburner RSS feed and create a graph to show both the daily and weekly average subscribers which you can’t currently get from the Feedburner dashboard.

One thing which I found while doing this is that calls to the .NET base class library don’t seem to fit in that well with the way that you would typically compose functions together in F#.

For example one of the first things I wanted to do was print the date and the circulation count to the console which I originally did like this:

open System.IO
open System.Net
open Microsoft.FSharp.Control
open System.Xml.Linq
open System
 
let xName value = XName.Get value
 
// GetXml is a function of type string -> string
 
let GetFeedBurnerStats url = 
    let feedBurnerXml = GetXml url |> XDocument.Parse
    feedBurnerXml.Descendants(xName "entry") |> 
    Seq.map (fun x -> x.Attribute(xName "circulation"), x.Attribute(xName "date")) |>
    Seq.iter (fun x -> printfn "%s %s" (fst x).Value (snd x).Value)

It’s quite annoying that we need to store the XDocument as a value before being able to call one of the methods on it to get the data that we want.

I realised that if I created a function which took in the element whose descendants I wanted to find and the XDocument I could then call the ‘XDocument.Descendants()’ method inside that function:

1
2
3
4
5
6
7
8
9
let xName value = XName.Get value
let GetDescendants element (xDocument:XDocument)  = xDocument.Descendants(xName element)
 
let GetFeedBurnerStats = 
    GetXml >> 
    XDocument.Parse >> 
    GetDescendants "entry" >>
    Seq.map (fun x -> x.Attribute(xName "circulation"), x.Attribute(xName "date")) >>
    Seq.iter (fun x -> printfn "%s %s" (fst x).Value (snd x).Value)

Since we no longer need to store the intermediate step of creating the XDocument we can now just chain together the functions using the functional composition operator instead of the forward operator.

We can also do this with the calls to ‘Attribute’ in the ‘Seq.map’ function on line 9 which helps simplify the code around there.

1
2
3
4
5
6
7
8
9
10
let xName value = XName.Get value
let GetDescendants element (xDocument:XDocument)  = xDocument.Descendants(xName element)
let GetAttribute element (xElement:XElement) = xElement.Attribute(xName element)
 
let GetFeedBurnerStats = 
    GetXml >> 
    XDocument.Parse >> 
    GetDescendants "entry" >>
    Seq.map (fun x -> GetAttribute "circulation" x, GetAttribute "date" x) >>
    Seq.iter (fun x -> printfn "%s %s" (fst x).Value (snd x).Value)

Written by Mark Needham

July 12th, 2009 at 12:11 pm

Posted in F#

Tagged with

F#: Downloading a file from behind a proxy

without comments

I’ve been continuing working on a little script to parse Cruise build data and the latest task was to work out how to download my Google Graph API created image onto the local disk.

I’m using the WebClient class to do this and the code looks like this:

let DownloadGraph (fileLocation:string) (uri:System.Uri) = async {
    let webClient = new WebClient()
    webClient.DownloadFileAsync(uri, fileLocation)}

Sadly this doesn’t work when I run it from the client site where I have access to the build metrics as there is a corporate proxy sitting in the way.

I tried Googling how to do this but all the ways that I tried kept resulting in the following error:

407 proxy authentication required

Even though I was entering a user name and password!

I didn’t succeed until my colleague showed me a way of getting past the proxy in C# which I could quite easily use in my code:

let DownloadGraph (fileLocation:string) (uri:System.Uri) = async {
    let webClient = new WebClient()
    webClient.Proxy <- new WebProxy("proxyName:port", true, null, new NetworkCredential("userName", "password", "corporateDomain"))
    webClient.DownloadFileAsync(uri, fileLocation)}

One thing I was doing wrong was putting a ‘http’ at the start of the proxyName which I think in my case was wrong as I later learn that the proxy isn’t a HTTP one.

I’m also making use of asynchronous workflows in this example so that the actual downloading of the files will be done away from the main thread – this also gives me the option to download multiple files asynchronously if I want to.

Written by Mark Needham

July 11th, 2009 at 3:20 am

Posted in F#

Tagged with ,

F#: Convert sequence to comma separated string

with 4 comments

I’ve been continuing playing around with parsing Cruise data as I mentioned yesterday with the goal today being to create a graph from the build data.

After recommendations from Dean Cornish and Sam Newman on Twitter I decided to give the Google Graph API a try to do this and realised that I would need to create a comma separated string listing all the build times to pass to the Google API.

My initial thinking was that I could just pipe the sequence of values through ‘Seq.fold’ and add a comma after each value:

let ConvertToCommaSeparatedString (value:seq<string>) =
    let initialAttempt = value |> Seq.fold (fun acc x -> acc + x + ",") ""
    initialAttempt.Remove(initialAttempt.Length-1)

It works but you end up with a comma after the last value as well and then need to remove that on the next line which feels very imperative to me.

My next thought was that maybe I would be able to do this by making use of a recursive function which matched the sequence on each iteration and then when it was on the last value in the list to not add the comma.

I know how to do this for a list so I decided to go with that first:

let ConvertToCommaSeparatedString (value:seq<string>) =
    let rec convert (innerVal:List<string>) acc = 
        match innerVal with
            | [] -> acc
            | hd::[] -> convert [] (acc + hd)
            | hd::tl -> convert tl (acc + hd + ",")           
    convert (Seq.to_list value) ""

That works as well but it seems a bit weird that we need to convert everything in a list to do it.

A bit of googling revealed an interesting post by Brian McNamara where he suggests creating an active pattern which would cast the ‘seq’ to a ‘LazyList’ (which is deprecated but won’t be removed apparently) and then do some pattern matching against that instead.

The active pattern which Brian describes is like this:

let rec (|SeqCons|SeqNil|) (s:seq<'a>) =
    match s with
    | :? LazyList<'a> as l ->
        match l with
        | LazyList.Cons(a,b) -> SeqCons(a,(b :> seq<_>))
        | LazyList.Nil -> SeqNil
    | _ -> (|SeqCons|SeqNil|) (LazyList.of_seq s :> seq<_>)

This doesn’t cover the three states of the sequence which I want to match so I adjusted it slightly to do what I want:

let rec (|SeqCons|SeqNil|SeqConsLastElement|) (s:seq<'a>) =
    match s with
    | :? LazyList<'a> as l ->
        match l with
        | LazyList.Cons(a,b) -> 
            match b with
                | LazyList.Nil -> SeqConsLastElement(a)
                | LazyList.Cons(_,_) -> SeqCons(a,(b :> seq<_>))
        | LazyList.Nil -> SeqNil
    | _ -> (|SeqCons|SeqNil|SeqConsLastElement|) (LazyList.of_seq s :> seq<_>)

Our function to convert sequences to a comma separated string would now look like this:

let ConvertToCommaSeparatedString (value:seq<string>) =
    let rec convert (innerVal:seq<string>) acc = 
        match innerVal with
            | SeqNil -> acc
            | SeqConsLastElement(hd) -> convert [] (acc + hd)
            | SeqCons(hd,tl) -> convert tl (acc + hd + ",")           
    convert (value) ""

An example of this in action would be like this:

ConvertToCommaSeparatedString (seq { yield "mark"; yield "needham" });;
val it : string = "mark,needham"

Written by Mark Needham

July 9th, 2009 at 10:32 pm

Posted in F#

Tagged with

F#: Parsing Cruise build data

with 2 comments

I’ve been playing around a bit with the properties REST API that Cruise exposes to try and get together some build metrics and I decided it might be an interesting task to try and use F# for.

I’m making use of the ‘search’ part of the API to return the metrics of all the builds run on a certain part of the pipeline and I then want to parse those results so that I can extract just the name of the agent that ran that build and the duration of that build.

The first part of this task is to parse the data and extract just the information I’m interested in.

The data is like this:

cruise_agent,cruise_job_duration,cruise_job_id,cruise_job_result,cruise_pipeline_label,cruise_timestamp_01_scheduled,cruise_timestamp_02_assigned,cruise_timestamp_03_preparing,cruise_timestamp_04_building,cruise_timestamp_05_completing,cruise_timestamp_06_completed\n
BuildAgentOne (Sydney, PersonOne),319,14052,Passed,2223,2009-06-25 12:14:01 +1000,2009-06-25 12:14:02 +1000,2009-06-25 12:14:02 +1000,2009-06-25 12:14:35 +1000,2009-06-25 12:19:54 +1000,2009-06-25 12:19:55 +1000\n
BuildAgentTwo (Sydney, PersonTwo),422,14084,Passed,2224,2009-06-25 14:13:57 +1000,2009-06-25 14:13:58 +1000,2009-06-25 14:13:58 +1000,2009-06-25 14:14:48 +1000,2009-06-25 14:21:49 +1000,2009-06-25 14:21:50 +1000\n

I first started off trying to do this extraction all in one regular expression but after a while realised that I’d probably have more success if I ran a regular expression over each line individually.

type CruiseData = { Agent: string; Duration: string }
1
2
3
4
5
6
7
8
9
let ExtractValues (item:string) =  
   let matchBuildDuration item = Regex.Match(item, "(.*\)),([0-9]+),") 
   Regex.Split(item, "\n") |> 
   Array.map (fun item -> 
        let m = matchBuildDuration item
        if(m.Success) 
        then { Agent = m.Groups.[1].Value; Duration = m.Groups.[2].Value } 
        else { Agent = ""; Duration = ""}  ) |>
   Array.filter (fun item -> item.Agent <> "" && item.Duration <> "")

I realised when I started writing the let statement inside the Array.map function on line 4 that I was thinking about this problem way too imperatively. I actually backed out at that stage and had another go but I decided it would be interesting to see what each iteration of the solution would look like if I had actually completed it.

An improvement on that would be to not set up an empty ‘CruiseData’ like we are doing on line 8 but instead to make use of the Option type to define when we do and do not have a value:

1
2
3
4
5
6
7
8
9
let ExtractValues (item:string) =  
   let matchBuildDuration item = Regex.Match(item, "(.*\)),([0-9]+),") 
   Regex.Split(item, "\n") |> 
   Array.map (fun item -> 
    let m = matchBuildDuration item
    if(m.Success) 
    then Some({ Agent = m.Groups.[1].Value; Duration = m.Groups.[2].Value }) 
    else None ) |>
    Array.filter (fun item -> item.IsSome)

It’s still not great as we have imperative logic inside the Array.map function which looks pretty ugly.

At this stage I realised that I needed to excluded any lines which didn’t match the regular expression so that I wouldn’t have to care about them at all.

This was the next solution:

1
2
3
4
5
6
let ExtractValues (response:string) =  
    let matchBuildDuration item = Regex.Match(item, "(.*\)),([0-9]+),") 
    Regex.Split(response, "\n") |> 
    Array.filter (fun x -> (matchBuildDuration x).Success) |>
    Array.map (fun x -> (matchBuildDuration x).Groups) |>
    Array.map (fun group -> { Agent = (group.[1].Value); Duration = (group.[2].Value) } )

This is better although we are now calling the ‘matchBuildDuration’ function twice which is a bit wasteful.

Dave pointed out that if we run the data straight through the ‘matchBuildDuration’ function after splitting the new lines we can remove the need to call the function twice and then inline the function:

1
2
3
4
5
6
let ExtractValues (response:string) =  
    Regex.Split(response, "\n") |> 
    Array.map (fun x -> Regex.Match(x, "(.*\)),([0-9]+),"))  |>
    Array.filter (fun x -> x.Success) |>
    Array.map (fun x -> x.Groups) |>
    Array.map (fun group -> { Agent = (group.[1].Value); Duration = (group.[2].Value) } )

In all the functions we end up with the following data by executing this function:

[|{Agent = "BuildAgentOne"; Duration = "319"};
  {Agent = "BuildAgentTwo"; Duration = "422"}|]

My current thinking is that if I have more than one expression inside a function it’s very probable that there’s a better way of solving the problem and if I have conditional logic in there then I’ve gone very wrong.

I’d be interested to see if there’s an even simpler way to solve this problem.

Written by Mark Needham

July 8th, 2009 at 10:46 pm

Posted in F#

Tagged with

F#: Pattern matching with the ‘:?’ operator

without comments

I’ve been doing a bit more reading of the Fake source code and one interesting thing which I came across which I hadn’t seen was an active pattern which was making use of the ‘:?’ operator to match the input type against .NET types.

  let (|File|Directory|) (fileSysInfo : FileSystemInfo) =
    match fileSysInfo with
      | :? FileInfo as file -> File (file.Name)
      | :? DirectoryInfo as dir -> Directory (dir.Name, seq { for x in dir.GetFileSystemInfos() -> x })
      | _ -> failwith "No file or directory given."

I thought maybe this was just a wild card operator to say that we don’t care what the value is as long as it matches ‘FileInfo’ or ‘DirectoryInfo’ respectively but I couldn’t see it defined on the list of operators on the Microsoft Research website.

A bit of googling led me to Matthew Podwysocki’s post about pattern matching which explained the purpose of the operator (about 1/3 of the way down):

What the above example does is check for the corresponding .NET types by using the ‘:?’ operator especially reserved for this behavior.

I’ve been playing around with a simple ‘add’ function to try and understand F#’s type inference and one thing I noticed is that if you just define it with minimal code you end up with a function which takes in 2 integers and returns an integer as the result:

let add a b = a + b
 
val add: int -> int -> int

I had thought that the signature and result of that function might remain generic due to the fact that there are more types than just ‘int’ with which you can make use of the addition operator.

For example, it is possible to add two string together but in fact you need to be more explicit about that:

let add (a:string) (b:string) = a + b
 
val add: string -> string -> string

From what I can tell if we wanted to write a generic add function we would need to do something like this – I originally tried just returning ‘new A + new B’ from each of the pattern matches but the return type of add3 then becomes ‘string’ since the first path in the pattern matching returns a ‘string’.

    let add3 a b =
        match (box a,box b) with
            | (:? string as newA),(:? string as newB) -> newA +  newB |> box
            | (:? int as newA),(:? int as newB) -> newA + newB |> box
            | (:? decimal as newA),(:? decimal as newB) -> newA + newB |> box
            | _ -> failwith "you can't add these together"

Which is slightly verbose and has a type of “‘a -> ‘b -” obj’ – I haven’t been able to work out whether it’s possible to create a generic function like this without needing to cast the result down to ‘obj’.

I thought it might be possible to get rid of the boxing by making use of the downcast operator:

You can also use the downcast operator to perform a dynamic type conversion. The following expression specifies a conversion down the hierarchy to a type that is inferred from program context.

I tried surrounding the ‘newA + new B |> box’ code with a call to ‘downcast’ but that just resulted in the following error message when trying to make use of the function:

Value restriction. The value 'it' has been inferred to have generic type
	val it : '_a
Either define 'it' as a simple data term, make it a function with explicit arguments or, if you do not intend for it to be generic, add a type annotation.

I’d be intrigued to see if anyone has worked out how to do this as I’m out of ideas.

Written by Mark Needham

July 2nd, 2009 at 11:10 pm

Posted in F#

Tagged with ,

F#: What I’ve learnt so far

with 2 comments

I did a presentation of some of the stuff that I’ve learnt from playing around with F# over the last six months or so at the most recent Alt.NET Sydney meeting.

I’ve included the slides below but there was also some interesting discussion as well.

  • One of the questions asked was around how you would deal with code on a real project with regards to structuring it and ensuring that it was maintainable. I’m not actually sure what the answer is to this question as I haven’t written any code in F# that’s in production but there are certainly applications written n F# that are in production – the main one that I know a bit about is one which Amanda Laucher worked on which she spoke about at the Alt.NET conference in Seattle.
  • There was some discussion about dynamic v static languages – Phil spoke of not caring about what type something is rather caring about what it does. I pretty much agree with this and I think when using languages which have quite strong type inference such as F# (and more-so Haskell from what I hear) then I think we do move more towards that situation.
  • Erik raised the point that functional languages aren’t the solution for everything and I certainly feel it’s niche is probably around operations with heavy data parsing/mining involved. I’m not sure I’d fancy doing an ASP.NET MVC application only in F# although I’ve seen some WPF code written using F# (unfortunately can’t remember where) which looked reasonable so I’m not sure we should write it off just yet.

I’ve put the code that I walked through in the presentation on bitbucket.

Written by Mark Needham

June 30th, 2009 at 11:09 pm

Posted in F#

Tagged with ,