Mark Needham

Thoughts on Software Development

F#: A day of writing a little twitter application

with 13 comments

I spent most of the bank holiday Monday here in Sydney writing a little application to scan through my twitter feed and find me just the tweets which have links in them since for me that’s where a lot of the value of twitter lies.

I’m sure someone has done this already but it seemed like a good opportunity to try and put a little of the F# that I’ve learned from reading Real World Functional Programming to use. The code I’ve written so far is at the end of this post.

What did I learn?

  • I didn’t really want to write a wrapper on top of the twitter API so I put out a request for suggestions for a .NET twitter API. It pretty much seemed to be a choice of either Yedda or tweetsharp and since the latter seemed easier to use I went with that. In the code you see at the end I have added the ‘Before’ method to the API because I needed it for what I wanted to do.
  • I found it really difficult writing the ‘findLinks’ method – the way I’ve written it at the moment uses pattern matching and recursion which isn’t something I’ve spent a lot of time doing. Whenever I tried to think how to solve the problem my mind just wouldn’t move away from the procedural approach of going down the collection, setting a flag depending on whether we had a ‘lastId’ or not and so on.

    Eventually I explained the problem to Alex and working together through it we realised that there are three paths that the code can take:

    1. When we have processed all the tweets and want to exit
    2. The first call to get tweets when we don’t have a ‘lastId’ starting point – I was able to get 20 tweets at a time through the API
    3. Subsequent calls to get tweets when we have a ‘lastId’ from which we want to work backwards from

    I think it is probably possible to reduce the code in this function to follow just one path by passing in the function to find the tweets but I haven’t been able to get this working yet.

  • I recently watched a F# video from Alt.NET Seattle featuring Amanda Laucher where she spoke of the need to explicitly state types that we import from C# into our F# code. You can see that I needed to do that in my code when referencing the TwitterStatus class – I guess it would be pretty difficult for the use of that class to be inferred but it still made the code a bit more clunky than any of the other simple problems I’ve played with before.
  • I’ve not used any of the functions on ‘Seq’ until today – from what I understand these are available for applying operations to any collections which implement IEnumerable – which is exactly what I had!
  • I had to use the following code to allow F# interactive to recognise the Dimebrain namespace:
    #r "\path\to\Dimebrain.Tweetsharp.dll"

    I thought it would be enough to reference it in my Visual Studio project and reference the namespace but apparently not.

The code

This is the code I have at the moment – there are certainly some areas that it can be improved but I’m not exactly sure how to do it.

In particular:

  • What’s the best way to structure F# code? I haven’t seen any resources on how to do this so it’d be cool if someone could point me in the right direction. The code I’ve written is just a collection of functions which doesn’t really have any structure at all.
  • Reducing duplication – I hate the fact I’ve basically got the same code twice in the ‘getStatusesBefore’ and ‘getLatestStatuses’ functions – I wasn’t sure of the best way to refactor that. Maybe putting the common code up to the ‘OnFriendsTimeline’ call into a common function and then call that from the other two functions? I think a similar approach can be applied to findLinks as well.
  • The code doesn’t feel that expressive to me – I was debating whether or not I should have passed a type into the ‘findLinks’ function – right now it’s only possible to tell what each part of the tuple means by reading the pattern matching code which feels wrong. I think there may also be some opportunities to use the function composition operator but I couldn’t quite see where.
  • How much context should we put in the names of functions? Most of my programming has been in OO languages where whenever we have a method its context is defined by the object on which it resides. When naming functions such as ‘findOldestStatus’ and ‘oldestStatusId’ I wasn’t sure whether or not I was putting too much context into the function name. I took the alternative approach with the ‘withLinks’ function since I think it reads more clearly like that when it’s actually used.
open Dimebrain.TweetSharp.Fluent
open Dimebrain.TweetSharp.Extensions
open Dimebrain.TweetSharp.Model
open Microsoft.FSharp.Core.Operators 
let getStatusesBefore (statusId:int64) = FluentTwitter
                                            .AuthenticateAs("userName", "password")
let withLinks (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = 
    statuses |> Seq.filter (fun eachStatus -> eachStatus.Text.Contains("http"))
let print (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) =
    for status in statuses do
        printfn "[%s] %s" status.User.ScreenName status.Text    
let getLatestStatuses  = FluentTwitter
                            .AuthenticateAs("userName", "password")
let findOldestStatus (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = 
    statuses |> Seq.sort_by (fun eachStatus -> eachStatus.Id) |> Seq.hd
let oldestStatusId = (getLatestStatuses |> findOldestStatus).Id  
let rec findLinks (args:int64 * int * int) =
    match args with
    | (_, numberProcessed, recordsToSearch) when numberProcessed >= recordsToSearch -> ignore
    | (0L, numberProcessed, recordsToSearch) -> 
        let latestStatuses = getLatestStatuses
        (latestStatuses |> withLinks) |> print
        findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)    
    | (lastId, numberProcessed, recordsToSearch) ->  
        let latestStatuses = getStatusesBefore lastId
        (latestStatuses |> withLinks) |> print
        findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)
let findStatusesWithLinks recordsToSearch =
    findLinks(0L, 0, recordsToSearch) |> ignore

And to use it to find the links contained in the most recent 100 statuses of the people I follow:

findStatusesWithLinks 100;;

Any advice on how to improve this will be gratefully received. I’m going to continue working this into a little DSL which can print me up a nice summary of the links that have been posted during the times that I’m not on twitter watching what’s going on.

Be Sociable, Share!

Written by Mark Needham

April 13th, 2009 at 10:09 pm

Posted in .NET,F#

Tagged with ,