Mark Needham

Thoughts on Software Development

F#: A day of writing a little twitter application

with 13 comments

I spent most of the bank holiday Monday here in Sydney writing a little application to scan through my twitter feed and find me just the tweets which have links in them since for me that’s where a lot of the value of twitter lies.

I’m sure someone has done this already but it seemed like a good opportunity to try and put a little of the F# that I’ve learned from reading Real World Functional Programming to use. The code I’ve written so far is at the end of this post.

What did I learn?

  • I didn’t really want to write a wrapper on top of the twitter API so I put out a request for suggestions for a .NET twitter API. It pretty much seemed to be a choice of either Yedda or tweetsharp and since the latter seemed easier to use I went with that. In the code you see at the end I have added the ‘Before’ method to the API because I needed it for what I wanted to do.
  • I found it really difficult writing the ‘findLinks’ method – the way I’ve written it at the moment uses pattern matching and recursion which isn’t something I’ve spent a lot of time doing. Whenever I tried to think how to solve the problem my mind just wouldn’t move away from the procedural approach of going down the collection, setting a flag depending on whether we had a ‘lastId’ or not and so on.

    Eventually I explained the problem to Alex and working together through it we realised that there are three paths that the code can take:

    1. When we have processed all the tweets and want to exit
    2. The first call to get tweets when we don’t have a ‘lastId’ starting point – I was able to get 20 tweets at a time through the API
    3. Subsequent calls to get tweets when we have a ‘lastId’ from which we want to work backwards from

    I think it is probably possible to reduce the code in this function to follow just one path by passing in the function to find the tweets but I haven’t been able to get this working yet.

  • I recently watched a F# video from Alt.NET Seattle featuring Amanda Laucher where she spoke of the need to explicitly state types that we import from C# into our F# code. You can see that I needed to do that in my code when referencing the TwitterStatus class – I guess it would be pretty difficult for the use of that class to be inferred but it still made the code a bit more clunky than any of the other simple problems I’ve played with before.
  • I’ve not used any of the functions on ‘Seq’ until today – from what I understand these are available for applying operations to any collections which implement IEnumerable – which is exactly what I had!
  • I had to use the following code to allow F# interactive to recognise the Dimebrain namespace:
    #r "\path\to\Dimebrain.Tweetsharp.dll"

    I thought it would be enough to reference it in my Visual Studio project and reference the namespace but apparently not.

The code

This is the code I have at the moment – there are certainly some areas that it can be improved but I’m not exactly sure how to do it.

In particular:

  • What’s the best way to structure F# code? I haven’t seen any resources on how to do this so it’d be cool if someone could point me in the right direction. The code I’ve written is just a collection of functions which doesn’t really have any structure at all.
  • Reducing duplication – I hate the fact I’ve basically got the same code twice in the ‘getStatusesBefore’ and ‘getLatestStatuses’ functions – I wasn’t sure of the best way to refactor that. Maybe putting the common code up to the ‘OnFriendsTimeline’ call into a common function and then call that from the other two functions? I think a similar approach can be applied to findLinks as well.
  • The code doesn’t feel that expressive to me – I was debating whether or not I should have passed a type into the ‘findLinks’ function – right now it’s only possible to tell what each part of the tuple means by reading the pattern matching code which feels wrong. I think there may also be some opportunities to use the function composition operator but I couldn’t quite see where.
  • How much context should we put in the names of functions? Most of my programming has been in OO languages where whenever we have a method its context is defined by the object on which it resides. When naming functions such as ‘findOldestStatus’ and ‘oldestStatusId’ I wasn’t sure whether or not I was putting too much context into the function name. I took the alternative approach with the ‘withLinks’ function since I think it reads more clearly like that when it’s actually used.
#light
 
open Dimebrain.TweetSharp.Fluent
open Dimebrain.TweetSharp.Extensions
open Dimebrain.TweetSharp.Model
open Microsoft.FSharp.Core.Operators 
 
let getStatusesBefore (statusId:int64) = FluentTwitter
                                            .CreateRequest()
                                            .AuthenticateAs("userName", "password")
                                            .Statuses()
                                            .OnFriendsTimeline()
                                            .Before(statusId)
                                            .AsJson()
                                            .Request()
                                            .AsStatuses()
 
let withLinks (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = 
    statuses |> Seq.filter (fun eachStatus -> eachStatus.Text.Contains("http"))
 
let print (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) =
    for status in statuses do
        printfn "[%s] %s" status.User.ScreenName status.Text    
 
let getLatestStatuses  = FluentTwitter
                            .CreateRequest()
                            .AuthenticateAs("userName", "password")
                            .Statuses()
                            .OnFriendsTimeline()
                            .AsJson()
                            .Request()
                            .AsStatuses()                                    
 
let findOldestStatus (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = 
    statuses |> Seq.sort_by (fun eachStatus -> eachStatus.Id) |> Seq.hd
 
let oldestStatusId = (getLatestStatuses |> findOldestStatus).Id  
 
let rec findLinks (args:int64 * int * int) =
    match args with
    | (_, numberProcessed, recordsToSearch) when numberProcessed >= recordsToSearch -> ignore
    | (0L, numberProcessed, recordsToSearch) -> 
        let latestStatuses = getLatestStatuses
        (latestStatuses |> withLinks) |> print
        findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)    
    | (lastId, numberProcessed, recordsToSearch) ->  
        let latestStatuses = getStatusesBefore lastId
        (latestStatuses |> withLinks) |> print
        findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch)
 
 
let findStatusesWithLinks recordsToSearch =
    findLinks(0L, 0, recordsToSearch) |> ignore

And to use it to find the links contained in the most recent 100 statuses of the people I follow:

findStatusesWithLinks 100;;

Any advice on how to improve this will be gratefully received. I’m going to continue working this into a little DSL which can print me up a nice summary of the links that have been posted during the times that I’m not on twitter watching what’s going on.

Written by Mark Needham

April 13th, 2009 at 10:09 pm

Posted in .NET,F#

Tagged with ,

  • http://dimebrain.com Daniel

    Thanks for the addition of Before; with the recent Twitter API change to deprecate Since, we do need to add support for MaxId, and Before is a good keyword choice for that. We’ll also be adding some filtering to “give back” some facility for the Since keyword. If you want to send your code for Before, I’d be happy to check it out, and then in.

    Your code looks great, nice and compact, it makes me want to get into F# sooner than I had planned. Also, calling AsStatuses() might end up with a null result, say if the request prior to it ended in an error and you received the error details from Twitter. You can either check for null there and then use AsError to cast the error result for reporting, or you could also check the HasError property of the FluentTwitter instance to verify if the last request was dubious, then use the Response property to get at the lower level details if needed.

    Daniel

  • Pingback: DotNetShoutout

  • Pingback: 9eFish

  • Pingback: Coding Dojo #12: F# at Mark Needham

  • Pingback: F#: Refactoring that little twitter application into objects at Mark Needham

  • http://alexy.khrabrov.net/ Alexy Khrabrov

    Greetings Mark — awesome job, I want to get started digging into tweeter with F# too! Now I’m fresh across a boot camp partition from Mac to Vista, and wonder how exactly do you set up the F# project to do that? Care to post it?

  • http://alexy.khrabrov.net/ Alexy Khrabrov

    Or, specifically — how do you structure the F# vis-a-vis Tweet#, i.e. how do you refer to the Tweet# DLLs?

  • http://www.markhneedham.com Mark Needham

    @Alexy – To setup the F# project I just installed the F# for Visual Studio pack and then I created an F# project.

    I have the tweetsharp dlls as references in my F# project and then I can reference them by using ‘open Dimebrain.TweetSharp.Fluent’ for example rather than ‘using Dimebrain.TweetSharp.Fluent’ if it was a C# project.

    Hope that explains what I’ve done. Let me know if not.
    Mark

  • Pingback: F#: Stuff I get confused about at Mark Needham

  • Pingback: Real World Functional Programming: Book Review at Mark Needham

  • Pingback: xUnit.NET: Running tests written in Visual Studio 2010 at Mark Needham

  • Pingback: F#: Testing asynchronous calls to MailBoxProcessor at Mark Needham

  • Pingback: Javascript: Using ‘replace’ to make a link clickable at Mark Needham