I spent most of the bank holiday Monday here in Sydney writing a little application to scan through my twitter feed and find me just the tweets which have links in them since for me that’s where a lot of the value of twitter lies.
I’m sure someone has done this already but it seemed like a good opportunity to try and put a little of the F# that I’ve learned from reading Real World Functional Programming to use. The code I’ve written so far is at the end of this post.
What did I learn?
- I didn’t really want to write a wrapper on top of the twitter API so I put out a request for suggestions for a .NET twitter API. It pretty much seemed to be a choice of either Yedda or tweetsharp and since the latter seemed easier to use I went with that. In the code you see at the end I have added the ‘Before’ method to the API because I needed it for what I wanted to do.
- I found it really difficult writing the ‘findLinks’ method – the way I’ve written it at the moment uses pattern matching and recursion which isn’t something I’ve spent a lot of time doing. Whenever I tried to think how to solve the problem my mind just wouldn’t move away from the procedural approach of going down the collection, setting a flag depending on whether we had a ‘lastId’ or not and so on.
Eventually I explained the problem to Alex and working together through it we realised that there are three paths that the code can take:
- When we have processed all the tweets and want to exit
- The first call to get tweets when we don’t have a ‘lastId’ starting point – I was able to get 20 tweets at a time through the API
- Subsequent calls to get tweets when we have a ‘lastId’ from which we want to work backwards from
I think it is probably possible to reduce the code in this function to follow just one path by passing in the function to find the tweets but I haven’t been able to get this working yet.
- I recently watched a F# video from Alt.NET Seattle featuring Amanda Laucher where she spoke of the need to explicitly state types that we import from C# into our F# code. You can see that I needed to do that in my code when referencing the TwitterStatus class – I guess it would be pretty difficult for the use of that class to be inferred but it still made the code a bit more clunky than any of the other simple problems I’ve played with before.
- I’ve not used any of the functions on ‘Seq’ until today – from what I understand these are available for applying operations to any collections which implement IEnumerable – which is exactly what I had!
- I had to use the following code to allow F# interactive to recognise the Dimebrain namespace:
I thought it would be enough to reference it in my Visual Studio project and reference the namespace but apparently not.
This is the code I have at the moment – there are certainly some areas that it can be improved but I’m not exactly sure how to do it.
- What’s the best way to structure F# code? I haven’t seen any resources on how to do this so it’d be cool if someone could point me in the right direction. The code I’ve written is just a collection of functions which doesn’t really have any structure at all.
- Reducing duplication – I hate the fact I’ve basically got the same code twice in the ‘getStatusesBefore’ and ‘getLatestStatuses’ functions – I wasn’t sure of the best way to refactor that. Maybe putting the common code up to the ‘OnFriendsTimeline’ call into a common function and then call that from the other two functions? I think a similar approach can be applied to findLinks as well.
- The code doesn’t feel that expressive to me – I was debating whether or not I should have passed a type into the ‘findLinks’ function – right now it’s only possible to tell what each part of the tuple means by reading the pattern matching code which feels wrong. I think there may also be some opportunities to use the function composition operator but I couldn’t quite see where.
- How much context should we put in the names of functions? Most of my programming has been in OO languages where whenever we have a method its context is defined by the object on which it resides. When naming functions such as ‘findOldestStatus’ and ‘oldestStatusId’ I wasn’t sure whether or not I was putting too much context into the function name. I took the alternative approach with the ‘withLinks’ function since I think it reads more clearly like that when it’s actually used.
#light open Dimebrain.TweetSharp.Fluent open Dimebrain.TweetSharp.Extensions open Dimebrain.TweetSharp.Model open Microsoft.FSharp.Core.Operators let getStatusesBefore (statusId:int64) = FluentTwitter .CreateRequest() .AuthenticateAs("userName", "password") .Statuses() .OnFriendsTimeline() .Before(statusId) .AsJson() .Request() .AsStatuses() let withLinks (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = statuses |> Seq.filter (fun eachStatus -> eachStatus.Text.Contains("http")) let print (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = for status in statuses do printfn "[%s] %s" status.User.ScreenName status.Text let getLatestStatuses = FluentTwitter .CreateRequest() .AuthenticateAs("userName", "password") .Statuses() .OnFriendsTimeline() .AsJson() .Request() .AsStatuses() let findOldestStatus (statuses:seq<Dimebrain.TweetSharp.Model.TwitterStatus>) = statuses |> Seq.sort_by (fun eachStatus -> eachStatus.Id) |> Seq.hd let oldestStatusId = (getLatestStatuses |> findOldestStatus).Id let rec findLinks (args:int64 * int * int) = match args with | (_, numberProcessed, recordsToSearch) when numberProcessed >= recordsToSearch -> ignore | (0L, numberProcessed, recordsToSearch) -> let latestStatuses = getLatestStatuses (latestStatuses |> withLinks) |> print findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch) | (lastId, numberProcessed, recordsToSearch) -> let latestStatuses = getStatusesBefore lastId (latestStatuses |> withLinks) |> print findLinks(findOldestStatus(latestStatuses).Id, numberProcessed + 20, recordsToSearch) let findStatusesWithLinks recordsToSearch = findLinks(0L, 0, recordsToSearch) |> ignore
And to use it to find the links contained in the most recent 100 statuses of the people I follow:
Any advice on how to improve this will be gratefully received. I’m going to continue working this into a little DSL which can print me up a nice summary of the links that have been posted during the times that I’m not on twitter watching what’s going on.