Archive for the ‘functional-programming’ tag
Functional C#: Writing a 'partition' function
One of the more interesting higher order functions that I've come across while playing with F# is the partition function which is similar to the filter function except it returns the values which meet the predicate passed in as well as the ones which don't.
I came across an interesting problem recently where we needed to do exactly this and had ended up taking a more imperative for each style approach to solve the problem because this function doesn't exist in C# as far as I know.
In F# the function makes use of a tuple to do this so if we want to create the function in C# then we need to define a tuple object first.
public class Tuple<TFirst, TSecond> { private readonly TFirst first; private readonly TSecond second; public Tuple(TFirst first, TSecond second) { this.first = first; this.second = second; } public TFirst First { get { return first; } } public TSecond Second { get { return second; } } }
public static class IEnumerableExtensions { public static Tuple<IEnumerable<T>, IEnumerable<T>> Partition<T>(this IEnumerable<T> enumerableOf, Func<T, bool> predicate) { var positives = enumerableOf.Where(predicate); var negatives = enumerableOf.Where(e => !predicate(e)); return new Tuple<IEnumerable<T>, IEnumerable<T>>(positives, negatives); } }
I'm not sure of the best way to write this function – at the moment we end up creating two iterators to cover the two different filters that we're running over the collection which seems a bit strange.
In F# 'partition' is on List so the whole collection would be evaluated whereas in this case we're still only evaluating each item as it's needed so maybe there isn't a way to do it without using two iterators.
If we wanted to use this function to get the evens and odds from a collection we could write the following code:
var evensAndOdds = Enumerable.Range(1, 10).Partition(x => x % 2 == 0); var evens = evensAndOdds.First; var odds = evensAndOdds.Second;
The other thing that's nice about F# is that we can assign the result of the expression to two separate values in one go and I don't know of a way to do that in C#.
let evens, odds = [1..10] |> List.partition (fun x -> x % 2 = 0)
We don't need to have the intermediate variable 'evensAndOdds' which doesn't really add much to the code.
I'd be interested in knowing if there's a better way to do this than what I'm trying out.
Book Club: Why noone uses functional languages (Philip Wadler)
Our latest technical book club discussion was based around Philip Wadler's paper 'Why noone uses functional langauges' which he wrote in 1998. I came across this paper when reading some of the F# goals in the FAQs on the Microsoft website.
These are some of my thoughts and our discussion of the paper:
- One of the points suggested in the paper is that functional languages aren't used because of their lack of availability on machines but as Dave pointed out this doesn't really seem to be such a big problem these days – certainly for F# I've found it relatively painless to get it setup and running and even for a language like Ruby people are happy to download and install it on their machines and it is also pretty much painless to do so.
- Erik pointed us to an interesting article which suggests that functional programming can be very awkward for solving certain problems – I think this is definitely true to an extent although perhaps not as much as we might think. I am certainly seeing some benefit in an overall OO approach with some functional concepts mixed in which seems to strike a nice balance between code which is descriptive yet concise in places. I'm finding the problems that F# is useful for tend to be very data intensive in nature.
- Matt Dunn pointed out that an e-commerce store written by Paul Graham, which he later sold to Yahoo, was actually written in Lisp – to me this would seem like the type of problem that wouldn't be that well suited for a functional language but interestingly only part of the system was written in Lisp and the other part in C.
Viaweb at first had two parts: the editor, written in Lisp, which people used to build their sites, and the ordering system, written in C, which handled orders. The first version was mostly Lisp, because the ordering system was small. Later we added two more modules, an image generator written in C, and a back-office manager written mostly in Perl.
- The article also suggests that it takes a while for Java programmers to come to grips with functional programs – I would agree with this statement to an extent although one of the things I found really hard when first reading functional programs is the non descriptiveness of the variable names. It seems to be more idiomatic to make use of single letter variable names instead of something more descriptive which I would use in an imperative language.
I'm intrigued as to whether this will change as more people use functional languages or whether this is just something we will need to get used to.
- The author makes a very valid point with regards to the risk that a project manager would be taking if they decided to use a functional language for a project:
If a manager chooses to use a functional language for a project and the project fails, then he or she will certainly be fired. If a manager chooses C++ and the project fails, then he or she has the defense that the same thing has happened to everyone else.
I'm sure I remember a similar thing being said about the reluctance to make use of Ruby a couple of years ago – it's something of a risk and human nature is often geared towards avoiding those!
- I think the availability of libraries is probably very relevant even today – it helps F# a lot that we have access to all the .NET libraries and I imagine it's also the same for Scala with the Java libraries. I don't know a lot about the Lisp world but I'm told that people often end up rolling their own libraries for some quite basic things since there aren't standard libraries available as there are in some other languages.
- Another paper pointed out as being a good one to read was 'Functional Programming For The Rest Of Us' – I haven't read it yet but it does look quite lengthy! Wes Dyer also has a couple of articles which I found interesting – one around thinking functionally and the other around how functional programming can fit in a mixed programming environment
I think in general a lot of the points this paper raises have been addressed by some of the functional languages which are gaining prominence more recently – Erlang, F# and Scala to name a few.
It will definitely be interesting to see what role functional languages have to play in the polyglot programming era that my colleague Neal Ford foresees.
Functional Collection Parameters: A different way of thinking about collections
One of the changes that I've noticed in my coding now compared to around 7 or 8 months ago is that whenever there's some operations to be performed on a collection I am far more inclined to think of how to do those operations using a functional approach.
I've written previously about the ways I've been making use of functional collection parameters in my code but what I hadn't really considered was that the way of thinking about the problem we want to solve is slightly different.
While Dean and I were talking through the refactoring that I mentioned in my post about handling null collections, we realised that the way it was originally written followed a very sequential mindset.
public IEnumerable<Foo> MapFooMessages(IEnumerable<FooMessage> fooMessages) { var result = new List<Foo>(); if(fooMessages != null) { foreach(var fooMessage in fooMessages) { result.Add(new Foo(fooMessage)); } } return result; }
- Create a new collection
- Take an existing collection
- If it's not null then iterate over the existing collection
- Add each item to the new collection
- Return the new collection
I find when I'm thinking about doing this type of code now my thought process is more focused towards the collection as a whole rather than about the individual items in the collection.
If I do want to only apply an operation on a subset of the collection then I need to first apply another function to the whole collection that filters the collection down. I find myself thinking about what I want to happen rather than how I want to do it – a declarative approach over an imperative one in summary.
One of the LINQ C# extension methods which I sometimes find myself abusing is the 'ForEach' one which I feel is used a lot more times than is necessary and often ends up with complicated lambda blocks inside it which could be avoided by using some of the other functions.
To give a very simple example of some code I came across recently:
public IEnumerable<string> GetFooKeys(IEnumerable<Foo> foos) { var list = new List<string>(); foos.Where(foo => foo.Opted).ToList().ForEach(foo => list.Add(foo.Key)); return list; }
We are making use of functional collection parameters but we can easily do this without using the 'ForEach' method:
public IEnumerable<string> GetFooKeys(IEnumerable<Foo> foos) { return foos.Where(foo => foo.Opted).Select(foo => foo.Key); }
I think the danger with 'ForEach' is that we are creating side effects which may be unexpected. In this case there's not really that much problem as we're just adding a value to a list but it is possible to do anything else in the ForEach block as well.
I also came across a good post written by one of the 8th light guys talking about the use of ForEach in Scala and how we can use it in a way that minimises side effects.
Functional Collection Parameters: Handling the null collection
One of the interesting cases where I've noticed we tend to avoid functional collection parameters in our code base is when there's the possibility of the collection being null.
The code is on the boundary of our application's interaction with another service so it is actually a valid scenario that we could receive a null collection.
When using extension methods, although we wouldn't get a null pointer exception by calling one on a null collection, we would get a 'source is null' exception when the expression is evaluated so we need to protect ourself against this.
As a result of defending against this scenario we have quite a lot of code that looks like this:
public IEnumerable<Foo> MapFooMessages(IEnumerable<FooMessage> fooMessages) { var result = new List<Foo>(); if(fooMessagaes != null) { foreach(var fooMessage in fooMessages) { result.Add(new Foo(fooMessage)); } } return result; }
The method that we want to apply here is 'Select' and even though we can't just apply that directly to the collection we can still make use of it.
private IEnumerable<Foo> MapFooMessages(IEnumerable<FooMessage> fooMessages) { if(fooMessages == null) return new List<Foo>(); return fooMessages.Select(eachFooMessage => new Foo(eachFooMessage)); }
There's still duplication doing it this way though so I pulled it up into a 'SafeSelect' extension method:
public static class ICollectionExtensions { public static IEnumerable<TResult> SafeSelect<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, TResult> selector) { return source == null ? new List<TResult>() : source.Select(selector) ; } }
We can then make use of this extension method like so:
private IEnumerable<Foo> MapFooMessages(IEnumerable<FooMessage> fooMessages) { return fooMessages.SafeSelect(eachFooMessage => new Foo(fooMessage)); }
The extension method is a bit different to the original way that we did this as I'm not explicitly converting the result into a list at the end which means that it will only be evaluated when the data is actually needed.
In this particular case I don't think that decision will have a big impact but it's something interesting to keep in mind.
OO with a bit of functional mixed in
From my experiences playing around with F# and doing a bit of functional C# I'm beginning to think that the combination of functional and object oriented programming actually results in code which I think is more expressive and easy to work with than code written only with an object oriented approach in mind.
I'm also finding it much more fun to write code this way!
In a recent post Dean Wampler questions whether the supremacy of object oriented programming is over before going on to suggest that the future is probably going to be a mix of functional programming and object oriented programming.
I agree with his conclusion but there are some things Dean talks about which I don't really understand:
The problem is that there is never a stable, clear object model in applications any more. What constitutes a BankAccount or Customer or whatever is fluid. It changes with each iteration. It’s different from one subsystem to another even within the same iteration! I see a lot of misfit object models that try to be all things to all people, so they are bloated and the teams that own them can’t be agile. The other extreme is “balkanization”, where each subsystem has its own model. We tend to think the latter case is bad. However, is lean and mean, but non-standard, worse than bloated, yet standardized?
I don't think an object model needs to be stable – for me the whole point is to iterate it until we get something that fits the domain that we're working in.
I'm not sure who thinks it's bad for each subsystem to have its own model – this is certainly an approach that I think is quite useful. Having the same model across different subsystems makes our life significantly more difficult. There are several solutions for this outlined in Domain Driven Design.
Dean goes on to suggest that in a lot of applications data is just data and that having that data wrapped in objects doesn't add much value.
I've worked on some projects which took that approach and I found the opposite to be true – if we have some data in an application it is very likely that there is going to be some sort of behaviour associated to it, meaning that it more than likely represents some concept in the business domain. I find it much easier to communicate with team mates about domain concepts if they're represented explicitly as an object instead of just as a hash map of data for example.
Creating objects also helps manage the complexity by hiding information and from my experience it's much easier to make changes in our code base when we've managed data & behaviour in this manner.
I think there is still a place for the functional programming approach though. Functional collection parameters for example are an excellent way to reduce accidental complexity in our code and removing useless state from our applications when performing operations on collections.
I don't think using this type of approach to coding necessarily means that we need to expose the state of our objects though – we can still make use of these language features within our objects.
The most interesting thing for me about using this approach to some areas areas of coding when using C# is that you do need to change your mindset about how to solve a problem.
I typically solve problems with a procedural mindset where you just consider the next step you need to take sequentially to solve the problem. This can end up leading to quite verbose solutions.
The functional mindset seems to be more about considering the problem as a whole and then working out how we can simplify that problem from the outside in which is a bit of a paradigm shift. I don't think I've completely made but it can certainly lead to solutions which are much easier to understand.
The other idea of functional programming that I've been experimenting with is that of trying to keep objects as immutable as possible. This pretty much means that every operation that I perform on an object which would previously mutate the object now returns a new instance of it.
This is much easier in F# than in C# where you end up writing quite a lot of extra code to make that possible and can be a bit confusing if you're not used to that approach.
Sadek Drobi did a presentation at QCon London where he spoke more about taking a functional programming approach on a C# project and while he's gone further than I have with the functional approach my current thinking is that we should model our domain and manage complexity with objects but when it comes to solving problems within those objects which are more algorithmic in nature the functional approach works better.
C#: Refactoring to functional collection parameters
I wrote about a month or so ago about the functional collection parameters now available in C# and certainly one of the most fun refactorings for me is trying to get code written using a for loop into a state where it is using one of these.
With a bit of help from my colleague James Crisp, these are some of the most common refactorings that I have come across so far.
There's a little bit of repetition with regards to my previous post but I think it's interesting to see the procedural approach to solving this problems versus the functional approach.
For the sake of the examples, the following classes are common:
public class Foo { private String bar; private String baz; private readonly bool specialFlag; public Foo(string bar, string baz, bool specialFlag) { this.bar = bar; this.baz = baz; this.specialFlag = specialFlag; } public bool HasSpecialFlag() { return specialFlag; } } public class NewFoo { public NewFoo(Foo foo) { } } public class NumericFoo { public int Value { get; set; } }
Going from one collection to another conditionally
var foos = new List<Foo> {new Foo("bar1", "baz2", true), new Foo("bar2", "baz2", false)}.Where(foo => foo.HasSpecialFlag()); var newFoos = new List<NewFoo>(); foreach (var foo in foos) { newFoos.Add(new NewFoo(foo)); }
We started off well using the 'Where' method to reduce the original list but we can do even better!
var foos = new List<Foo> {new Foo("bar1", "baz2", true), new Foo("bar2", "baz2", false)}; foos.Where(foo => foo.HasSpecialFlag()) .Select(foo => new NewFoo(foo));
Not significantly less code but it removes the need for more state in our code and makes the code more expressive at the same time which can only be a good thing.
Returning just one result
I came across some code last week which was iterating through a collection and then returning the first item which matched a certain criteria.
This would be useful if we wanted to get the first Foo object which has the special flag set for example.
private Foo GetSpecialFoo(List<Foo> foos) { foreach (var foo in foos) { if(foo.HasSpecialFlag()) { return foo; } } return null; }
My initial thoughts on coming upon this code were that it should be possible to get rid of the loop but I wasn't sure how to do the return. Luckily the 'First' method solves this problem.
private Foo GetSpecialFoo(List<Foo> foos) { return foos.Where(foo => foo.HasSpecialFlag()).First(); }
Accumulating some values
Adding values in collections together is surely one of the most common operations we do and functional collection parameters can help us here too.
var numericFoos = new List<NumericFoo> {new NumericFoo {Value = 2}, new NumericFoo {Value = 5}}; int total = 0; foreach (var numericFoo in numericFoos) { total += numericFoo.Value; }
Introducing a functional approach here gives us the following code:
var numericFoos = new List<NumericFoo> {new NumericFoo {Value = 2}, new NumericFoo {Value = 5}}; int total = numericFoos.Sum(foo => foo.Value);
F# vs C# vs Java: Functional Collection Parameters
I wrote a post about a month ago on using functional collection parameters in C# and over the weekend Fabio and I decided to try and contrast the way you would do this in Java, C# and then F# just for fun.
Map
Map evaluates a high order function on all the elements in a collection and then returns a new collection containing the results of the function evaluation.
Given the numbers 1-5, return the square of each number
Java
int[] numbers = { 1,2,3,4,5}; for (int number : numbers) { System.out.println(f(number)); } private int f(int value) { return value*value; }
C#
new List<int> (new[] {1, 2, 3, 4, 5}.Select(x => x*x)).ForEach(Console.WriteLine);
F#
[1..5] |> List.map (fun x -> x*x) |> List.iter (printfn "%d");;
Filter
Filter applies a predicate against all of the elements in a collection and then returns a collection of elements which matched the predicate.
Given the numbers 1-5, print out only the numbers greater than 3:
Java
int[] numbers = { 1,2,3,4,5}; for (int number : numbers) { f(number); } private void f(int value) { if(value > 3) { System.out.println(value); } }
C#
new List<int> { 1,2,3,4,5}.FindAll(x => x > 3).ForEach(Console.WriteLine);
F#
[1..5] |> List.filter (fun x -> x > 3) |> List.iter (printfn "%d");;
Reduce
Reduce applies a high order function against all the elements in a collection and then returns a single result.
Given a list of numbers 1-5, add them all together and print out the answer
Java
int sum = 0; int[] numbers = { 1,2,3,4,5}; for (int number : numbers) { sum += number; } System.out.println(sum);
C#
Console.WriteLine(new[] {1, 2, 3, 4, 5}.Aggregate(0, (accumulator, x) => accumulator + x));
F#
[1..5] |> List.fold_left (+) 0 |> printfn "%d";;
In Summary
I was surprised that we could achieve these results in relatively few lines of Java. The C# and F# versions are still more concise but the Java version isn't too bad. The Apache Commons Library has a class which allows you to write these in a functional way but the need to use anonymous methods means it's not as clean as what you can achieve in C# and F#.
I think there is still a bit of a mindset switch to make from thinking procedurally about these things to thinking in a way that allows you to make the most of functional programming concepts.
Keeping the code as declarative as possible and reducing the amount of state in our code are the most obvious things I have learned so far from playing with F#.
F#: Partial Function Application with the Function Composition Operator
In my continued reading of F# one of the ideas I've come across recently is that of partial function application.
This is a way of allowing us to combine different functions together and allows some quite powerful syntax to be written.
The term 'currying' is perhaps a better known term for describing this although as I understand they are not exactly the same.
Currying is where we return a function that has been partially applied, in such a way that we can chain together a group of functions with a single argument.
I first came across this idea with the forward piping operator when reading about Matthew Podwysocki's FsTest project but there is an even cleaner way of chaining functions together using Function Composition.
The function composition operator (>>) is defined thus:
> (>>);;
val it : (('a -> 'b) -> ('b -> 'c) -> 'a -> 'c)We take in two functions ('a -> 'b and 'b -> 'c) and one value ('a).
We evaluate the first function ('a -> 'b) with the argument 'a and then pass the result to the second function ('b -> 'c).
The way I understand this:
- The first function takes in 'a (which is the 3rd argument passed to >>) and returns 'b
- The second function takes in 'b (which is the return value of the first function) and returns 'c (which is the return value of >>)
Chris Smith perhaps best explains this as follows:
> let inline (>>) f g x = g(f x)
val inline ( >> ) : ('a -> 'b) -> ('b -> 'c) -> 'a -> 'cGiven two functions (f, g) and a value x compute the result of f(x) and pass the result to g.
From my reading so far this operator makes it even easier to write code in a declarative way (although I suppose the functional programming approach does encourage that in the first place).
We can achieve a lot of this nice declarative style by using the forward piping operator (|>) but the function composition operator takes it one step further:
Say we want to take a list of numbers, square all of them and then only show the negative ones:
let findOddSquares = List.map (fun x-> x*x) >> List.filter (fun x -> x%2 <> 0);;
If we did this using the forward piping operator it would read like this i.e. we need to explicitly define the list:
let findOddSquares list = list |> List.map (fun x-> x*x) |> List.filter (fun x -> x%2 <> 0);;
It's not that much more verbose but there's less code for doing the same thing if we use the function composition operator. I still think the forward piping operator works nicely when we just want to switch the order of the function and the data though:
> [1..10] |> findOddSquares val it : int list = [1; 9; 25; 49; 81]
As with all my F# posts, I'm still learning so please point out if I have anything wrong. Chris Smith has a closer to real life example of how to use partial function application that probably shows the benefits more than I have here.
* Update *
As pointed out in the comments I'm actually finding the odd squares not the negative ones as I originally posted.