Archive for February, 2010
C#: Causing myself pain with LINQ’s delayed evaluation
I recently came across some code was imperatively looping through a collection and then mapping each value to go to something else by using an injected dependency to do that.
I thought I’d try to make use of functional collection parameters to try and simplify the code a bit but actually ended up breaking one of the tests.
About a month ago I wrote about how I’d written a hand rolled stub to simplify a test and this was actually where I caused myself the problem!
The hand rolled stub was defined like this:
public class AValueOnFirstCallThenAnotherValueService : IService { private int numberOfCalls = 0; public string SomeMethod(string parameter) { if(numberOfCalls == 0) { numberOfCalls++; return "aValue"; } else { numberOfCalls++; return "differentValue"; } } }
The test was something like this:
[Test] public void SomeTest() { var fooOne = new Foo { Bar = "barOne" }; var fooTwo = new Foo { Bar = "barTwo" }; var aCollectionOfFoos = new List<Foo> { fooOne, fooTwo }; var service = new AValueOnFirstCallThenAnotherValueService(); var someObject = new SomeObject(service); var fooBars = someObject.Method(aCollectionOfFoos); Assert.That(fooBars[0].Other, Is.EqualTo("aValue")); // and so on }
The object under test looked something like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | public class SomeObject { private IService service; public SomeObject(IService service) { this.service = service; } public IEnumerable<FooBar> Method(List<Foo> foos) { var fooBars = new List<FooBar(); foreach(var foo in foos) { fooBars.Add(new FooBar { Bar = foo.Bar, Other = service.SomeMethod(foo.Bar) }; } // a bit further down var sortedFooBars = fooBars.OrderBy(f => f.Other); return fooBars; } } |
I decided to try and incrementally refactor the code like so:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | public class SomeObject { ... public IEnumerable<FooBar> Method(List<Foo> foos) { var fooBars = foos.Select(f => new FooBar { Bar = f.Bar, Other = service.SomeMethod(f.Bar) }; // a bit further down var sortedFooBars = fooBars.OrderBy(f => f.Other); return fooBars; } } |
I ran the tests after doing this and the test I described above failed – it was expecting a return value for ‘Other’ of ‘aValue’ but was actually returning ‘differentValue’.
I was a bit confused about what was going on until I started watching what the test was doing through the debugger and realised that on the ‘OrderBy’ call on line 10 the ‘Select’ call on line 7 was being reevaluated which meant that the value returned by ‘service.SomeMethod’ would be ‘differentValue’ since it was being called for the 3rd and 4th time and it’s set up to return ‘aValue’ only on the 1st time.
The way to get around this problem was to force the evaluation of ‘fooBars’ to happen immediately by calling ‘ToList()’:
1 2 3 4 5 6 7 8 9 10 11 | public class SomeObject { ... public IEnumerable<FooBar> Method(List<Foo> foos) { var fooBars = foos.Select(f => new FooBar { Bar = f.Bar, Other = service.SomeMethod(f.Bar) }.ToList(); ... } } |
In this case it was fairly easy to identify the problem but I’ve written similar code before which has ended up reordering collections with thousands of items in because it’s been lazy evaluated every time the collection is needed.
In Jeremy Miller’s article about functional C# he suggests the idea of memoization as an optimisation technique to stop expensive calls being made more times than they need to be so perhaps this would be another way to solve the problem although I haven’t tried that approach before.
Rules of Thumb: Don’t use the session
A while ago I wrote about some rules of thumb that I’d been taught by my colleagues with respect to software development and I was reminded of one of them – don’t put anything in the session – during a presentation my colleague Luca Grulla gave at our client on scaling applications by making use of the infrastructure of the web.
The problem with putting state in the session is that it means that requests from a specific user have to be tied to a specific server i.e. we have to use a sticky session/session affinity.
This reduces our ability to scale our system horizontally (scale out) i.e. by adding more servers to handle requests.
If, for example, we have a small amount of users (whose first request went to the same server) making a lot of requests (perhaps through AJAX calls) then we may quickly put one of our servers under load while the others are sitting there idle.
In addition we have increased complexity around our deployment process.
If we want to do an incremental deployment of a new version of our website across some of our servers then we need to ensure that we create a copy of any sessions on those servers and copy them to the ones we’re not updating so that any users still on the system don’t experience loss of data.
There are no doubts products which can allow us to do this more easily but it seems to me to be an unnecessary product in the first place since we can just design our application to not rely on the session.
As I understand it the web was designed to be stateless i.e. each request is independent and all the information is contained within that request and the idea of the session was only something which was added in later on.
How does the way we code change if we don’t use the session?
One thing we’ve often used the session for on projects that I’ve worked on is to store the current state of a form that the user is filling in.
When they’ve completed the form then we would probably store some representation of what they’ve entered in a database.
If we don’t use the session then we need to store this intermediate data somewhere and include a key to load it in the request.
On the project I’m working on at the moment we’re storing that data in a database but then clearing out that data every other day since it’s not needed once the user has completed the form.
An alternative perhaps could be to store it in a cache since in reality all we have is a key/value pair which we need to keep for a relatively short amount of time.
Advantages/disadvantages of this approach
The disadvantage of this approach is that we have to make more reads and writes to the database to deal with this temporary data.
Apart from the advantages I outlined initially, we are also more protected if a server handling a user’s request goes down.
If we were using the session to store intermediate state then that information would be lost and they would have to start over.
In the approach we’ve using this isn’t a problem and when the request is sent to another server we can still query the database and get whatever data the user had already saved.
As with most things there’s a trade off to be made but in this case it seems a fair one to me.
Alternative approaches
I’ve come across some alternative approaches where we avoid using the session but don’t store intermediate state in a database.
One way is to store that state in hidden fields on the form and another is to send it in the request parameters.
Neither of these approaches seem particularly clean to me and they give the user an easier way to change the intermediate data in ways that the form might not allow them to do.
From my experience our server side code becomes more complicated since we’re always writing all of the data entered so far back into the page.
In addition the url becomes a complete mess with the second approach.
F#: Passing an argument to a member constraint
I’ve written previously about function overloading in F# and my struggles working out how to do it and last week I came across the concept of inline functions and statically resolved parameters as a potential way to solve that problem.
I came across a problem where I thought I would be able to make use of this while playing around with some code parsing Xml today.
I had a ‘descendants’ function which I wanted to be applicable against ‘XDocument’ and ‘XElement’ so I originally just defined the functions separately forgetting that the compiler wouldn’t allow me to do so as we would have a duplicate definition of the function:
let descendants name (xDocument:XDocument) = xDocument.Descendants name let descendants name (xElement:XElement) = xElement.Descendants name
I wanted to make use of the inline function to define a function which would allow any type which supported the ‘Descendants’ member:
let inline descendants name (xml:^x) = (^x : (member Descendants : XName -> seq<XElement>) (xml))
I couldn’t work out how I could pass the ‘name’ input parameter to ‘Descendants’ so I was getting the following error:
expected 2 expressions, got 1
I posted the problem to StackOverflow and ‘Brian’ pointed out the syntax that would allow me to do what I wanted:
let inline descendants name (xml:^x) = (^x : (member Descendants : XName -> seq<XElement>) (xml,name))
Tomas Petricek pointed out that in this case we could just write a function which took in ‘XContainer’ since both the other two types derive from that anyway:
let descendants name (xml:XContainer) = xml.Descendants name
In this situation that certainly makes more sense but it’s good to know how to write the version using member constraints for any future problems I come across.
F#: Unexpected identifier in implementation file
I’ve been playing around with some F# code this evening and one of the bits of code needs to make a HTTP call and return the result.
I wrote this code and then tried to make use of the ‘Async.RunSynchronously’ function to execute the call.
The code I had looked roughly like this:
namespace Twitter module RetrieveLinks open System.Net open System.IO open System.Web open Microsoft.FSharp.Control let AsyncHttp (url:string) = async { let request = HttpWebRequest.Create(url) let! response = request.AsyncGetResponse() let stream = response.GetResponseStream() use reader = new StreamReader(stream ) return! reader.AsyncReadToEnd() } let getData = let request = "http://some.url" AsyncHttp <| request Async.RunSynchronously getData
The problem was I was getting the following error on the last line:
Error 3 Unexpected identifier in implementation file
I’ve seen that error before and it often means that you haven’t imported a reference correctly and hence the compiler doesn’t know what you’re trying to refer to.
In this case I was fairly sure all my references were correct and I was still getting the same error when I used the full namespace to ‘Async.RunSynchronously’ which seemed to suggest I’d done something else wrong.
After comparing this file with another one which was quite similar but didn’t throw this error I realised that I’d left of the ‘=’ after the module definition. Putting that in solved the problem.
namespace Twitter module RetrieveLinks = // and so on
As I understand it if we don’t use the ‘=’ then we’ve created a top level module declaration and if we do use the ‘=’ then we’ve created a local module declaration.
You do not have to indent declarations in a top-level module. You do have to indent all declarations in local modules. In a local module declaration, only the declarations that are indented under that module declaration are part of the module.
Given this understanding another way to solve my problem would be to remove the indentation of the functions inside the module like so:
module RetrieveLinks open System.Net open System.IO open System.Web open Microsoft.FSharp.Control // and so on until... Async.RunSynchronously getData
That compiles as expected.
From reading the MSDN page it would suggest that in my first example I’d created a top level module declaration but indenting the code inside that module somehow meant that the ‘Async.RunSynchronously’ function wasn’t recognised.
I don’t quite understand why that is so if anyone can enlighten me that would be cool!
Javascript: Some stuff I learnt this week
I already wrote about how I’ve learnt a bit about the ‘call’ and ‘apply’ functions in Javascript this week but as I’ve spent the majority of my time doing front end stuff this week I’ve also learnt and noticed some other things which I thought were quite interesting.
Finding character codes
We were doing some testing early in the week where we needed to restrict the characters that could be entered into a text box.
As a result we needed to know the character codes for the banned characters. While googling to work them out we came across Uncle Jim’s CharCode Translator which allows you to type in a character and get its character code and vice versa.
I guess you could easily just call the Javascript functions in FireBug but it’s a nice little utility to save the effort.
Duck typing makes some testing much easier
Related to that we needed to be able to pass in an event object to a function which only made use of the ‘charCode’ method.
In a statically language we would have needed to create an event object which had all the properties that an event object needs. In Javascript we could just create the following…
var event = { eventCode : 57 };
…and then pass that into the function and check that the result was as expected.
I haven’t done a lot with languages which support duck typing so this is pretty cool to me and I imagine we’d probably see the same advantages of duck typing when testing in language like Ruby, Python and so on.
Compressing Javascript files
One of the requirements for my project is that we need to compress all the javascript files used in our application to allow them to be downloaded more quickly by the user.
On a previous project that I worked on we made use of some Javascript minifying code written by Douglas Crockford but on this one we’re making use of the Combres library which does all this work for us and compresses CSS files as well.
I haven’t done a lot with it but so far it seems to work pretty well.
Command query separation
I find it quite intriguing how difficult we’ve sometimes found it to unit test Javascript on some of the projects I’ve worked on without ending up with really complicated tests and it seems to me that perhaps the biggest reason for this is that we’re often writing functions which violate the idea of command query separation principle.
The idea here is that a function should either be a command i.e. it has some side effect which means DOM manipulation in Javascript code usually or it should be a query i.e. it returns a value probably based on the input.
Typically we might end up writing a function which validates an input in a text box and tells us whether or not it’s valid, but then also sets up the display of the error message in the same function.
I don’t think this would happen as frequently in Java or C# so perhaps it’s down to the fact that it’s so easy to reference a global variable (i.e. jQuery) that we end up doing so in our code.
It seems like if we could separate these two types of logic then it would be easier to test the query type code in unit tests and we could rely more on Selenium or manual tests to check that the page is being manipulated correctly.
Javascript: Passing functions around with call and apply
Having read Douglas Crockford’s ‘Javascript: The Good Parts‘ I was already aware that making use of the ‘this’ keyword in Javascript is quite dangerous but we came across what must be a fairly common situation this week where we wanted to pass around a function which made use of ‘this’ internally.
We were writing some JSTestDriver tests around a piece of code which looked roughly like this:
function Common() { this.OtherMethod = function(value) { // do some manipulation on value return someMagicalNewValue; }; this.Method = function(value) { return this.OtherMethod(value); }; };
In the test we were originally making the following call:
TestCase("Common", { testShouldDoSomeStuff:function(){ var common = new Common(); var result = common.Method("some value"); assertEquals("some value", result); } };
After writing a couple of tests it became clear that we were pretty much repeating the same few lines of code over and over so we decided to pull out a function:
function ShouldAssertThatValueIs(f, value, expectedValue) { var result = f(value); assertEquals(expectedValue, result); }
TestCase("Common", { testShouldDoSomeStuff:function(){ var common = new Common(); ShouldAssertThatValueIs(common.Method, "some value", "expected value"); } };
When we run that code we get the following error:
TypeError: this.OtherMethod is not a function
The scope of ‘this’ has changed so that ‘this’ now refers to the ‘ShouldAssertThatValueIs’ function which doesn’t have a ‘SomeMethod’ defined on it and hence we get the error.
Luckily we can make use of the call or apply functions to get around this problem and redefine what we want the scope of ‘this’ to be.
With both ‘call’ and ‘apply’ we call either of those methods and pass in the object which we want to be referred to as ‘this’ as the first argument.
We can then then pass in any other parameters to call on our function as an array in the case of ‘apply’ or just as a list of arguments for ‘call’.
K Scott Allen covers this in more detail in his post.
Making use of the ‘call’ function our assertion function would now look like this:
function ShouldAssertThatValueIs(common, f, value, expectedValue) { var result = f.call(common, value); assertEquals(expectedValue, result); }
TestCase("Common", { testShouldDoSomeStuff:function(){ var common = new Common(); ShouldAssertThatValueIs(common, common.Method, "some value", "expected value"); ShouldAssertThatValueIs(common, common.Method, "some value", "expected value"); } };
In this case it probably makes more sense to use ‘call’ since we only have one parameter to pass to the function. If we had an array of values then we could pass that in using ‘apply’.
Looking at the test code at the end of the post as compared to the beginning I’m not too convinced that we’ve actually improved it with this refactoring although it did provide an interesting Javascript lesson for us!
I’m still very much learning Javascript so if I have anything wrong please feel free to point it out or if there’s a better way to do what I’ve described, even better!
F#: Inline functions and statically resolved type parameters
One thing which I’ve often wondered when playing around with F# is that when writing the following function the type of the function is inferred to be ‘int -> int -> int’ rather than allowing any values which can be added together:
let add x y = x + y > val add : int -> int -> int
It turns out if you use the ‘inline’ keyword then the compiler does exactly what we want:
> let inline add x y = x + y val inline add : ^a -> ^b -> ^c when ( ^a or ^b) : (static member ( + ) : ^a * ^b -> ^c)
Without the inline modifier type inference forces the function to take a specific type, in this case int. With it the function has a statically resolved type parameter which means that “the type parameter is replaced with an actual type at compile time rather than run time”.
In this case it’s useful to us because it allows us to implicitly define a member constraint on the two input parameters to ‘add’. From the MSDN page:
Statically resolved type parameters are primarily useful in conjunction with member constraints, which are constraints that allow you to specify that a type argument must have a particular member or members in order to be used. There is no way to create this kind of constraint by using a regular generic type parameter.
The neat thing about the second definition is that we can add values of any types which support the ‘+’ operator:
add "mark" "needham";; > val it : string = "markneedham"
> add 1.0 2.0;; val it : float = 3.0
From a quick look at the IL code in Reflector it looks like the ‘add’ function defined here makes use of the ‘AdditionDynamic‘ function internally to allow it to be this flexible.
One thing which I found quite interesting while reading about inline functions is that it sounds like it’s quite similar to duck typing in that we’re saying a function can be passed any value which supports a particular method.
Michael Giagnocavo has a post where he covers the idea of statically type resolved parameters in more detail and describes what he refers to as ‘statically typed duck typing’.
Javascript: File encoding when using string.replace
We ran into an interesting problem today when moving some Javascript code which was making use of the ‘string.replace’ function to strip out the £ sign from some text boxes on a form.
The code we had written was just doing this:
var textboxValue = $("#fieldId").val().replace(/£/, '');
So having realised that we had this code all over the place we decided it would make sense to create a common function that strip the pound sign out. These common functions reside in a different js file to the original code.
function Common() { this.stripPounds = function(value) { return value.replace(/£/, ''); }; }
We replace the above code with a call to that instead:
var textboxValue = new Common().stripPounds($("#fieldId").val());
Having done this we realised that the £ sign was no longer being replaced despite the fact that the code was pretty much identical.
After a lot of fiddling around Brian eventually realised that the js file containing ‘Common’ was ANSI encoded when we actually needed it to be UTF-8 encoded, probably because we created it in Visual Studio.
As a result the £ sign is presumably being read as some other character which means the replacement doesn’t happen anymore.
Converting the file to UTF-8 encoding fixed the problem for us but it’s certainly not something I’d have ever thought of.
Functional C#: Extracting a higher order function with generics
While working on some code with Toni we realised that we’d managed to create two functions that were almost exactly the same except they made different service calls and returned collections of a different type.
The similar functions were like this:
private IEnumerable<Foo> GetFoos(Guid id) { IEnumerable<Foo> foos = new List<Foo>(); try { foos = fooService.GetFoosFor(id); } catch (Exception e) { // do some logging of the exception } return foos; }
private IEnumerable<Bar> GetBars(Guid id) { IEnumerable<Bar> bars = new List<Bar>(); try { bars = barService.GetBarsFor(id); } catch (Exception e) { // do some logging of the exception } return bars; }
We’re defining the empty lists so that if the service throws an exception we can make use of an empty list further on in the code. A failure of the service in this context doesn’t mean that the application should stop functioning.
My thinking here was that we should be able to pull out the service calls into a function but the annoying thing is that they return different types of collections so I initially thought that we’d be unable to remove the duplication.
Thinking about the problem later on I realised we could just define the return value of the service call in the function to use generics.
We therefore end up with this solution:
private IEnumerable<Bar> GetBars(Guid id) { return GetValues(() => barService.GetBarsFor(id)); }
private IEnumerable<Foo> GetFoos(Guid id) { return GetValues(() => fooService.GetFoosFor(id)); }
private IEnumerable<T> GetValues<T>(Func<IEnumerable<T>> getValues) { IEnumerable<T> values = new List<T>(); try { values = getValues(); } catch (Exception e) { // do some logging of the exception } return values; }
I think the code is still quite readable and it’s relatively obvious what it’s supposed to be doing.
Willed vs Forced designs
I came across an interesting post that Roy Osherove wrote a few months ago where he talks about ‘Willed vs Forced Designs‘ and some common arguments that people give for not using TypeMock on their projects.
I’m not really a fan of the TypeMock approach to dealing with dependencies in tests because it seems to avoid the fact that the code is probably bad in the first place if we have to resort to using some of the approaches it encourages.
Having said that Roy makes the following point which I think is quite accurate:
You let an automated tool (rhino mocks, Moq etc..) tell you when your design is OK or not. That point alone should go against anything ALT.NET has ever stood for, doesn’t it? If you need a tool to tell you what is good or bad design, then you are doing it wrong.
While it is true that it’s useful to be able to know for ourselves whether our code is drifting into territory where it’s become way too complicated, I think it is useful to have the tests as a reminder that this is becoming the case.
It’s quite easy when you have a delivery deadline and are under pressure to stop being as observant about the quality of what you’re coding and to rush to complete our particular task.
In these situations it can be useful to be restricted by our framework to the extent that the pain we’ll feel in trying to test our code will act as an indicator that we’re doing something wrong.
What I found interesting when reading Roy’s post is that the arguments sound sounds quite similar to the discussion a couple of years ago with respect to whether using Mockito instead of jMock was bad because it hides design problems that you have with dependencies. Steve Freeman wrote the following comment on Dan’s post:
But, it also became clear that he wrote Mockito to address some weak design and coding habits in his project and that the existing mocking frameworks were doing exactly their job of forcing them into the open. How a team should respond to that feedback is an interesting question.
In the meantime, I’ve found that I /can/ teach using the existing frameworks if I concentrate on what they were intended for: focusing on the relationships between collaborating objects. I’ve seen quite a few students light up when they get the point. In fact, the syntactic noise in jMock really helps to bring this out, whereas it’s easy for it to get lost with easymock and mockito.
In this case I definitely prefer the style of mocking that we get with Mockito over jMock even though I’ve worked on code bases where we’ve created objects with way too many dependencies and haven’t felt the pain as much because the framework is so easy to use.
I can’t think of a compelling argument for why this is different to the TypeMock vs other mocking frameworks argument. It seems to be a similar argument around dependencies in our code.
The other thing I’m intrigued about is whether the choice of framework should be in some way linked to the level of skill of the people who are going to use it.
If someone is a Dreyfus Model novice with respect to object oriented design then it would make much more sense to use a tool which makes it really obvious that they’re doing something wrong. In that case using a perhaps more limited tool would just be a quick feedback mechanism.
Once we have a bit more skill then it would seem more appropriate to use the more powerful tool which we have the ability to abuse but hopefully now have the experience to know when we can and cannot get away with doing so.
In the end the argument seems quite similar to ones I’ve often heard about programming in Ruby and whether or not we should give programmers powerful language features because they’re liable to hang themselves.
In conclusion I’m thinking that perhaps TypeMock in experienced hands isn’t such a bad thing and could actually be useful in some select situations but would probably be quite a dangerous tool for someone new to the whole unit testing game.