Mark Needham

Thoughts on Software Development

Archive for the ‘Haskell’ Category

Haskell: Mixed type lists

with 3 comments

I’ve been continuing to work through the exercises in The Little Schemer and came across a problem which needed me to write a function to take a mixed list of Integers and Strings and filter out the Integers.

As I mentioned in my previous post I’ve been doing the exercises in Haskell but I thought I might struggle with that approach here because Haskell collections are homogeneous i.e. all the elements need to be of the same type.

I read about existentially quantified types but they seemed a bit complicated and instead I decided to use the Dynamic interface.

Using Dynamic we can define a function to strip out the numbers like this:

import Data.Dynamic
import Data.Maybe
 
noNums :: [Dynamic] -> [Dynamic]
noNums lat = cond [(null lat, []), 
                   (isNumber (head lat), noNums (tail lat)),
                   (otherwise, head lat : noNums (tail lat))]
 
justInt :: Dynamic -> Maybe Int
justInt dyn = fromDynamic dyn :: Maybe Int
 
isNumber :: Dynamic -> Bool
isNumber x = isJust $ justInt x

We can then call the function like this:

> map toString $ noNums [toDyn (5 :: Int), toDyn "pears", toDyn (6 :: Int), toDyn "prunes", toDyn (9 :: Int), toDyn "dates"]
[Just "pears",Just "prunes",Just "dates"]
toString :: Dynamic -> Maybe String
toString dyn = fromDynamic dyn

fromDynamic eventually makes a call to unSafeCoerce#:

The function unsafeCoerce# allows you to side-step the typechecker entirely. That is, it allows you to coerce any type into any other type. If you use this function, you had better get it right, otherwise segmentation faults await. It is generally used when you want to write a program that you know is well-typed, but where Haskell’s type system is not expressive enough to prove that it is well typed.

I wanted to try and make the ‘isNumber’ function handle any numeric type rather than just Ints but I haven’t quite worked out how to do that.

Obviously I’m only using Dynamic here because the exercise requires it but I’m not sure what real life situation would require its use.

If anyone has used it before or knows a use case I’d be interested to know what it is!

Written by Mark Needham

June 19th, 2012 at 11:09 pm

Posted in Haskell

Tagged with

Haskell: Writing a function that can take Ints or Doubles

with 2 comments

In my continued reading of SICP I wanted to recreate a ‘sum’ function used to demonstrate a function which could take another function as one of its parameters.

In Scheme the function is defined like this:

(define (sum term a next b)
  (if (> a b)
      0
      (+ (term a)
         (sum term (next a) next b))))

And can be used like this to sum the values between two numbers:

(define (identity x) x)
 
(define (sum-integers a b)
  (sum identity a inc b))
> (sum-integers 1 10)
55

I translated it into Haskell as the following:

sicpSum :: (Int -> Int) -> Int -> (Int -> Int) -> Int -> Int
sicpSum term a next b | a > b = 0
                      | otherwise = term a + sicpSum term (next a) next b

The ‘sum-integers’ function translates like this:

sumIntegers :: Int -> Int -> Int
sumIntegers a b = sicpSum id a inc b
> sumIntegers 1 10
55

It works fine with integers but later on in the chapter it’s used to define a function which returns a double:

(define (pi-sum a b)
  (define (pi-term x)
    (/ 1.0 (* x (+ x 2))))
  (define (pi-next x)
    (+ x 4))
  (sum pi-term a pi-next b))
> (* 8 (pi-sum 1 1000))
3.139592655589783

I tried writing my ‘piSum’ function to call the existing ‘sumIntegers':

piSum a b = sicpSum piTerm a piNext b where
	piTerm x = 1 / (x * (x + 2)) 
	piNext x = x + 4

Which unfortunately doesn’t compile because of our use of ‘/':

sicp.hs:124:22:
    No instance for (Fractional Int)
      arising from a use of `piTerm'
    Possible fix: add an instance declaration for (Fractional Int)
    In the first argument of `sicpSum2', namely `piTerm'
    In the expression: sicpSum2 piTerm a piNext b
    In an equation for `piSum':
        piSum a b
          = sicpSum2 piTerm a piNext b
          where
              piTerm x = 1 / (x * x + 2)
              piNext x = x + 4
> :t (/)
(/) :: Fractional a => a -> a -> a

I need to make ‘sicpSums’ generic enough to take in a Double or Int in order to reuse it here.

Most of the functions I’ve written have been for very specific types although a lot of the Haskell examples I’ve come across tend to have very generic type signatures.

I found quite a cool diagram on a Haskell wiki which shows which types inherit different type classes.

Haskell classes small

In this case both Int and Double derive from the Num type class so we can redefine ‘sicpSum’ in terms of that:

sicpSum :: (Ord a, Num a1) => (a -> a1) -> a -> (a -> a) -> a -> a1
sicpSum term a next b | a > b = 0
                      | otherwise = term a + sicpSum term (next a) next b

We had to also make sure that ‘a’ inherits the ‘Ord’ type class because of the comparison between a and b that we do on the second line.

‘piSum’ can then make use of the new ‘sicpSum’ in its definition:

piSum :: (Ord a1, Fractional a1) => a1 -> a1 -> a1
piSum a b = sicpSum piTerm a piNext b where
	piTerm x = 1 / (x * (x + 2)) 
	piNext x = x + 4

We can then use it like this:

> 8 * (piSum 1 1000)
3.139592655589783

Obviously this is a very simple example but I haven’t written any functions which could take in different types so I thought I’d document how I did it, especially because the diagram of the type classes is really useful!

Written by Mark Needham

June 5th, 2012 at 12:10 am

Posted in Haskell

Tagged with

Haskell: Building a range of numbers from command line arguments

without comments

I’m working through some of the SICP problems in Haskell and for problem 1.22 you need to write a function which will indicate the first 3 prime numbers above a starting value.

It is also suggested to only consider odd numbers so to find the prime numbers above 1000 the function call would look like this:

> searchForPrimes [1001,1003..]
[1009,1013,1019]

I wanted to be able to feed in the range of numbers from the command line so that I’d be able to call the function with different values and see how long it took to work it out.

I initially tried passing the above array to the program as a command line argument wrapped in a string and then parse it using read which works if you provide a complete array:

> read "[1,2]" :: [Int]
[1,2]

But not so well for ranges:

> read "[1,2..]" :: [Int]
*** Exception: Prelude.read: no parse

Instead I thought I could just provide the first two values for the proposed range and then work from there.

To help me do this I came across the enumFromThen function which has the following signature:

> :t enumFromThen
enumFromThen :: Enum a => a -> a -> [a]

From the Haskell source this function seems to call the literal range syntax which I mentioned above:

enumFromThen x y = map toEnum [fromEnum x, fromEnum y ..]

I was actually expecting there to be a literal range syntax definition which made a call to the various enumFrom… methods since the literal syntax seems more like syntactic sugar on top of the function call approach but apparently not!

I use it like this:

main = do
  args <- getArgs
  let (first:second:_) = map read args in 
    putStrLn $ show $ searchForPrimes (enumFromThen first second)

Which I suppose could just as easily be written like this:

main = do
  args <- getArgs
  let (first:second:_) = map read args in 
    putStrLn $ show $ searchForPrimes ([first, second..])

And I’ve gone full circle!

Written by Mark Needham

June 3rd, 2012 at 8:13 pm

Posted in Haskell

Tagged with

Haskell: Using type classes to generify Project Euler #31

with 2 comments

As I mentioned in my previous post I’ve been working on Project Euler #31 and initially wasn’t sure how to write the algorithm.

I came across a post on StackOverflow which explained it in more detail but unfortunately the example used US coins rather than UK ones like in the Project Euler problem.

To start with I created two versions of the function – one for US coins and one for UK coins:

combinations :: Int -> [USCoin] -> [USCoin] -> [[USCoin]]
combinations 0 current _ = [current]
combinations _  current [] = []
combinations total p (c:cs) = concatMap (\ times -> combinations (total - (times * fromEnum c)) (p ++ replicate times c) cs) 
                                        [0,1..(total `div` fromEnum c)]
combinationsUK :: Int -> [UKCoin] -> [UKCoin] -> [[UKCoin]]
combinationsUK 0 current _ = [current]
combinationsUK _  current [] = []
combinationsUK total p (c:cs) = concatMap (\ times -> combinationsUK (total - (times * fromEnum c)) (p ++ replicate times c) cs) 
                                          [0,1..(total `div` fromEnum c)]

As we can see the only difference between the two functions is the type being passed around and they are almost identical as well:

data USCoin = Quarter | Dime | Nickel | Penny deriving (Eq, Show)
instance Enum USCoin where
    fromEnum = fromJust . flip lookup usTable
    toEnum = fromJust . flip lookup (map swap usTable)
usTable = [(Quarter, 25), (Dime, 10), (Nickel, 5), (Penny, 1)]
data UKCoin = OnePence | TwoPence | FivePence | TenPence | TwentyPence | FiftyPence | OnePound | TwoPounds deriving (Eq, Show)
instance Enum UKCoin where
    fromEnum = fromJust . flip lookup ukTable
    toEnum = fromJust . flip lookup (map swap ukTable)
ukTable = [(OnePence, 1), (TwoPence, 2), (FivePence, 5), (TenPence, 10), 
           (TwentyPence, 20), (FiftyPence, 50), (OnePound, 100), (TwoPounds, 200)]

We can run those two functions like this:

> length $ combinations 200 [] (reverse $ map fst usTable)
1463
 
> length $ combinationsUK 200 [] (reverse $ map fst ukTable)
73682

After spending a lot of the past week reading about type classes I figured we could probably make use of one here so I created a ‘Coin’ type class:

class Coin a where 
  value :: a -> Int

And then implemented that with ‘USCoin’ and ‘UKCoin':

instance Coin USCoin where value coin = fromEnum coin
instance Coin UKCoin where value coin = fromEnum coin

Then we need to make some small adjustments to ‘combinations’ to make it work with ‘Coin’ instead:

combinations :: (Coin a) => Int -> [a] -> [a] -> [[a]]
combinations 0 current _ = [current]
combinations _  current [] = []
combinations total p (c:cs) = 
  concatMap (\ times -> combinations (total - (times * value c)) (p ++ replicate times c) cs) 
            [0,1..(total `div` value c)]

Instead of calling ‘fromEnum’ we call ‘value’ and the function can now be used with any types which implement the Coin type class should we wish to do that.

We can actually go even further and get rid of our Enum type class instances and just put that code onto the Coin type class:

class Eq a => Coin a where 
  table :: [(a, Int)]
 
  value :: a -> Int
  value = fromJust . flip lookup table
instance Coin USCoin where 
  table = [(Quarter, 25), (Dime, 10), (Nickel, 5), (Penny, 1)]	
 
instance Coin UKCoin where 
  table = [(OnePence, 1), (TwoPence, 2), (FivePence, 5), (TenPence, 10), (TwentyPence, 20), (FiftyPence, 50), (OnePound, 100), (TwoPounds, 200)]

We can then call the function with either of those coins:

> length $ combinations 200 [] (reverse $ map fst (table :: [(USCoin, Int)]))
1463
 
> length $ combinations 200 [] (reverse $ map fst (table :: [(UKCoin, Int)]))
73682

Written by Mark Needham

May 30th, 2012 at 12:08 pm

Posted in Haskell

Tagged with

Haskell: Java Style Enums

with 2 comments

I’ve been playing around with problem 31 of Project Euler which is defined as follows:

In England the currency is made up of pound, £, and pence, p, and there are eight coins in general circulation:

1p, 2p, 5p, 10p, 20p, 50p, £1 (100p) and £2 (200p).

It is possible to make £2 in the following way:

1 £1 + 150p + 220p + 15p + 12p + 31p

How many different ways can £2 be made using any number of coins?

Having coded way too much in Java my first thought was that the coins could be represented as an Enum but I wasn’t sure how to do that in Haskell.

As it turns out the question has been asked on Stack Overflow and there’s an Enum type class we can implement to achieve the same thing.

In this case:

data UKCoin = OnePence | TwoPence | FivePence | TenPence | TwentyPence | FiftyPence | OnePound | TwoPounds 
              deriving (Eq, Show)
instance Enum UKCoin where
    fromEnum = fromJust . flip lookup table
    toEnum = fromJust . flip lookup (map swap table)
table = [(OnePence, 1), (TwoPence, 2), (FivePence, 5), (TenPence, 10), (TwentyPence, 20), 
         (FiftyPence, 50), (OnePound, 100), (TwoPounds, 200)]

If we want to get the monetary value of a coin we use ‘fromEnum':

> fromEnum TwoPounds
200

And if we wanted to go back to the enum value we’d do this:

> (toEnum 200) :: UKCoin
TwoPounds

If we want to get a list of all the instances of ‘UKCoin’ we can just do a map over ‘table':

> let ukCoins = map fst table
> ukCoins
[OnePence,TwoPence,FivePence,TenPence,TwentyPence,FiftyPence,OnePound,TwoPounds]

Written by Mark Needham

May 30th, 2012 at 11:10 am

Posted in Haskell

Tagged with

Haskell: Finding the minimum & maximum values of a Foldable in one pass

without comments

I recently came across Dan Piponi’s blog post ‘Haskell Monoids & their Uses‘ and towards the end of the post he suggests creating monoids to work out the maximum and minimum values of a Foldable value in one pass.

The Foldable type class provides a generic approach to walking through a datastructure, accumulating values as we go.

The foldMap function applies a function to each element of our structure and then accumulates the return values of each of these applications.

A list is one example of a type which implements the Foldable type class like so:

instance Foldable [] where
    foldr = Prelude.foldr
    foldl = Prelude.foldl
    foldl' = List.foldl'
    foldr1 = Prelude.foldr1
    foldl1 = Prelude.foldl1

In this case it delegates all of the Foldable functions to previously defined functions from the Prelude or List modules.

The ‘foldMap’ function mentioned above has the following default implementation:

foldMap :: Monoid m => (a -> m) -> t a -> m
foldMap f = foldr (mappend . f) mempty

Dan shows an example where we use ‘foldMap’ to check whether any values in a Foldable have the value 1 by using the ‘Any’ monoid:

> foldMap (Any . (== 1)) [1..20]
Any {getAny = True}

I couldn’t quite understand how it worked so I expanded the ‘foldMap’ definition which results in the following:

> Data.Foldable.foldr (\x acc -> acc `mappend` (Any . (== 1)) x) mempty [1..20]
Any {getAny = True}

So we’re folding over the list with an initial seed value of ‘Any False’ as that’s what mempty will evaluate to. For each value we’re then calling ‘mappend’ with two ‘Any’ values and passing on the result.

So in this example it would go like this:

Any False `mappend` (Any . (==1)) 20 => Any False `mappend` Any False => Any False
Any False `mappend` (Any . (==1)) 19 => Any False `mappend` Any False => Any False
...
Any False `mappend` (Any . (==1)) 1 => Any False `mappend` Any True => Any True

In our case we need to define our own monoids to keep track of the maximum value in a Foldable:

import Data.Monoid
import Data.Foldable
 
data MaxSoFar a = MaxSoFar { getMaxSoFar :: a } deriving (Eq, Ord, Read, Show, Bounded)
instance (Num a,  Ord a, Bounded a) => Monoid (MaxSoFar a) where
  mempty = MaxSoFar (minBound)
  MaxSoFar x `mappend` MaxSoFar y = MaxSoFar (max x y)

We set ‘mempty’ to the minimum possible Int value so that anything Int value compared to it will automatically become the max value when it gets compared to that.

For ‘mappend’ we take the values stored in the two ‘MaxSoFar’ instances and then select the maximum and return a new MaxSoFar.

We call foldMap like so:

> foldMap (MaxSoFar) ([1..20] :: [Int])
MaxSoFar {getMaxSoFar = 20}

MinSoFar is pretty similar in design:

data MinSoFar a = MinSoFar { getMinSoFar :: a } deriving (Eq, Ord, Read, Show, Bounded)
instance (Num a,  Ord a, Bounded a) => Monoid (MinSoFar a) where
  mempty = MinSoFar maxBound
  MinSoFar x `mappend` MinSoFar y = MinSoFar (min x y)

And we can then combine them together like this:

> foldMap (\x -> (MaxSoFar x, MinSoFar x)) ([1..20] :: [Int])
(MaxSoFar {getMaxSoFar = 20},MinSoFar {getMinSoFar = 1})

This works because we can make use of what Dan calls the ‘Product Monoid’, which is a Monoid instance for a tuple:

instance (Monoid a, Monoid b) => Monoid (a,b) where
  mempty = (mempty, mempty)
  (a1,b1) `mappend` (a2,b2) = (a1 `mappend` a2, b1 `mappend` b2)

So in our example the fold would go like this:

(MaxSoFar -9223372036854775808, MinSoFar 9223372036854775807) `mappend` (MaxSoFar 20, MinSoFar 20) => (MaxSoFar 20, MinSoFar 20)
(MaxSoFar 20, MinSoFar 20) `mappend` (MaxSoFar 19, MinSoFar 19) => (MaxSoFar 20, MinSoFar 19)
...
(MaxSoFar 20, MinSoFar 2) `mappend` (MaxSoFar 1, MinSoFar 1) => (MaxSoFar 20, MinSoFar 1)

We could probably achieve the same thing without using monoids but I think this approach is pretty neat.

Written by Mark Needham

May 28th, 2012 at 11:18 am

Posted in Haskell

Tagged with

Haskell: Debugging code

without comments

In my continued attempts to learn QuickCheck, one thing I’ve been doing is comparing the results of my brute force and divide & conquer versions of the closest pairs algorithm.

I started with this property:

let prop_dc_bf xs = (length xs > 2) ==> (fromJust $ bfClosest xs) == dcClosest xs

And then ran it from GHCI, which resulted in the following error:

> quickCheck (prop_dc_bf  :: [(Double, Double)] -> Property)
*** Failed! Falsifiable (after 8 tests and 19 shrinks):     
[(-165.0,8.0),(5.0,47.0),(6.0,0.0),(8.0,0.0),(2.0,1.0),(-129.0,17.0),(-15.0,11.0)]

I wasn’t really sure where to start to work out what had gone wrong but I eventually came across Debug.Trace which has the ‘trace‘ function which I found quite useful.

trace :: String -> a -> aSource

The trace function outputs the trace message given as its first argument, before returning the second argument as its result.

For example, this returns the value of f x but first outputs the message.

trace (“calling f with x = ” ++ show x) (f x)

The initial divide and conquer algorithm read like this:

dcClosest :: (Ord a, Floating a) => [Point a] -> Pair a
dcClosest pairs
  | length pairs <= 3 = fromJust $ bfClosest pairs    
  | otherwise = foldl (\closest (p1:p2:_) -> minimumBy (compare `on` distance') [closest, Pair p1 p2])
                      closestPair 
                      (windowed 2 pairsWithinMinimumDelta)
  where sortedByX = sortBy compare pairs	      
        (leftByX:rightByX:_) = chunk (length sortedByX `div` 2) sortedByX    
        closestPair = minimumBy (compare `on` distance') [closestLeftPair, closestRightPair]
          where closestLeftPair =  dcClosest leftByX
                closestRightPair = dcClosest rightByX     
        pairsWithinMinimumDelta = sortBy (compare `on` snd) $ filter withinMinimumDelta sortedByX
          where withinMinimumDelta (x, _) = abs (xMidPoint - x) <= distance' closestPair
                  where (xMidPoint, _) = last leftByX

So I put a ‘trace’ before the ‘foldl’ function to see what was being passed to it:

dcClosest :: (Ord a, Floating a) => [Point a] -> Pair a
dcClosest pairs
  | length pairs <= 3 = fromJust $ bfClosest pairs    
  | otherwise = trace ("passed in the following: " ++ show (windowed 2 pairsWithinMinimumDelta)) 
                foldl (\closest (p1:p2:_) -> minimumBy (compare `on` distance') [closest, Pair p1 p2])
                      closestPair 
                      (windowed 2 pairsWithinMinimumDelta)
...
> dcClosest [(-165.0,8.0),(5.0,47.0),(6.0,0.0),(8.0,0.0),(2.0,1.0),(-129.0,17.0),(-15.0,11.0)]
passed in the following: []
(2.0,1.0) (6.0,0.0)

The function still works the same way but it prints out the trace message as well.

Having run this a few times I realised that I’d made a mistake with the values that I’d passed to ‘foldl’ – I was supposed to pass the combination of all the pairs in ‘pairsWithinMinimumDelta’ but had actually only passed in adjacent ones.

The following change needs to be made to fix that:

dcClosest :: (Ord a, Floating a) => [Point a] -> Pair a
dcClosest pairs
  | length pairs <= 3 = fromJust $ bfClosest pairs    
  | otherwise =  foldl (\closest (p1, p2) -> minimumBy (compare `on` distance') [closest, Pair p1 p2])
                      closestPair 
                      (combos pairsWithinMinimumDelta)
  where sortedByX = sortBy compare pairs	      
        (leftByX:rightByX:_) = chunk (length sortedByX `div` 2) sortedByX    
        closestPair = minimumBy (compare `on` distance') [closestLeftPair, closestRightPair]
          where closestLeftPair =  dcClosest leftByX
                closestRightPair = dcClosest rightByX     
        pairsWithinMinimumDelta = sortBy (compare `on` snd) $ filter withinMinimumDelta sortedByX
          where withinMinimumDelta (x, _) = abs (xMidPoint - x) <= distance' closestPair
                  where (xMidPoint, _) = last leftByX	
combos :: [a] -> [(a,a)]
combos initial = 
	[(initial !! i, initial !! j) | i <- [0..length initial - 1], j <- [i+1..length initial-1 ], i /= j]

It’s still not perfect because we end up doing too many comparisons in the ‘foldl’ part of the code.

It does now, however, give the same results as the brute force version.

> quickCheck (prop_dc_bf  :: [(Double, Double)] -> Property)
+++ OK, passed 100 tests.

Written by Mark Needham

May 27th, 2012 at 10:16 pm

Posted in Haskell

Tagged with

Haskell: Using monoids when sorting by multiple parameters

with 7 comments

On the project I’ve been working on we had a requirement to sort a collection of rows by 4 different criteria such that if two items matched for the first criteria we should consider the second criteria and so on.

If we wrote that code in Haskell it would read a bit like this:

data Row = Row { shortListed :: Bool, cost :: Float, distance1 :: Int, distance2 :: Int } deriving (Show, Eq)
import Data.Ord
import Data.List
 
compareRow :: Row -> Row -> Ordering
compareRow x y = if comparing (not . shortListed) x y == EQ then 
                    if comparing cost x y == EQ then
                      if comparing distance1 x y == EQ then
                        comparing distance2 x y
                      else
                        comparing distance1 x y
                    else
                      comparing cost x y
                  else 
                    comparing (not . shortListed) x y

We can then sort a list of rows like this:

> let shortListedRow = Row {shortListed = True, cost = 10.0, distance1 = 20, distance2 = 30 }
 
> let nonShortListedRow = Row {shortListed = False, cost = 10.0, distance1 = 20, distance2 = 30 }
 
> sortBy compareRow [nonShortListedRow, shortListedRow]
[Row {shortListed = True, cost = 10.0, distance1 = 20, distance2 = 30},
 Row {shortListed = False, cost = 10.0, distance1 = 20, distance2 = 30}]

It works but it’s messy and we couldn’t see what abstraction we should be using to simplify the code.

I was continuing with my reading of Functors, Applicative Functors and Monoids yesterday and got to the section on Monoids which showed an example for simplifying this type of code.

The definition of a Monoid from the Haskell source code is:

Types with an associative binary operation that has an identity

But I prefer Dan Piponi’s definition:

In Haskell, a monoid is a type with a rule for how two elements of that type can be combined to make another element of the same type.

To be a monoid there also needs to be an element that you can think of as representing ‘nothing’ in the sense that when it’s combined with other elements it leaves the other element unchanged.

In our case we have a bunch of things of type ‘Ordering’ and we want to combine them all together and end up with a final ‘Ordering’ which takes them all into account.

For example if we were comparing the following two rows:

> let row1 = Row {shortListed = True, cost = 10.0, distance1 = 1, distance2 = 30 }
> let row2 = Row {shortListed = True, cost = 10.0, distance1 = 100, distance2 = 30 }
> compareRow row1 row2
LT

When we compare their shortListed value we get back ‘EQ’, so we compare their cost value and get back ‘EQ’ and finally we compare their distance1 value which gives back ‘LT’ which is our final value.

We can make use of the Ordering Monoid to do this rather than all the nested if statements.

Monoid is a type class defined like so:

class Monoid a where
  mempty  :: a
  mappend :: a -> a -> a
  mconcat :: [a] -> a
  mconcat = foldr mappend mempty

‘mempty’ represents the identity value for a monoid i.e. the ‘nothing’ in Dan Piponi’s definition. If we combine anything with this then we should get that thing back.

The most interesting function here is ‘mappend’ which we use to combine together two elements of a type. Each instance of Monoid needs to define this function for themselves.

The Ordering Monoid is defined like so:

instance Monoid Ordering where
  mempty         = EQ
  LT `mappend` _ = LT
  EQ `mappend` y = y
  GT `mappend` _ = GT

What makes this work for us is that we always keep the value on the left unless it’s ‘EQ’ in which case we take the value on the right.

Therefore as soon as one of our comparisons returns a non ‘EQ’ value that will be the value that eventually gets returned.

e.g.

> GT `mappend` LT `mappend` EQ
GT

Our ‘row1’/’row2′ comparison would look like this using ‘mappend':

> EQ `mappend` EQ `mappend` LT
LT

We can then change our ‘compareRow’ function:

compareRow x y = comparing (not . shortListed) x y `mappend` comparing cost x y `mappend` comparing distance1 x y `mappend` comparing distance2 x y

We can simplify this further by making use of ‘mconcat’ which folds over a list of monoids applying ‘mappend’ each time.

For example we could replace our ‘row1’/’row2′ comparison with the following:

> mconcat [EQ, EQ, LT]
LT

And ‘compareRow’ now reads like this:

compareRow x y = mconcat [comparing (not . shortListed) x y,  comparing cost x y, comparing distance1 x y, comparing distance2 x y]

We’re still repeating the ‘comparing’ bit of code every time so I extracted that into a function:

by :: Ord a => (b -> a) -> b -> b -> Ordering
by fn x y = comparing fn x y

We then need to apply those functions to x and y to get our collection of monoids we can pass to mconcat:

compareRow x y = mconcat $ map (\fn -> fn x y) [by (not . shortListed), by cost, by distance1, by distance2]

One problem with this code is that we’re now comparing by all the parameters when we can actually stop once we’ve found a non equality.

I’m sure there’s a clever way to short circuit this but I’m not sure what it is at the moment!

As BeRewt points out in the comments what I wrote here is not actually the case, it is lazily evaluated!

Written by Mark Needham

May 23rd, 2012 at 6:44 am

Posted in Haskell

Tagged with

Scala/Haskell: A simple example of type classes

without comments

I never really understood type classes when I was working with Scala but I recently came across a video where Dan Rosen explains them pretty well.

Since the last time I worked in Scala I’ve been playing around with Haskell where type classes are much more common – for example if we want to compare two values we need to make sure that their type extends the ‘Eq’ type class.

Learn Me a Haskell’s chapter on type classes defines them like so:

Typeclasses are like interfaces.

A typeclass defines some behaviour (like comparing for equality, comparing for ordering, enumeration) and then types that can behave in that way are made instances of that typeclass.

The behaviour of typeclasses is achieved by defining functions or just type declarations that we then implement. So when we say that a type is an instance of a typeclass, we mean that we can use the functions that the typeclass defines with that type.

In that chapter we then go on to create a ‘YesNo’ type class which defines Boolean semantics for all different types.

We start by defining the type class like so:

class YesNo a where
  yesno :: a -> Bool

Any type which extends that type class can call this ‘yesno’ function and find out the truthyness of its value.

e.g.

instance YesNo Int where
  yesno 0 = False
  yesno _ = True

If we call that:

> yesno (0 :: Int)
False
 
> yesno (1 :: Int)
True

We get the expected result, but if we try to call it for a type which hasn’t defined an instance of ‘YesNo':

> yesno "mark"
 
    No instance for (YesNo [Char])
      arising from a use of `yesno'
    Possible fix: add an instance declaration for (YesNo [Char])
    In the expression: yesno "mark"
    In an equation for `it': it = yesno "mark"

In Scala we can use traits and implicits to achieve the same effect. First we define the ‘YesNo’ trait:

trait YesNo[A] {
  def yesno(value:A) : Boolean
}

Then we define an implicit value in a companion object which creates an instance of the ‘YesNo’ type class for Ints:

object YesNo {
  implicit val intYesNo = new YesNo[Int] { 
    def yesno(value:Int) = 
      value match { case 0 => false;  case _ => true } }
}

We then need to call our ‘yesno’ function and the idea is that if we’ve defined a type class instance for the type we call it with it will return us a boolean value and if not then we’ll get a compilation error.

object YesNoWriter {
  def write[A](value:A)(implicit  conv: YesNo[A]) : Boolean = {
    conv.yesno(value)
  }
}

If we call that:

> YesNoWriter.write(1)
res1: Boolean = true
 
> YesNoWriter.write(0)
res2: Boolean = false

It works as expected, but if we try to call it with a type which wasn’t defined for the ‘YesNo’ type class we run into trouble:

> YesNoWriter.write("mark")
:10: error: could not find implicit value for parameter conv: YesNo[java.lang.String]
       YesNoWriter.write("mark")

We can also define YesNoWriter like this by making use of context bounds:

object YesNoWriter {
  def write[A:YesNo](value:A) : Boolean = {
    implicitly[YesNo[A]].yesno(value)
  }
}

I think this pattern is preferred when we might just be tunnelling the implicit parameter through to another method but we can still use it here and use the ‘implicitly’ method to get access to the implicit value.

I’m still not entirely sure about the use of implicits but in this case they provide another way to implement polymorphism without having to use inheritance.

Dan Rosen goes into much more detail about type classes and implicits and I wrote about how we were using them on a project I worked on last year in an earlier blog post if you want to learn more.

Written by Mark Needham

May 22nd, 2012 at 10:26 am

Posted in Haskell,Scala

Tagged with ,

Haskell: My first attempt with QuickCheck and HUnit

without comments

As I mentioned in a blog post a few days I’ve started learning QuickCheck with the test-framework package as suggested by David Turner.

I first needed to install test-framework and some dependencies using cabal:

> cabal install test-framework
> cabal install test-framework-quickcheck
> cabal install test-framework-hunit

I thought it’d be interesting to try and write some tests around the windowed function that I wrote a few months ago:

Windowed.hs

module Windowed (windowed) where
 
windowed :: Int -> [a] -> [[a]]
windowed size [] = []
windowed size ls@(x:xs) = 
  if length ls >= size then (take size ls) : windowed size xs 
  else windowed size x

I wrote my first property like so:

import Windowed
 
import Test.Framework (defaultMain, testGroup)
import Test.Framework.Providers.QuickCheck (testProperty)
import Test.QuickCheck
 
main = defaultMain tests
 
tests = [
  testGroup "windowed" [
    testProperty "should not window if n is bigger than list size" prop_nothing_if_n_too_large
  ]
] 
 
prop_nothing_if_n_too_large n xs = n > length xs ==>  windowed n xs == []

And tried to compile the file:

ghc -package test-framework -package test-framework-quickcheck -threaded WindowedQC.hs -o Windowed

Which resulted in this compilation error:

WindowedQC.hs:16:17:
    No instance for (QuickCheck-1.2.0.1:Test.QuickCheck.Testable
                       (Gen Prop))
      arising from a use of `testProperty'
    Possible fix:
      add an instance declaration for
      (QuickCheck-1.2.0.1:Test.QuickCheck.Testable (Gen Prop))

According to the test-framework home page this error happens when we have more than one version of QuickCheck installed so we need to tell ‘ghc’ which one to use:

ghc -package test-framework -package test-framework-quickcheck -package QuickCheck-1.2.0.1 -threaded WindowedQC.hs -o Windowed

That solved the first compilation problem but I still had another:

WindowedQC.hs:16:17:
    Ambiguous type variable `a0' in the constraints:
      (Arbitrary a0) arising from a use of `testProperty'
                     at WindowedQC.hs:12:17-28
      (Show a0) arising from a use of `testProperty'
                at WindowedQC.hs:12:17-28
      (Eq a0) arising from a use of `prop_nothing_if_n_too_large'
              at WindowedQC.hs:12:80-106
    Probable fix: add a type signature that fixes these type variable(s)

The way to solve this problem is to explicitly state which types will be passed to the property like so:

prop_nothing_if_n_too_large n xs = n > length xs ==>  windowed n xs == []
    where types = (xs :: [Int], n :: Int)

I eventually ended up with the following:

WindowedQC.hs

import Windowed
 
import Test.Framework (defaultMain, testGroup)
import Test.Framework.Providers.QuickCheck (testProperty)
import Test.QuickCheck
 
import Test.Framework.Providers.HUnit
import Test.HUnit
 
main = defaultMain tests
 
tests = [
  testGroup "windowed" [
    testProperty "should not window if n is bigger than list size" prop_nothing_if_n_too_large,
    testProperty "should create sub arrays the size of window" prop_size_of_sub_arrays_is_n,
    testProperty "should group adjacent items into windows" prop_groups_adjacent_items,                
    testCase "should window a simple list" test_should_window_simple_list
  ]
] 
 
prop_groups_adjacent_items n xs = n < length xs ==>  
                                  not (null xs) ==>
                                  n > 0 ==> (last $  last $ windowed n xs) == last xs
  where types = (xs :: [Int], n :: Int)			
 
prop_nothing_if_n_too_large n xs = n > length xs ==>  windowed n xs == []
  where types = (xs :: [Int], n :: Int)
 
prop_size_of_sub_arrays_is_n n xs =  n > 0 ==> all (\subArray -> length subArray == n)  (windowed n xs)  
  where types = (xs :: [Int], n :: Int)
 
test_should_window_simple_list = 
  windowed 2 [1..10] @?= [[1,2], [2,3], [3,4], [4,5], [5,6], [6,7], [7,8], [8,9], [9,10]]

I initially only wrote QuickCheck properties but I wasn’t satisfied that just passing the properties I’ve written would guarantee the function is working as expected.

The first property does a simple check that the last item from the windowed function is the same as the last item in the list assuming that the list isn’t empty, that we’re using a positive n value and our window size is smaller than the list size.

The second checks that we end up with an empty list if we try to window to a size bigger than the list and the third checks that each of the sub arrays is the correct size.

I wasn’t sure how to write a QuickCheck property that would test the values of the individual sub arrays so I ended up writing a HUnit test instead.

If you know how I could achieve the same thing using QuickCheck please let me know.

We can then run all the tests like so:

> ./Windowed                                                                                                                    
windowed:
 
  should not window if n is bigger than list size: [OK, passed 100 tests]
 
  should create sub arrays the size of window: [OK, passed 100 tests]
 
  should group adjacent items into windows: [OK, passed 100 tests]
 
  should window a simple list: [OK]
 
          Properties  Test Cases  Total      
  Passed  3           1           4          
  Failed  0           0           0

Written by Mark Needham

May 20th, 2012 at 7:09 pm

Posted in Haskell

Tagged with