Mark Needham

Thoughts on Software Development

Archive for the ‘Clojure’ Category

Clojure: Not so lazy sequences a.k.a chunking behaviour

with 3 comments

I’ve been playing with Clojure over the weekend and got caught out by the behaviour of lazy sequences due to chunking – something which was obvious to experienced Clojurians although not me.

I had something similar to the following bit of code which I expected to only evaluate the first item of the infinite sequence that the range function generates:

> (take 1 (map (fn [x] (println (str "printing..." x))) (range)))
(printing...0
printing...1
printing...2
printing...3
printing...4
printing...5
printing...6
printing...7
printing...8
printing...9
printing...10
printing...11
printing...12
printing...13
printing...14
printing...15
printing...16
printing...17
printing...18
printing...19
printing...20
printing...21
printing...22
printing...23
printing...24
printing...25
printing...26
printing...27
printing...28
printing...29
printing...30
printing...31
nil)

The reason this was annoying is because I wanted to shortcut the lazy sequence using take-while, much like the poster of this StackOverflow question.

As I understand it when we have a lazy sequence the granularity of that laziness is 32 items at a time a.k.a one chunk, something that Michael Fogus wrote about 4 years ago. This was a bit surprising to me but it sounds like it makes sense for the majority of cases.

However, if we want to work around that behaviour we can wrap the lazy sequence in the following unchunk function provided by Stuart Sierra:

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))

Now if we repeat our initial code we’ll see it only prints once:

> (take 1 (map (fn [x] (println (str "printing..." x))) (unchunk (range))))
(printing...0
nil)

Written by Mark Needham

April 6th, 2014 at 10:07 pm

Posted in Clojure

Tagged with

Clojure: Writing JSON to a file – “Exception Don’t know how to write JSON of class org.joda.time.DateTime”

with 2 comments

As I mentioned in an earlier post I’ve been transforming Clojure hash’s into JSON strings using data.json but ran into trouble while trying to parse a hash which contained a Joda Time DateTime instance.

The date in question was constructed like this:

(ns json-date-example
  (:require [clj-time.format :as f])
  (:require [clojure.data.json :as json]))
 
(defn as-date [date-field]
  (f/parse (f/formatter "dd MMM YYYY") date-field ))
 
(def my-date 
  (as-date "18 Mar 2012"))

And when I tried to convert a hash containing that object into a string I got the following exception:

> (json/write-str {:date my-date)})
 
java.lang.Exception: Don't know how to write JSON of class org.joda.time.DateTime
 at clojure.data.json$write_generic.invoke (json.clj:367)
    clojure.data.json$eval2818$fn__2819$G__2809__2826.invoke (json.clj:284)
    clojure.data.json$write_object.invoke (json.clj:333)
    clojure.data.json$eval2818$fn__2819$G__2809__2826.invoke (json.clj:284)
    clojure.data.json$write.doInvoke (json.clj:450)
    clojure.lang.RestFn.invoke (RestFn.java:425)

Luckily it’s quite easy to get around this by passing a function to write-str that converts the DateTime into a string representation before writing that part of the hash to a string.

The function looks like this:

(defn as-date-string [date]
  (f/unparse (f/formatter "dd MMM YYYY") date))
 
(defn date-aware-value-writer [key value] 
  (if (= key :date) (as-date-string value) value))

And we make use of the writer like so:

> (json/write-str {:date my-date} :value-fn date-aware-value-writer)
"{\"date\":\"18 Mar 2012\"}"

If we want to read that string back again and reify our date we create a reader function which converts a string into a DateTime. The as-date function from the beginning of this post does exactly what we want so we’ll use that:

(defn date-aware-value-reader [key value] 
  (if (= key :date) (as-date value) value))

We can then pass the reader as an argument to read-str:

> (json/read-str "{\"date\":\"18 Mar 2012\"}" :value-fn date-aware-value-reader :key-fn keyword)
{:date #<DateTime 2012-03-18T00:00:00.000Z>}

Written by Mark Needham

September 26th, 2013 at 7:11 pm

Posted in Clojure

Tagged with

Clojure: Writing JSON to a file/reading JSON from a file

with 9 comments

A few weeks ago I described how I’d scraped football matches using Clojure’s Enlive, and the next step after translating the HTML representation into a Clojure map was to save it as a JSON document.

I decided to follow a two step process to achieve this:

  • Convert hash to JSON string
  • Write JSON string to file

I imagine there’s probably a way to convert the hash to a stream and pipe that into a file but my JSON document isn’t very large so I think this way is ok for now.

data.json seems to be the way to go to convert a Hash to a JSON string and I had the following code:

> (require '[clojure.data.json :as json])
nil
 
> (json/write-str { :key1 "val1" :key2 "val2" })
"{\"key2\":\"val2\",\"key1\":\"val1\"}"

The next step was to write that into a file and this StackOverflow post describes a couple of ways that we can do this:

> (use 'clojure.java.io)
> (with-open [wrtr (writer "/tmp/test.json")]
    (.write wrtr (json/write-str {:key1 "val1" :key2 "val2"})))

or

> (spit "/tmp/test.json" (json/write-str {:key1 "val1" :key2 "val2"}))

Now I wanted to read the file back into a hash and I started with the following:

> (json/read-str (slurp "/tmp/test.json"))
{"key2" "val2", "key1" "val1"}

That’s not bad but I wanted the keys to be what I know as symbols (e.g. ‘:key1′) from Ruby land. I re-learnt that this is called a keyword in Clojure.

Since I’m not very good at reading the documentation I wrote a function to convert all the keys in a map from strings to keywords:

> (defn string-keys-to-symbols [map]
    (reduce #(assoc %1 (-> (key %2) keyword) (val %2)) {} map))
 
> (string-keys-to-symbols (json/read-str (slurp "/tmp/test.json")))
{:key1 "val1", :key2 "val2"}

What I should have done is pass the keyword function as an argument to read-str instead:

> (json/read-str (slurp "/tmp/test.json") :key-fn keyword)
{:key2 "val2", :key1 "val1"}

Simple!

Written by Mark Needham

September 26th, 2013 at 7:47 am

Posted in Clojure

Tagged with

Clojure: Anonymous functions using short notation and the ‘ArityException Wrong number of args (0) passed to: PersistentVector’

with one comment

In the time I’ve spent playing around with Clojure one thing I’ve always got confused by is the error message you get when trying to return a vector using the anonymous function shorthand.

For example, if we want function which creates a vector with the values 1, 2, and the argument passed into the function we could write the following:

> ((fn [x] [1 2 x]) 6)
[1 2 6]

However, when I tried to convert it to the shorthand ‘#()’ syntax I got the following exception:

> (#([1 2 %]) 6)
clojure.lang.ArityException: Wrong number of args (0) passed to: PersistentVector
                                      AFn.java:437 clojure.lang.AFn.throwArity
                                       AFn.java:35 clojure.lang.AFn.invoke
                                  NO_SOURCE_FILE:1 user/eval575[fn]
                                  NO_SOURCE_FILE:1 user/eval575

On previous occasions I’ve just stopped there and gone back to the long hand notation but this time I wanted to figure out why it didn’t work as I expected.

I came across this StackOverflow post which explained the way the shorthand gets expanded:

#() becomes (fn [arg1 arg2] (...))

which means that:

#(([1 2 %]) 6) becomes ((fn [arg] ([1 2 arg])) 6)

We are evaluating the vector [1 2 arg] as a function but aren’t passing any arguments to it. One way it can be used as a function is if we want to return a value at a specific index e.g.

> ([1 2 6] 2)
6

We don’t want to evaluate a vector as a function, rather we want to return the vector using the shorthand syntax. To do that we need to find a function which will return the argument passed to it and then pass the vector to that function.

The identity function is one such function:

> (#(identity [1 2 %]) 6)
[1 2 6]

Or if we want to be more concise the thread-first (->) works too:

> (#(-> [1 2 %]) 6)
[1 2 6]

Written by Mark Needham

September 23rd, 2013 at 9:42 pm

Posted in Clojure

Tagged with

Clojure/Emacs/nrepl: Stacktrace-less error messages

without comments

Ever since I started using the Emacs + nrepl combination to play around with Clojure I’ve been getting fairly non descript error messages whenever I pass the wrong parameters to a function.

For example if I try to update a non existent key in a form I get a Null Pointer Exception:

> (update-in {} [:mark] inc)
NullPointerException   clojure.lang.Numbers.ops (Numbers.java:942)

In this case it’s clear that the hash doesn’t have a key ‘:mark’ so the function blows up. However, sometimes the functions are more complicated and this type of reduced stack trace isn’t very helpful for working out where the problem lies.

I eventually came across a thread in the nrepl-el forum where Tim King suggested that adding the following lines to the Emacs configuration file should sort things out:

~/.emacs.d/init.el

(setq nrepl-popup-stacktraces nil)
(setq nrepl-popup-stacktraces-in-repl t)

I added those two lines, restarted Emacs and after calling the function again got a much more detailed stack trace:

> (update-in {} [:mark] inc)
 
java.lang.NullPointerException: 
                 Numbers.java:942 clojure.lang.Numbers.ops
                 Numbers.java:110 clojure.lang.Numbers.inc
                     core.clj:863 clojure.core/inc
                     AFn.java:161 clojure.lang.AFn.applyToHelper
                     AFn.java:151 clojure.lang.AFn.applyTo
                     core.clj:603 clojure.core/apply
                    core.clj:5472 clojure.core/update-in
                  RestFn.java:445 clojure.lang.RestFn.invoke
                 NO_SOURCE_FILE:1 user/eval9
...

From reading this stack trace we learn that the problem happens when the inc function is called with a parameter of ‘nil’. We’d see the same thing if we called it directly:

> (inc nil)
 
java.lang.NullPointerException: 
                                  Numbers.java:942 clojure.lang.Numbers.ops
                                  Numbers.java:110 clojure.lang.Numbers.inc
                                  NO_SOURCE_FILE:1 user/eval14
...

Although Clojure error messages do baffle me at times, I hope things will be better now that I’ll be able to see on which line the error occurred.

Written by Mark Needham

September 22nd, 2013 at 11:07 pm

Posted in Clojure

Tagged with , ,

Clojure/Emacs/nrepl: Ctrl X + Ctrl E leads to ‘FileNotFoundException Could not locate […] on classpath’

without comments

I’ve been playing around with Clojure using Emacs and nrepl recently and my normal work flow is to write some code in Emacs and then have it evaluated in nrepl by typing Ctrl X + Ctrl E at the end of the function.

I tried this once recently and got the following exception instead of a successful evaluation:

FileNotFoundException Could not locate ranking_algorithms/ranking__init.class or ranking_algorithms/ranking.clj on classpath: clojure.lang.RT.load (RT.java:432)

I was a bit surprised because I had nrepl running already (via (Meta + X) + Enter + nrepl-jack-in) and I’d only ever seen that exception refer to dependencies which weren’t in my project.clj file at the time I launched nrepl.

I eventually came across this StackOverflow post which suggested that you either launch nrepl using leiningen and then connect to it from Emacs or have your project.clj open when running (Meta + X) + Enter + nrepl-jack-in.

To launch nrepl from leiningen we’d run the following command from the terminal:

$ lein repl
nREPL server started on port 52265
REPL-y 0.1.0-beta10
Clojure 1.4.0
    Exit: Control+D or (exit) or (quit)
Commands: (user/help)
    Docs: (doc function-name-here)
          (find-doc "part-of-name-here")
  Source: (source function-name-here)
          (user/sourcery function-name-here)
 Javadoc: (javadoc java-object-or-class-here)
Examples from clojuredocs.org: [clojuredocs or cdoc]
          (user/clojuredocs name-here)
          (user/clojuredocs "ns-here" "name-here")

We can then connect to that nrepl server from Emacs by typing (Meta + X) + Enter + nrepl which seems to work quite nicely.

To check the nrepl-jack-in approach works when we’ve got project.clj open we need to first kill the existing server by typing (Meta + X) + Enter + nrepl-quit.

Now if we type (Meta + X) + Enter + nrepl-jack-in our functions are evaluated correctly and all is well with the world again.

Written by Mark Needham

September 22nd, 2013 at 9:23 pm

Posted in Clojure

Tagged with ,

Clojure: Stripping all the whitespace

with 8 comments

When putting together data sets to play around with, one of the more boring tasks is stripping out characters that you’re not interested in and more often than not those characters are white spaces.

Since I’ve been building data sets using Clojure I wanted to write a function that would do this for me.

I started out with the following string:

(def word " with a  little bit of space we can make it through the night  ")

which I wanted to format in such a way that there would be a maximum of one space between each word.

I start out by using the trim function but that only removes white space from the beginning and end of a string:

> (clojure.string/trim word)
"with a  little bit of space we can make it through the night"

I wanted to get rid of the space in between ‘a’ and ‘little’ as well so I wrote the following code to split on a space and filter out any excess spaces that still remained before joining the words back together:

> (clojure.string/join " " 
                       (filter #(not (clojure.string/blank? %)) 
                               (clojure.string/split word #" ")))
"with a little bit of space we can make it through the night"

I wanted to try and make it a bit easier to read by using the thread last (->>) macro but that didn’t work as well as I’d hoped because clojure.string/split doesn’t take the string in as its last parameter:

>  (->> (clojure.string/split word #" ") 
   (filter #(not (clojure.string/blank? %))) 
   (clojure.string/join " "))
"with a little bit of space we can make it through the night"

I worked around it by creating a specific function for splitting on a space:

(defn split-on-space [word] 
  (clojure.string/split word #"\s"))

which means we can now chain everything together nicely:

>  (->> word 
        split-on-space 
        (filter #(not (clojure.string/blank? %))) 
        (clojure.string/join " "))
"with a little bit of space we can make it through the night"

I couldn’t find a cleaner way to do this but I’m sure there is one and my googling just isn’t up to scratch so do let me know in the comments!

Written by Mark Needham

September 22nd, 2013 at 6:54 pm

Posted in Clojure

Tagged with

Clojure: Converting an array/set into a hash map

with 3 comments

When I was implementing the Elo Rating algorithm a few weeks ago one thing I needed to do was come up with a base ranking for each team.

I started out with a set of teams that looked like this:

(def teams #{ "Man Utd" "Man City" "Arsenal" "Chelsea"})

and I wanted to transform that into a map from the team to their ranking e.g.

Man Utd -> {:points 1200}
Man City -> {:points 1200}
Arsenal -> {:points 1200}
Chelsea -> {:points 1200}

I had read the documentation of array-map, a function which can be used to transform a collection of pairs into a map, and it seemed like it might do the trick.

I started out by building an array of pairs using mapcat:

> (mapcat (fn [x] [x {:points 1200}]) teams)
("Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200})

array-map constructs a map from pairs of values e.g.

> (array-map "Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200})
("Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200})

Since we have a collection of pairs rather than individual pairs we need to use the apply function as well:

> (apply array-map ["Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200}])
{"Chelsea" {:points 1200}, "Man City" {:points 1200}, "Arsenal" {:points 1200}, "Man Utd" {:points 1200}}

And if we put it all together we end up with the following:

> (apply array-map (mapcat (fn [x] [x {:points 1200}]) teams))
{"Man Utd"  {:points 1200}, "Man City" {:points 1200}, "Arsenal"  {:points 1200}, "Chelsea"  {:points 1200}}

It works but the function we pass to mapcat feels a bit clunky. Since we just need to create a collection of team/ranking pairs we can use the vector and repeat functions to build that up instead:

> (mapcat vector teams (repeat {:points 1200}))
("Chelsea" {:points 1200} "Man City" {:points 1200} "Arsenal" {:points 1200} "Man Utd" {:points 1200})

And if we put the apply array-map code back in we still get the desired result:

> (apply array-map (mapcat vector teams (repeat {:points 1200})))
{"Chelsea" {:points 1200}, "Man City" {:points 1200}, "Arsenal" {:points 1200}, "Man Utd" {:points 1200}}

Alternatively we could use assoc like this:

> (apply assoc {} (mapcat vector teams (repeat {:points 1200})))
{"Man Utd" {:points 1200}, "Arsenal" {:points 1200}, "Man City" {:points 1200}, "Chelsea" {:points 1200}}

I also came across the into function which seemed useful but took in a collection of vectors:

> (into {} [["Chelsea" {:points 1200}] ["Man City" {:points 1200}] ["Arsenal" {:points 1200}] ["Man Utd" {:points 1200}] ])

We therefore need to change the code to use map instead of mapcat:

> (into {} (map vector teams (repeat {:points 1200})))
{"Chelsea" {:points 1200}, "Man City" {:points 1200}, "Arsenal" {:points 1200}, "Man Utd" {:points 1200}}

However, my favourite version so far uses the zipmap function like so:

> (zipmap teams (repeat {:points 1200}))
{"Man Utd" {:points 1200}, "Arsenal" {:points 1200}, "Man City" {:points 1200}, "Chelsea" {:points 1200}}

I’m sure there are other ways to do this as well so if you know any let me know in the comments.

Written by Mark Needham

September 20th, 2013 at 9:13 pm

Posted in Clojure

Tagged with

Clojure: Converting a string to a date

without comments

I wanted to do some date manipulation in Clojure recently and figured that since clj-time is a wrapper around Joda Time it’d probably do the trick.

The first thing we need to do is add the dependency to our project file and then run lein reps to pull down the appropriate JARs. The project file should look something like this:

project.clj

(defproject ranking-algorithms "0.1.0-SNAPSHOT"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.4.0"]
                 [clj-time "0.6.0"]])

Now let’s load the clj-time.format namespace into the REPL since we know we’ll be parsing dates:

> (require '(clj-time [format :as f]))

The string that I want to convert into a date looks like this:

(def string-date "18 September 2012")

The first thing we should do is check whether there is an existing formatter that we can use by evaluating the following function:

> (f/show-formatters)
...
:hour-minute                            06:45
:hour-minute-second                     06:45:22
:hour-minute-second-fraction            06:45:22.473
:hour-minute-second-ms                  06:45:22.473
:mysql                                  2013-09-20 06:45:22
:ordinal-date                           2013-263
:ordinal-date-time                      2013-263T06:45:22.473Z
:ordinal-date-time-no-ms                2013-263T06:45:22Z
:rfc822                                 Fri, 20 Sep 2013 06:45:22 +0000
...

There are a lot of different built in formatters but unfortunately I couldn’t find one that exactly matched our date format so we’ll have to write our own one.

For that we’ll need to refresh our knowledge of Java date formatting:

2013 09 20 07 48 52

We end up with the following formatter:

> (f/parse (f/formatter "dd MMM YYYY") string-date)
#<DateTime 2012-09-18T00:00:00.000Z>

It took me much longer than it should have to remember that ‘MMM’ is the pattern to match a short form of a month but it’s just the same as what we’d have to do in Java but with some neat wrapper functions.

Written by Mark Needham

September 20th, 2013 at 7:00 am

Posted in Clojure

Tagged with

Clojure: See every step of a reduce

without comments

Last year I wrote about a Haskell function called scanl which returned the intermediate steps of a fold over a collection and last week I realised that I needed a similar function in Clojure to analyse a reduce I’d written.

A simple reduce which adds together the numbers 1-10 would look like this:

> (reduce + 0 (range 1 11))
55

If we want to see the intermediate values of this function called then instead of using reduce there’s a function called reductions which gives us exactly what we want:

> (reductions + 0 (range 1 11))
(0 1 3 6 10 15 21 28 36 45 55)

I found this function especially useful when analysing my implementation of the Glicko ranking algorithm to work out whether a team’s ranking was being updated correctly after a round of matches.

I initially thought the reductions function was only useful as a debugging tool and that you’d always end up changing your code back to use reduce after you’d solved the problem but I realise I was mistaken.

As part of my implementation of the Glicko algorithm I wrote a bit of code that applied a reduce across a collection of football seasons and initially just returned the final ranking of each team:

(def initial-team-rankings { "Man Utd" {:points 1200} "Man City" {:points 1300}})
 
(defn update-team-rankings [teams year]
  (reduce (fn [ts [team _]] (update-in ts [team :points] inc)) teams teams))
> (reduce update-team-rankings initial-team-rankings (range 2004 2013))
{"Man City" {:points 1309}, "Man Utd" {:points 1209}}

I realised it would actually be quite interesting to see the rankings after each season for which reductions comes in quite handy.

For example if we want to find the rankings after 3 seasons we could write the following code:

> (nth (reductions update-team-rankings initial-team-rankings (range 2004 2013)) 3)
{"Man City" {:points 1303}, "Man Utd" {:points 1203}}

Or we could join the result back onto our collection of years and create a map so we can look up the year more easily:

(def final-rankings
  (zipmap (range 2003 2013) (reductions update-team-rankings initial-team-rankings (range 2004 2013))))
> (get final-rankings 2006)
{"Man City" {:points 1303}, "Man Utd" {:points 1203}}

Written by Mark Needham

September 19th, 2013 at 11:57 pm

Posted in Clojure

Tagged with