Mark Needham

Thoughts on Software Development

Clojure: Getting caught out by lazy collections

with one comment

Most of the work that I’ve done with Clojure has involved running a bunch of functions directly in the REPL or through Leiningen’s run target which led to me getting caught out when I created a JAR and tried to run that.

As I mentioned a few weeks ago I’ve been rewriting part of our system in Clojure to see how the design would differ and a couple of levels down the Clojure version comprises of applying a map function over a collection of documents.

The code in question originally looked like this:

(ns aim.main (:gen-class))
 
(defn import-zip-file [zipFile working-dir]
  (let [xml-files (filter xml-file? (unzip zipFile working-dir))]
    (map import-document xml-files)))
 
(defn -main [& args]
  (import-zip-file "our/file.zip", "/tmp/unzip/to/here"))

Which led to absolutely nothing happening when run like this!

$ lein uberjar && java -jar my-project-0.1.0-standalone.jar

I originally assumed that I had something wrong in the code but my colleague Uday reminded me that collections in Clojure are lazily evaluated and there was nothing in the code that would force the evaluation of ours.

In this situation we had to wrap the map with a doall in order to force evaluation of the collection:

(ns aim.main (:gen-class))
 
(defn import-zip-file [zip-file working-dir]
  (let [xml-files (filter xml-file? (unzip zip-file working-dir))]
    (doall (map import-document xml-files))))
 
(defn -main [& args]
  (import-zip-file "our/file.zip", "/tmp/unzip/to/here"))

When we run the code in the REPL or through ‘lein run’ the code is being eagerly evaluated as far as I understand it which is why we see a different behaviour than when we run it on its own.

I also got caught out on another occasion where I tried to pass around a collection of input streams which I’d retrieved from a zip file only to realise that when the code which used the input stream got evaluated the ZIP file was no longer around!

Be Sociable, Share!

Written by Mark Needham

July 31st, 2011 at 9:40 pm

Posted in Clojure

Tagged with

  • Sergey Didenko

    IMHO dorun would be more idiomatic since you don’t need the head of the seq.

    Also there is a really good convention to put the bang after the name of mutating functions. “(defn import-zip-file! …” and “import-document!” if it is mutating.