6 Apr 2014

Clojure: Not so lazy sequences a.k.a chunking behaviour

I’ve been playing with Clojure over the weekend and got caught out by the behaviour of lazy sequences due to chunking - something which was obvious to experienced Clojurians although not me.

I had something similar to the following bit of code which I expected to only evaluate the first item of the infinite sequence that the range function generates:

> (take 1 (map (fn [x] (println (str "printing..." x))) (range)))
(printing...0
printing...1
printing...2
printing...3
printing...4
printing...5
printing...6
printing...7
printing...8
printing...9
printing...10
printing...11
printing...12
printing...13
printing...14
printing...15
printing...16
printing...17
printing...18
printing...19
printing...20
printing...21
printing...22
printing...23
printing...24
printing...25
printing...26
printing...27
printing...28
printing...29
printing...30
printing...31
nil)

The reason this was annoying is because I wanted to shortcut the lazy sequence using http://clojuredocs.org/clojure_core/clojure.core/take-while, much like the poster of this StackOverflow question.

As I understand it when we have a lazy sequence the granularity of that laziness is 32 items at a time a.k.a one chunk, something that Michael Fogus wrote about 4 years ago. This was a bit surprising to me but it sounds like it makes sense for the majority of cases.

However, if we want to work around that behaviour we can wrap the lazy sequence in the following unchunk function provided by Stuart Sierra:

(defn unchunk [s]
  (when (seq s)
    (lazy-seq
      (cons (first s)
            (unchunk (next s))))))

Now if we repeat our initial code we’ll see it only prints once:

> (take 1 (map (fn [x] (println (str "printing..." x))) (unchunk (range))))
(printing...0
nil)

About the author

I'm currently working on short form content at ClickHouse. I publish short 5 minute videos showing how to solve data problems on YouTube @LearnDataWithMark. I previously worked on graph analytics at Neo4j, where I also co-authored the O'Reilly Graph Algorithms Book with Amy Hodler.