Monday, November 25, 2013

Thoughts on the Clojure threading macros

Most programmers who have dabbled in Clojure are probably familiar with the threading macros -> and ->>. For readers, who aren't. Here's what they do.

The threading macros allow you to write function composition in reversed order. For instance, using the threading macros (f (g (h x))) can be written as (-> x h g f) or (->> x h g f). Here, the first argument of the macros are evaluated as usual. The other arguments are assumed to be functions.

It's also possible to use -> and ->> for functions that expect multiple arguments. As such, the code (-> a (f b) g) is translated into (g (f a b)) and (->> a (f b) g) is rewritten as (g (f b a)). That is to say, every time the pattern (function arg1 arg2) is encountered, the result of the preceding expression is inserted as the first (->) or last argument (->>) of the function call.

The macro ->> is perhaps the more popular threading macro. It is very useful in combination with map, reduce, or filter—functions that take a function as their first argument and a collection as their last argument. To wit, you might write the following code:

(->> [-1 0 1 2]
     (filter pos?)
     (map inc)
     (map str))

This expression would first filter all positive elements from the vector [-1 0 1 2] (i.e. delete -1 and 0),  then increment the remaining elements, and finally convert them to strings. The result is the list ("2" "3").

The macro -> is useful for functions that perform operations on their first argument. For instance,

(-> { :a [1 2] }
    (get :a)
    (conj 3))

This evaluates to the vector [1 2 3].

Invoking object or protocol methods in Clojure abides by the following syntax: (method obj arg1 arg2 ...). If, in typical functional style, your methods return updates on 'obj' then the macro -> tends to be the more useful threading macro for composing method calls.

Combining -> and ->> can be a bit of a pain. Nesting ->> within an -> expression works well:

(-> {:a [0 1 2 3]}
    (get :a)
    (->> (filter pos?)
         (map inc))
    reverse)
; (4 3 2)

This works because -> inserts the result of (get :a {:a [0 1 2 3]}) as the first argument of the call to ->>. Unfortunately, however, there's no obvious way for nesting -> within ->>.

Using only -> and ->> there doesn't seem to be a good general way of dealing with long chains of composed function calls that expect the 'object' in different positions.

Another annoyance is that -> and ->> don't allow for control flow in an obvious way. That is to say, you can't easily use if-statements within an -> expression.

To be fair, you could use lambdas like this

(-> ...
    ((fn [x] (if (pred x)
                 (foo x)
                 (bar x)))))

but this is a less than stellar solution if only because it introduces a lot of extra parentheses.

(Granted, using the special lambda form #() one pair of parentheses can be eliminated, but then the trade-off is that #()-expressions cannot be nested.)

Where the Clojure threading macros appear to fall short, the Swiss arrow macros purport to be a solution. However, there are rather many of them and I'm not sure if you want other people to understand your code then I feel it's better not to use them.

Enter the macro as->. It's a built-in macro (as of Clojure 1.5) and it's pretty darn awesome. Here's the source code:

(defmacro as->
  "Binds name to expr, evaluates the first form in the lexical context
  of that binding, then binds name to that result, repeating for each
  successive form, returning the result of the last form."
  {:added "1.5"}
  [expr name & forms]
  `(let [~name ~expr
         ~@(interleave (repeat name) forms)]
     ~name))

That code is unusually short, don't you agree? Here's what it does. It transforms the code

(as-> [1 2 3 4] $
      (conj $ 5)
      (map dec $))

into

(let* [$ [1 2 3 4] 
       $ (conj $ 5) 
       $ (map dec $)] $)

and it returns (0 1 2 3 4). I think it's a real gem.

The macro as-> allows you to introduce control flow and it works very well with the -> macro. To wit, you can write things like

(-> ...
    (as-> x (if (pred x)
                (foo x)
                (bar x)))))
    ...)

It also works great in combination with the -> macro. For instance, you can write

(as-> [1 2 3] x
      (map inc x)
      (-> x
          vec
          (conj 5)))
; [2 3 4 5]

Isn't that just swell?