ClojureScript Macros: A (Not So) Long Journey [Part II]

The writing of Clojure/Script macro may seem as a wizard senior programming craft, but this article will show you that it's not the case. This is the second part of my journey to learn Clojure/Script macro and this article will deal with Clojure macro.

Motivation

In the part I article of this serie, I exposed my issue which is illustrated by this snippet:

(defn- add-bubble [appstate bubble]
  (update appstate :bubbles conj bubble))

(defn add-bubble! [bubble]
  (swap! appstate (fn [appstate_arg] (add-bubble appstate_arg bubble))))

I would like to write a macro, let say

BANG

, which will take the

add-bubble

function as argument and generate the

add-bubble!

function which is the side-effect version of it.

Even if my goal is to write ClojureScript macro, I like to go one step after another, thus I thought it is a good idea to learn how to write Clojure macro first. ClojureScript takes its origin from Clojure, and from my experience in programming, it's easier to build from solid basics rather than eagerly jump to the last step. Also from my experience, you usually get more documentation/ressources from the root language than its derivative.

Macro: the origin

During my try/error process in learning macro, I was wondering: where does this idea of macro came from? As Clojure is known to be a dialect of Lisp, the path to its origin is clear.

Lisp was invented by John McCarthy and its team in the late 1950s at MIT. At this time, FORTRAN was really the main programming language used on the early years of IBM machines. When McCarthy discovered the FORTRAN language, he was fascinated by the idea of writing programs with "algebraic" means. But as his main topic was Artificial Intelligence, he was also convinced of the need of a different way to express a program to a computer.

Another programming language which would allow to handle symbolic expressions. These thought lead him to create the LISP language. If you're interested by the history of LISP language, I recommend you this article of Herbert Stoyan where you'll get the historical context of McCarthy's work as well as the atmosphere at this time between the machine builder, the compiler builder and R&D interests (read funding). From this article:

Already in 1956 it was clear that one had to work with symbolic expressions to reach the goal of artificial intelligence. As the researchers already understood numerical computation would not have much importance. McCarthy, in his aim to express simple, short description parts by short language elements, saw the composition of algebraic sub-expressions as an ideal way to reach this goal.

In this related article still of Herbert Stoyan, you can find a list of 24 new ideas for programming language (at this time) from McCarthy. I quote here some idea which are the essence of macro in my humble opinion:

(4) extensibility of programs (incremental compiler) and changeability of programs,
(10) possibilities for manipulating symbolic quantities.

Rich Hickey, the author and maintainer of Clojure, has recently published an article titled "A History of Clojure". This article explained his motivations around Clojure, some design choice he has made in the language, and obviously discuss about similarities and differences between LISP and Clojure.

On the topic of macro, Clojure macros are similar to Common Lisp macros. One difference discussed is about how symbol are manipulated by these two languages. In Clojure symbols are essentially simple names: their resolution to a value can be delayed. As long as you don't required the value behind a symbol, you don't know if a symbol is bound a to value, or from where its potential value come from. During the talk "Clojure for Lisp Programmers", Rich Hickey said about Clojure macro:

macros are manipulating this name-world not this var-world

Finally, when I was looking for more documentation about macro, I discovered there are two kinds of macro: the regular ones and reader macros. Reader macros are special/low level syntax in Clojure code. For example, the quote operator,

. Its use indicates to the Clojure reader to avoid the evaluation step of the form which follows the quote operator.

For example,

'foo

expression will produce the symbol

foo

: Clojure doesn't attempt to resolve the potential value behind the name

foo

. For me the reader macro constitutes the foundation of the language itself, and in Clojure the user cannot define new reader macros whereas it is possible to do so in LISP.

From this point, the term macro will always refer to regular macro.

Macro: a "blur" definition

To handle new concept, it's useful to take a look at the definition of the concept. So here is my definition of Clojure macro:

Clojure macros are code that generate code.

That's it. Thanks for coming.

I'm joking, please bear with me.

You'll find a more convincing definition from the official home page of Clojure:

Clojure has a programmatic macro system which allows the compiler to be extended by user code. Macros can be used to define syntactic constructs which would require primitives or built-in support in other languages. Many core constructs of Clojure are not, in fact, primitives, but are normal macros.

It's (almost) always simpler for me to look at a concrete example and then reason about it. Let's look at the

when

macro provided by

clojure.core

namespace:

(source when)
;; => (defmacro when
;;      "Evaluates test. If logical true, evaluates body in an implicit do."
;;      {:added "1.0"}
;;      [test & body]
;;      (list 'if test (cons 'do body)))

At first sight, the definition of the

when

macro is very similar to the definition of a regular function except the use of the keyword

defmacro

instead of

defn

. Roughly, the macro definition skeleton is:

(defmacro <macro-name>
  <documentation-string>
  <meta-data>
  <argument-list>
  <body>)

For completeness, here is a simple example of the code generated by

when

on a simple example:

(macroexpand '(when true 42))
=> (if true (do 42))

The

macroexpand

function is especially useful to check the generated code from a given macro. From this first observation, some thoughts can immerge.

The first fundamental difference between macro code and regular code is that macro code is executed at compile time. Which means, before the writing of Java bytecode by the Clojure compiler, macros have to be expanded to get the terminal Clojure code which will be effectively translated to Java bytecode.

The second main difference is that the arguments of macro are not evaluated: they remain symbols in the body of macro.

In the subsequent part of this article, a particular emphasis on this point will be made.

Macro is the tool to do metaprogramming. Wikipedia defines metaprogramming as:

Metaprogramming is a programming technique in which computer programs have the ability to treat other programs as their data.

Clojure macro takes as input any Clojure data and generates arbitrary Clojure code: full power.

The talk "Illuminated Macros" introduces Clojure macro as a hook to the compiler. Macros are expanded/executed at compile time; it's relevant to see them as extension to the compiler: it allows you to define new syntax.

Build the BANG macro

This article don't have the pretension to learn you how to write Clojure macro. Many high quality materials are available online. If you want to master the craft of creating macro and decipher all notations it involves, you should read one of these links:

With only one of these resources, you'll feel fluent to write Clojure macro by the end of the day. By learning it, the main difficulty I experienced is to understand when to keep a symbol as is, instead of taking it's value in the macro body .

Back to my use case, I want to create a macro to generate the side effect version of a function which change the state of my application (cf the snippet at the top of this article). To achieve this, here is first attempt:

(defmacro BANG
  "Define the side-effect version of a given function 'func-name'"
  [func-name]
  (let [func-name-banged (symbol (str func-name "!"))
        arg-symbol (symbol (str "arg"))
        appstate-arg-symbol (symbol (str "appstate-arg"))]
    `(defn ~func-name-banged [~arg-symbol]
       (swap! appstate (fn [~appstate-arg-symbol] (~func-name ~appstate-arg-symbol ~arg-symbol))))))

Writing macros is generally not as easy as developing regular function. Along the road, I intensively use the

macroexpand

function which takes a form as input and gives back the full expansion of it:

(macroexpand '(BANG add-bubble))
;; => (def add-bubble!
;;      (clojure.core/fn
;;        ([arg]
;;          (clojure.core/swap! core/appstate
;;            (clojure.core/fn [appstate-arg]
;;              (add-bubble appstate-arg arg))))))

As many Clojure instructions rely on macros, the result of

macroexpand

can be a bit confusing. As

macroexpand

does recursively all macro expansions, some implementation detail of builtin functions is exposed and this is generally not relevant when you write your own macro. To hide this complexity, you can use

macroexpand-1

instead as it does only one step of macro expansion:

(macroexpand-1 '(BANG add-bubble))
;; => (clojure.core/defn add-bubble! [arg]
;;      (clojure.core/swap! core/appstate
;;        (clojure.core/fn [appstate-arg]
;;          (add-bubble appstate-arg arg))))

The

(BANG add-bubble)

form expanded correctly to the target function definition

add-bubble!

: it works!

To comment briefly the body of

BANG

macro, firstly you can see the binding of

func-name-banged

(symbol (str func-name "!"))

. The function

symbol

allows you to create a symbol from an arbitrary string. In Clojure, a Symbol is bound or not to a Var (a Var is basically a value). Through the call

(BANG add-bubble)

func-name-banged

stores a symbol with the name "add-bubble!":

(symbol (str "add-bubble" "!"))
;; => add-bubble!
(type (symbol (str "add-bubble" "!")))
;; => clojure.lang.Symbol
(name (symbol (str "add-bubble" "!")))
;; => "add-bubble!"
(type (name (symbol (str "add-bubble" "!"))))
;; => java.lang.String

Secondly, you can see local variables

arg-symbol

and

appstate-arg-symbol

which just store a symbol with a given name. In the body of the macro, these variables are used in the signature of functions and you can see in the macro expansion that the compiler resolves them just with there name (the string provided at their initialisation).

This is a way to manage situation where you want to use a free variable in a macro. A free variable is not bound (yet) to a value during the macro compilation, which is on purpose as the BANG macro generate a function, and so its signature.

As this situation is recurrent, you can also use the function

gensym

which returns a symbol with a unique name. If you rewrite the

BANG

macro using the

gensym

function, you'll get:

(defmacro BANG
  "Define the side-effect version of a given function 'func-name'"
  [func-name]
  (let [func-name-banged (symbol (str func-name "!"))
        arg-symbol (gensym)
        appstate-arg-symbol (gensym)]
    `(defn ~func-name-banged [~arg-symbol]
       (swap! appstate (fn [~appstate-arg-symbol] (~func-name ~appstate-arg-symbol ~arg-symbol))))))

(macroexpand-1 '(BANG add-bubble))
;; => (clojure.core/defn add-bubble! [G__7657]
;;      (clojure.core/swap! core/appstate
;;        (clojure.core/fn [G__7658]
;;          (add-bubble G__7658 G__7657))))

You can see that

arg-symbol

is expanded to

G__7657

and

appstate-arg-symbol

G__7658

. This version of

BANG

is completely equivalent to the previous one with less characters and also a bit less readable when you look at the macro expansion. If you don't care about a particular name behind a symbol, the use of

gensym

is perfectly fine.

If you want to learn more about the use of generated symbols, I recommend this article which also gives a lot of material about the use of macro itself.

In this article, I prefer to have a

BANG

macro with an easy macro expansion to comment, so I'll stick with the first version of it.

The current definition of

BANG

macro works perfectly for the

add-bubble

function, but its not generic enough for my application. I have other functions which take more than one argument as input, not just one. The number of arguments accepted as input by a function is called arity in Clojure.

The

add-bubble!

function need only one argument: the bubble to add to the global state

appstate

. But what if the function I deal with takes more than one argument as input. For example the function

update-bubble

takes 2 arguments as input: a bubble-id and an hashmap of attributes to update a given bubble.

The big deal here is to find a mechanism to retrieve the number of argument(s) of the input function passed to the

BANG

macro. Another formulation would be: "How to deal with functions of arbitrary arity with the

BANG

macro?".

All the magic of Clojure macros lives in the fact that you can access to thins kind of information at compile time: the programmer has the power to handle them at his/her convenience.

Handle n-arity by the BANG macro

A really interesting feature of Clojure for me is the accessibility to metadata information. Every time you define a variable through

def

, some metadata is automatically attach to this variable.

The official documentation describes the standard information that are defined and their meaning. You can inspect these metadata with

meta

function:

(type (var add-bubble))
;; => clojure.lang.Var
(meta (var add-bubble))
;; => {:private true,
;;     :arglists ([appstate bubble]),
;;     :line 12,
;;     :column 1,
;;     :file ".../clojurescript macro not so long journey/part2/appstate/src/appstate.clj",
;;     :name add-bubble,
;;     :ns #namespace[core]}

You can call

meta

only on a value of type Var. To get a Var from a Symbol, you just have to use the var function. Its documentation says:

(doc var)
;; => var
;;    (var symbol)
;;    Special Form
;;    The symbol must resolve to a var, and the Var object
;;    itself (not its value) is returned. The reader macro #'x expands to (var x).
;;
;;    Please see http://clojure.org/special_forms#var
;;    The symbol must resolve to a var, and the Var object
;;    itself (not its value) is returned. The reader macro #'x expands to (var x).

What really interested me in metadata data information is the

:arglists

field as it gives you the list of input argument for a given function, exactly what I need to handle function of any arity within the

BANG

macro. Every function without side effect take the current application state,

appstate

, as first argument. To generate the side effect of them, I need their signature without the first argument:

(-> add-bubble var meta :arglists first rest)
;; => (bubble)

Voila, now I just need to update the

BANG

macro, and we'll be ready to wrap up everything:

(defmacro BANG
  "Define the side-effect version of a given function 'func-name'"
  [func-name]
  (let [func-name-banged (symbol (str func-name "!"))
        appstate-arg-symbol (symbol "appstate-arg")
        func-var (var func-name)
        arg-list (-> func-var meta :arglists first rest)
        ]
    `(defn ~func-name-banged [~@arg-list]
       (swap! appstate (fn [~appstate-arg-symbol] (~func-name ~appstate-arg-symbol ~@arg-list))))
    ))

But when I compiled this macro, I get a weird and mysterious error message:

1. Caused by java.lang.RuntimeException
Unable to resolve var: func-name in this context

For an unknown reason, I cannot call the

var

function in a macro definition. After tinkering the code for a moment, I got more error just as mysterious as each other.

At this point, I understood my vision of the puzzle was not complete, something was missing and I had to dig deeper or ask for help. Some moment later, I found this question on stackoverflow which is relative to my issue. This question is about getting a var value in the body of a macro. Especially, in the given answer, a function

resolve

, that I were not aware of, is used. Let's take a look at the documentation of this function:

(doc resolve)
;; => clojure.core/resolve
;;      ([sym] [env sym])
;;        same as (ns-resolve *ns* symbol) or (ns-resolve *ns* &env symbol)
(doc ns-resolve)
;; => clojure.core/ns-resolve
;;      ([ns sym] [ns env sym])
;;        Returns the var or Class to which a symbol will be resolved in the
;;        namespace (unless found in the environment), else nil.  Note that
;;        if the symbol is fully qualified, the var/Class to which it resolves
;;        need not be present in the namespace.

From a given Symbol, the

resolve

function look up a Var in the namespace from where it is called. In fact, this is what I needed for my

BANG

macro. To convince myself of the good behaviour of the this function, I like to write dummy examples:

(def dummy-arg 42)
(defmacro dummy-m [arg]
  (resolve arg))

(macroexpand '(defmacro dummy-m [arg]
                (resolve arg)))
;; => (do
;;      (clojure.core/defn dummy-m ([&form &env arg] (resolve arg)))
;;      (. #'m (setMacro))
;;      #'m)

(dummy-m dummy-arg)
;; => #'core/dummy-arg

What is interesting is the above snippet is the expansion of the definition of the

dummy-m

macro. You can see the use of special arguments

&form

and

&env

. You can use

&form

to see how a macro has been called:

(defmacro dummy-m1 [arg]
  (prn &form))

(dummy-m1 (+ 3 2 doesn't-exist))
;; => (dummy-m1 (+ 3 2 doesn't-exist))

The

&env

variable let you inspect the current compiler environment for the macro:

(defmacro dummy-m2 []
  (prn &env))

(dummy-m2)
;; => nil

By default, the

&env

variable is nil. This is not the main track of this article, so if you want to learn more about it, I recommend this article. Anyway, the only point I would like to emphasize here is that Clojure macro is definitely a hook to the Clojure compiler and this tiny examples give you some hint on how you can tackle this topic.

Finally, to fix the

BANG

macro, I have to use the

resolve

function instead of

var

(defmacro BANG
  "Define the side-effect version of a given function 'func-name'"
  [func-name]
  (let [func-name-banged (symbol (str func-name "!"))
        appstate-arg-symbol (symbol "appstate-arg")
        func-var (resolve func-name)
        arg-list (-> func-var meta :arglists first rest)]
    `(defn ~func-name-banged [~@arg-list]
       (swap! appstate (fn [~appstate-arg-symbol] (~func-name ~appstate-arg-symbol ~@arg-list))))
    ))

(macroexpand-1 '(BANG add-bubble))
;; => (clojure.core/defn add-bubble! [bubble]
;;      (clojure.core/swap! core/appstate
;;        (clojure.core/fn [appstate-arg]
;;          (add-bubble appstate-arg bubble))))

The

BANG

macro can now handle functions of any arity.

Conclusion

Clojure macro is really a powerful feature: it allows you generate arbitrary code from Clojure data. But as you saw through this article, it can be a bit sporty to get it works, the way you want.

Except for the

resolve

trick, there are really good materials around the Clojure macro feature in different books/blog posts and as I said, after reading one of this documentation you'll be comfortable to write your own macro by the end of the day.

Learning Clojure macro was a preliminary step to tackle serenely the ClojureScript macro feature. Some subtle differences exist between those two but this is another story that I'll go through in the last article of this serie.