In stackerizer, our + and * operations will take any number of arguments. To use the mathy word, they are variadic. But in our stacker target language, those + and * operations take only two arguments (mathy word = dyadic). So our first challenge is to figure out how to decompose a call to a variadic function into some combination of calls to a dyadic function.
Let’s set up a simple test case:
1 2 | #lang s-exp "stackerizer.rkt" (+ 1 2 3 4) |
Of course, when we run this, we’ll get an error:
1 | module: no #%module-begin binding in the module's language |
That should be totally unsurprising. Our "stackerizer.rkt" is almost blank. Let’s update it like so:
1 2 3 4 5 6 7 | #lang br/quicklang (provide + *) (define-macro (stackerizer-mb EXPR) #'(#%module-begin EXPR)) (provide (rename-out [stackerizer-mb #%module-begin])) |
First, we’ll provide our + and * functions, just as we did in stacker.
We stipulated in specification and setup that stackerizer would only take one S-expression as input. Therefore, we’ll make a stackerizer-mb that accepts one S-expression, denoted by EXPR, and passes it through. As we did in stacker, we use rename-out to export this macro with the correct #%module-begin name.
When we run "stackerizer-test.rkt", we’ll get a new result:
1 | 10 |
Very good. But our + expression has four arguments. We need to put that operation in terms of dyadic operations.
Hopefully we remember from sixth-grade math that addition is associative and commutative. That means we can group the arguments as we wish, and perform the additions in any order. For instance, this S-expression will evaluate to same result as the last one:
This new expression uses only dyadic operations. The first addition combines (+ 3 4), the next addition combines (+ 2 first-result), and so on. Furthermore, this pattern of nested additions could be extended to any number of arguments. + Being able to identify patterns in code is an essential skill for writing good macros, since a macro works by matching patterns in the code and systematically rearranging their elements.
Let’s consider another way we could decompose this expression into dyadic operations:
Also mathematically sound. But suppose that our arguments are instead 1 2 3 4 5. Where would we put the 5? The decomposition pattern above is a dead end, because it only works with an even number of arguments.
Whereas with our first pattern, it’s easy to include a new argument:
In effect, we can think of this as the specification for our variadic-to-dyadic macro: the macro needs to rewrite an expression like (+ 1 2 3 4 5) to look like (+ 1 (+ 2 (+ 3 (+ 4 5)))), for any number of arguments.
Now we can write the macro that will convert the variadic + into a dyadic version using this pattern. Let’s add this code to "stackerizer.rkt":
1 2 3 4 5 6 7 8 9 10 11 | #lang br/quicklang (provide + *) (define-macro (stackerizer-mb EXPR) #'(#%module-begin EXPR)) (provide (rename-out [stackerizer-mb #%module-begin])) (define-macro-cases + [(+ FIRST) #'FIRST] [(+ FIRST NEXT ...) #'(list 'dyadd FIRST (+ NEXT ...))]) |
This is the first time we’ve seen define-macro-cases. It works like define-macro. But instead of matching a single syntax pattern, it lets us specify a series of patterns to match. It works like a cond expression: when a left-hand syntax pattern matches, the code on the right side of the branch—also known as a syntax template, for self-evident reasons—is returned. Otherwise, the next pattern is tested.
We’re also doing something sneaky: we’re naming our macro +. This means that we’ll override the usual definition of + (= a function that adds) and turn it into a macro that does something else.
What does it do, exactly? Let’s start with the first branch:
1 | [(+ FIRST) #'FIRST] |
The pattern on the left will match a single-argument addition like (+ 42). The + in the pattern matches literally. + An identifier in a syntax pattern matches an identifier in the input “literally” when both their datums and their bindings match (also known as matching in the sense of free-identifier=?) We’ll match the argument within the pattern to the pattern variable FIRST. Within a syntax pattern, an all-caps identifier like FIRST will match anything and assign it that name.
As for the addition operation itself, (+ FIRST) is equivalent to adding nothing. So we’ll just return a syntax object containing only our FIRST pattern variable. As usual, to make a syntax object within the current lexical context, we use the #' prefix, so the result is #'FIRST. Easy.
Now the less easy second branch:
The syntax pattern in this branch will match a variadic addition like (+ 1 2 3 4 5). The first argument is matched to FIRST, and the remaining arguments are matched to NEXT ..., where the ellipsis gathers everything that follows. Note that a pattern variable with an ellipsis is allowed to match zero arguments. That means this pattern could also match something like (+ 42). But in this macro, our first branch will match that case, so (+ 42) would never reach this branch. That also means that the pattern in this branch is guaranteed to match two or more arguments.
On the right side, we decompose our variadic addition into a dyadic addition. For prototyping purposes, we’ll simulate the conversion by using the placeholder 'dyadd symbol to represent a dyadic addition. We see it has two arguments, as it must. The first argument will be FIRST.
The second argument, however, is where the magic happens. We take the rest of our arguments, which are stored in NEXT ..., and pass them as input to our + macro. This idea of a function or macro calling itself is known as recursion. We’ll be seeing a lot more of it. The core idea of recursion is breaking down a task into smaller versions of the same task. Here, every time the macro reaches this branch, it peels off the first argument, and recursively calls itself using the other arguments. Eventually, there will only be one argument left, which will be handled by the first branch.
Let’s see how this works. Save "stackerizer.rkt" and try running these test files. This time, every + in the test program will be interpreted as a call to our new + macro. When called with one argument, the macro just returns that argument:
1 2 | #lang s-exp "stackerizer.rkt" (+ 1) |
1 | 1
|
When called with more arguments, the macro calls itself recursively, thereby returning recursively nested lists with 'dyadd.
1 2 | #lang s-exp "stackerizer.rkt" (+ 1 2) |
1 | '(dyadd 1 2) |
1 2 | #lang s-exp "stackerizer.rkt" (+ 1 2 3 4 5) |
1 | '(dyadd 1 (dyadd 2 (dyadd 3 (dyadd 4 5)))) |
Notice how this last result follows the (+ 1 (+ 2 (+ 3 (+ 4 5)))) pattern we designed earlier. We’re on the right track. In fact, to get things a little closer to the goal, let’s go into our + macro and swap out 'dyadd with '+ like so:
Let’s be clear: that '+ is a symbol consisting of one literal plus sign. It is not the same as the variable + that represents our macro (just like the string "+" also is not the same as +). Let’s save "stackerizer.rkt" and try running "stackerizer-test.rkt" again:
1 | '(+ 1 (+ 2 (+ 3 (+ 4 5)))) |
That takes us even closer to the goal. We’ll also need to deal with *, of course, but that will follow the same pattern. Let’s return to that after we’ve tackled the other lingering problem we left for ourselves.
stacker only handles a flat list of arguments, but stackerizer allows nested expressions. So we have to flatten the nested structure into an equivalent list.
As before, we’ll start by working through a simple example. In the last section, we figured out how to convert a stackerizer expression like (+ 1 2 3 4) into one that looks like (+ 1 (+ 2 (+ 3 4))). This time, suppose we start with that:
How would we write that nested expression in stacker? The expression would be evaluated from innermost to outermost, so the initial (+ 3 4) expression would start our stacker program:
1 2 3 | 4 3 + |
This would leave 7 on the stack. So if we wanted to then add 2, we would write:
And then 1:
From this, the pattern emerges: we just remove the parentheses, keeping the arguments in order, and then “rotate” the program 90 degrees counterclockwise. We do that by printing them in reverse order, one per line.
This kind of transformation is best accomplished in our stackerizer-mb. Let’s update it as follows:
1 2 3 4 5 6 7 8 9 10 11 | #lang br/quicklang (provide + *) (define-macro (stackerizer-mb EXPR) #'(#%module-begin (for-each displayln (reverse (flatten EXPR))))) (provide (rename-out [stackerizer-mb #%module-begin])) (define-macro-cases + [(+ FIRST) #'FIRST] [(+ FIRST NEXT ...) #'(list '+ FIRST (+ NEXT ...))]) |
Recall that in the first version, we just passed through the EXPR argument untouched. This time, we’ll flatten it (which is equivalent to removing the inner parentheses) and reverse the order. After that, we use for-each with displayln to print each item in this list on a separate line. When we run our test program again:
1 2 | #lang s-exp "stackerizer.rkt" (+ 1 2 3 4 5) |
We’ll now get:
At this point, it might be clearer why we used the '+ symbol in our macro: when printed with displayln, it looks just like an ordinary + operator in a stacker program.
We’re one giant step closer to the goal. We just need to go back and put in our * operator.
The easy way to support our * operation would be to copy the macro we made for + and change it to use *:
But this would break the heart of any experienced Racketeer. Since we’re making a macro anyhow, we might as well make one that can generate code for both these variants.
Macros are perfectly happy to make other macros. We just have to ask. In "stackerizer.rkt", let’s delete our + macro and replace it with this new define-op macro:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | #lang br/quicklang (provide + *) (define-macro (stackerizer-mb EXPR) #'(#%module-begin (for-each displayln (reverse (flatten EXPR))))) (provide (rename-out [stackerizer-mb #%module-begin])) (define-macro (define-op OP) #'(define-macro-cases OP [(OP FIRST) #'FIRST] [(OP FIRST NEXT (... ...)) #'(list 'OP FIRST (OP NEXT (... ...)))])) (define-op +) (define-op *) |
How did we come up with this? Let’s convince ourselves that there’s nothing mysterious happening here. We just need to follow the macro recipe we’ve used before. We start with the + macro we already made:
We wrap it inside a new define-macro form with the name define-op, taking a single argument named OP, and apply the #' prefix to turn our existing code into a syntax object:
1 2 3 4 | (define-macro (define-op OP) #'(define-macro-cases + [(+ FIRST) #'FIRST] [(+ FIRST NEXT ...) #'(list '+ FIRST (+ NEXT ...))])) |
We replace the occurrences of our specific + operation with the new pattern variable OP. And when we say 'OP, it’s shorthand for (quote OP), so the OP is replaced with its actual value before the ' is applied:
1 2 3 4 | (define-macro (define-op OP) #'(define-macro-cases OP [(OP FIRST) #'FIRST] [(OP FIRST NEXT ...) #'(list 'OP FIRST (OP NEXT ...))])) |
And finally, some fiddly but necessary housekeeping. We want our ellipsis operators to be used in the patterns of the result macros, not the define-op macro itself. So we “escape” them with the special (... ...) form:
1 2 3 4 5 | (define-macro (define-op OP) #'(define-macro-cases OP [(OP FIRST) #'FIRST] [(OP FIRST NEXT (... ...)) #'(list 'OP FIRST (OP NEXT (... ...)))])) |
More generally, the (... pat) form means “escape all ellipses that appear within pat”. So (... ...) escapes a pattern consisting of a single ellipsis. We can also write longer patterns like (begin (... (foo ... (bar ...)))) and the two nested ellipses would be escaped, just as if we had escaped them individually by writing (begin (foo (... ...) (bar (... ...)))).
All that’s left to do is invoke our new macro with the desired names:
1 2 3 4 5 6 7 8 | (define-macro (define-op OP) #'(define-macro-cases OP [(OP FIRST) #'FIRST] [(OP FIRST NEXT (... ...)) #'(list 'OP FIRST (OP NEXT (... ...)))])) (define-op +) (define-op *) |
That’s it. Each call to define-op creates a new macro using the same template. Then, both + and * will be set up correctly.
Our define-op macro will work fine as is. But let’s learn one more thing about the ellipsis operator, which turns out to be a serious macro power tool.
We saw that in a syntax pattern, an identifier followed by an ellipsis like FOO ... will match a list of elements (including possibly zero elements). Then, when used in a syntax template, FOO represents the first element of that list, and ... represents the rest.
But when we use an ellipsis in a syntax template, it’s working harder than we might suppose. The ellipsis doesn’t just mean “put the other elements here”. It means “handle the other elements the same way we handled FOO”. So in a syntax template, we’re allowed to separate FOO and its ellipsis. Whatever we do to FOO ends up as a prototype for the rest of the list, represented by ....
Let’s try a simple example to see how this works. This lister macro will match its input arguments to the ARG ... pattern variable. Then we use ARG ... to insert them into the template in the same order:
1 2 3 4 | (define-macro (lister ARG ...) #'(list ARG ...)) (lister "foo" "bar" "baz") ; '("foo" "bar" "baz") |
Now consider the wrap macro below. It’s like lister. But in the syntax template, ARG is used separately from the ellipsis. We put ARG in a sublist with 42, and then the ellipsis takes care of doing the same thing to the rest of the matched arguments:
1 2 3 4 5 | (define-macro (wrap ARG ...) #'(list '(ARG 42) ...)) (wrap "foo" "bar" "baz") ; '(("foo" 42) ("bar" 42) ("baz" 42)) |
We can even use ARG multiple times in the template, and the ellipsis will still do the right thing:
1 2 3 4 5 | (define-macro (wrap2 ARG ...) #'(list '(ARG 42 ARG) ...)) (wrap2 "foo" "bar" "baz") ; '(("foo" 42 "foo") ("bar" 42 "bar") ("baz" 42 "baz")) |
Now we can reconsider our define-op macro. Let’s replace it with a define-ops macro that will let us generate both our + and * operations with one macro call. In the real world, this would be a pointless optimization. But we’re just looking for a pretext to use another ellipsis. Here’s the new code:
What’s changed here?
First, we add an ellipsis to the initial syntax pattern for the macro, so instead of (define-op OP), we have (define-ops OP ...). This makes our macro variadic.
Then, within the syntax template, we surround everything with a new (begin ···) form. A syntax template can only contain one top-level expression, so begin is just a way of grouping multiple expressions into one.
The code in the middle for the OP macro is the same as before. But after the end of this macro code, we add the ellipsis that corresponds to the OP pattern variable. This ellipsis applies this macro-code template to the other matched arguments, just as it was applied to OP.
Finally, we invoke our new macro with (define-ops + *).
Again, in the real world, we wouldn’t bother upgrading define-op to define-ops. But now we’ve seen how the ellipsis lets us repeat part of a syntax template for each member of a list of matched arguments.
Our finished expander should look like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #lang br/quicklang (provide + *) (define-macro (stackerizer-mb EXPR) #'(#%module-begin (for-each displayln (reverse (flatten EXPR))))) (provide (rename-out [stackerizer-mb #%module-begin])) (define-macro (define-ops OP ...) #'(begin (define-macro-cases OP [(OP FIRST) #'FIRST] [(OP FIRST NEXT (... ...)) #'(list 'OP FIRST (OP NEXT (... ...)))]) ...)) (define-ops + *) |
Now let’s run our original test programs. First, the old favorite:
When we run this module, we get:
Then we try:
This time, we get:
Exactly right.