In stacker we wrote our expander using imperative programming. In funstacker we upgraded the expander with functional programming. To nail down this contrast, we’ll do the same with bf: we’ll first write the expander in an imperative style, and then upgrade it to a functional style. (Those who already feel comfortable with functional programming can move ahead to the next section.)
Here’s where we left our expander at the end of the last section:
1 2 3 4 5 6 | #lang br/quicklang (define-macro (bf-module-begin PARSE-TREE) #'(#%module-begin PARSE-TREE)) (provide (rename-out [bf-module-begin #%module-begin])) |
We don’t need to change anything in our mandatory #%module-begin macro. So we’ll move on to the other macros and functions we need to handle the contents of PARSE-TREE.
We saw how the reader follows the grammar while parsing the source code, producing a parse tree whose nodes correspond to the names and patterns of the production rules of the grammar.
In turn, we can follow the grammar further, and use these production rules to design our expander. Once again, here’s our bf grammar:
1 2 3 4 | #lang brag bf-program : (bf-op | bf-loop)* bf-op : ">" | "<" | "+" | "-" | "." | "," bf-loop : "[" (bf-op | bf-loop)* "]" |
What this grammar tells us is that we need to handle three types of parse-tree nodes—bf-program, bf-op, and bf-loop. In the expander, these will become the names of macros or functions that will implement each type of node as standard Racket code.
Furthermore, the pattern on the right side of each rule tells us what kind of input the corresponding macro or function has to accept. So we don’t have to bumble around in the dark. The grammar gives us specific marching orders.
Let’s say it one more time, because it’s such a tremendous shortcut:
Each production rule in the grammar will have a corresponding macro or function in the expander.
The name (on the left side) of each production rule is the name of the corresponding macro or function.
The pattern (on the right side) of each production rule describes the possible input to its corresponding macro or function.
Given a certain production rule, how do we know if we should implement it with a macro or a function? The rule of thumb is to use a function where we can, and a macro where we must.
Often, a production rule does something simple, like return its arguments in a list or print them. In those cases, the rule can be implemented with a function.
But other times, the production rule needs us to rearrange the code in ways that an ordinary function won’t support. In those cases, we can escalate to a macro.
But for bf, we’ll only use macros, so we can get a little more practice with how they work.
To see how this works, let’s pencil out a tiny example. This miniature bf program:
Will be parsed into a tree that looks like this:
1 | (bf-program (bf-op "+") (bf-loop "[" (bf-op ">") "]")) |
Once we’ve written the corresponding macros, our expander will proceed as follows:
Call the bf-program macro with two input arguments:
(bf-op "+")
(bf-loop "[" (bf-op ">") "]")
Call the bf-op macro with one input argument:
"+"
Call the bf-loop macro with three input arguments:
"["
(bf-op ">")
"]"
Call the bf-op macro with one input argument:
">"
The result of this expansion needs to be a standard Racket program that can be evaluated to produce a result.
When we first used define-macro, we learned that the first line of the definition relies on a syntax pattern. A syntax pattern does for syntax objects what a regular expression does for strings: it breaks down the input into pieces so they can be manipulated & rearranged.
Let’s start with our bf-program macro. It’s easy—it doesn’t do anything. It’s just the top-level wrapper for the other bf commands that do the real work. According to the grammar, this macro has to be able to handle any number of bf-op or bf-loop arguments:
1 | bf-program : (bf-op | bf-loop)* |
So we’ll define our macro using this syntax pattern:
1 | (bf-program OP-OR-LOOP-ARG ...) |
The pattern has three parts.
bf-program denotes a literal identifier in the code, and becomes the name of the macro. Every element of a define-macro syntax pattern matches literally …
… unless it’s in ALL-CAPS, which denotes a pattern variable that can match anything. The name of a pattern variable can be whatever we like. In this case, we use OP-OR-LOOP-ARG, to remind us that our matched item is either a bf-op or bf-loop.
Finally, the ellipsis, which is similar to a * quantifier in regular expressions. When used after a pattern variable, the ellipsis gathers all the arguments that follow. Like a * quantifier, the OP-OR-LOOP-ARG ... combination can also match zero arguments.
Taken together, our syntax pattern above will define a macro called bf-program that can take any number of arguments, which are matched to OP-OR-LOOP-ARG ....
This syntax pattern will handle every possible pattern of arguments described by the bf-program production rule in the grammar. In general, as we convert the grammar into macros, we’ll essentially be translating each production rule into a syntax pattern.
Here’s the rest of the bf-program macro:
1 2 3 4 5 6 7 8 9 10 | #lang br/quicklang (define-macro (bf-module-begin PARSE-TREE) #'(#%module-begin PARSE-TREE)) (provide (rename-out [bf-module-begin #%module-begin])) (define-macro (bf-program OP-OR-LOOP-ARG ...) #'(void OP-OR-LOOP-ARG ...)) (provide bf-program) |
We recall that the return value of a macro is a syntax template for the rewritten code. In this case, because bf programs have their own printing commands, we’ll just pass our input arguments to the function void, which discards them. + Similar to piping output to /dev/null in Unixy operating systems, for those who have done that. When we write our syntax template, we use the pattern variables and ellipses from the syntax pattern. When the macro is invoked, they’ll be replaced with the actual input.
Next, our bf-loop macro. Let’s review the production rule for bf-loop, so it can guide our syntax pattern for the macro definition:
1 | bf-loop : "[" (bf-op | bf-loop)* "]" |
This is the same as the bf-program rule, but with "[" and "]" surrounding the bf-op or bf-loop arguments in the middle. So our syntax pattern for this macro will look similar to the bf-program macro, but with brackets added:
1 | (bf-loop "[" OP-OR-LOOP-ARG ... "]") |
Now we can put in the macro:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | #lang br/quicklang (define-macro (bf-module-begin PARSE-TREE) #'(#%module-begin PARSE-TREE)) (provide (rename-out [bf-module-begin #%module-begin])) (define-macro (bf-program OP-OR-LOOP-ARG ...) #'(void OP-OR-LOOP-ARG ...)) (provide bf-program) (define-macro (bf-loop "[" OP-OR-LOOP-ARG ... "]") #'(until (zero? (current-byte)) ; function coming soon OP-OR-LOOP-ARG ...)) (provide bf-loop) |
We remember from how bf works that the arguments inside a loop are evaluated repeatedly until the current byte is zero. We can implement this behavior with an until conditional, which keeps evaluating its body expressions until the current byte is zero?. To get that value, we’ll call a current-byte function that doesn’t exist yet (but we’ll add it in a minute).
Now to handle bf-op. As we saw in the grammar and sample parse trees, a bf-op node holds a character from the source code that tells us how bf-op should behave:
1 | bf-op : ">" | "<" | "+" | "-" | "." | "," |
As we did with bf-program and bf-loop, we’ll translate this production rule into a syntax pattern for the bf-op macro.
The wrinkle this time is that bf-op has six possible patterns. To handle this, we’ll use a variant of define-macro called define-macro-cases that will let us set up our macro with six cases, one for each character we might get from bf-op.
Similar to cond, define-macro-cases can have any number of branches. Each branch has a syntax pattern on the left, and a syntax template on the right. When the macro is invoked, each pattern on the left is tested in order. When a pattern matches, the syntax template on the right is returned. So if we just fill in the syntax patterns, we get this:
1 2 3 4 5 6 7 8 | (define-macro-cases bf-op [(bf-op ">") ···] [(bf-op "<") ···] [(bf-op "+") ···] [(bf-op "-") ···] [(bf-op ".") ···] [(bf-op ",") ···]) (provide bf-op) |
Our bf-op macro needs to convert these six patterns into functions that implement the bf language. For now, let’s put in the names of those functions—we’ll fill in their definitions in a moment. The finished bf-op macro looks like this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | #lang br/quicklang (define-macro (bf-module-begin PARSE-TREE) #'(#%module-begin PARSE-TREE)) (provide (rename-out [bf-module-begin #%module-begin])) (define-macro (bf-program OP-OR-LOOP-ARG ...) #'(void OP-OR-LOOP-ARG ...)) (provide bf-program) (define-macro (bf-loop "[" OP-OR-LOOP-ARG ... "]") #'(until (zero? (current-byte)) ; function coming soon OP-OR-LOOP-ARG ...)) (provide bf-loop) (define-macro-cases bf-op [(bf-op ">") #'(gt)] ; We have [(bf-op "<") #'(lt)] ; not made [(bf-op "+") #'(plus)] ; these functions [(bf-op "-") #'(minus)] ; yet, but [(bf-op ".") #'(period)] ; we will [(bf-op ",") #'(comma)]) ; shortly. (provide bf-op) |
Now we need to add the missing functions that implement the rest of the bf language.
Recall from how bf works that when a bf program starts, it creates an array of 30,000 bytes (initialized to 0) and a pointer into that array (initialized to the 0 position). The array and the pointer are state values: they keep a record of what’s happened in the program so far.
We’ll follow this description by implementing the byte array as a Racket vector. Unlike a Racket list, a vector is not a series of linked pairs. It’s just a block of memory, so it’s typically better in situations that require random, nonsequential access to data elements. + See data structures for more about which Racket data structures are suitable for which needs.
The make-vector function takes two arguments: a size for the vector, and a default value for its cells. Our pointer can just be the number 0. Let’s add them to "expander.rkt":
1 2 | (define arr (make-vector 30000 0)) (define ptr 0) |
Unlike our macros, we don’t have to provide our array variables because they don’t need to be visible outside the expander. They’re just for internal use.
The current byte is the byte in the array at the location indicated by the pointer. So our current-byte function just gets the value from arr at the position ptr using vector-ref:
1 | (define (current-byte) (vector-ref arr ptr)) |
Other functions also need to set the current byte. For that, we use vector-set!:
1 | (define (set-current-byte! val) (vector-set! arr ptr val)) |
Then we need to make the functions used in our bf-op macro. The gt and lt functions—corresponding to the bf operations > and <—increase or decrease the pointer by one. As we did in stacker, we mutate our state variable ptr, using set!:
The plus and minus functions—corresponding to the bf operations + and -—increase or decrease the value of the current byte. We can express these operations in terms of current-byte and set-current-byte!:
The period function, corresponding to a . in bf code, writes the current byte to standard output. We use the Racket library function write-byte to help out:
1 | (define (period) (write-byte (current-byte))) |
Finally, the comma function, corresponding to a , in bf code, sets the current byte to one read from standard input. We can do this with the Racket function read-byte:
Taken together, our expander should now look like this, with the new array implementation highlighted:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | #lang br/quicklang (define-macro (bf-module-begin PARSE-TREE) #'(#%module-begin PARSE-TREE)) (provide (rename-out [bf-module-begin #%module-begin])) (define-macro (bf-program OP-OR-LOOP-ARG ...) #'(void OP-OR-LOOP-ARG ...)) (provide bf-program) (define-macro (bf-loop "[" OP-OR-LOOP-ARG ... "]") #'(until (zero? (current-byte)) OP-OR-LOOP-ARG ...)) (provide bf-loop) (define-macro-cases bf-op [(bf-op ">") #'(gt)] [(bf-op "<") #'(lt)] [(bf-op "+") #'(plus)] [(bf-op "-") #'(minus)] [(bf-op ".") #'(period)] [(bf-op ",") #'(comma)]) (provide bf-op) (define arr (make-vector 30000 0)) (define ptr 0) (define (current-byte) (vector-ref arr ptr)) (define (set-current-byte! val) (vector-set! arr ptr val)) (define (gt) (set! ptr (add1 ptr))) (define (lt) (set! ptr (sub1 ptr))) (define (plus) (set-current-byte! (add1 (current-byte)))) (define (minus) (set-current-byte! (sub1 (current-byte)))) (define (period) (write-byte (current-byte))) (define (comma) (set-current-byte! (read-byte))) |
Let’s confirm that our test file "atsign.rkt" is ready to go:
1 2 3 | #lang reader "reader.rkt" Greatest language ever! ++++-+++-++-++[>++++-+++-++-++<-]>. |
When we run this, we’ll see the result:
1 | @ |
Great—this shows that the macros in our expander are successfully converting a bf parse tree into functions that correctly implement the bf language rules.
What if we try to run our “Hello World” bf program with our new implementation?
1 2 3 4 5 6 | #lang reader "reader.rkt" ++++++[>++++++++++++<-]>. >++++++++++[>++++++++++<-]>+. +++++++..+++.>++++[>+++++++++++<-]>. <+++[>----<-]>.<<<<<+++[>+++++<-]>. >>.+++.------.--------.>>+. |
1 | Hello, World! |
How about that factorial generator? (Warning: this program will emit factorials until you stop it. To interrupt a running program in DrRacket, use the menu options Racket → Ask the Program to Quit or Racket → Force the Program to Quit):
1 2 3 4 5 6 7 | #lang reader "reader.rkt" >++++++++++>>>+>+[>>>+[-[<<<<<[+<<<<<]>>[[-]>[<<+>+>-] <[>+<-]<[>+<-[>+<-[>+<-[>+<-[>+<-[>+<-[>+<-[>+<-[>+<- [>[-]>>>>+>+<<<<<<-[>+<-]]]]]]]]]]]>[<+>-]+>>>>>]<<<<< [<<<<<]>>>>>>>[>>>>>]++[-<<<<<]>>>>>>-]+>>>>>]<[>++<-] <<<<[<[>+<-]<<<<]>>[->[-]++++++[<++++++++>-]>>>>]<<<<< [<[>+>+<<-]>.<<<<<]>.>>>>] |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | 1 1 2 6 24 120 720 5040 40320 362880 3628800 39916800 479001600 6227020800 87178291200 1307674368000 ··· |
That’s it—we’ve successfully implemented the bf language. + … minus some fiddly portability details, which are too dull to pursue here.
In real life, we’d want to run a larger set of sample programs before drawing optimistic conclusions. But we’ll defer that kind of stress testing for a later tutorial.
In this version of the expander, we implemented bf in an imperative-programming style. We set up two state values (arr and ptr) and called functions (gt, lt, plus, and minus) that mutated them. This wasn’t wrong—our test programs worked. And it had a certain logic, as bf itself is an imperative language. Who knows—on a certain set of test programs, this may even turn out to be the fastest approach.
But it’s not idiomatic in Racket, where functional programming is the favored style. So we’d be shortchanging ourselves if we stopped here. Now that we have a mental model for how bf works as a Racket language, let’s convert our expander to a functional-programming style.