Along with the reader, the expander is the other essential component of every Racket-implemented language:
For an identifier to have a meaning, it needs a binding. Conversely, an identifier without a binding has no meaning.
For instance, if we try to run this program:
1 | x
|
We get this error:
1 | x: unbound identifier in module in: x |
Why? When we run the program, Racket tries to evaluate the identifier x. But it can’t, because x doesn’t yet have a binding. Hence the “unbound identifier” error.
We can bind x by assigning it a value with define:
1 2 | (define x "x is bound to this string") x |
1 | "x is bound to this string"
|
More broadly, every identifier in a Racket program needs a binding before the program can run. define is just one way we can bind an identifier. We can also import a binding from another module with require, if it was exported from that module with provide:
1 | "x is bound to this string"
|
This is why Racketeers tend to speak of “binding identifiers” rather than “assigning values”. The latter phrase narrowly suggests the use of define. But there are many ways to bind an identifier.
The main job of an expander is to bind the identifiers in a parse tree, even though no define expressions appear in the parse tree itself. This is why the reader delivers code without bindings—so the expander can start with a blank slate. The expander does this job by exporting a binding (using provide) for each identifier in the parse tree.
When the reader returns its module expression, it embeds a reference to the expander. For instance, in the read-syntax function for jsonic, inside module-datum, we see the reference to jsonic/expander:
1 2 3 4 5 6 7 8 9 | #lang br/quicklang (require "tokenizer.rkt" "parser.rkt") (define (read-syntax path port) (define parse-tree (parse path (make-tokenizer port))) (define module-datum `(module jsonic-module jsonic/expander ,parse-tree)) (datum->syntax #f module-datum)) (provide read-syntax) |
When the code represented by module-datum is evaluated, jsonic/expander is imported and provides the initial bindings. In other words, module behaves as if there were an implicit (require jsonic/expander) on the inside. So we don’t need an explicit require.
Again, this is mostly a recap. If the details are hazy, here’s a review.
Racket starts the expander for a language by invoking a macro called #%module-begin. Therefore, as in previous tutorials, our jsonic expander will provide a #%module-begin macro.
Beyond that, we just saw how the parse tree produced by the jsonic parser follows the production rules of the grammar. In turn, we can use these production rules to organize the rest of our expander. Let’s review the technique we first learned in bf:
The name (on the left side) of each rule is the name of the corresponding macro or function.
The pattern (on the right side) of each rule describes the possible input to that corresponding macro or function.
The phrase “macro or function” doesn’t imply that we’ll always have a choice. Certain transformations will require macros. Otherwise, we should prefer functions. But here in the jsonic expander, we’ll opt for macros only, just to get some more experience writing them.
With that in mind, let’s take another look at the grammar for jsonic:
1 2 3 4 | #lang brag jsonic-program : (jsonic-char | jsonic-sexp)* jsonic-char : CHAR-TOK jsonic-sexp : SEXP-TOK |
This grammar implies that our expander should have three more macros (in addition to #%module-begin):
A jsonic-program macro that accepts any number of arguments, each of which is either a jsonic-sexp or jsonic-char.
A jsonic-char macro that accepts one argument, which is the string value from a CHAR-TOK token.
A jsonic-sexp macro that accepts one argument, which is the string value from a SEXP-TOK token.
We also know from our language specification that we need to do some housekeeping in terms of converting results to strings, and making sure we’re generating valid JSON. We’ll deal with those tasks as they arise.
That gives us enough of a roadmap to get going.
Let’s create a new expander module called "expander.rkt". Our expander must have a #%module-begin macro, so we’ll start there. The shell of the macro looks like this:
1 2 3 4 5 6 | #lang br/quicklang (define-macro (jsonic-mb PARSE-TREE) #'(#%module-begin ···)) (provide (rename-out [jsonic-mb #%module-begin])) |
Following our existing habit, we define the macro as jsonic-mb and then use rename-out in the provide expression, so that it doesn’t conflict with the other #%module-begin imported from br/quicklang. The input to this macro is our PARSE-TREE.
Now let’s fill in the blanks. A major requirement of our language is that every jsonic program results in valid JSON. This implies two things: that the result of the program has to be a string, and that the string has to be validated against the JSON standard.
Let’s assume for a moment that we can convert our parse tree into a string (and we will, with our next set of macros). Since jsonic-mb is the top-level macro in our program, we can use it to validate our new string as valid JSON, and return the result:
1 2 3 4 5 6 7 8 9 | #lang br/quicklang (require json) (define-macro (jsonic-mb PARSE-TREE) #'(#%module-begin (define result-string PARSE-TREE) (define validated-jsexpr (string->jsexpr result-string)) (display result-string))) (provide (rename-out [jsonic-mb #%module-begin])) |
We’ll assume for now that the expansion of PARSE-TREE results in a string, which we assign to result-string.
To help us out, we import Racket’s json library. A major benefit of making languages in Racket is that all its libraries are available for every DSL, so we can be optimally lazy.
In this case, json exports a function called string->jsexpr. This function converts a string to a special kind of S-expression called a JS-expression. A JS-expression holds a representation of JSON data. If string->jsexpr succeeds, it returns a JS-expression; otherwise it raises an error.
Thus, we can confirm that the result-string of our program is valid JSON by calling string->jsexpr. For clarity, we assign its return value to validated-jsexpr. This is optional, since we ignore this result—we only care about the validation side effect. Then we just display our result-string.
Now we’ll add the macros that correspond to the production rules in our jsonic grammar.
We’ll start with jsonic-char, because it’s easy:
1 2 3 4 5 6 7 8 9 10 11 12 13 | #lang br/quicklang (require json) (define-macro (jsonic-mb PARSE-TREE) #'(#%module-begin (define result-string PARSE-TREE) (define validated-jsexpr (string->jsexpr result-string)) (display result-string))) (provide (rename-out [jsonic-mb #%module-begin])) (define-macro (jsonic-char CHAR-TOK-VALUE) #'CHAR-TOK-VALUE) (provide jsonic-char) |
This macro is defined with a pattern that matches one item, which is the value from a CHAR-TOK token, representing a character in a JSON string. All we need to do in this case is pass through this string, which means converting it into a syntax object.
By the way, there’s nothing special about the name of the pattern variable, in this case CHAR-TOK-VALUE. It has nothing to do with the token name used in the grammar. And the macro would work the same way with any pattern-variable name.
Then we’ll move to jsonic-program. It’s almost as easy:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | #lang br/quicklang (require json) (define-macro (jsonic-mb PARSE-TREE) #'(#%module-begin (define result-string PARSE-TREE) (define validated-jsexpr (string->jsexpr result-string)) (display result-string))) (provide (rename-out [jsonic-mb #%module-begin])) (define-macro (jsonic-char CHAR-TOK-VALUE) #'CHAR-TOK-VALUE) (provide jsonic-char) (define-macro (jsonic-program SEXP-OR-JSON-STR ...) #'(string-trim (string-append SEXP-OR-JSON-STR ...))) (provide jsonic-program) |
This macro is defined with a pattern that matches any number of items, each of which is either a jsonic-sexp or jsonic-char. In turn, each of these is going to be a macro that emits code that produces a string. As the top-level macro in the parse tree, jsonic-program needs to combine these little strings into one big string (which will become the input to string->jsexpr in jsonic-mb).
We can do this with string-append and, for tidiness, string-trim (which will lop off any leading or trailing whitespace). (As above, there’s nothing special about the pattern-variable name SEXP-OR-JSON-STR. It’s just meant to be descriptive.)
That leaves the jsonic-sexp macro, which is only slightly less easy:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | #lang br/quicklang (require json) (define-macro (jsonic-mb PARSE-TREE) #'(#%module-begin (define result-string PARSE-TREE) (define validated-jsexpr (string->jsexpr result-string)) (display result-string))) (provide (rename-out [jsonic-mb #%module-begin])) (define-macro (jsonic-char CHAR-TOK-VALUE) #'CHAR-TOK-VALUE) (provide jsonic-char) (define-macro (jsonic-program SEXP-OR-JSON-STR ...) #'(string-trim (string-append SEXP-OR-JSON-STR ...))) (provide jsonic-program) (define-macro (jsonic-sexp SEXP-STR) (with-pattern ([SEXP-DATUM (format-datum '~a #'SEXP-STR)]) #'(jsexpr->string SEXP-DATUM))) (provide jsonic-sexp) |
The input is the string value from a SEXP-TOK token, representing a Racket expression. In our macro syntax pattern, we call this SEXP-STR. We need to manually convert this string into a new datum that we can insert back into the program, as if it had been there all along.
with-pattern lets us introduce new pattern variables that we can then use inside syntax templates. In this case, we want to create a new SEXP-DATUM.
To make this datum, we pass our SEXP-STR input argument to format-datum. We used this function before, in the stacker reader, to convert a source string into an S-expression. We’re using it for the same purpose here.
A point about notation. In the syntax pattern defining the macro, we call the pattern variable SEXP-STR, but within format-datum, we write #'SEXP-STR. In both cases, the name of the pattern variable is SEXP-STR. But within the macro, we can’t use a naked pattern variable—we have to put it inside a syntax template. So when we say #'SEXP-STR, we’re creating the world’s smallest syntax template that contains only the pattern variable SEXP-STR.
Finally, we need to return the result of our Racket expression as a JSON string. In jsonic-mb, we used string->jsexpr from the json library. This library also provides the inverse function, jsexpr->string, to convert a Racket JS-expression to a JSON string. So in our syntax template, we just pass our SEXP-DATUM to this function.
As we saw before, one benefit of relying on Racket’s json library is that we don’t have to work as hard. But even better, we add a layer of error checking to our DSL for free. When jsexpr->string runs, it will confirm that our Racket expression represents a valid JS-expression, and raise an error if it doesn’t.
Of course, the cost of relying on the json library is that we have to abide by its rules. For instance, for a hash table to count as a valid JS-expression, it has to use only symbols for keys. Also, the JSON value null has to be represented as the Racket symbol 'null. If we wanted jsonic to be more lenient with input, we could insert a helper function that would convert a wider range of Racket values to the narrower set accepted as JS-expressions.
Before we test our expander, let’s check that everything is in the right place:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | #lang br/quicklang (require json) (define-macro (jsonic-mb PARSE-TREE) #'(#%module-begin (define result-string PARSE-TREE) (define validated-jsexpr (string->jsexpr result-string)) (display result-string))) (provide (rename-out [jsonic-mb #%module-begin])) (define-macro (jsonic-char CHAR-TOK-VALUE) #'CHAR-TOK-VALUE) (provide jsonic-char) (define-macro (jsonic-program SEXP-OR-JSON-STR ...) #'(string-trim (string-append SEXP-OR-JSON-STR ...))) (provide jsonic-program) (define-macro (jsonic-sexp SEXP-STR) (with-pattern ([SEXP-DATUM (format-datum '~a #'SEXP-STR)]) #'(jsexpr->string SEXP-DATUM))) (provide jsonic-sexp) |
Now that we have a reader and expander for jsonic, let’s try using the language.