During the expansion phase of program evaluation, Racket wraps certain operations in an extra macro that acts as a hook for providing default behavior. These macros are known as interposition points. We can recognize them by their names, which are prefixed with #%.
For instance, #%module-begin. + The stacker tutorial introduces #%module-begin. When Racket expands a module expression, it starts by wrapping the body of the expression in #%module-begin. So this:
1 2 | (module name lang/expander body-exprs ···) |
Becomes this:
1 2 | (#%module-begin ;; imported from `lang/expander` body-exprs ···) |
Evaluation continues from here, with #%module-begin taking the body-exprs ··· as input and returning a result.
If a needed interposition point isn’t available in the language, then the operation fails. This is why in our language tutorials, we always export a #%module-begin. Under the hood, every source file becomes a module expression. So every language also needs a #%module-begin, or evaluation will always fail.
In most language implementations, it suffices to export the default interposition points, because they give us the vanilla behavior we usually want. But they can be modified if we want to achieve certain special effects, as demonstrated below.
The easiest way to understand interposition points is to see how they affect the evaluation of a language. In particular, it’s useful to be able to recognize the errors that arise when they’re missing.
Suppose we start making a language in a source file called "lang.rkt":
1 | #lang br |
Don’t panic—it’s blank, except for the first line. In the same directory, we create a source file that invokes this language:
1 | #lang reader "lang.rkt" |
Don’t panic—it’s blank too. When we run this test file, we get an error:
1 | read-syntax: cannot find reader for `#lang reader "lang.rkt"' |
Though read-syntax is not an interposition point, it operates by a similar principle. When Racket runs the language, it automatically looks for the read-syntax function, and passes the source code as input, so it can be turned into a module expression. + The stacker tutorial introduces read-syntax.
Thus, we can fix this error by adding a simple read-syntax function:
1 2 3 4 5 6 7 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) |
This time, when we run our test file:
1 | #lang reader "lang.rkt" |
We get two new errors:
1 2 3 | module: no #%module-begin binding in the module's language in: (module lang-mod "lang.rkt") Interactions disabled: "lang.rkt" does not support a REPL (no #%top-interaction) |
Strictly speaking, the first error is being reported by Racket itself (as it tries to run our tiny language), and the second by DrRacket (as it tries to create a REPL prompt). Notice that both errors arise from a missing interposition point: the first from #%module-begin, and the second from #%top-interaction.
We can fix these errors the same way we fixed the missing read-syntax: by exporting these identifiers from our language. We can write our own version of each missing macro. Or we can also just export an existing one.
For #%module-begin, let’s do it that way:
1 2 3 4 5 6 7 8 9 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin) |
When we run our test file again, we only get one error:
1 | Interactions disabled: "lang.rkt" does not support a REPL (no #%top-interaction) |
#%top-interaction implicitly wraps every expression that’s entered at the REPL. Therefore, if we want the REPL to be available in our language, we need to export #%top-interaction. As before, it suffices to reuse the #%top-interaction already available:
1 2 3 4 5 6 7 8 9 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction) |
This time, when we run our test file:
1 | #lang reader "lang.rkt" |
We don’t get an error.
Unlike #%module-begin, there’s rarely a useful reason to write a special #%top-interaction. If we want to customize the REPL for a language, we use other tools. (See the third BASIC tutorial for an example.) But #%top-interaction is still a prerequisite for booting the REPL.
Now let’s put a value in our test file:
1 2 | #lang reader "lang.rkt" 42.0 |
When we run this file, we get a new error:
1 2 | ?: literal data is not allowed; no #%datum syntax transformer is bound in: 42.0 |
#%datum is the interposition point that wraps every instance of literal data in a program, like numbers and strings. So 42.0 becomes (#%datum . 42.0), and "str" becomes (#%datum . "str").
If we want to be able to use literal data in a program, we need to export #%datum:
1 2 3 4 5 6 7 8 9 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction #%datum) |
With this change, our test file no longer raises an error, and instead prints our value as usual:
1 | 42.0
|
We can customize #%datum if we want to affect the evaluation of literal values. For example, in the BASIC tutorial, we converted inexact integers to exact integers in the expander. We could move that housekeeping into #%datum:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction) (provide (rename-out [new-datum #%datum])) (define-macro (new-datum . D) (with-pattern ([NEW-D (let ([val (syntax->datum #'D)]) (if (and (integer? val) (inexact? val)) (inexact->exact val) val))]) #'(#%datum . NEW-D))) |
Notice that we use the same basic technique as when we customize #%module-begin. First, we write our new macro with a different name. Second, we pass our customized result to the default #%datum. Third, when we export our macro, we use rename-out to change its name to the expected #%datum.
This time, when we run our test file:
1 2 | #lang reader "lang.rkt" 42.0 |
The literal value 42.0 gets converted to an exact integer:
1 | 42
|
Now suppose we want to accomplish the epic feat of subtracting two integers:
1 2 | #lang reader "lang.rkt" (- 42 10) |
Unfortunately, it doesn’t work:
1 2 | -: unbound identifier; also, no #%app syntax transformer is bound in: - |
The “unbound identifier” part of the error can be fixed by exporting - from our language (we also reset #%datum to its vanilla meaning):
1 2 3 4 5 6 7 8 9 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction #%datum -) |
With that change, we get a more descriptive error:
1 2 | -: function application is not allowed; no #%app syntax transformer is bound in: (- 42 10) |
Just as #%datum is the interposition point that wraps every instance of literal data, #%app is the form that wraps every function application. So (- 42 10) becomes (#%app - 42 10). Of course, in this example, the integers are still wrapped by #%datum, so the transformation ends up looking like this:
We can customize #%app if we want to affect how functions are applied to their arguments. For instance, by default function arguments are evaluated from left to right. We can reverse this behavior by customizing #%app:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction #%datum -) (provide (rename-out [new-app #%app])) (define-macro (new-app ID . ARGS) (with-pattern ([SGRA (reverse (syntax->list #'ARGS))]) #'(#%app ID . SGRA))) |
This version of #%app reverses the order of arguments, and then calls the default #%app. With this change, (- 42 10) instead behaves like (- 10 42):
1 2 | #lang reader "lang.rkt" (- 42 10) |
1 | -32
|
Alternatively, if we edit our language to use the default #%app:
1 2 3 4 5 6 7 8 9 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction #%datum - #%app) |
We get the usual result:
1 | 32
|
To complete our tour of interposition points, let’s swap a variable into our math expression:
1 2 | #lang reader "lang.rkt" (- 42 x) |
This time, we get a new error:
1 2 | x: unbound identifier; also, no #%top syntax transformer is bound in: x |
Every identifier in a program without a local binding (e.g., from let) gets wrapped with #%top, which signals “this identifier uses the current top-level binding”. #%top doesn’t know or care whether that binding exists.
It’s true, however, that every unbound identifier gets wrapped with #%top. + This stands to reason: if the identifier had some local binding, it wouldn’t be unbound. So we can use #%top to do something about unbound identifiers:
1 2 3 4 5 6 7 8 9 10 11 12 13 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction #%datum - #%app) (provide define (rename-out [my-top #%top])) (define-macro (my-top . ID) (if (identifier-binding #'ID) #'ID #'25)) |
In this case, we redefine #%top to check if the identifier is bound (with identifier-binding). If it’s bound, we pass it through. If not, we substitute the value 25.
When run our test file again, the undefined x is converted to 25:
1 2 | #lang reader "lang.rkt" (- 42 x) |
1 | 17
|
Suppose we insert an explicit binding for x with define:
Because x is no longer unbound, the result changes:
1 | 24
|
But as with our other interposition points, if we just want the usual behavior of #%top, we can export the existing version:
1 2 3 4 5 6 7 8 9 | #lang br (provide read-syntax) (define (read-syntax path port) (datum->syntax #f `(module lang-mod "lang.rkt" ,@(for/list ([datum (in-port read port)]) datum)))) (provide #%module-begin #%top-interaction #%datum - #%app #%top) |
Finally, if we revert to our original test case:
1 2 | #lang reader "lang.rkt" (- 42 x) |
We get the usual error:
1 | x: unbound identifier in module in: x |
The br/quicklang dialect automatically exports the default versions of #%top, #%app, #%datum, and #%top-interaction. Why? Because in practice, these forms are rarely customized. But if you’re implementing a language starting from something other than br/quicklang (say, racket/base) then you’ll need to handle this housekeeping yourself.