Beautiful Racket: Hygiene

Beautiful Racket / explainers


Determine the bindings according to what the identifiers mean at the place where the macro was defined (aka the definition site). This is Racket’s default policy, and part of what is implied by hygiene.

Determine the bindings according to what the identifiers mean at the place where the macro was invoked (aka the calling site). Sometimes this is the behavior we want, so Racket lets us “break hygiene” when we need to.


Code produced by a macro adopts the lexical context of the macro-definition site. Therefore, this code can only rely on identifiers that have bindings at the definition site. Below, the mac macro can create code that refers to x, because x has a binding at the macro-definition site (albeit outside the macro):
(define x 42)
(define-macro (mac)
#'(println x))
(mac) ; 42
1 2 3 4
(define x 42) (define-macro (mac) #'(println x)) (mac) ; 42

Within code produced by the macro, new bindings can shadow existing bindings at the definition site (similar to how let works). For example, this updated mac macro defines its own x, which overrides the x defined outside the macro, so the result is now 84:
(define x 42)
(define-macro (mac)
  #'(begin
      (define x 84)
      (println x)))
(mac) ; 84
1 2 3 4 5 6
(define x 42) (define-macro (mac) #'(begin (define x 84) (println x))) (mac) ; 84

Bindings introduced by a macro are only visible to other code produced by that macro. In the example below, one x is defined outside the mac macro; another is defined inside. The (println x) generated by the macro refers to the x defined by the macro. The (println x) outside the macro refers to the x defined outside the macro. So we end up with two identifiers called x with different values:
(define x 42)
(define-macro (mac)
  #'(begin
      (define x 84)
      (println x)))
(mac) ; 84
(println x) ; 42
1 2 3 4 5 6 7
(define x 42) (define-macro (mac) #'(begin (define x 84) (println x))) (mac) ; 84 (println x) ; 42

The corollary to this rule is that bindings introduced by a macro are not visible outside the macro. Every early-stage Racketeer tries to write a variation of the define-x macro below, and is flummoxed by the error:
(define-macro (define-x)
#'(define x 42))
(define-x)
(println x) ; error: unbound identifier
1 2 3 4
(define-macro (define-x) #'(define x 42)) (define-x) (println x) ; error: unbound identifier

But it’s not surprising: the x defined inside the macro lives inside a separate lexical context, so the (println x) outside the macro can’t see it. If this still annoys you, consider an analogous example with let, which won’t surprise anyone.
(let ([x 42])
x)
(println x) ; error: unbound identifier
1 2 3
(let ([x 42]) x) (println x) ; error: unbound identifier

Every identifier retains its binding from its original lexical context. + This is accomplished by using syntax objects throughout the macro system. Here, we pass the outer x to our macro as an argument and assign it to the pattern variable OUTER-X. When OUTER-X appears in our macro code, it still refers to the x defined outside the macro, while the other x refers to the one defined inside the macro:
(define x 42)
(define-macro (mac OUTER-X)
  #'(begin
      (define x 84)
      (println x)
      (println OUTER-X)))
(mac x)
1 2 3 4 5 6 7
(define x 42) (define-macro (mac OUTER-X) #'(begin (define x 84) (println x) (println OUTER-X))) (mac x)
84
42
1 2
84 42

(define-macro (define-$ (ID ARG ...) BODY ...)
  (define id$-datum (format-datum '~a$ (syntax->datum #'ID)))
  (with-pattern ([ID$ (datum->syntax #'ID id$-datum)])
    #'(define (ID$ ARG ...)
      BODY ...)))
(define-$ (f x) (* x x))
(f$ 5) ; 25
(define-macro (define-$ (ID ARG ...) BODY ...)
  (define id$-datum (format-datum '~a$ (syntax->datum #'ID)))
  (with-pattern ([ID$ (datum->syntax #'ID id$-datum)])
    #'(define (ID$ ARG ...)
      BODY ...)))
(define-$ (f x) (* x x))
(f$ 5) ; 25

The unhygienic identifier is created next, with datum->syntax. The first argument of datum->syntax is the lexical context for the identifier being created. In this case, we use #'ID because it came from the calling site, and therefore has the lexical context we want to borrow. The second argument is our datum. We match this new identifier to the pattern variable ID$, so we can use it in the syntax template below. Within the template, we use ID$ in the name position of a standard define form.


Lexical scope in the Racket Guide

Macro-introduced bindings in the Racket Reference

Beautiful Racket / explainers

Hygiene

What problem does hygiene solve?

The golden rules

Breaking hygiene

Further

Beau­tiful Racket / explainers

Hygiene

What problem does hygiene solve?

The golden rules

Breaking hygiene

Further

Beautiful Racket / explainers