A syntax object is a data structure native to Racket that holds everything you’d want to know about a piece of source code. Syntax objects are used extensively within Racket’s macro system. For instance, every macro takes a syntax object as input and returns another syntax object as output.
A syntax object is not an “object” in the sense of object-oriented programming. Rather, it’s just a data object, similar to a hash table.
At minimum, a syntax object consists of:
A datum that represents the literal code as it would appear in a source file. A syntax object can be flattened into a datum with syntax->datum:
1 2 | (syntax->datum #'foo) ; 'foo (syntax->datum #'(+ 1 2 3)) ; '(+ 1 2 3) |
Metadata about the code, most importantly its lexical context and source location. Most of these metadata fields have accessor functions:
1 2 3 4 5 | (define stx #'foo) (syntax-line stx) ; 2 (syntax-column stx) ; 14 (syntax-span stx) ; 3 (syntax-srcloc stx) ; (srcloc 'unsaved-editor 2 14 24 3) |
Optionally, a syntax object can also contain:
Syntax properties that are arbitrary key–value pairs, which are written and read with syntax-property:
1 2 3 4 | (define stx #'foo) (define stx+prop (syntax-property stx 'hello "world")) (syntax? stx+prop) ; #t (syntax-property stx+prop 'hello) ; "world" |
Other syntax objects, in place of any subdatum. These nested syntax objects retain their original syntax properties and metadata, including lexical context.
For instance, this macro assigns its argument to the pattern variable OUTER-OP, which then holds a reference to a syntax object containing the argument. The macro then uses OUTER-OP within a larger syntax object. This syntax object ends up with two variables named +: one that’s defined inside of the macro, and the other (represented by OUTER-OP) that’s defined outside. But they can coexist peacefully because they retain separate lexical contexts:
1 2 3 4 5 6 7 | (define-macro (nonplussed OUTER-OP) #'(begin (define + -) (list (+ 21 21) (OUTER-OP 21 21)))) (syntax->datum (expand-once #'(nonplussed +))) ; '(begin (define + -) (list (+ 21 21) (+ 21 21))) (nonplussed +) ; '(0 42) |
Be cautious of using syntax->datum indiscriminately, as it discards all the metadata and properties from any nested syntax objects. Usually, it’s wiser to preserve this information.
To create a syntax object in the current lexical context, simply wrap a datum with syntax, or equivalently, prefix the datum with #'. For instance, these syntax objects are the same (except for their source location):
The syntax prefix #' corresponds logically to the ' prefix used to create a datum. Just as a syntax object is a datum with metadata attached, the syntax prefix #' is the datum prefix with an extra character attached.
1 2 3 | '(+ 1 2) ; datum '(+ 1 2) #'(+ 1 2) ; syntax object with datum '(+ 1 2) inside (syntax->datum #'(+ 1 2)) ; '(+ 1 2) |
To create a syntax object within a different lexical context, use datum->syntax. The first argument is a syntax object that carries the target lexical context, and the second argument is a datum. The result is a syntax object that will behave as if it were created in the other lexical context.
This is a common way of circumventing macro hygiene. Below, make-z will create an identifier z with the lexical context of OUTSIDE-ID, so z behaves as if it had been defined in the same lexical context as OUTSIDE-ID (= outside the macro):
1 2 3 4 5 | (define-macro (make-z OUTSIDE-ID) (with-pattern ([Z-ID (datum->syntax #'OUTSIDE-ID 'z)]) #'(define Z-ID 42))) (make-z out-here) z ; 42 |
This macro combines two syntax objects: the identifier z returned by datum->syntax, which is then matched to the pattern variable Z-ID and used inside the larger syntax template #'(define Z-ID 42).
Without this maneuver, z is created in the macro’s lexical context (which is separate) and thus is not visible outside, due to the protections of hygiene:
1 2 3 4 | (define-macro (make-z OUTSIDE-ID) #'(define z 42)) (make-z out-here) z ; error: unbound identifier |
Any syntax object printed to the DrRacket REPL will be displayed with an arrow on the left. Clicking the arrow will reveal a box with the syntax object datum on the left, and metadata and properties on the right. Clicking on code within the datum on the left will reveal information about nested syntax objects. For instance, try running this in DrRacket:
1 2 3 4 5 | (define-macro (nonplussed OUTER-+) #'(begin (define + -) (list (+ 21 21) (OUTER-+ 21 21)))) (expand-once #'(nonplussed +)) |
expand-once will reveal the syntax object returned by the (nonplussed +) macro and print it on the REPL. Click on the arrow to reveal the details:
Click each of the two + signs in the third row of the datum and notice that they have different line numbers (because the second retains its source location from outside the nonplussed macro).
The easiest way to deconstruct a syntax object is with a syntax pattern, which offers regexp-style matching to break a syntax object into pieces. Forms like define-macro and define-macro-cases use syntax patterns to match the input arguments. Forms like with-pattern can be used inside a macro to further disassemble syntax objects and rearrange them:
1 2 3 4 | (define-macro (rearrange XS) (with-pattern ([(1ST 2ND 3RD) #'XS]) #'(list 3RD 2ND 1ST 3RD))) (rearrange (10 20 30)) ; '(30 20 10 30) |
By the way, when a pattern variable appears in a syntax pattern, like XS in (rearrange XS) or 1ST or 2ND or 3RD, it doesn’t have a #' prefix, because it doesn’t yet represent a standalone syntax object. But when that pattern variable appears in a syntax template on the right side of a pattern clause or in the body, like #'XS, it is acting as a syntax object, so it needs the syntax prefix (and the XS is implicitly replaced with the matched value).
For finer control, syntax objects can be dismantled manually, much like lists, though some extra housekeeping is necessary to use them with standard list operations. syntax->list will turn a single syntax object with a list-shaped datum into a list of syntax objects:
1 2 | (syntax->list #'(10 20 30)) ; '(#<syntax:2:17 10> #<syntax:2:20 20> #<syntax:2:23 30>) |
These smaller syntax objects can then be manipulated with ordinary list operations to build new syntax objects. Below, we use reverse and list to reverse our input:
1 2 3 4 5 | (define-macro (rev ARGS) (define arg-stxs (syntax->list #'ARGS)) (with-pattern ([(REVERSED-ARG ...) (reverse arg-stxs)]) #'(list REVERSED-ARG ...))) (rev (10 20 30)) ; '(30 20 10) |
Just as the syntax prefix #' is the syntaxed version of the basic ' prefix, the quasiquote operators (` , and ,@) also have quasisyntax equivalents (#` #, and #,@) that can build syntax objects in an analogous way: + See Lists for an introduction to quasiquote.
Syntax properties are used to attach extra metadata fields to a syntax object. For instance, the read-syntax function in #lang br attaches a 'paren-shape property to indicate whether the source code used square or curly brackets to delimit a list:
1 2 3 4 5 6 | (define stx1 (read-syntax #f (open-input-string "(1 2 3)"))) (syntax-property stx1 'paren-shape) ; #f (define stx2 (read-syntax #f (open-input-string "[1 2 3]"))) (syntax-property stx2 'paren-shape) ; #\[ (define stx3 (read-syntax #f (open-input-string "{1 2 3}"))) (syntax-property stx3 'paren-shape) ; #\{ |
Why? Because parentheses, brackets, and braces are syntactically equivalent. With a syntax property, we can preserve the extra information without affecting the meaning of the syntax object. In this case, if we wanted to convert the syntax back to a string, we could still recover the original shapes:
1 2 3 4 5 6 7 | (require syntax/to-string) (define stx1 (read-syntax #f (open-input-string "((1))"))) (syntax->string stx1) ; "(1)" (define stx2 (read-syntax #f (open-input-string "([2])"))) (syntax->string stx2) ; "[2]" (define stx3 (read-syntax #f (open-input-string "({3})"))) (syntax->string stx3) ; "{3}" |
More broadly, syntax properties are useful because macros have a limited interface: they take one syntax object and return another syntax object. Syntax properties can be used to stash other information so that macros can communicate with each other without affecting the input and output. In this example, a base macro echo is used by the twice and thrice macros with a 'count argument stashed as a syntax property, which echo uses to change its result:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | (require (for-syntax racket/list)) (define-macro (echo ARG) (define how-many (or (syntax-property #'ARG 'count) 1)) (with-pattern ([ARGS (make-list how-many #'ARG)] [ARGLIST (cons #'list #'ARGS)]) #'ARGLIST)) (echo "foo") ; '("foo") (define-macro (twice ARG) (with-pattern ([ARGPLUS (syntax-property #'ARG 'count 2)]) #'(echo ARGPLUS))) (twice "bar") ; '("bar" "bar") (define-macro (thrice ARG) ; quasisyntax notation, equivalent to above #`(echo #,(syntax-property #'ARG 'count 3))) (thrice "zam") ; '("zam" "zam" "zam") |
Syntax objects are the foundation of hygiene in Racket, which is what makes its sophisticated macro system possible. Hygiene is the idea that a code fragment floats within a bubble of bindings—provided by its lexical context—thereby permitting clean, predictable interactions between those fragments. The syntax object creates that bubble by associating a lexical context with a code fragment. Strings, by contrast, can capture the code fragment, but not the lexical context.
“If syntax objects are so great, why don’t other languages have them?” Those languages don’t have hygienic macros, so there’s no need. Instead, many languages have an eval or exec function that will treat a string as source code. For instance Python:
1 2 | >>> eval("6 + 7 * len('this string')") 83 |
You can perform an analogous operation on a Racket string with eval, using make-base-namespace to set up the default bindings:
1 2 3 4 | (define str "(+ 6 (* 7 (string-length \"this string\")))") (eval (format-datum '~a str) (make-base-namespace)) ; 83 |
But a syntax object will always be more idiomatic and compact:
1 2 | (define stx #'(+ 6 (* 7 (string-length "this string")))) (eval stx) ; 83 |
Syntax objects in the Racket Guide
Syntax-object content in the Racket Reference
Syntax-object properties in the Racket Reference