A syntax pattern is a tool for matching elements within a syntax object. Syntax patterns are used extensively in , especially for separating the input into named pieces so they can be rearranged. In that way, a syntax pattern does for a syntax object what a regular expression does for a string.
1 2 3
(define-macro (m foo) #'"match") (m foo) ; "match" (m bar) ; error: no matching case for pattern
Whereas m2, defined with define-macro-cases, matches patterns of zero arguments, a literal foo identifier, or a literal foo followed by anything:
1 2 3 4 5 6 7 8 9 10
Syntax patterns cooperate closely with . A syntax template is an expression that creates a syntax object. In a syntax template, internal references to pattern variables created by earlier syntax patterns are automatically replaced with their underlying matched value. Within a macro definition or with-pattern expression, any wrapped in a syntax expression (or, equivalently, prefixed with #') is treated as a syntax template.
Syntax patterns are often used throughout a macro to destructure syntax objects and syntax templates. For instance, this m3 macro contains three syntax patterns and three syntax templates:
1 2 3 4 5 6
(m3 MID ... LAST) is a syntax pattern that defines the possible input arguments to the macro, and matches them to pattern variables.
(syntax LAST) is a syntax template containing only the element matched by LAST. We could also write this as #'LAST. These elements are matched to another syntax pattern, (ONE TWO THREE).
#'(MID ...) is a syntax template containing the elements matched by MID .... We could also write this as (syntax (MID ...)). These elements are matched to the syntax pattern (ARG ...).
#'(list ARG ... THREE TWO ONE) is a syntax template containing the matched elements inside a list.
Of course, for a syntax pattern to produce a match, the input syntax has to conform to the pattern. For instance, the syntax pattern (ONE TWO THREE) needs LAST to be a list of three elements. If it isn’t, an error arises:
(m3 25 42 ("foo" "bar"))
with-pattern: unable to match pattern (ONE TWO THREE) in: ("foo" "bar")
A syntax pattern can have five possible ingredients:
A literal, which only matches itself. Numbers, strings, and symbols are always literals. In define-macro, define-macro-cases, and with-pattern, identifiers that are not in UPPERCASE are treated as literals. + In standard Racket, you need to list out literals separately (see, e.g., syntax-case). This is a chore that the br syntax functions handle automatically.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
(define-macro-cases num [(num 42) "match"] [else "nope"]) (num 42) ; "match" (num 24) ; "nope" (num "foo") ; "nope" (define-macro-cases str [(str "foo") "match"] [else "nope"]) (str "foo") ; "match" (str foo) ; "nope" (str "bar") ; "nope" (define-macro-cases sym [(sym 'foo) "match"] [else "nope"]) (sym 'foo) ; "match" (sym 'bar) ; "nope" (sym "foo") ; "nope" (define-macro-cases id [(id foo) "match"] [else "nope"]) (id foo) ; "match" (id bar) ; "nope" (id "foo") ; "nope"
When matching literal identifiers, a trap awaits the unwary. A literal identifier in a syntax pattern is matched on the basis of its name but also its . + Racket jocks might know this as equality in the sense of free-identifier=?. In the submodule below, mac looks like it will match the literal identifier zeta. But outside the submodule, when we import zeta from math and then pass it as input to mac, we don’t get a match:
1 2 3 4 5 6 7 8 9
Why not? Because the identifier zeta has no binding at the macro-definition site. + And this, in turn, is because of macro hygiene: mac lives in a separate , and can’t see the zeta binding at the . Even though the names of the literal identifiers are the same, their bindings are not. Therefore, the pattern doesn’t match, and the result is "nope".
We can make the two zeta identifiers match if we also make the same binding available at the macro-definition site, by importing math:
1 2 3 4 5 6 7 8 9 10
A pattern variable (or wildcard) which can match anything (including a list of things) and assigns the matched item a name. Once a pattern variable is defined, all appearances of the pattern variable within a syntax object are replaced with the matched value.
1 2 3 4 5 6
The special wildcard _ also matches anything, but it can be used any number of times in a syntax pattern, and it cannot appear in a syntax template. It’s useful for signaling that an element of the syntax datum is being ignored.
A sublist pattern, which will only match elements arranged with the same parenthesization. If you know a certain element will be a list, a sublist pattern can be used to immediately match elements inside that list. Sublist patterns can be nested to any depth.
1 2 3 4 5 6 7 8
By the way, a sublist pattern cannot create a sublist where none exists in the input:
An ellipsis, which has to follow a pattern variable, and matches as many items as it can (similar to the “greedy” * operator in regular expressions).
Even though an ordinary pattern variable will match exactly one item, a pattern variable with an ellipsis can match zero items.
If a pattern variable has an ellipsis, when the variable appears in a syntax object, the ellipsis must also appear (otherwise it’s an error):
You can only have one ellipsis in each sublist of the pattern, including the top level: + But see syntax/parse, an advanced macro-creation system that supports a richer pattern vocabulary.
1 2 3 4 5 6 7 8
A dot, which has to precede a pattern variable at the end of a list, and matches all remaining items in the list starting at the dot. The resulting pattern variable can either be used in a syntax object alone (in which case it’s treated as a list of items) or with another dot (in which case it’s spliced into the result). This explanation is more complicated than the example:
1 2 3 4 5
“Isn’t a dot just a less flexible way of writing an ellipsis?” Pretty much. The above macros could be written with ellipses like so:
1 2 3 4 5
You can probably go your whole career as a macro writer without using the dot. But if you come across it in someone else’s source code, you’ll know what it means. (BTW, you can’t use a dot and ellipsis in the same level of a pattern.)
As a final exam, let’s combine all our vocabulary elements into a single pattern. Hopefully the result is not surprising:
1 2 3 4
As with regular expressions, you can often choose among multiple syntax patterns to match a single syntax object. Which you choose is a question of how strict you want to be about the match, and how you want to manipulate the pieces thereafter.
For instance, these are all valid ways to match (and then reassemble) the list (+ 1 2):
1 2 3 4 5 6 7 8 9 10
(define-macro (m1 ARGS) #'ARGS) (m1 (+ 1 2)) ; 3 (define-macro (m2 (ARG ...)) #'(ARG ...)) (m2 (+ 1 2)) ; 3 (define-macro (m3 (1ST 2ND 3RD)) #'(1ST 2ND 3RD)) (m3 (+ 1 2)) ; 3 (define-macro (m4 (1ST REST ...)) #'(1ST REST ...)) (m4 (+ 1 2)) ; 3 (define-macro (m5 (1ST . TAIL)) #'(1ST . TAIL)) (m5 (+ 1 2)) ; 3
You cannot make a syntax pattern that captures arguments two at a time (or three at a time, etc.).
You cannot take two ellipsized pattern variables and interleave their values within a syntax template.
You cannot make matches optional, in the sense of “match zero or one occurrence of this item”. An ellipsis can approximate an optional argument, because it will match zero occurrences, but it will also match more than one:
If these shortcomings bum you out, the syntax/parse library supports a richer vocabulary of syntax patterns.
Syntax patterns in the Racket Reference