We remember from our language specification that every valid JSON file should be a valid jsonic program. Let’s test that proposition by running our original sample JSON under #lang jsonic:
1 2 3 4 5 6 7 8 9 10 11 12 | #lang jsonic [ null, 42, true, ["array", "of", "strings"], { "key-1": null, "key-2": false, "key-3": {"subkey": 21} } ] |
1 2 3 4 5 6 7 8 9 10 11 | [ null, 42, true, ["array", "of", "strings"], { "key-1": null, "key-2": false, "key-3": {"subkey": 21} } ] |
Very good. Now let’s change the 42 to 3/5—which is an invalid JSON number—and see what happens.
1 2 3 4 5 6 7 8 9 10 11 12 | #lang jsonic [ null, 3/5, true, ["array", "of", "strings"], { "key-1": null, "key-2": false, "key-3": {"subkey": 21} } ] |
1 | string::14: string->jsexpr: error while parsing a json array |
We could imagine a friendlier error message. The number 14 signals that the problem is at the 14th character of the source code. But jsonic is behaving correctly. It’s raising an error because it can’t produce a valid JSON result.
Now let’s replace our JSON values with embedded Racket expressions. Taken together, they should produce the same result:
1 2 3 4 5 6 7 8 9 10 11 | #lang jsonic // a line comment [ @$ 'null $@, @$ (* 6 7) $@, @$ (= 2 (+ 1 1)) $@, @$ (list "array" "of" "strings") $@, @$ (hash 'key-1 'null 'key-2 (even? 3) 'key-3 (hash 'subkey 21)) $@ ] |
1 2 3 4 5 6 7 | [ null, 42, true, ["array","of","strings"], {"key-1":null,"key-3":{"subkey":21},"key-2":false} ] |
That’s also correct. The whitespace is formatted differently. The keys inside the JSON object have a different order. But those differences don’t change the meaning of the resulting JSON.
As we did above, let’s break the source code. This time we replace the Racket expression (* 6 7) with (/ 3 5), which produces an invalid number:
1 2 3 4 5 6 7 8 9 10 11 | #lang jsonic // a line comment [ @$ 'null $@, @$ (/ 3 5) $@, @$ (= 2 (+ 1 1)) $@, @$ (list "array" "of" "strings") $@, @$ (hash 'key-1 'null 'key-2 (even? 3) 'key-3 (hash 'subkey 21)) $@ ] |
1 | jsexpr->string: expected argument of type <legal JSON value>; given: 3/5 |
Again, we could imagine a friendlier error message. But jsonic is doing what we hoped. It raises an error when we try to embed an uncooperative S-expression.
We’ve met our original requirements, so jsonic is complete.
One lurking bug we should acknowledge. Below, we can see that (symbol->string '$@) is a legit Racket expression that ought to evaluate to the string "$@". So why doesn’t it?
1 2 3 | #lang jsonic @$ (symbol->string '$@) $@ |
1 | string::20: read: unexpected `)` |
The problem is that the $@ character sequence is being used both as the right-hand delimiter of the embedded Racket expression, but also as literal code within that expression. Our jsonic language has no way of distinguishing between these two uses. It reads the shorter code fragment @$ (symbol->string '$@ as a single token because this matches one of the lexing rules. But once it trims the (apparent) delimiters, it can’t figure out what to do with the resulting malformed Racket expression (symbol->string '. Hence the error.
Unfortunately, this syntactic problem is always with us. In any language, we need to attach special notational meaning to certain strings. But often, we still want to be able to use those strings as literal data. For instance, these programs won’t work either, for a similar reason:
1 2 | #lang br (display "this is a double quote: " ") |
1 | read-syntax: expected a closing `"` |
1 2 | #lang br (display (car (regexp-match #rx"+" "+"))) |
1 | read-syntax: `+' follows nothing in pattern |
Now that we’ve built a lexer, we can appreciate why languages sometimes need escape characters—extra characters added to the source to signal “read these characters literally, not as special notation”. Escape characters remove ambiguity by making the intended meaning explicit in the program syntax.
For instance, we would fix the two programs above by adding the appropriate escape characters:
1 2 | #lang br (display "this is a double quote: \" ") |
1 | this is a double quote: " |
1 2 | #lang br (display (car (regexp-match #rx"\\+" "+"))) |
1 | + |
Likewise, we could define escape sequences for the $@ and @$ delimiters in jsonic, so that they could be used in embedded Racket expressions. We could accomplish this by the usual process of refining our lexing and parsing rules to recognize the new language elements. But having noted the flaw, we won’t digress further. The details are left as an exercise for the curious.