Functions in BASIC are introduced with the def statement, and can be invoked with any expressions (including other function applications) as arguments. The result of the function is a single expression:
1 2 3 4 5 6 7 | #lang basic-demo-3 10 x = 2 : y = 3 : z = 5 20 print f(3, 4) 30 print f(f(3, g(2)), 2) 40 def f(x, y) = x * y * z 50 def g(i) = i + i 60 print y |
1 2 3 | 60 600 3 |
In ye olde days, BASIC functions had to begin with the prefix fn, for ye olde reasons. Here in the 21st century, we won’t observe this restriction. Instead, we’ll allow functions to be named with any identifier.
In an exception to the usual all-variables-are-global policy, the variables representing the arguments to the function are separate from the ones outside, though the function can refer to existing global variables. In other words, just like Racket’s define, local variables introduced by a def shadow the ones outside.
A function introduced with def can be used anywhere in the program, even before the program reaches the line. So in the example above, even though f is not defined until line 40, it can be invoked on lines 20 and 30.
We probably noticed that def is similar to let in that it assigns a value to an identifier. So we can already guess that we’ll implement def by making the usual lexer and parser updates, and then in the expander, converting it into a lambda expression that gets assigned to the function identifier. Right. Hopefully, at this point, that much seems easy.
The harder part is the last requirement: that every function defined in the program, no matter where, can be invoked right away. This means that our def statements have to circumvent the usual control flow. They have to be moved out of our list of line functions to the top of the module, so they get evaluated right away.
In a sense, it’s similar to the situation we confronted in the previous tutorial when implementing variables. In that case, we found the unique variable names throughout the program by flattening the parse tree and looking for items with the 'b-id syntax property.
But that was possible because an identifier is a standalone item, so we could afford to discard the structure of the parse tree in our search. In this case, we need to keep each of the b-def nodes intact.
If we wanted, we could pick through the parse tree a little more gingerly and hoist the b-def nodes to the top level. But this is a common enough task that Racket’s macro system has beaten us to it: as we’ll see, we can simply tell Racket to move the result of a macro to the top level of our module by using a lift. Best of all, using the lift is a lot easier than explaining why we need it.
To our lexer, we add def and , to our list of reserved-terms (we need the comma for separating the arguments in a function call):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | #lang br (require brag/support) (define-lex-abbrev digits (:+ (char-set "0123456789"))) (define-lex-abbrev reserved-terms (:or "print" "goto" "end" "+" ":" ";" "let" "=" "input" "-" "*" "/" "^" "mod" "(" ")" "if" "then" "else" "<" ">" "<>" "and" "or" "not" "gosub" "return" "for" "to" "step" "next" "def" ",")) (define basic-lexer (lexer-srcloc ["\n" (token 'NEWLINE lexeme)] [whitespace (token lexeme #:skip? #t)] [(from/stop-before "rem" "\n") (token 'REM lexeme)] [reserved-terms (token lexeme lexeme)] [(:seq alphabetic (:* (:or alphabetic numeric "$"))) (token 'ID (string->symbol lexeme))] [digits (token 'INTEGER (string->number lexeme))] [(:or (:seq (:? digits) "." digits) (:seq digits ".")) (token 'DECIMAL (string->number lexeme))] [(:or (from/to "\"" "\"") (from/to "'" "'")) (token 'STRING (substring lexeme 1 (sub1 (string-length lexeme))))])) (provide basic-lexer) |
In our parser, we create new rules to handle the def statement itself, but also to support function calls as expressions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | #lang brag b-program : [b-line] (/NEWLINE [b-line])* b-line : b-line-num [b-statement] (/":" [b-statement])* [b-rem] @b-line-num : INTEGER b-rem : REM @b-statement : b-end | b-print | b-goto | b-let | b-input | b-if | b-gosub | b-return | b-for | b-next | b-def b-end : /"end" b-print : /"print" [b-printable] (/";" [b-printable])* @b-printable : STRING | b-expr b-goto : /"goto" b-expr b-let : [/"let"] b-id /"=" (STRING | b-expr) b-if : /"if" b-expr /"then" (b-statement | b-expr) [/"else" (b-statement | b-expr)] b-input : /"input" b-id @b-id : ID b-gosub : /"gosub" b-expr b-return : /"return" b-for : /"for" b-id /"=" b-expr /"to" b-expr [/"step" b-expr] b-next : /"next" b-id b-def : /"def" b-id /"(" b-id [/"," b-id]* /")" /"=" b-expr b-expr : b-or-expr b-or-expr : [b-or-expr "or"] b-and-expr b-and-expr : [b-and-expr "and"] b-not-expr b-not-expr : ["not"] b-comp-expr b-comp-expr : [b-comp-expr ("="|"<"|">"|"<>")] b-sum b-sum : [b-sum ("+"|"-")] b-product b-product : [b-product ("*"|"/"|"mod")] b-neg b-neg : ["-"] b-expt b-expt : [b-expt ("^")] b-value @b-value : b-number | b-id | /"(" b-expr /")" | b-func b-func : ID /"(" b-expr [/"," b-expr]* /")" @b-number : INTEGER | DECIMAL |
The b-def rule follows the style we saw above:
1 | def f(x, y) = x * y * z |
Where we have a function name, a list of comma-separated variables between parentheses, and finally an expression:
1 | b-def : /"def" b-id /"(" b-id [/"," b-id]* /")" /"=" b-expr |
Then we also want to be able to parse function calls. Near the bottom, we add a b-func rule as a possible b-value. The b-func rule follows pattern similar to b-def, with a function identifier followed by comma-separated arguments between parentheses:
1 | b-func : ID /"(" b-expr [/"," b-expr]* /")" |
There’s a subtle maneuver here. Rather than matching the b-id rule in the first position, we’re deliberately matching a raw ID token.
Why? We recall from our implementation of variables that everything marked as a b-id in the parse tree will be hoisted into the top level of our module and defined as a new variable with value 0. Usually that’s what we want, so that a program referring to an undeclared variable will just print 0:
1 2 | #lang basic 10 print f |
But what should happen when an undeclared variable is called as a function?
1 2 | #lang basic 10 print f(42) |
If we use b-id in the pattern for b-func, we’d be signaling that the undeclared f should be initialized to 0. That means our f(42) would become 0(42). Which is meaningless.
Therefore, we’d rather trigger a standard unbound-identifier error. We do this by using ID rather than b-id, which will match the right kind of token, but won’t cause the variable to be automatically defined. That way, when we try to call an undefined function, we get this instead:
1 | f: unbound identifier in module in: f |
In so doing, we’re choosing to handle this condition as a compile-time error. It wouldn’t be wrong to just let the expression become 0(42) and then raise a run-time error. For instance, our b-func macro that will be handling function calls could notice the 0 in the function position and call raise-line-error.
But there are two arguments for doing it this way. First, it’s polite to raise an error as soon it can be noticed, rather than procrastinate. Here, the error can be detected at compile time, so we should raise it at compile time. Moreover, if we defer it into run time, it may take a long time to cause a problem. Second, we can make the behavior of basic in this situation consistent with regular Racket, with the familiar “unbound identifier” error.
Now we need to add two macros to our "expr.rkt" module to handle our new b-def and b-func nodes in the parse tree. At the top, we import "line.rkt" so we can use our raise-line-error function below. We also export our two new b-def and b-func macros:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | #lang br (require "line.rkt") (provide b-expr b-sum b-product b-neg b-expt b-def b-func) (define (b-expr expr) (if (integer? expr) (inexact->exact expr) expr)) (define-macro-cases b-sum [(_ VAL) #'VAL] [(_ LEFT "+" RIGHT) #'(+ LEFT RIGHT)] [(_ LEFT "-" RIGHT) #'(- LEFT RIGHT)]) (define-macro-cases b-product [(_ VAL) #'VAL] [(_ LEFT "*" RIGHT) #'(* LEFT RIGHT)] [(_ LEFT "/" RIGHT) #'(/ LEFT RIGHT 1.0)] [(_ LEFT "mod" RIGHT) #'(modulo LEFT RIGHT)]) (define-macro-cases b-neg [(_ VAL) #'VAL] [(_ "-" VAL) #'(- VAL)]) (define-macro-cases b-expt [(_ VAL) #'VAL] [(_ LEFT "^" RIGHT) #'(expt LEFT RIGHT)]) (define-macro (b-def FUNC-ID VAR-ID ... EXPR) (syntax-local-lift-expression #'(set! FUNC-ID (λ (VAR-ID ...) EXPR)))) (define-macro (b-func FUNC-ID ARG ...) #'(if (procedure? FUNC-ID) (FUNC-ID ARG ...) (raise-line-error (format "expected ~a to be a function, got ~v" 'FUNC-ID FUNC-ID)))) |
Our b-def macro rewrites the function definition as a lambda expression, and then uses set! to assign it to FUNC-ID.
To get this expression to the top of our module, we wrap the macro result in syntax-local-lift-expression, which is like a magic compile-time elevator that moves a syntax object to the top of its enclosing module.
Since macros are a big part of how Racket works, it includes a set of syntax-transformation tools—not usually needed, and some arcane—that simplify certain operations within macros. syntax-local-lift-expression has an imposing name, but this simplified example illustrates what it does:
1 2 3 4 5 6 7 8 9 | (define-macro (lift-me EXPR) (syntax-local-lift-expression #'EXPR)) (displayln 'outside-let) (let () (displayln 'let-start) (lift-me (displayln 'whee)) (displayln 'let-end)) |
Ordinarily, we’d expect 'whee to be displayed after 'let-start and before 'let-end. But the lift moves (displayln 'whee)) to the top level of the module, just before its enclosing let, so the code is rearranged like so:
When we run the original sample, the expressions are displayed in this lift-adjusted order:
1 2 3 4 | outside-let whee let-start let-end |
Racket won’t have a specialized syntax-transformation tool for every task. But in those cases, we can usually get the job done by manipulating the parse tree by hand (as we did with variables). Here, we lucked out, because the lift does exactly what we want.
The b-func macro is pretty easy: all we need to do is apply FUNC-ID to its list of ARGs. To be safe, we add a check to make sure that FUNC-ID is a procedure?. It will be if there was a corresponding def statement that assigned a lambda to that variable. But it’s also possible for the variable to be assigned an inappropriate value.
In that case, we use raise-line-error to stop the program with an informative message. Note how we’re using the pattern variable FUNC-ID twice in the error message:
1 2 | (format "expected ~a to be a function, got ~v" 'FUNC-ID FUNC-ID) |
The quoted form represents its literal name at compile time; the unquoted form will represent its value at run time.
We should now be able to run our original test program:
1 2 3 4 5 6 7 | #lang basic 10 x = 2 : y = 3 : z = 5 20 print f(3, 4) 30 print f(f(3, g(2)), 2) 40 def f(x, y) = x * y * z 50 def g(i) = i + i 60 print y |
1 2 3 | 60 600 3 |
Let’s also see what happens when we hide one of our function definitions with rem:
1 2 3 4 5 6 7 | #lang basic 10 x = 2 : y = 3 : z = 5 20 print f(3, 4) 30 print f(f(3, g(2)), 2) 40 rem def f(x, y) = x * y * z 50 def g(i) = i + i 60 print y |
1 | f: unbound identifier in module in: f |
And let’s see what happens when we define a function correctly but then reassign it a bad value:
1 2 3 4 5 6 7 | #lang basic 10 x = 2 : y = 3 : z = 5 20 f = "foobar" : print f(3, 4) 30 print f(f(3, g(2)), 2) 40 def f(x, y) = x * y * z 50 def g(i) = i + i 60 print y |
1 | error in line 20: expected f to be a function, got "foobar" |