Maybe it seems obvious. But let’s be precise.
As programmers, our only indispensable tool is a programming language. A programming language puts a computer—a term we’ll use to refer to any computing device—under our control by letting us translate our ideas into terms the computer will understand. This is what it means to write a program.
Computers can be programmed without a language. For instance, they can be trained. There may come a time when computers don’t need humans to explain everything to them. Or anything. But until then, programming languages will remain the essential interface.
Still, despite our dependence on programming languages, a language is usually presented to us in a black box. We’re encouraged to use the language as we wish. But we’re dissuaded from scrutinizing how it works. And that’s fine, as far as it goes. The luxury of ignoring details is one of the great pleasures of programming.
But the other great pleasure of programming is learning how things work, so we can change them to suit our needs. A programming language may start out in a black box. But we shouldn’t be intimidated about taking it out of that box. Once we do, we’ll see that there’s nothing esoteric or special about a language. After that, we can start thinking about programming languages in a new way: as open-ended and malleable.
So let’s open the box.
On a technical level, a programming language is just another program. It accepts certain input. It evaluates that input. It produces a result. The twist is that the input to a programming language is a string of text that describes another program—also known as source code. The result of the programming language is the program described by that source code. By convention, the software that implements the programming language and converts the source code is often known as a compiler or interpreter. These terms exist mostly so we can avoid tangled phrases like “the program that makes the other program”. + Yes, there is a distinction between these terms, but we can leave it aside for now.
Because a compiler or interpreter for a programming language is a program, the process of writing one isn’t much different than any other program: we pick a language suited for the task and start working.
Note, however, that the programming language we use to write a compiler or interpreter may have nothing to do with the implemented language. For instance, much of racket—the new program now on your computer that runs Racket source code—is implemented in C. Nothing shocking. Many languages can be compiled to JavaScript. Business as usual. Likewise, we’re going to use Racket to implement new languages—like stacker—that may not look or behave like Racket.
Broad-minded readers might observe that this idea of take some input → put it into a processing device → get a result is a pattern seen at all levels of programming. Depending on the context, we might call the “processing device” in the middle a function, or a program, or a programming language. But these distinctions are largely arbitrary. For instance, it’s fair to think of a programming language as a special kind of function. It probably seems weird now, but it’s also fair to think of any function as a domain-specific language with a very tiny domain.
On a design level, a programming language sets the ground rules for what can happen in a program. It specifies what kind of source code is acceptable, what that source code should mean, and what kind of results should be generated.
Since a programming language can set these ground rules, it offers a wider horizon of possibilities than an ordinary program. Why? Because an ordinary program is necessarily restricted by the rules of the programming language it’s written in. But when we make a programming language instead of a program, we free ourselves from many of these constraints.
On a cultural level, a programming language is a way to explore our evolving ideas about programming—algorithms, performance, ergonomics, expressiveness, and so on. Though a programming language is obviously a tool for writing a program, it’s also a tool for discovering new ways to program.
It’s no coincidence that Racket itself emerged from the work of a team of programming-language researchers. They’ve used it as a platform for testing out new programming ideas. + A language that doesn’t affect the way you think about programming is not worth knowing.
—Alan Perlis
But why should researchers have all the fun? If languages were just about writing programs, we could’ve stopped with C. (And some have.) But computers and languages are interesting specifically because they’re malleable. (That is changing.) The more we expect out of programs, the more vital it is to explore new ways of making programs.
Which includes making new programming languages.
When we think about programming languages, we shouldn’t only think of large, general-purpose languages like Racket and Rust and Python and C. Languages are more diverse than that: they can be big or small, general or specialized, or anywhere in between.
This brings us back to the idea of domain-specific languages, or DSLs. Among languages, DSLs arguably have the most undiscovered potential, because they’ve been overlooked for so long. Why overlooked? Because until now, we haven’t had tools for making them efficiently.
But even if we haven’t been making DSLs, we’ve used them all the time, probably without even noticing: DSLs that are designed to be used inside larger languages, like regular expressions, SQL and printf-style format strings. DSLs that stand alone, like CSS/HTML, PostScript, Coq, TeX, R, Julia, and MATLAB. Tools descended from Unix that are essentially DSLs, like awk, bash, lex/yacc, and make. + Also see domain-specific languages in the appendix.
“Hold on—HTML is a data-markup format, not a Turing-complete programming language.” Point taken. But in this book, when we talk about DSLs and other programming languages, we’re going to take a broad view of a language as any structured notation for describing data or operations on data. This view includes Turing-complete languages, but also languages that do less.
Let’s not fear simplicity. Regular expressions, for instance, aren’t Turing complete, but they’re still useful. Conversely, plenty of Turing-complete languages are still useless.
If you think programming should fill your brain and soul with feelings of power, creativity, curiosity, and fun—then you’ll probably like making programming languages. + Writing a compiler from scratch is so satisfying, even if it is for the most lame language ever ... you feel like the master of the universe!
—Racket bigwig Jay McCarthy
If not, then you can move along. (No hard feelings.)
Past that, we have the practical benefits:
Enlarge the solution space. It doesn’t matter which language you like best or use most—sometimes you encounter a problem that doesn’t mesh with the language idiom. When you make a language, you expand your possibilities, which also allows the solution to be carefully tailored to the problem.
Certain programs naturally describe languages. Languages are good for what we might call wide-funnel problems, where there’s a large universe of possible inputs, but a relatively small scope of outputs. (How to tell: does your program take input from arbitrarily complex files, and turn them into a simpler representation?)
Better glue. A little language can fill holes in a larger toolchain. For instance, Python doesn’t have a preprocessor like C does, but you could make one as a DSL. (Python fans, please don’t write me to explain why this is a bad idea. I’m not here to say what you should do, just what you can do.)
Better interface. Whether you work alone or in a team, a language can wrap a streamlined, easy-to-understand interface around other code.
You’ll know something others don’t. Knowing how to make a programming language will teach you ideas that you can fold into your usual programming work.
Of course, a programming language isn’t the right solution for every problem. But when creating a language is easy and inexpensive—and in Racket, it often is—then it becomes a realistic option for many more problems.
Racket is a general-purpose programming language that provides a high-level interface for making new languages. For any Racket-implemented language, we’ll proceed in three steps:
Design the notation and behavior of our new language.
Write a Racket program that takes source code written in the new language and converts its notation and behavior to an equivalent Racket program.
Run this new Racket program normally.
Thus, every language implemented in Racket is really just an indirect way of writing other Racket programs. What we’re making in step #2 is sometimes known as a source-to-source compiler or transcompiler.
Why is this a nice approach? Once we’re able to compile our new language into Racket, the new language can rely on everything in Racket’s toolchain: the libraries, the cross-platform deployment, the packaging and testing tools, the DrRacket IDE, and so on. We don’t have to recreate it all for each language. That’s great news.
The cost, however, is that we have to learn how to make a source-to-source compiler in Racket. But that’s why this book exists. And we only have to learn it once. Then we can make pretty much any language we want.
Moreover, since the Racket implementation of a language is itself just a Racket program, these languages can do anything that Racket can. So we can make languages that behave like traditional programming languages, printing their results to the terminal. But we can also make languages that behave in unconventional ways—for instance, they might produce pictures or sounds, since Racket can do all of that too.