The Forth Dimension

17 Oct


The Forth programming language reached maturity around 1970 after more than ten years of development and experimentation by its creator, Charles H. Moore. Its first practical use was to control a radio telescope at the National Radio Astronomy Observatory, where Moore was employed at the time. From there Forth spread to other telescopes and other observatories, cementing a connection with astronomy and space science that persists to this day; in addition to controlling countless telescopes and planetariums earthside, Forth has been into space many times on probes and satellites of all descriptions. Yet already by the end of its first decade Forth had spread far beyond astronomical circles. It was being used to control the motorized cameras used to film miniature-based special-effects sequences (suddenly a booming business in the wake of Star Wars); to control automotive diagnostic equipment; as the firmware in various medical devices; to control automated agricultural equipment. Closer to our usual interests, Atari had invested a lot of money into developing a version of the language suitable for programming pinball machines and stand-up arcade games, while versions of the language were available for all of the trinity of 1977 within a year or so of their appearance. The key to Forth’s burgeoning popularity was its efficiency: it not only ran faster than just about any language short of assembly, but in the right hands it was also almost unbelievably stingy with memory. Those were good qualities to have in the late 1970s, when the average PC ran at 1 MHz and had no more than 16 K.

We’ll get into why Forth is so efficient in just a bit. But first let’s take a look at the language itself. If you’ve programmed before in just about any other language, Forth will likely seem deeply, deeply weird. Still, there’s also something kind of beautiful about it. If you’d like to follow along with the few examples I’ll give in this article, you have many free implementations of the language to choose from. A good choice, and one that has the advantage of working on Windows, Macintosh, and Linux alike, is Gforth.

Forth is an interactive programming language, like the microcomputer BASICs so many of us grew up with. This means that you can enter commands directly at the Forth console and watch them run immediately.

Forth is also a stack-based programming language, and this is the key to everything else about it. Virtually every programming language uses a stack under the hood; it’s one of the most fundamental mechanisms of computer science. But most other languages try to hide the stack from us, strain to make it so that we need not trouble ourselves over it and, indeed, don’t really need to know much about it at all. The only time many programmers even hear the word “stack” is when an infinite loop or runaway recursion causes a program to crash with a “stack overflow error.” Forth, however, doesn’t hide its stack away like something shameful. No, Forth loves its stack, sets it front and center for all to see. Forth demands that if we are to love it, we must also love its stack. Given this, it would behoove me at this point to explain just what is meant by the idea of a stack in the first place.

A stack is just that: a stack of numbers stored in a special part of memory, used for storing transient data. Adding a number to the stack is called pushing to the stack. It always goes on top. Taking a number from the stack is called popping the stack. It’s always the top number — i.e., the most recently pushed — that’s popped, after which that number is erased from the stack. A stack is, in other words, a first-in-last-out system — or, if you like, a last-in-first-out system. If you haven’t quite wrapped your head around the idea, don’t sweat it. It should become clearer momentarily.

Let’s look at how we can do simple arithmetic in Forth. Let’s say we want to add 2 and 3 together and print the result. In a typical modern language like Java, we’d just do something like this:

System.out.print(2 + 3);

In Forth, we do it a bit differently. If you’ve started up a Forth environment, you can type this in and see the result immediately.

2 3 + .

If you happened to use a Hewlett-Packard scientific calculator back in the day, this notation might look familiar to you. It’s known as “postfix” or “reverse Polish” notation. Let’s unpack this piece by piece to see how exactly Forth is handling this expression.

The first thing to understand here is that Forth reads almost everything as a word — Forthese for a command. A number standing by itself is actually interpreted as a word, a command to push that number onto the stack. Therefore, assuming we started with an empty stack, the stack looks like this after the first two parts of the expression above have been processed:


Now the interpreter comes to the “+,” which is also read as a command, instructing it to pop two values off the stack, add them together, and push the result back onto the stack. After doing so, the stack looks like this:


Finally, the “.” just instructs the interpreter to pop the top value off the stack and print it.

Now let’s consider a more complicated algebraic expression, like “(4 + 5) * (6 + 7).” In Forth, it would be written like this:

4 5 + 6 7 + * .

Let’s walk through this. We push 4 and 5 onto the stack.


We pop them off, add them together, and push the result to the stack.


We push 6 and 7 onto the stack.


We add them together and push the result.


We pop the top two values on the stack, multiply them together, and push the result.


And finally we pop and print the result.

To this point we’ve been working interactively. The key to programming in Forth, however, is to define new words; this is Forth’s nearest equivalent to the function calls common to other languages. Let’s consider a function to cube a number, which would look like this in Java:

int cube (int num) {
   return (num * num * num);

In Forth, we might do it by entering the following lines at the console:

: CUBE ( N -> N. Cube a number)
   DUP DUP ( Now there are three copies)
   * * ( Get the cube)

Let’s again unpack this piece by piece. The colon is a word which tells the interpreter that what follows will be a new word definition, to be terminated by a semicolon. “CUBE” is the name of the new word we are creating. All text within parenthesis are comments, to be ignored by the interpreter. The “N -> N.” notation within the first parenthesis is not required, but is considered good practice in Forth programming. It tells us that this word will pop and operate on the current topmost word on the stack, and will push a single result onto the stack. Forth words do not take arguments like functions in other languages, but operate only on the current contents of the stack. Thus it’s the programmer’s responsibility to set the stack up properly before invoking a word, and to know what that word will have done to the stack when it finishes. The two lines in the middle are the meat of the word, the actual instructions it represents.

Let’s say we call this new word “CUBE” with a 5 on top of the stack — i.e., by entering “5 CUBE .” at the console. Thus the initial stack looks like this:


Now we’re going into the body of the word itself. The two “DUP” statements tell the interpreter to duplicate the top value on the stack twice, without destroying — i.e., without actually popping — the original value. So, we end up with:


Now we pop the top two values, multiply them together, and push the result.


Then we just do the same thing again.


And our work is done.

Next we’ll see how we can use this word within another word. But first let’s see how we would do that as a function in Java.

void cubes10() {
   for (int i = 0; i < 10; i ++) {
      System.out.print(i + " ");

Here it is as a Forth word:

: CUBES10 ( ->. Print a table of cubes of 0-9.)
   10 0 ( Indices of loop)
   DO ( Start Loop)
      CR I . I CUBE . ( Print a number and its cube.)
   LOOP ( End of loop.)

As the first comment indicates, the “CUBES10” word expects nothing on the stack and leaves nothing there. We begin by pushing 10 and 0 onto the stack. Now Forth’s back-asswordness really comes to the fore: the “DO” word pops the last two words off the stack. It will increment a variable — always known as “I” — from the second of these until it is equal to the first of these, looping each time through the block of words contained between “DO” and “LOOP.” Within the loop, the word “CR” simply causes the cursor to move down to the next line. Keeping in mind that “I” represents the current value of the variable being incremented, which can be pushed and popped just like a constant, the rest should hopefully be comprehensible. The output looks something like this:

0 0
1 1
2 8
3 27
4 64
5 125
6 216
7 343
8 512
9 729

Forth is built entirely from words like the ones we’ve just created. In fact, calling Forth a programming language may be something of a misnomer because virtually every piece of its vocabulary is redefinable. Forth comes with a dictionary of common, useful words, but the programmer is always free to replace these with others of her own devising, to make Forth into whatever she wants it to be. The most basic words are not constructed from other Forth words but rather written as in-line assembly language. The programmer adds words to this base which do ever more complicated tasks, until finally she writes a word that subsumes the entire program. To take an example from Leo Brodie’s classic book Starting Forth (one of Forth’s chief products down through the decades has been horrid puns), a Forth program to control a washing machine might have this as its top-level word:


Each of the words referenced within “WASHER” would likely call several words of their own. “RINSE,” for instance, might look like this:


Each of these words would call still more words of its own, until we come to the level of fundamental assembly-language commands to control the CPU on its most basic level. Forth words can even create new words dynamically, resulting in programs that effectively rewrite themselves as they run to suit their environment.

Especially if you’re a programmer yourself, you may have already formed an impression by now of Forth’s strengths and weaknesses. And yes, contrary to the claims of many Forth zealots, the latter do exist in considerable numbers. Even leaving aside the strange reverse notation, which one can eventually get used to, Forth programs can be incredibly hard to actually read thanks to their reliance on pushing and popping to the stack, with the associated lack of helpful variable names. For this reason Forth has occasionally been called a “write-only” language; Forth code can be well-nigh incomprehensible even to the person who originally wrote it after just a week or so has elapsed. It’s the polar opposite of a contemporaneous language I once wrote about on this blog, Pascal, replacing the latter’s pedantic emphasis on structure and readability above all else with a love of hackerish tricks, sleight of hand, and cleverness that can sometimes come off as sort of facile. Just trying to keep a picture in your head of the current configuration of the stack, something on which absolutely everything you do in Forth depends, can be a nightmare as programs get more complicated and their possible states get more varied. If not quite the last language in the world I’d use to write a complicated modern application, it must be pretty close to it. It’s “write-only” qualities make it particularly unsuitable for team projects, a problem given that most useful software long ago got too complicated for solo programmers.

Yet there’s also an uncompromising beauty about Forth that has drawn many people to it, a beauty that has occasionally been compelling enough to override people’s better judgment and make them use the language for purposes to which it really isn’t terribly suited. Whatever else you you can say about it, it sticks to its philosophical guns tenaciously. There’s a fascination to building a dictionary of your own, to effectively making a programming language all your own. Grizzled Forth programmers have often replaced virtually everything that comes with the language to create something that is absolutely theirs. That’s a rare experience indeed in modern programming. People who love Forth really love it. This (in Leo Brodie’s words) “high-level language,” “assembly language,” “operating system,” “set of development tools,” and “software design philosophy” has that rare ability, like my old love the Commodore Amiga, to inspire a level of visceral, emotional commitment that smacks more of romance or religion than practicality.

If we do insist on speaking practically, within certain domains Forth excels. It’s still widely used today in extremely constrained environments where every byte and every processor cycle counts, such as, well, the firmware inside a washing machine. To understand what makes Forth so efficient, we need to first understand that those more readable Java functions I showed you above must ultimately be converted into a form pretty close to that we see in the Forth versions. By making us meet the computer halfway (or further), Forth eliminates a whole lot of shuffling about that costs precious processor time. A well-written Forth program can actually be smaller than its pure assembly-language equivalent — much less the same program written in some other high-level language — because Forth so emphasizes reusable words. And it can be surprisingly easy to port Forth programs from computer to computer; one need only re-implement that bottommost layer of words in the new machine’s assembly language, and leave the rest alone.

Of course, all of these advantages that make Forth so attractive to programmers working on embedded systems and device firmware today also made it mighty appealing to programmers of ordinary PCs of the late 1970s and 1980s, working as they were under stringent restrictions of their own. For some early PCs Forth was the only language other than the ROM-housed BASIC or assembly language that made any sense at all. Stripped down to its essentials, Forth can be tiny; for example, Cognetics Corporation, a developer we met in a recent article, worked with a version of Forth that fit into just 6 K. Thus Forth enjoyed considerable popularity, with a fair number of games and other commercial software written in the language. John Draper, the legendary “Captain Crunch” who taught Steve Wozniak and Steve Jobs how to phone phreak amidst myriad other hacking accomplishments, was a particular devotee, distributing a Forth development system for the Apple II which he also used to write the II’s first really usable word processor, EasyWriter. Many of the magazines ran columns or extended series on Forth, which was available, and generally in multiple versions, for virtually every remotely viable machine of the era. One British computer, the ill-fated but fascinating Jupiter Ace, even included Forth in ROM in lieu of BASIC. Tellingly, however, as the 1980s wore on and software got more complex Forth became less common amongst commercial application and game developers, even as it retained a dedicated cult of hobbyists who have persisted with the language to this day. According to Charles Moore, this was as it should be. Forth, he told Jerry Pournelle in Byte‘s March 1985 issue, had never been intended for closed-source commercial software.

Writing big programs to be distributed in object code is a distortion of what Forth is all about. Forth is like a set of craftsman’s tools. You use it to make still more tools that work with whatever you specialize in. Then you use it to solve problems. Forth programs should always be distributed in source code. You should have Forth online at all times. Recompile whenever you want to use a program. Forth programs are tailored, they’re living and dynamic, not static object code.

“Distortion” or not, the most important Forth game, and arguably the most ambitious project ever completed in the language, would appear more than a year after those remarks. I know I’ve been teasing you with it for a while, but, with all the pieces in place at last, we’ll get there next time… really, I promise.

(Probably the best place to look to get an idea of the excitement Forth once generated, as well as a very good picture of the language itself, is the August 1980 Byte, which had Forth as its main theme. My example code in this article has its origins there, as does the picture.)


Tags: ,

25 Responses to The Forth Dimension

  1. Matthew

    October 17, 2014 at 2:04 pm

    Heh. Never knew Atari used it for Pinball and Arcade games. For those who want to try it out just in a browser, there’s a few online versions, for example

    (I never really used it much, but I’ve played with it a little out of historical interest).

  2. Jayle Enn

    October 17, 2014 at 2:35 pm

    A variant called Multi-User Forth is still in use as a compiled language on some ‘MUCK’ servers. Access usually requires special permission or an experimenter server install of your own, so I’m not sure how similar it is to the traditional variety beyond having the stack exposed.

    • Felix

      October 17, 2014 at 5:17 pm

      MUF is higher-level than regular Forth. Off the top of my head, it has proper strings, arrays and local variables. Moreover, it has a whole bunch of words to manipulate the game database, since it’s designed for scripting multiplayer games. Otherwise it’s fairly similar to the original language, but the extra features make it easier to write readable programs in it.

      (I might add that nowadays compiling MUF code does little apart from applying basic peephole optimizations. But that can help to a surprising degree.)

  3. Felix

    October 17, 2014 at 5:09 pm

    I designed my own Forth dialect for an interpreter construction tutorial, because it was so easy to implement, and therefore ideal for teaching the subject matter. But then I found myself unable to do any serious work in my very own programming language! How’s that for irony.

    On the other hand, I know someone who learned MUF* as his first programming language, and confessed finding all other programming languages very strange afterwards. (He ended up a competent Python hacker anyway.) So yes, it’s a very strange language. But it’s a language one should know at least in theory because, much like Lisp, it changes the way one thinks about programming.

    * MUF: Multi-User Forth, a higher-level (and very specialized) dialect of the language, used for scripting MUCKs, a type of text-based virtual world.

    • Jimmy Maher

      October 17, 2014 at 7:02 pm

      Yeah, I found it interesting to play with when preparing for this article just because it’s *so* different from other languages. Nice to shift the old paradigms once in a while…

  4. ZUrlocker

    October 17, 2014 at 7:04 pm

    I remember that issue of Byte and Leo Brody’s books. I enjoyed reading about Forth more than programming with it. Definitely a step up from assembler, but pretty rudimentary for someone used to a traditional procedural high level language like Pascal or even C. I remember there was a language by Apple II programming guru Paul Lutus called GraForth.

  5. Anthony

    October 17, 2014 at 9:43 pm

    It’s probably obvious but i stupidly having trouble figuring out what you mean by “all of the trinity of 1977”.

    Another interesting article, and a good tease for the next instalment!

    • Asterisk

      October 18, 2014 at 6:16 am

      He’s referring to the trio of extremely influential PCs that were announced within months of each other in 1977: the Apple II, the TRS-80, and the Commodore PET.

      These were the first personal computers that shipped in a complete, ready-to-use condition, and set the stage for the home computer revolution of the 1980s; earlier PCs were either kits that the user needed to hand-assemble, or were bare boards that required separate hardware for the most basic I/O functions.

  6. Ethan Johnson

    October 18, 2014 at 1:39 am

    I have just caught up with this blog! Now I can feel the agonizing wait until I figure out what the program written in Forth is.

    I know I’m just a baby who only knows C-based languages, but I can’t imagine that this is easy stuff to interpret or code in long stretches. I can see the efficiency though, which makes me curious as to exactly why I’d never really heard talk about it before.

    Thanks for your work, and specifically your citing of sources. This blog has been fantastic in helping me find historically relevant information for a project I’m putting together.

    Just an off question relating to a much earlier blog, was there a source for Ultima 2 having one of the earliest patches? I’m very interested in any findings you have on early PC patching.

    • Jimmy Maher

      October 18, 2014 at 8:37 am

      I don’t think there need to a lot of mystery about the program coded in Forth at this point, but feel free to enjoy the sweet agony anyway. ;)

      An updated version of Ultima II is mentioned in Computer Gaming World’s review in the March/April 1983 issue. John Williams also mentioned it to me during our correspondence, and the actual patch disks are listed as hosted at the University of Texas’s Richard Garriott archives, although I haven’t seen them personally. Good luck with your research!

  7. charlie pye

    October 18, 2014 at 1:52 am

    That looks like the kind of language I would absolutely despise. I mean, I admire its efficiency, but any kind of slip up and you’ve got a wrong number in the stack, which won’t give you any error message but just give the wrong result. It must be a nightmare to debug.

  8. Matthew

    October 18, 2014 at 6:03 pm

    I just realized, I had completely forgotten all about the biggest use of Forth anywhere: It wasn’t just the Jupiter Ace that had it in ROM.

    Open firmware ( contains a Forth interpreter. Note that all Sun Sparcs past a certain vintage have it, as did all PowerPC macs, and a bunch of other machines. Dying off now (Apple switched to EFI when they went Intel, and not many Sparcs are being sold these days), but in it’s time it was very big.

  9. Nate

    October 19, 2014 at 6:00 am

    It’s interesting to note that in the last decade there has been somewhat of a fringe renaissance of interest in Forth-like stack-based languages. Two in particular stand out to me:
    Manfred von Thun’s academic language Joy, ( which generalises Forth to Lisp/Scheme-like functional programming, and Slava Pestov’s Factor ( ), which takes Joy and turns it into a practical actual language which really does resemble the love-child of Forth and Common Lisp.

    Oh and if you didn’t know, Postscript – you know, that page description language that runs in every laser printer – is a stack language much like Forth! And used for much the same reasons, I believe, since RAM consumption and CPU speed have always been problems in printers.

  10. James

    January 29, 2015 at 11:33 pm

    I skimmed this one, eyes slightly glazed, but am fairly sure in
    “it not only ran faster than just above any language short of assembly, but in the right”

    you meant ‘just about any language’

    • Jimmy Maher

      January 30, 2015 at 7:26 am

      I did. Thanks!

  11. Peter Piers

    December 16, 2015 at 8:41 pm

    What a beautiful -looking language (said the non-programmer who was introduced to BASIC early in school and never got to grips with I6. AGS was comfortable, though).

  12. CJGeringer

    January 12, 2016 at 5:53 pm

    Regarding the phrase “the “10CUBES” word expects nothing on the”

    The word defined by the code is actually “cubes10”

    • Jimmy Maher

      January 12, 2016 at 6:08 pm


  13. Daryl

    July 23, 2016 at 6:33 pm

    I have found that forth is the most powerful programming paradigm I have ever seen. It is the only language capable of transformation into a purely ideographic language. It not only gives the manipulation of data in words, but also defines the compiler itself in words. In no other languages is the or can the compiler be defined as part of the language.

    • DZ-Jay

      March 5, 2017 at 12:19 pm

      Hmm… That’s not quite accurate, unless I misunderstood your comment. Programming Language become truly useful and fully expressive when they reach a sort of singularity in which their compilers can indeed be written in their own tongue.

      This must be true for any so-called “systems programming language,” which must be able to express the sort of idioms which are in turn used to build other programming languages.

      As I recall, the C compiler is written in C, Pascal is written in Pascal (Anders Hejlsberg’s editions at least: Turbo Pascal, Object Pascal, Delphi, etc.). I believe other languages are as well, though admittedly many are written in C just because their expressivity is not intended to aim at systems programming, or perhaps because their designers are more familiar with C than their new language (this in itself tells you a lot at the philosophy inherent in the new language, when its own creators are unable or unwilling to apply it for their own complex work).

      In any case, I’ve heard of Forth in the very abstract, as a mythical and ancient beast of yore whose relevance passed with the Age Of Dinosaurs. However, I’ve never actually seen it. Now that I have, I’m intrigued and awestruck by its simplicity, beauty, and elegance of design.

      If such shallow introduction to the language could resonate so thunderously and powerfully within my programming soul, I feel rather embarrassed that I never cared enough before to look into it.

      I must now read more on it and I’m even dreaming about making my own compiler for my own purposes: game programming for the old Intellivision console, which as an avid home-brewer, I heretofore only did in Assembly Language.

      Thank you for the introduction to a new paradigm of programming.


      P.S. As a fellow programmer who finds himself having to tech complex concepts to the layperson, my only criticism to your fascinating treatise above is that your description of a stack was underwhelming and a bit confusing. May I suggest that instead of introducing technical terms of art like “last-in-first-out,” you use real-world analogies like the ever-clichéd “stack of trays in a cafeteria” or the like. In my experience, old and overused as they may appear, people do respond to these analogy much more powerfully. :)

      • DZ-Jay

        March 5, 2017 at 12:23 pm

        Hmmm… By the way, the first three paragraphs are a response to Daryl, the rest a general comment to the author of the article.

        I don’t know why I mixed them up. Sorry.

  14. Daryl

    July 23, 2016 at 6:52 pm

    Heres an illustration:
    DUP is an up arrow, Drop a down arrow. Push and Pop are up and down triangles. Swap is a leftright arrow. Over is an up down arrow. Store is a left arrow. Etc.

    I have gotten it so every forth word is an ideographic telling what it does. Being one symbol, combination of symbols becomes the decomposition of symbolic words or ideographs.

    I.e. Nip is s

  15. Daryl

    July 23, 2016 at 7:20 pm

    Design decisions. Let’s say we were designing an assembler for the x86 in c. The register AX becomes a token, controlled by the assembly.

    Well in forth AX can be a word that places it’s bit equivalent in the output stream or something equivalent. I.e. forth paradigm eliminates both the tokenization and lexical analysis of compiler design. Awesome.

  16. Daryl

    July 24, 2016 at 3:06 am

    I dabbled with different forth threading techniques. Direct, indirect, subroutine even lisp list based threading. You can create a forth compiler on emu8086 to view the awesome power it contains.

    Instead of everything being a word. I went a step farther. Everything was a symbol. White space is a symbol that delimitation words. All other symbols add to the current word until a white space.

    I am so fascinated with the versatility and power of the forth paradigm. It’s not just a programming language, it’s a paradigm far greater than object oriented programming. It is word oriented programming.

    I saw that the assembler code words that define the core forth can in fact be defined as assembler threads. Ie.


    Now take the swap word. [SI] AX and XChg are assembler words. Thus the assembler itself becomes a thread of those three words. Interpreted as a forth asm thread that creates the binary. In what other language can you do this? Lisp maybe. No, asm would have to be an s-expression. Only forth


Leave a Reply

Your email address will not be published.