4tH compiler Wiki

A Forth compiler with a little difference

Brought to you by: thebeez

This is Forth

"I've seen a lot of Forth programs that look very much like C programs transliterated into Forth, and that isn't the intent. The intent is to have a fresh start."
- Chuck Moore

4tH is a derivative of Forth, that elusive programming language that had a short bout of popularity in the early 1980-ies, before sinking back in oblivion. Forth is really a strange language. It seems really simple at first sight, but it is fiendishly difficult to master. The tiniest imbalance of the stack will lead to a fatal error - and you are bound to make an error with every single branch or iteration.

The second problem is that it doesn't resemble any language at all. Almost every language construct imaginable is there, but if you want to write true Forth, you only resort to it when you can't make it work with the only native language construct it offers, the stack.

Ok, you say, the stack. What about it. It's just an array with a different name. But no, it isn't. The first two items on the stack are the only ones you can easily access. The third, with a bit of trouble. Make your stack deeper than three items and you're in for a world of pain. Compare it to C. With C you can shoot yourself in the foot, with Forth you blow your head clean off.

No, I'm not going to tell you why I prefer to use Forth over C. I'm going to tell you what your life as a Forth programmer looks like by introducing you to a small and simple program that approximates PI. The algorithm is allegedly designed by Tamura and Kanada, two Japanese PI reseachers.

float array a
float array b
float array c
float array y

: tamura-kanada                    ( n -- fpi )
  >r 1 s>f a f!
  1 s>f 2 s>f fsqrt f/ b f!
  1 s>f 4 s>f f/ c f!
  1 s>f
  r> 1 do
      a f@ fdup y f!
      b f@ f+ 2 s>f f/ a f!
      b f@ y f@ f* fsqrt b f!
      c f@ fover a f@ y f@ f-
      fdup f* f* f- c f! 2 s>f f*
  loop
  fdrop
  a f@ b f@ f+ fdup f* 4 s>f c f@ f* f/
;

I found this in the manual of a Forth compiler. There are as many Forth compilers as Forth programmers, since every Forth programmer writes its own. Well, may be even more, because a Forth compiler can be made without having attended a single computer science class in your life.

So what does this tiny program tells us? It tells us this is written by someone who wrote a compiler, but never wrote a word of Forth in his life. Why? Well, not every Forth has local variables - and even then their use is controversial in the Forth world. This guy decided to allocate four global variables to compensate. Four variables for a short routine like that is unheard of. You just don't do that. You use the stack.

It is so bad you can even reconstruct the original code. In BASIC, it looks like this:

 5 n = 4
10 a = 1.0
20 b = 1/SQR(2)
30 c = 0.25
40 d = 1.0
50 FOR x = 1 TO n
60 y = a
70 a = (b+a)/2
80 b = SQR(b*y)
90 c = c - (d*(a - y)^2)
100 d = d*2
110 NEXT x
120 PRINT ((a + b)^2)/(4*c)

Now if you want to approach this the Forth way, you have to study this program carefully. First, you need to see where a variable enters and leaves a scope - because instead of in a variable, it has to reside on the stack.

Added bonus: this is a floating point program. Floating point support is kind of an "add-on" in Forth. It has its own stack (or it doesn't, depending on the compiler you're using) and hence its own stack commands.

Integer Forth has another trick on its sleeve, the return stack. Although that stack is primarily used to save return addresses on, most Forth programmers use it as a kind of "reserve stack". You can safely do so, as long as you clean up before that big "RETURN" comes up. If not, it will happily attempt to jump to your string length or structure address, whatever it is you left there, and make your life on earth to a living hell.

Anyway, the only item the other man left on the stack is our d variable, which by closer examination is not a floating point number at all. It's an integer, that is multiplied by 2 in line 100. So that one doesn't belong on the floating point stack in the first place. y is just a temporary copy of the original value of a (line 60) and goes out of scope after line 90. Which leaves us with a, b and c. c isn't used before or after line 90.

Unfortunately, there is only room for a, b and a temporary value on the floating point stack. We just can't reach c anymore at the most crucial moment, so we just can't get rid of all the variables. On the other hand, three out of four isn't bad:

float array acc

: tamura-kanada                      ( n -- fpi)
  >r 1 s>f 4 s>f f/ acc f!
  1 s>f fdup 2 s>f fsqrt f/ fswap    ( f: b a)
  1 r> over                          ( 1 n 1)

  do                                 ( f: b a)
    >r fover fover f+ 2 s>f f/       ( f: b y a')
    frot frot fswap fover f* fsqrt   ( f: a' y b')
    frot frot fover fswap f- fdup f* ( f: b' a' a'-y^2)
    r@ s>f f* acc f@ fswap f- acc f! r> 2*
  loop drop

  f+ fdup f* acc f@ 4 s>f f* f/
;

Now, does it run faster? Hardly. Is it any shorter. Not much, if any. Is it easier to maintain? Only for Forth programmers, I think. So why bother? Because this is Forth..