Menu

4tH philosophy

thebeez

Why 4tH isn't "modern by design"

Introduction

Some people dismiss 4tH, because it isn't "modern" and "lacks" features like associative arrays, object orientation, regular expressions, quotations, garbage collection and even something as "fundamental" as type checking. Of course, most of these can be back fitted (try our libraries if you can't live without them) - but that is not the question here. Much more crucial is, do they make the life of a programmer really that much easier? And are the programs created really that much better in each and every respect?

Datatypes

There is a reason why 4tH is the way it is. For me there are two different data-types, bytes and words. You can wrap them into arrays if you want, but that's it. A structure is simply a special kind of array. Period.

I have very little to do with other data-types, since you can all reduce them to the two I mentioned. Some may say lists are a "basic" data-type, but it's not, since a list is an abstraction in itself. For Pete's sake, even "multidimensional" arrays are an abstraction. Note I got nothing against abstractions, but you only put them into existence when they serve a purpose to solve a particular problem. Otherwise they don't exist and shouldn't be a part of your vocabulary.

The problem with types is that they tend to get in your way. Even in C you might find yourself into a position where "type safety" is more of a nuisance than of a help. And no, a void declaration does not always save you. Pointers to extended structures aren't of the same type. And what about malloc()? Since malloc() is typeless, it can either align all allocations or do some dirty cleaning up afterwards.

4tH on the other hand, always knows exactly what you're doing. @C is for the Code segment, @ is for the Integer segment and C@ is for the Character segment. I don't care what abstractions you or me throw on top of them, but it always boils down to this.

Quotations

"Quotations" are simply unnamed execution tokens. If you want them in 4tH, the only thing I would have to do is map [: to :NONAME and ;] to ; (which by the way, was exactly what happened in v3.62.2). Bottom line. Hence, if you want "quotations", you already got them. It's a fancy way to group a bunch of instructions, but I fail to see how they make life significantly easier.

Some claim quotations make it easy to make control structures. That may be, I know at least three other ways to implement control structures. They all have their advantages and disadvantages. I decide which one to use depending on the problem I want to solve. Quotations are one way to do it, but certainly not the only way or the best way.

"Fancy stuff"

Another problem with fancy stuff is that they introduce overhead. Local variables are variables any way you look at it. They come with the overhead of fetching and storing them. Worse, you have to allocate them at the start of a subroutine and break them down afterwards. Execution tokens are subroutines nonetheless and come with function call overhead.

The most beautiful example must be "garbage collection", which was created for "programmers" who are evidently completely unable to balance their malloc() and free() calls (let alone balance a stack in a 64K program). The consequence is that your program now comes to a grinding halt at the most inconvenient moment so it can figure out that mess you've made in the meanwhile and try to make sense of it. That gave rise to a whole new class of specialists who are hired to "tune" the engine so the whole thing becomes workable again.

We've seen this before when SQL was created. It was designed as a successor of COBOL to enable layman to query a database. Let alone it had at least a couple of dozen fundamental problems, it didn't meet its design objectives by far. Instead, we created a bunch of other specialists who were not only able to write SQL, but also to persuade the "optimization engines" to give you some acceptable performance.

And to this very day SQL lays on top of whatever you throw at it, with no lexical binding at all, leaving you to "map" the results to whatever language you're actually using. If you're doing any web programming, you have to use a myriad of languages, which are subtly interwoven, giving the word "debugging" an entirely different meaning.

Yes, 4tH offers a tiny RDBMS in its libraries, but it is tightly embedded. A structure serves as a database buffer, so no mapping is required. When you retrieve a record, you can access it by its field name straight away. That's architecture. That's design.

Finally, I don't see how they make life easier - in other words: why they're always a better solution than alternative solutions.

Forth was designed to unobfuscate everything, the stack, the concatenation, the post-fix notation, etc. and open them up to the programmer, so he can make full use of it and (also) to put the burden of handling them to the programmer - and not the computer. In other words, to unabstract the language.

Some people try to mix these "modern" constructs with this basic philosophy - and you can't. It simply never becomes coherent, because you can't expose what you're trying to hide.

Benefits?

Some people claim that "reuse" is only possible with object orientation - which is plain nonsense. Libraries were there long before object orientation and the fact they are in wide spread use proves you don't need object orientation to reuse code.

Forth allows you to factor, factor and refactor. I can't imagine any language that allows for a more fine-grained reusability than Forth. 4tH itself comes with more than 400 (!) libraries.

I can write stuff faster in 4tH than every other language. I wrote a 64K source code application program in five days - and two more for the proper documentation. When a client took the functional specs and offered it to a specialized company, the word was that it would take a programmer three months to write the same thing in C.

Some say Forth is a "write only" language, proving only that they can't write proper Forth. I have maintained lots of 4tH programs and some aren't quite trivial - even if they're relatively small.

E.g. TCS has two separate dictionaries and hence two different interpreters. Needless to say maintaining a BASIC interpreter like uBasic, which I retrofitted with a structured extension, is far from easy. And I don't think you'll find that PP4tH is a "simple" program - and that one really evolved over time!

Why 4tH is the way it is

It's like Chuck Moore says: the industry has tried to make it easy for boys to program - but boys are not carved out to do the really complex stuff, no matter what kind of tools you put in their hands. That's left to real men - all information points in that direction. (You can choose to ignore that, because programmers - like actors - are very superstitious and rather believe in holy men, holy rituals and holy scriptures than cold, hard facts).

In other words, the tool you choose reflects your skill level and the complexity of your problem. I like Forth because it doesn't step into my way. Whatever is on the stack, I decide what to do with it and what abstraction it represents.

Yes, Fortran like expressions can be awkward to express in Forth (and 4tH). Yes, if you don't document them, duh.. Note TEONW - I documented the formulas there. That's what comments are for - don't tell me the code says (or should say) everything. And if you want to use the Fortran way to express such formula's, then use Operator Precedence Grammar/Formula Translation by David Johnson: a particular solution to a particular problem.

The Forth advantage

Concatenative languages are easier to read (less noise) and more compact, because you pass the parameters transparently by the stack. I'll show you what I mean:

strcat(strcat(strcat(strcpy(hello,"Hello"), " "), "world"), "!");

You start reading a strcat(), while the first function executed is strcpy(). You read from right to left! Now take a look at the Forth version:

Hello >r s" Hello " r@ place s"  " r@ +place s" world" r@ +place "!" r> +place

You read its source consistently from left to right, which makes it easier to parse as well. Of course, one could add optimizations like this:

: s! dup >r place r> ;
: +s! rot dup >r +place r> ;
s" Hello" Hello s! s"  " +s! s" world" +s! s" !" +s! drop

Or even this:

include lib/concat.4th
s" Hello" s"  " s" world" s" !" 4 Hello concat

Again, particular problem, particular solution. That's why there are several libraries in 4tH with the same basic functionality, but different characteristics.

Conclusion

Forth (and 4tH) is down to earth. You can put another abstraction layer on top of it - if and when you see fit - until you can express your problem in high level words. That's also why I never worry about "stack acrobatics" in low level words: it boosts performance (stack instructions are quite fast) and it doesn't clobber the main application logic.

If you work your way top down (top of the listing that is) you see the creation of these abstraction layers until they end up in a single word that says (and does) it all. If you work bottom up, you see this same word fall apart into basic Forth and basic data-types.

Sure, I helped you by pre-creating certain abstraction layers, which is the library. If you don't want or need to use it - it's all the same to me. But however you look at it, they remain separate entities, each one of them at its own discrete level - and there it stays.

I think the ultimate expression of that philosophy was implementing and breaking down Object Orientation, so you guys could see and understand what is really there - and frankly: it's not much.

  • Like a pointer is simply a variable holding an address;
  • Like a structure is simply an array with offsets;
  • Like a string is simply a character array with a terminator or a count.

IMHO that's how you make programmers: not by hiding what is there, but by showing what is there. If you want to fix cars, you have to know how they work and what's under the hood. For programmers it is no different.