It's not easy being a BDFL

Authors:

It's not easy being a BDFL

Languages develop. That's a fact of life. If not, we would only see 1.0 versions with very high patch levels - and that's hardly the case. Contrary to popular belief the bulk of the changes do not originate from functional enhancements. If you've modeled your language after an already existing one, achieving and maintaining compatibility may play a big role as well. And you may be forced to rectify bad or flawed design decisions you made earlier.

In any other program that is usually not much of a problem. Ok, may be you chose a different file format, but that's something you can fix with a trivial conversion program. However, if you design a language you don't have that luxury. Why? Because now the existing programs of your users may fail to compile or run - or even worse: your changes may introduce subtle bugs in a program that worked fine before. That's called "breaking code" - and it's considered a big "no-no".

If you violate this "social contract" between you and your users the repercussions can be quite serious. People may refuse to upgrade and prefer to use a previous incarnation of your language. Or they may fork it - which is the equivalent of splitting up your country, instantly turning all of your citizens into refugees - who rather apply for asylum in the newly created realm than to live under your rule for a day longer.

So, yes, you may be a ruler with absolute power, but that doesn't mean you shouldn't wield that power carefully - and in a somewhat predictable way.

Character

Every language has a certain character. PHP and Perl are "have we thrown in the kitchen sink yet?" languages, which are not particularly elegant. Pascal and Ada are the computer science equivalents of "Sam, the American eagle" - very rigid, very formal, very reliable. Java is an overprotecting mother, which doesn't allow its children to play in a playground, which isn't plastered with rubber tiles, certified by a dozen agencies and guaranteed to be absolutely and completely safe. Forth and Lisp are minimalist languages, created by brilliant minds to be used and loved by brilliant people. C is merely a pragmatic, very high level assembly language.

When you create a language, you got to determine its character, since it will help you to make more consistent decisions in the future. 4tH was meant to be Forth, but simplified, with an added safety net and behaving more like a traditional language like BASIC or C. I still hate Forth's "counted strings", the ridiculous behavior of words like CMOVE and CMOVE>, the vast amount of (slightly) different integer types and the subtle differences in behavior that come with it - and the absurd, bolted-on C-like ANS-94 wordlists like "LOCALS" or "FILE".

On the other hand I knew that people - myself included - would port Forth code to 4tH. So at least some compatibility was required. DO..LOOP has always been problematic in Forth - and it still is. I tried to fix it as much as I could. Others tried as well - but code could still be broken. Lots of code. So not much has changed in the last 40 odd years.

One of the efforts to "fix" the awkward behavior of DO..LOOP was the invention of ?DO. You see, in "normal" languages, when the starting conditions of a counted loop imply that there should be no loop at all, no loop is entered. Not even one single iteration is performed. But Forth does. So this genius invented a ?DO that would act accordingly, but only if the start value equaled the limit. So it didn't even begin to cover the problem at hand.

I contemplated that one for a very long time. Especially since it required its own token, which is a very limited resource in 4tH. Words would kill their own mother if that would get them their own token. Of course, I could simply replace DO by ?DO. But this choice could introduce some very obscure bugs if a loop was actually meant to be executed at least once. And there were already some differences between 4tH's and Forth's DO..LOOP. So, one fine day I took a deep sigh and implemented it.

The main reason was it was really hard to emulate ?DO..LOOP in code:

: ?do-loop over over = if drop drop else DO LOOP then ;

It's a lot of code, it's dead ugly and it's very hard to understand. I know. I did it a few times. I didn't like it.

Sometimes, it's just that the correct solution eludes you. You might ask what's wrong with S". Still, it took some major design to get that one in. The reason is, that string constants reside in a different area than string variables - and in the early days there were special words in 4tH to transfer them over from one area to the other. The trick I applied can still be found in 4tH today.

In Forth, PAD is a landing space for user-defined temporary strings. In 4tH, it is a circular buffer that accommodates words like S". In 4tH, S" silently transfers the string constant to PAD and returns that address, taking care not to clutter other temporary strings needlessly.

Another thing I highly value is intentionality and readability. So, when I found myself using DUP XOR a lot, I created >ZERO. Furthermore, I find expressions containing 0= hard to read. Hence, the birth of EXCEPT (as a replacement for WHILE 0=) and UNLESS (as a replacement for 0= IF). Now I think of it, NOT was a lot easier to read - but hey - I didn't write the ANS-94 standard.

But these are the easy ones. They're just syntactic sugar. If you hate 'em - don't use 'em. The old way still works fine. But what if you really, really don't like something? I fought the introduction of CASE..ENDCASE in 4tH for over a quarter of a century, claiming tables were the better solution. And I still believe they are. Until I found out that the selection of a small number of integers required a CASE..ENDCASE like solution if you wanted the very best performance.

For a long time I ached my head with trying to find something different - as long as it wasn't anything like that cursed CASE..ENDCASE! In the end, I concluded that every solution I could come up with, would be non-standard and awkward. So I gave in.

Changes to the core - those are really hard. If you can solve the problem with a library, that's much easier - since you can choose to use the library or not. So, yes, nowadays 4tH supports double words, mixed words and even floating point. But only if you use the appropriate libraries. Yes, 4tH even supports Object Orientation - but not by implementing anything in the core. It's just one huge bag of syntactic sugar, which is generously applied by using the preprocessor.

There shall be only one?

In short, I like to give users an (informed) choice. I don't believe in "there should be one - and only one obvious way to do things". Not every painter uses the same brush - so if there is a real choice, I'd like to make it available. As long as the character of 4tH remains intact. However, if you do embrace that paradigm, you make your life a lot harder. Like Guido van Rossum found out the hard way.

If you do not know the guy, he is a Dutchman that designed the Python scripting language. That may explain the "out-of-the-box" choices we make, since we both are Dutch. I can't say I always agree with his choices, but I find some of them pretty interesting and original. Guido also openly appreciates the character of his language.

And that's why PEP 572 was his worst idea ever. In short, the "walrus operator" turns an assignment into a function, returning the assigned value. It allows you to concatenate several expressions together, resulting in longer lines. Which is not very Python-like. It also violates Guido's own "there should be one obvious way to do things" paradigm. And finally, it's just an ugly "C-ism".

I don't mean it's bad in C - on the contrary, it fits C perfectly. But on Python, it's just an ugly bolt-on kludge. In short, the PEP was accepted and Guido quit his job as BDFL. Maybe for the better, because when a leader loses his authority - since he obviously violates his own principles - it's really time to go.

I, on the other hand, still like this job, and although I also approach the ripe old age of 62, I have no desire to retire. Which means I give every decision I make a lot of thought. And I try to limit the number of changes I introduce with each release. Best: recompile your code and continue where you left off. If only life were that simple..

But if you can't avoid drastic changes, allow people a little time to adjust. Support the previous syntax and behavior for a little bit longer if you can. If all that fails, make sure there is a real benefit to these new features. An improvement in readability. A vast amount of brand new functionality. Something to make it worth. Something to balance out the effort of fixing what shouldn't have been broken in the first place.

Yes, the best solution is always to make things faster, smarter and more versatile - without affecting the original functionality. Add functionality - without changing anything else. Transparent improvement.

But sometimes you're faced with a real dilemma. Recently I had to entertain three possibilities:
1. Consistent, fast, but more cumbersome and less elegant;
2. Consistent, slow, but concise and elegant;
3. Inconsistent, but versatile - since it comprises both solutions.

I'm sure there are good arguments for either choice. But being a BDFL means I have to make that choice. All things considered, the difference between the first two solutions and the last solution was only one single line. So I left the choice to the user - and implemented all three separate solutions. One of 'em being a walrus operator.

The other, a SET() function. Which opened up an interesting question. What if Guido hadn't gone for a murky 'walrus' operator, but for a function? Would have that have eradicated all the controversy? We'll never know.

4tH compiler Wiki

A Forth compiler with a little difference