ECMA-55 Minimal BASIC compiler BASICK_git
BASIC Compiler and Interpreter
Brought to you by:
gatewood
| File | Date | Author | Commit |
|---|---|---|---|
| examples | 2013-05-01 | gatewood <> | [1f1e67] add power operator example |
| xtra | 2016-11-25 | gatewood <> | [e2550d] Add self-contained demos showing returning mult... |
| COPYING | 2012-01-30 | gatewood <> | [35b4f8] add license information |
| GRAMMAR.TXT | 2016-11-25 | gatewood <> | [601b88] add missing IDENTIFIER to dimtail production |
| README.TXT | 2016-11-16 | gatewood <> | [3af5e2] Document that four versions of the interpreter ... |
| STATUS.TXT | 2016-11-25 | gatewood <> | [b7904e] Lua >= 5.3 really supports integers |
| basick.lua | 2018-06-20 | gatewood <> | [083ec4] Update version number |
| basick.php | 2018-06-20 | gatewood <> | [083ec4] Update version number |
| basick.py | 2018-06-20 | gatewood <> | [083ec4] Update version number |
| basick.rb | 2018-06-20 | gatewood <> | [083ec4] Update version number |
BASICK is an integer BASIC interpreter in a very old-school style. It requires line numbers. It only supports integer math. It has no string support at all. This implementation has four versions of the interpreter. These are for python, lua, ruby, and php. This dialect includes END, REM, IF, GOTO, GOSUB, RETURN, PRINT, INPUT, and DIM. The arrays created with DIM are integer-indexed single dimension arrays of integers, with the first element in slot 0. The IF statement is a single line IF statement that can have any single valid statement after the THEN including another IF statement. The GOTO and GOSUB statements take an integer expression, allowing nice computed jumps, and removing the need for ON..GOTO and ON..GOSUB found in other BASIC dialects. The arithmetic operators are +, -, /, *, and % for addition, subtraction, integer division, multiplication, and modulo. There is also support for unary minus. Parenthesis are supported to override the normal operator precedence. The logical operators are only <, =, and > at this time. BASICK is case-insensitive, and converts everything to upper case when it is processing. Variables names are a single letter, optionally followed by a single digit. If the variable was declared as an array with the DIM statement then it is a contiguous integer-index one-dimensional array of integer values with the the first element at index zero. Otherwise the variable is an integer scalar value. Here is an example program demonstrating the amazing features of BASICK: 10 REM ************************************************** 20 REM * GENERATE THE FIRST N FIBONACCI NUMBERS * 30 REM * N IS THE NUMBER OF ELEMENTS IN THE SEQUENCE TO * 40 REM * GENERATE, I IS OUR LOOP INDEX, AND WE CALL THE * 50 REM * SUBROUTINE ON LINE 240 TO GENERATE FIB(I). * 60 REM * X IS THE INPUT, Y IS THE RESULT OF THE FIB() * 70 REM * SUBROUTINE. * 80 REM ************************************************** 90 INPUT N 100 LET I=1 110 IF I>N THEN GOTO 170 120 LET X=I-1 130 GOSUB 240 140 PRINT Y 150 LET I=I+1 160 GOTO 110 170 END 180 REM ************************************************** 190 REM * Y=FIB(X) FOR X>=0 * 200 REM * A,B,C AND J ARE USED AS SCRATCH VARIABLES. * 210 REM * ENTRY POINT IS LINE 200 * 220 REM * INPUT PARAMETER IS IN X, OUTPUT RESULT IS IN Y * 230 REM ************************************************** 240 IF X>1 THEN GOTO 270 250 LET Y=X 260 GOTO 380 270 LET A=0 280 LET B=1 290 LET C=A+B 300 LET J=2 310 IF J>X-1 THEN GOTO 370 320 LET A=B 330 LET B=C 340 LET C=A+B 350 LET J=J+1 360 GOTO 310 370 LET Y=C 380 RETURN Note well that this program (and all the examples shown here) also runs in the brandy dialect of BASIC. However, the programs below do not run on BASICK. Studying the old BASIC code does explain why the old BASIC dialects gets a bad reputation. When written as above with the necessary REM statements and only one statement per line, the code is easy to read and understand and use. The 'spaghetti BASIC' reputation was mostly a result of limited memory. On a machine with 16KB, you cannot spend any bytes on REM statements like the ones shown here that make the subroutine easy to use and make it easy to avoid any problems with global variable conflicts could not be used when coding larger programs. Here is the same program written in a more terse style, using a similar dialect, but still limited to only one statment per line: 10 INPUT N 20 I=1 30 IF I>N THEN END 40 X=I-1 50 GOSUB 90 60 PRINT Y 70 I=I+1 80 GOTO 30 90 IF X>1 THEN GOTO 120 100 Y=X 110 RETURN 120 A=0 130 B=1 140 C=A+B 150 J=2 160 IF J>=X THEN GOTO 220 170 A=B 180 B=C 190 C=A+B 200 J=J+1 210 GOTO 160 220 Y=C 230 RETURN This version does not waste any bytes on REM statements, directly returns on line 110 saving a GOTO, and directly ends on line 30 saving another GOTO. This version also does not require LET for assignment stateents. More speed, less memory, but at the cost of readability. Let's take the next step back in time and look at how the program would actually be entered on some machine with say 16KB total RAM, using ':' to allow multiple statements per line: 10 INPUT N:I=1 20 IF I>N THEN END 30 X=I-1:GOSUB 40:PRINT Y:I=I+1:GOTO 20 40 IF X>1 THEN GOTO 70 50 Y=X 60 RETURN 70 A=0:B=1:C=A+B:J=2 80 IF J>=X THEN GOTO 100 90 A=B:B=C:C=A+B:J=J+1:GOTO 80 100 Y=C:RETURN 110 RETURN Programs like this would be magazines (remember, no WWW yet) and on BBS download sites, and would be typed in by hand. However, this program would be considered 'bad' because it is too long. Here is a 'better' version: 10 INPUT N:I=1 20 IF I>N END 30 X=I-1:GOSUB 40:PRINT Y:I=I+1:GOTO 20 40 IF X<2 Y=X:RETURN 50 A=0:B=1:C=1:J=2 60 IF J>=X Y=C:RETURN 70 A=B:B=C:C=A+B:J=J+1:GOTO 60 First, notice the 'THEN' keywords are not used, since it was considered unnecessary 'syntactic sugar'. At this point it still works (try it in brandy) but it is quite difficult to determine what it does. Each line of text uses memory, and by reducing the number of lines and filling them up as much as possible (reducing number of line numbers that must be tracked by the interpreter) memory was use was minimized. Also on line 50 since A and B are constants, I go ahead and do the A+B to set C=1 to save one add. On those early machines (even first IBM PC was only 4.77mhz) every add mattered for speed. We have to have lines 20, 40, and 60 since they are targets of GOTO and/or GOSUB statements. This would have been consider 'good' BASIC, because it reduced memory usage and ran fast. This is why BASIC has a bad reputation. On today's computers which have enough memory, if you write the code as shown in the first version, BASIC is easy to read and understand. With the RAM and CPU speed of modern systems, BASIC runs just fine even with the REM statements and the rule of only 1 statement per line. Amazingly, the 'best' version for brandy can be smaller: 1INPUTN:I=1 2IFI>N END 3X=I-1:GOSUB4:PRINTY:I=I+1:GOTO2 4IFX<2 Y=X:RETURN 5A=0:B=1:C=1:J=2 6IFJ>=X Y=C:RETURN 7A=B:B=C:C=A+B:J=J+1:GOTO6 The silly spaces on line 20, 40, and 60 are required so you know the end of the conditional. If there is no space on line 20, then the machine thinks you said IF I>NEND and you forgot to have any statment after the condition. We have less than 10 lines, so we can use single-digit line numbers and save another 10 bytes. The initial 'nice' version source requires 1165 bytes on linux, and the 'best' version needs 137 bytes on linux. I think you can already begin to see the size difference on this short program. And yes, Mr. Ham wrote code like this, just like all the good programmers did on those machines. Sadly brandy lacks the shortcut most machines had for print (use '?' instead of 'PRINT') so that cost me 4 bytes. At least now you know why nobody used long (2 byte) variable names until they had used all 26 1 byte names, and the smaller you could make the source code, the better you were. Trust me, 16KB is not much. BASICK can run this bloated 220 byte version of the code: 1 INPUT N:LET I=1 2 IF I>N THEN END 3 LET X=I-1:GOSUB 4:PRINT Y:LET I=I+1:GOTO 2 4 IF X<2 THEN LET Y=X:RETURN 5 LET A=0:LET B=1:LET C=1:LET J=2 6 IF J>X-1 THEN LET Y=C:RETURN 7 LET A=B:LET B=C:LET C=A+B:LET J=J+1:GOTO 6 If we had string variables (they end with a '$' symbol to mark them as strings) then we could write code like this: 10 REM *****> FORTUNE TELLER <***** 20 A0=0 30 N0=0 40 PRINT "1. Enter name" 50 PRINT "2. Enter age" 60 PRINT "3. Get fortune" 70 PRINT "4. Exit program" 80 PRINT "What do you want to do?" 90 INPUT I 100 IF I>0 THEN IF I<5 THEN GOTO 120 110 GOTO 90 120 IF I=4 THEN END 130 ON I GOSUB 150,190,230 140 GOTO 40 150 PRINT "What is your name"; 160 INPUT N$ 170 N0=1 180 RETURN 190 PRINT "What is your age"; 200 INPUT A 210 A0=1 220 RETURN 230 IF A0=1 THEN GOTO 260 240 PRINT "You did not enter your age yet" 250 RETURN 260 IF N0=1 THEN 290 270 PRINT "You did not enter your name yet" 280 RETURN 290 IF A>30 THEN GOTO 350 300 PRINT N$;" has an opportunity to improve" 310 RETURN 320 IF A>30 THEN GOTO 350 330 PRINT N$;" will suffer as the new person at work" 340 RETURN 350 IF A>40 THEN GOTO 380 360 PRINT N$;" will gain weight and debt" 370 RETURN 380 IF A>60 THEN GOTO 410 390 PRINT N$;" will be filled with regrets" 400 RETURN 410 IF N$="John" THEN GOTO 440 420 PRINT N$;" will be fighting with his/her teenage children" 430 RETURN 440 PRINT "Poor John will be living alone as a homeless person" 450 RETURN This uses strings and string literals. It also uses the ON..GOSUB, which is how BASIC supports a switch statement. Most BASIC dialects also have both ON..GOTO and ON..GOSUB, but obviously the ON..GOSUB is just a convenience. The A0 and N0 are flags, with the convention that 0 is false and 1 is true. BASIC has no boolean type, but this convention works fine. If you have true computed GOTO, you can do this: 10 REM *****> FORTUNE TELLER <***** 20 A0=0 30 N0=0 31 DIM A(3) 32 A(1)=150 33 A(2)=190 34 A(3)=230 40 PRINT "1. Enter name" 50 PRINT "2. Enter age" 60 PRINT "3. Get fortune" 70 PRINT "4. Exit program" 80 PRINT "What do you want to do?" 90 INPUT I 100 IF I>0 THEN IF I<5 THEN GOTO 120 110 GOTO 90 120 IF I=4 THEN END 125 I=A(I) 130 GOSUB (I) 140 GOTO 40 150 PRINT "What is your name"; 160 INPUT N$ 170 N0=1 180 RETURN 190 PRINT "What is your age"; 200 INPUT A 210 A0=1 220 RETURN 230 IF A0=1 THEN GOTO 260 240 PRINT "You did not enter your age yet" 250 RETURN 260 IF N0=1 THEN 290 270 PRINT "You did not enter your name yet" 280 RETURN 290 IF A>30 THEN GOTO 350 300 PRINT N$;" has an opportunity to improve" 310 RETURN 320 IF A>30 THEN GOTO 350 330 PRINT N$;" will suffer as the new person at work" 340 RETURN 350 IF A>40 THEN GOTO 380 360 PRINT N$;" will gain weight and debt" 370 RETURN 380 IF A>60 THEN GOTO 410 390 PRINT N$;" will be filled with regrets" 400 RETURN 410 IF N$="John" THEN GOTO 440 420 PRINT N$;" will be fighting with his/her teenage children" 430 RETURN 440 PRINT "Poor John will be living alone as a homeless person" 450 RETURN This actually runs in brandy BASIC. However, it isn't really pretty since you need line 125. In BASICK we can replace lines 125-130 with 130 GOSUB A(I) since GOSUB and GOTO take an arbitrary arithmetic expression as the jump target. Brandy has two distinct GOTO/GOSUB statements syntaxes: GOTO integer GOSUB integer GOTO (var) GOSUB (var) so we can get the job done, but the syntax isn't quite as clean. The first style (with an integer target) is supported by all old-school BASIC interpreters. Computed GOTO isn't supported in many of them like bywater BASIC, and they require the more rigid ON..GOTO or ON..GOSUB statements. As you can see, we can use whatever is provided, but the most flexible choice is to allow true computed addresses and that is what BASICK does.