Scott Franco - 2014-07-24

According to my information, the Pascal-P compiler enjoyed its 40th anniversary last year (2013). This also means revised Pascal itself is 40 years old. For comparison (from wikipedia):

Fortran - 1957
Algol - 1958
Pascal - 1970
C - 1972
C++ - 1983
Java - 1995
C# - 1999
Ada - 1980

I included Wikipedia's guesstimate for Pascal as it provides a good baseline. That is, the other language start dates are probably equally good or bad.

So how is it working with a compiler that is 40 years old? To me there is not that much difference. I use a Pascal with extensions (IP Pascal), but before that I used several ISO 7185 Pascals, including the original version of IP Pascal (Pascal for Z80), then ProPascal, then SVS Pascal, then back to the 80386 version of IP Pascal. To enable life with different Pascal versions that all obeyed the ISO 7185 standard, I had a library I would carry around called "basicio.pas". It contained a series of routines I considered essential, such as open file by name, check file existence, delete file by name and routines to open and process binary files (not as straightforward as you would think!).

Basicio.pas started when I left Z80 Pascal for ProPascal on the 8086. It was loosely based on the system built in procedures of Z80 IP Pascal, which made moving existing code easier. When I would move to a new compiler, usually less than a day of recoding a new basicio.pas would get all of my code online. I was pretty happy with this method, and by the time I reached SVS Pascal, high powered ISO 7185 Pascal compilers with excellent optimization were available. Note that this was in the late 1980s when Borland compilers were still 16 bit, with little optimization. This ended when SVS exited business along with virtually all of the other compiler makers. SVS was stuck in DOS, abet DOS with extender. I suppose it is a great tribute to the work of SVS that their compiler still runs in a DOS box. The DPMI standard still runs, or did the last time I checked.

With the 80386 version of IP Pascal, there are lots of built in system functions, libraries available, and dynamic array types. However, my Pascal coding methods have not changed much. I don't use variable length string or array handling in most code because I don't really need it. I use dynamic string routines in libraries because that is far easier to do than declaring a lot of different length string handling routines. Ie., I want to declare a few different size strings using fixed lengths, but I can use general purpose routines to process them.

The P5 compiler could have been written yesterday as far as I am concerned. Outside of object orientation, there is not much new to be done there. The compiler suffers from lack of dynamic strings, mainly because the ISO 7185 specification says it has to handle any size identifier. This limit wasn't in the original compiler.

On the other hand, even on a fixed, 8 character identifier limited implementation, constant strings must be processed, and Pascal-P all the way though P4 shows that limitation. If you feed a source program into it with lots of strings, it is going to waste space. That is one reason the original compiler featured only numbers for errors, not strings, and where error strings did exist, they were short. The fix for that, fully variable length strings, was both done compatibly with ISO 7185 Pascal as well as being reasonably clear in the source code. If you want an example of implementing variable length, any size character strings (up to the memory size of the machine), all implemented in standard ISO 7185 Pascal, there you go.

Perhaps another limitation of age was Pascal-P's reliance on having instructions packed into a CDC 6000 compatible word size. Its an interesting limitation of the P1-P4 series compilers in that it does not stop those compilers from working on a current ISO 7185 implementation, but it does make them fairly inefficient. Fixing that limitation was not difficult, but it did change the source code in pint.pas quite a bit. I could have just rounded up the 60 bit word size of the CDC 6000 machines to 64 bits and relied on 8 byte basic instruction sizes, but I knew it would be far more efficient to embrace the byte orientation of CPUs that began with the PDP series computers from DEC, and became the rule from then on. It works both ways. The "modern" byte implementation of P5 would be horribly inefficient if ported backwards to a CDC 6000 class machine, since it would be trying to pick apart words.

For the rest, you can divide the improvements of P5 over P4 into two groups. The first is changes needed to bring the compiler from being a subset of Pascal to a full implementation. The second were changes to bring it to full ISO 7185 compatibility. I'm sure that the former constituted the majority of the changes. Its hard to say now because the code was never broken out that way.

Could P5 have been done in 1980? Well, actually it was, and it was done twice. The first time, arguably was the UCSD implementation, which changed P2 into a byte oriented VM, and then followed up on that by making the VM the center of implementation on microprocessors, including running a full operating system and tools within the VM. It was a stunning achievement, and they did it long before I ever got to use any kind of computer (it was first ported on a PDP-11 knockoff). It was rapidly doomed by its implementation as an interpreter and perhaps also the idea that local hardware and concerns outside the VM were uninteresting. Both of those problems could have been solved, and were solved. But too late to help UCSD.

The second "P5" was the Model Implementation of Pascal, which was issued in conjunction with the ISO 7185 standard. Few people have heard of it, and no further compilers were based on it. It vanished quickly because the standards bodies apparently thought they were going to make a lot of money on it, and locked it into a vault to die.

So the answer is yes, P5 could have been written and run along with the first versions of the ISO 1785 standard, or perhaps before.

Does Pascal-P make the basis for a good compiler (down to assembly code) implementation?

It actually was used this way several times, the first time being to create the CDC 6000 compiler itself. This was documented, and you can see this for yourself at:

http://www.standardpascal.com/CDC6000pascal.html

In order to see this, you have to go back to the original versions of both the Pascal-P compiler and the CDC 6000 compiler to see the similarities. To make the CDC 6000 compiler, the back end was thrown in the trash, and the front end recoded to directly output machine code. This was the famous "one pass compiler" that Wirth espoused.

Several other compilers created for Pascal either modified the Pascal-P system in that way, or used the CDC 6000 for the working basis of a new compiler. Its worth asking: why were so few implementations based on using the Pascal-P compiler as an intermediate based two pass compiler?

I suspect it came down to two factors:

  1. Pascal-P shares a lot of machine specific information between the front end and back end, such as word lengths and other information (for a complete list, see the machine parameters block at the top of P5's pcom.pas).

  2. Pascal-P discards its type and symbol information and concentrates on generating code for a VM.

The first factor makes it difficult to keep pcom.pas as a truly machine and implementation independent program. The second means that Pascal-P isn't applicable to advanced intermediate graphing techniques such as used in highly optimizing compilers like GCC. However, both of these don't apply to the CDC 6000 and similar compilers. There, the single pass compiler front end is littered with machine specifics, and machine optimization is still driven by serial examination of the code.

I would argue that Pascal-P is still a good basis for a compiler if you don't assume that it will be a high optimization compiler. It's best attribute is simplicity. For contrast, IP Pascal, and similar compilers such as GCC, are towers of code complexity. P5 makes a very good starting basis for a compiler class project. If that is not simple enough, P2 still makes a very compact and simple to understand compiler, abet very non-standard.

To sum up, here you have Pascal-P5, a new version of a 40 year old compiler from another time. Its been relatively simple to update it to current use of Pascal, even if a bit late. That work was both possible and valuable 10 years into its life.

So what can you do with P5? I know of one instance where it was heavily adapted to other uses. There was brief talk of using it to replace the defunct GPC project, but that was dismissed on the grounds that it's optimization possibilities are limited. On the other hand, a live, medium level optimization compiler beats a dead one any day.

For Borland affectionados, you can add Borland features to P5 and get a crossover compiler if that is what you want. I suspect the reason this is not attractive is that it would be redundant with the FPC project.

Perhaps the most fun, and the most valuable thing to do with P5, is to add your own constructs and make your own custom language. And in the process learn a lot about compilers and languages in general.

Cheers,

Scott Franco
San Jose (Silicon Valley), CA

 

Last edit: Scott Franco 2014-07-24