Scott Franco - 2018-03-16

Quick overview of P6 modularity

(for the actual mechanics of P6 modularity, see the commits)

P6 uses a very simple plan for modularity inherited from IP Pascal.
I considered it to be implementation specific, and so didn't document
it in the Pascaline specification (and it will remain so).

The idea of P6 modularity is of a stack of modules:

-------------
|  Program  |
-------------
|   Mod 2   |
-------------
|   Mod 1   |
-------------
|  Startup  |
-------------

This picture is conceptually accurate, but looks better in its
command line format:

p6 Mod1 mod2 Program

The module "startup", which is an analog of the C (famous or
infamous) "crt0" module, isn't specified in p6 interpreter, but
it actually does exist. It is simply generated by pint itself.
Long story short, it is only about 4 instructions and I didn't
think it was worth the bother of having to specify it each time,
even for old style Pascal programs without separate modules.

Each module is built on the basic structure:

Module N;
Call startup
Call Next
Call Shutdown
Return

Startup:
Return

Shutdown:
Return

Next:

That is, each module in the "module stack" runs its own startup
code block, then calls the module above it, then calls its shutdown
block, then exits back to its caller, the module below it.

This corresponds to a skeleton module as follows:

module n;

begin

{ startup }

end;

begin

{ shutdown }

end.

Pascaline describes the shutdown block as optional, but p6 actually
generates it anyway. Its just a dummy block that does nothing. This
is because p6 cannot really look forward much, so it does not know
at the start of the module if the shutdown block is going to appear.

The other module types fit into this same system. Programs are
considered to be a special type of module in Pascaline, that only has
a startup block but no shutdown block. In the original IP Pascal
implementation, program modules even call the next block in the stack,
and IP uses a "cap" module to terminate the series. It usually just
contains a single return. In p6 the program module is its own cap
module, which makes sense. The program is the last module to run, and
when it exits the program run is over. In IP Pascal, the fact that
program modules obey the chain system means that multiple program
modules will run in series, one after the other, even though they
use common modules. However, I will admit that in the 25 years
IP has implemented this system, I have never used this capability.

From there, the other possible modules fall out naturally. Share
modules don't use startup or shutdown blocks, because they have no
globals to initialize or shutdown. Process modules only implement
a startup module, and use it to start a new thread that runs the
process, then that actually runs a startup block that is private
to it. It has no shutdown, and is like a program module in all other
respects. A channel module theorectically uses no startup, since it
has no globals, but in reality it does. It needs to manage locking
code.

Unlike C with pragmas, the run order of startup/shutdown is not
specified by specifying the run order with priorities or other method,
but rather is specified by the module order in the linker stack.
The module startup/shutdown order, however, is perfect for the
way modules are used in Pascaline. Each module in the stacking order
is arranged by dependencies. In the original example:

p6 Mod1 mod2 Program

mod1 has no dependencies, but mod2 can have dependencies on mod2
The program module can have dependencies on both mod1 and mod2.
Thus (for example), when mod2 or the program calls routines in
mod1, mod1 has had the opportunity to set up any and all of its
data before that happens, and likewise any calls from program
to mod1 or mod2. Similarly, if mod2 or program overrides routines
in mod1, it can't do that unless mod1 has initialized itself.

In IP Pascal, which is the only existing compiler for Pascaline
that generates machine code, that system in fact goes far past
simple Pascaline modules. The support libraries are implemented
as a series of override modules, and even the assembly language
modules are formatted as modules. This is why, for example,
that the system module that implements things like write/read
can plug into a terminal mode or even graphical mode module
simply by changing the stacking order.

In IP Pascal, the issue was often encountered that it was
necessary to have loops in the module order. For example in
the stack:

p6 Mod1 mod2 Program

If mod1 has an error, and needs the program to exit, then it must
contain a loop to route the goto back to program, usually by
a routine like "abort", that contains a goto to the main
block. Mod1 can implement its own error handler, and it can
use halt to stop the program run, but in order to give each
module above mod1 the chance to clean up, it must use gotos
to ripple up to the top.

Of course, as you have probably guessed by now, this was from
the days when Pascaline didn't implement structured exception
handling. That feature completely negates the need for loops
in module structure, and indeed the IP pc/linker program has
the ability to flag module dependency loops as an error.

So how do modules, and indeed, separate compilation, get
implemented in a system like p6, which is only designed to
handle one input file at a time? The secret is to
concatenate the intermediate files into a larger intermediate
that is loaded into pint as one step. This is actually similar
to the way a linker loader works, since it's final job is to
group all the binaries into a single deck.

For this reason, only the program module generates the "q"
or "end of intermediate" instruction. Other modules are not
designed to load and run into pint alone. The program
generated intermediate does, but it can be used as either
a standalone intermediate file, or combined with the other
module intermediates, and then it serves as a terminator.