pascal p5c Wiki

fast, portable Pascal compiler using gnu c as intermediate code

Brought to you by: trevorbl

using the p5c compiler

Using the p5c pascal compiler

Compiler Options

Options are as in p5 (and p4 before that, and many other pascal compilers),
eg {$X+,Y-,Z+}, with '$' as the first character in a comment, then a sequence of characters. There is an example near the top of pcom.pas.
(*$X+*)comments can be used instead of {$X+} comments.

More that one option can be in the same comment if they are separated by a comma, eg

    {$l+,w+,t- }

There should be no spaces between the options.
Upper case letters may be used instead of lower case letters.
p5c issues a warning whenever an unrecognised option is encountered.

More details can be found in the p5 documentation.

Lisings are turned on with the 'l' character followed by a '+' in a comment like this:
{$l+}

and turned off with a '-' as in this comment:
{$l-}

Debugging can be turned on and off with the d character, eg
{$d+} {debug on}
{$d-} {debug on}

Debugging performs lots of runtime checks on the code for various programming errors that otherwise can be hard to find.
eg

    a: array[1..9] of real;
       ....
       i := 10;
       if a[i] > 0.0 then  { error i too big for a }
          ...

With debugging enabled, unexpected behavior in your program is reported exactly where it happens. With debugging disabled, there is no checking so the code will be faster, but should an error occur, the behaviour of your code will be undefined. In plain English, that means it could blow up in your face.

The debug checks include arithmetic, file, memory, and range checks.
They are described in more detail below.

Variant and tag checking is controlled with the {$v+} & {$v-} options.
These checks can find problems when a variant record is likely to contain the wrong data, as in this example:

     var   r : record
                   case t: boolean of
                     true ( size : integer);
                     false( value: real;);
                 end;
       ....

       r.t := true;  { r contains size }

       ....

       if r.value > 100 then    { !!! error - r holds size, not value }

A warning option has been added, so to turn warnings off use a comment like this:
{$w-}
and to turn them back on again, use this:
{$w+}

There is an option that embeds pascal line numbers in the generated c code:
{$n+}
You probably don't need to use this option often, but it might be useful if you need to browse the generated c code and relate it to the corresponding pascal code.

Also if you use the gnu c tools (as the pascal analyser, pan, does) the {$n+}option forces the the generated c code to refer to the pascal line numbers.

Line numbers are normally turned off, so if you want them you must explicitly turn them on.

The p5c pascal tools pan & rv turn line numbers on automatically, so you don't need do this yourself.

There is also the {$z+} option which favours increasing code size over memory size in evaluating expressions for large sparse sets. The default is {$z-} (off).
It is an advanced option and is described in the section on set implementation later in this document.

The other options from p5 do nothing or have been removed.

File Header Parameters

File variables listed in the program header are bound to external files.
They must also be declared as file variables at the program level (ie not in aprocedure or function).

For example, if my program's header looked like this

    program myprog( input, output, file1, file2 );
    var file1, file2 : text;
    ... etc ...

and myprog was called with parameters something like this

myprog data res

then variable file1 in myprog refers to the file called data, and file2 refers to the file called res.

The files are bound in the same order as they appear in the header.
The files input and output don't count because they are already bound to standard input and standard output.

If the program is called with too few parameters, the extra files in the program header are assigned temporary files.

There is a test program called tfile.pas which illustrates header file parameters: it reads a message from the first file argument, and writes a response to the second.
Run it with a sequence of commands like this:

  echo 'message 1' > file1
  ./r tfile file1 file2

The program reads and displays the contents of file1, and writes a message to file2. You can show what it wrote with a command like this:

    cat file2

There is also a benchmark program, dhry.pas, that uses /proc/uptime as a file header parameter and then uses this to provide timing information.
Compile and run with:

            ./r dhry /proc/uptime

This assumes the /proc filesystem is available (eg linux, cygwin), other systems may need suitable modifications.

files are closed when they go out of scope

Files declared in a procedure or function are closed when that procedure or function returns. This is true whether it returns normally to its caller, or if it executes a goto statement to an outer procedure of function.

Files declared in dynamic memory (ie in a pointer via new) are closed when the memory is disposed.

Short Circuit boolean conditions

In p5c, all boolean expressions are short circuited, ie boolean expressions are always evaluated left to right and only as far as necessary to determine the result.

Consider, for example, this statement:

    var a: array[1..last] of real;
     ...
     while (i <= last) and (a[i] <> c) do
     ....

If i > last, the left subexpression is false, so the whole expression must be false, whatever the value of the right subexpression. So it is not necessary to evaluate the right subexpression.
Besides saving a few instructions of execution time, a[i] will be accessed only if (i <= last) , so avoiding the danger of accessing the array when i is larger than the last index.

In other words, the rhs of the and operator is evaluated if and only if the lhs is true.

Conversely, to evaluate the rhs if and only if the lhs is false, use the or operator. eg with an expression like this:

     while (p = nil) or (p^.size=0) do
     ....

Note that there could be trouble in expressions where the rhs of a boolean expression is function that has side effects. For example if doSomething(x) is a function, then in this statement

    if (weather = sunny) and doSomething(x) then ...

function doSomething will never be called when the weather is sunny. If you need doSometing to be called always, it should not be part of a boolean expression.

The pascal standard does not say whether boolean expressions should be short circuit or fully evaluated.
(And where boolean expressions are fully evaluated, there is no guarantee whether the left or right hand sides are evaluated first.)

Pascal compilers that evaluate both sides of a boolean expression should produce a run time error for the above examples.

So code that relies on the behavior of boolean expressions is not portable.

Mod & Div

p5c correctly implements the mod operator according to the iso pascal standard.
The mod oparator in pascal is not exactly equivalent to the remainder operator.
The difference is when negative numbers are used, eg
(-15) mod 4 = +1,
ie the result of the mod operator should be >= 0 and < 4.
On the other hand, for the rem operator, (-15) rem 4 is -3, ie -(15 rem 4)

Why does this make sense? Think about, say, 5 mod 4, which is +1. Now add 4 to 5, or subtract 4 from 5 and then take the same mod again, ie
9 mod 4 = 1, and 1 mod 4 = 1.
In fact we can add or subtract 4 as many times to 5 as we want, the result after taking the mod is always +1. This is still true if we subtract 4 enough times to get a negative number, like -15. ie
5 - (4x4) = -15, and -15 mod 4 = +1.

By the rules of pascal, -22 mod 3 is interpreted as -(22 mod 3), ie with the mod done first. If you need to evaluate (-22) mod 3, use brackets.

sets (in particular, size of sets)

p5c allows sets to contain negative numbers, and have arbitrary size subject to any limits imposed by your computer's hardware and/or gcc.

A set will use only as many bytes as it needs, subject to some rounding at the ends,
eg the set s1 declared as follows will occupy exactly 2 bytes:

    s1: set of 0..15;

A set of integer is assumed to be too big and is truncated to a default size of [-255..+255].
The default size is only used when the true set size is unknown at compile time, for example in an expression like:

    if [a1,a2,a3] = [b1..b2] then ....

The p5c compiler will issue a warning whenever it cannot determine the size of a set expression.

A suitable workaround, should it be needed, is to use set intersection to make the set size known:

    if [a1,a2,a3]*[-10000..10000] = [b1..b2]*[-10000..10000] then ....

where in this case the sizes of the sets [a1,a2,a3] and[b1,b2] are known to lie within the bounds -10000..10000. The set limits need to be constants (so the compiler knows what they are). These constants don't need to be the same on each side of the comparison, and are necessary only when the compiler cannot determine the size of a set expression.

Notes:

in this example, 2 large temporary sets will be constructed - this is likely to be a problem only in systems where memory is severely constrained.
[first..last] is a constant limit if first and last are defined as constants.
p5c can now see an expression like [first..first+100] as a constant (though it's really an expression because they are not simple constants on their own). This ability is limited to expressions containing integer constants and the integer arithmetic operators (+, -, *\, div, mod) along with parentheses.
see also the {$z+} option which can be used when the set size is unknown

There are some notes about set size in the implentation section of this document.

conformant arrays

Conformant arrays are fully implemented in p5c.
Conformant arrays allow arrays of any size (but known index type and component type) to be passed to a procedure or function.

here's an example:

    { return the dot product of 2 vectors a.b }
    function dotproduct( a,b: array[lo..hi:integer] of real): real;
    var
       i: integer;
       x: real;
    begin
       x := 0.0;
       for i := lo to hi do
          x := x + a[i]*b[i];
       dotproduct := x;
    end;

    procedure test1;
    var
      vector1, vector2: array[1..20] of real;
    begin
     ...
     if dotproduct(vector1,vector2) = 0 then
        writeln( 'vectors are orthogonal' );
     ...
    end;

    procedure test2;
    var
      myvector: array[0..9] of real;
      magnitude: real
    begin
     ...
      magnitude := sqrt(dotproduct(myvector,myvector);
      writeln( 'magnitude of myvector is', magnitude:1:3 );
     ...
    end;

Notes:

in the function dotproduct the parameters a and b are conformant arrays. (a and b are called formal parameters in the literature)
the index range of the conformant array is lo..hi.

lo and hi are called the bound identifiers and apart from using their value, their use is quite limited:
- they are not constants, so can't be used to define new types, eg
  
  var i : lo..hi; { ILLEGAL }
- they are not variables, so cannot have values assigned to them, eg
```
 lo := 1; { ILLEGAL }
```
- the bounds must be identifiers (that is thay must have names), so this is not allowed:
```
 a,b: array[1..hi:integer] of real ;   { ILLEGAL, low bound cannot be a constant}
```
dotproduct is called by test1 and test2 with arrays of different sizes. (the arrays in the test procedures are called actual parameters in the literature)
The arrays in the test procedures must have index types compatible with the index type of a and b in the dotproduct function, ie integer in this case.
The component types of the arrays in the test procedures must match the component type of a and b, ie real.
a and b are in a group, ie they share the same type definition, so the types of the arrays in test1 and test2 must also match. If we needed a and b to have different sizes, then we would need an extra conformant array parameter.

Multi dimensional arrays can be defined like this:

         procedure matrixMultiply( var a,b,c: array[lo1..hi1:integer;
                                              lo2..hi2:integer] of real );

this is exactly equivalent to

   procedure matrixMultiply( var a,b,c: array[lo1..hi1:integer] of
                                        array[lo2..hi2:integer] of real );

Conformant arrays can even go inside function and procedure parameters:

procedure p(function ff(aa:array[lb..ub:char] of real):boolean);

When calling p, make sure the function you provide exactly matches ff.

packing

The packed keyword is recognised as expected, but only records are stored packed. All other types are stored the same, whether declared packed or not.
The standard tranfer functions pack() and unpack() are implemented in p5c.

page procedure

This procedure outputs a form feed to the text file.
The exising line is terminated with a writeln if necessary.

differences from the pascal standard

There are just a few of these, mainly to remove restrictions imposed by the pascal standard. (Let's face it, sometimes it's easier to implement a feature than it is to block it because the standard says so, then test that the block is correct.) Some are inherited from p5 (and p4 before that).

write format field width can be negative

a negative field width causes the output to be left justified, ie any padding is added to the right.
This feature is provided for free by gcc (and any standard c compiler).
NB: the precision field width of a real number must be >= 0

write(pointer)

write the value of a pointer to a file.
The output format is system dependent - in fact it is determined by gcc.

for example:

      new(p);
      writeln( 'pointer is', p:12 );

this might result in something like:
pointer is 0x15b6010

string index

The pascal standard implies that the upper string index must be strictly greater than one. In p5c (and many other pascal compilers), it is possible to have a string of length one.

conformant array passing

To make compiler writing easier, the pascal standard allows some restrictions on passing conformant arrays to procedures with conformant array parameters.
In p5c, the hard work is passed on to gcc, so conformant arrays can always be passed to procedures with conformant array parameters with no restrictions.

mod -ve numbers

for b<0, (a mod b) = -(a mod (-b)), and b < -(a mod (-b)) <= 0
for example (-5) mod (-3) is -2
so for the mod operator, if r := a mod b, then r is always between zero and b (including zero, excluding b)

calculating exponentials of the form exp( ln(x) * y )

This and the similar exp( y * ln(x) ) are a way to evaluate x ** y (expressed x^y in some other languages).
p5c recognises these expressions as a special case and optimises them to the (almost) equivalent but far more accurate
c function pow(x,y).
Even if ln(x) has the smallest possible theoretical error, when used as an argument to the exp function, this error becomes amplified.
Using the pow() function avoids this, so is more accurate.

ord(pointer)

create an integer value from a pointer.
this feature is inherited from the original p4 & p5 compilers.
Its results are not defined or predictable, ie it's allowed, and sometime it might work, but in other cases it might not. If you're thinking of using this so you can write out the value of a pointer, p5c allows you to write the value of pointers directly and correctly. Consider doing that instead.

external directive

p5c can access external functions and procedures with the external directive (just like nearly all other pascal compilers).
For example, suppose you have a C source file with the function myProc defined as follows:

 cfile.c:

     void myProc( int a )
     {
       ....
     }

Your pascal source could then call myProc() via a declaration like this:

procedure myProc( arg1 : integer ); external; { case of myProc matters }

Notes:
- myFunc must be a C function, or appear like a C function.
This means that external functions can't use names that clash with C reserved keywords, eg

          procedure switch; external;   {won't work, switch is reserved by C }

A function called Switch will work, because it uses an upper case letter.
Additionally, p5c appends _n (where n is 1,2,3, etc) to your pascal variable names when it generates c code - this guarantees you can't use c reserved words in your pascal code. So having an external function called a_1 in code like this will also generate an error:

    var a : integer;            {translated to a_1 in gcc}
       procedure a_1; external;    {name clash}

Notes:

The names of the arguments (arg1 in the above example) are arbitrary.
the case of the function or procedure name in the external declaration must exactly match the case of the external C function.
So myProc() in the above example will not link to a c function called MyProc() or myproc(), etc.
Pascal code that calls myproc() does not need to match the case of the declaration (this is what you would expect). In other words, the case in the external declaration matters, but the case doesn't matter when calling the function (or procedure). So, you can write code like
```
              myproc(13);   { OK, not part of the declaration }
              MYPROC(23);   { this is OK too }
              etc
```

don't forget to link your C object file with your pascal object file, eg

              p5c mprog.pas myprog.c > myprog.lst
              cc -I. -lm myprog.c cfile.c -o myprog

embedding C code in generated code stream

==============================================

p5c has a feature that enables your own C code to be embedded into the emitted code stream that is generated by p5c. Simply add your own code inside special comments like this:

   {@@ your ... c code ... here @@}

Pascal identifiers need to be changed when they are emitted in c code - this makes c
obey pascal's scope rules, and prevents name clashes with c reserved words.
To access pascal identifiers from c code, prefix the pascal name with a double @@,
like this:

var
    myInt : integer;
    ...
    begin
       myInt := 123;
    {@@
       printf("%d\n", @@myInt); // acesss pascal name from c code
    @@}

The sample program dirDemo.pas uses embedded c code to access system level functions to get a directory listing.

For another example, to make a function inline, do this:

  procedure myProc(arg1 : type1)  {@@ inline @@};

'inline' is a special gcc attribute that is applied to c functions to make them inlined, and the {@@ inline @@} comment above simply adds the inline keyword to the emitted c code.
Consult your gcc documentation for other attributes that you might like to use.
There are examples of using this feature in the tp5c.pas test file and in the clib.inc.pas c library file.

Notes:

you can use (*@@ ... @@*) comments
there is no space allowed between the comment and @@ markers, ie '{ @@' will be treated as a normal comment.
any end of comment (ie '}' or '*)' ) between the {@@ and @@} is assumed to be part of your c code and does not terminate the comment.
the code between the {@@ and @@} markers is emitted as soon as it is parsed by the compiler. The c code emitted by the compiler that corresponds to your pascal code might be built up, stored internally and emitted later. Check the final c code is what you expect it to be.
In the above example, p5c parses the procedure declaration before it sees the 'inline' attribute. On the other hand, p5c doesn't emit the c code for the procedure declaration til after the 'inline' atrribute.
if you use the c preprocessor to preprocess your pascal code, it will strip out c style comments before the code is compiled. The comments won't make it into the final c code.
Also #if, #define type preprocessor lines could be processed when the pascal code is compiled. To ensure c preproccessor commands are seen by the c compiler, use something like this
```
 {@@#define MYDEF  something  @@}
```

The c preprocessor

==================

Although the c preprocessor is not part of pascal, it is possible to use it with p5c. This allows features such as

include files.
define macros
conditional compilation

It is important to remember that the c preprocessor processes your file before it is compiled. It simply manipulates the text in your file without knowing anything about the program code.

An include file will typically contain code that you need to share across many separate programs - a collection of constants, perhaps, or some special functions that you need to use over and over.

Say you have a library that can print data in some special format and you need to use it in many programs.

Put your code in a separate file, say myLib.inc.pas, and include this each of your programs with a line like this

   # include "myLib.inc.pas"

Note:

the '#' char must be in the left margin.
This is true for all c preproccessor directives.
the file mylib.inc.pas must be in the current directory.
It is possible to have it in another directory if you include the file name in <angle brackets=""> instead of "quotes", and if you also add a -I option to your build command line. Consult you gcc documents for specific details.</angle>

Macro definitions are useful for small repeated sections of text, with orwithout parameters.

The file cdefs.inc.pas contains macro definitions for your system's real numbers, eg the number of significant digits a real numeber has is defined like this:

   #define REAL_DIGITS 15

As an example, here's a small program that uses this file to test your system's real numbers:

    program testReal(output);

    #include "cdefs.inc.pas"

    begin
           {field width is real digits + sign + '.' + exp width}
           writeln('the biggest real is ', REAL_MAX:REAL_DIGITS+7);
    end.

Notes:

the c preprocessor definitions, eg REAL_DIGITS, are known to the c preprocessor only, they are not pascal variables. Unlike in pascal, the case matters, so Real_Digits cannot be used instead of REAL_DIGITS, and the '_' is allowed.
the file cdefs.inc.pas is auto generated at the same time as p5c to match whatever values your system provides.

Here'a an example of a macro with parameters:

      program testMac(output);
      #define MAX(a,b)  (((a) + (b) + abs((a) - (b)))/2)
      begin
          writeln(' max(3,5) is ', MAX(3,5) );
      end.

Similarly, we could have written a MIN macro by replacing the addition with a subtraction, as follows:

   #define MIN(a,b)  (((a) + (b) - abs((a) - (b)))/2)

So this can be a very useful feature, but watch out. Consider this macro to generate the cube of a number:

#define CUBE(z)  z*z*z

Now let's suppose we want this

   n := 2;
   writeln( '(2n+1 cubed is ', CUBE(2n+1) );

We expect the answer 3 cubed = 27, but get 13.
Why? Looking at the list file created by p5c, we see that the c preprocessor has generated the statement

writeln( '(2n+1 cubed is ', 2*n+1*2*n+1*2*n+1 );

and since multiply has a higher precedence than +, it is evaluated like this:

   writeln( '(2n+1 cubed is ', (2*n) + (1*2*n) + (1*2*n) + 1 );

So we should always use parenthese with macros, so our revised CUBE macro becomes:

    #define CUBE(z)  ((z)*(z)*(z))

There is yet another pitfall to be wary of. Consider a function f that has side effects - say it prompts the user to enter a number. Let's now suppose we want to print the cube of this number. We might be tempted to write:

   function getNum: integer;
   var i:integer;
   begin
       write('please enter a number ');
       read(i);
       getNum := i;
   end;
       ...
       write( CUBE(getNum) );  {get anumber from user, cube it}

What happens? The user is prompted 3 times for the number! With macros, it is not always so obvious when something is not behaving as we expect.

We can write instead:

    n := getNum;             { get a number }
    write( CUBE(n) );   { cube it }

Here's an example of condtional compilation. It includes or excludes some part of the program according to whether a prticular feature has been enabled.

    #define HAS_FEATURE

    program p(...)
     {... code ...}


    #if HAS_FEATURE
     { code to implement feature }
    #else
     writeln( 'feature not implemented' );
    #endif

There's a small example in the test file tp5c.pas which optionally compiles code for interactive testing of standard input when TEST_STDIN is defined as a preprocessor symbol, eg by running it with the command:

        ./r -DTEST_STDIN tp5c

Normally, the code is omitted so that it can run unattended.

The c preprocessor provides other features:

cpp strips out // style comments and /* ... */comments, so with the c preprocessor, you can use comments like this in pascal:

         writeln( 'hello world' );   // comment til end of line

or this

        /** provide answer to life, the universe and everything */
        function answer : integer;
        begin
           answer := 42;
        end;

Also, There is a line number macro called __LINE__ that you can use like this:

          if somethingBad then
             writeln( 'problem found at line ', __LINE__ );

Note that you can't use all the features of cpp, for example DATE produces illegal pascal code.

See the cpp documentation for gcc or just about any other c language description for more details.

If you use the r build script, you can force using the preprocessor by starting one of the first 10 lines of your pascal program with a '#' char.

Debug Checks

============

As mentioned earlier, with the debug option turned on the p5c compiler generates extra code to self check your program as it runs.
This will locate most range errors, memory errors, etc without you needing to figure it out later.

Except for memory debugging, the options can be turned on or off as required.
For example to check a particular fragment of code:

    {$d+}
    { generate an error if i is outside the array bounds }
    if a[i] > 20 then
    {$d-}

    { no checks here }

Memory debugging needs to check the whole program, not small fragments of code. So it is global, ie it is turned on for the whole program, or it is off for the whole program.
To turn it off, put a {$d-} comment before the program heading.
If there is a {$d+} or nothing before the program heading then memory debugging is enabled. eg in this example, memory debug is enabled, but all other debugging is disabled:

    {$d+}
    program myprog(input, output);
    {$d-}
      ...

When Memory debug is enabled, the p5c compiler gathers information about memory as it is allocated and then checks it as pointers are used.
Finally, at the end of the program, it reports any memory that has not been disposed (ie memory leaks).

These errors issue a warning:

not disposing memory,
Memory allocated with new but then not disposed causes a memory leak.
This becomes important only when your program is likely to run out of memory.

Other errors are programming errors so are fatal:

accessing an invalid pointer
The pointer is not pointing to memory that has been allocated with new().
This can be caused by a pointer being used before it is initialised with new, or being corrupted, say as part of a variant record.
accessing the nil pointer
pointer could be nil because no value was ever assigned to it.
attempting to dispose memory twice
attempting to use disposed memory
for example

            dispose(p);
            dispose(p^.next); {pointer p has been disposed, so cannot be used again}

using a file that is not yet defined
for example

         program test(output);
         var f:text;
          begin
              reset(f);  { f is not defined since nothing has been written to it }

Use rewrite(f) to create a file before reset(f);

writing to a file opened for reading
reading from a file opened for writing
file errors when reading char, integer or real
this can be caused when the program is trying to read data that is in the wrong format, or trying to read past the end of the file, so there is not enough data. A hardware error could also cause this problem.
For an example of data in the wrong format consider reading an integer from a text file containing "abc". This causes a fatal error since an integer cannot be read.
accessing an array outside its bounds
for example

       a : array[1..9] of integer;
       ...
       z := a[10];  { error, there is no a[10] }

result of sqr(n), trunc, round > maxint
for example

    x := 20.0 * maxint; { OK, if x is real }
    i := trunc(x);      { error, since trunc(x) > maxint }

result of chr() larger than last character
bounds overrun or underrun errors for pack()
the pack function copies components of an unpacked array to a packed array.
It copies enough components to exactly fill the packed array, starting from a given index in the unpacked array. If this index is too large, there will be too few components in the unpacked array to fill the packed array.
for example

      a  : array[1..9] of integer;
      pa : packed array [1..4] of integer;
      ...
      pack( a, 7, pa); { error, copies a[7], a[8], a[9] & a[10], ...
                         ... but there is no a[10] }

bounds overrun or underrun errors for unpack()
see comments for pack()
range errors in assignment
for example

       var percent = 1..100;
       ....
         percent := 0;  { 0 in out of range of the values in variable percent }

range errors in parameter substitution
see comments for range error in assignment
out of bounds errors for read integer or char from a file
see comments for range error in assignment

case statement with no case constant matching the case expression
for example, a statement like this will case a fatal error:

    {$d+}   {-- turn debugging on if it isn't already on}
    case 13 of       {error, there is no case for 13}
      1: write( 1 );
      2: write( 2 );
      3: write( 3 );
    end;

debug must be turned on at the point that the case keyword is seen, for example

    {$d+} case {$d-} e of  {will detect a case error}
    ...
    end;

whereas in this case statement, debug is turned off where the case keyword is, so there is no debug checking:

    {$d-} case {$d+} e of  {will not detect a case error}
    ...
    end;

no return value assigned to a function
for example

    {$d+}
    function f : integer;
    begin
      if someCondition then
         f := 99;
    end; {error: if someCondition is false then f has no return value}

this option is enabled for the function if debug is enabled at the point of the function keyword. For example, this function is not checked for unassigned return value:

    {$d-} function {$d+} notest: integer;
    begin
      writeln( 'unassigned return value not detected because debug disabled' );
    end; { notest }

division by zero (including mod)
pred & succ extend a variable outside its declared range
for example

       type colour = (red, yellow, green);
       ...
       next := succ(green)  { error, there is no value that follows green }

sqrt of a number less than zero
natural logarithm of a number less than or equal to zero
exp(x) exceeds the largest representable real number (typically about 1.8e308)
integer overflow
This happens when the result of any integer add, subtract or multiply cannot be represented by an integer.
ie, the result is bigger than maxint, or smaller than -maxint (or more likely maxint-1).
for example

        var i,j: integer;
        ....
          i := maxint - 2;
          j := i+3;            { this is bigger than maxint }

record variant mismatches tag
this is controlled by the {$v+} option, see comments above

p5c Utility Programs

Build Tools

There is a bash shell script called pc that compiles a pascal program.

To compile tp5c.pas, for example, type something like this

     ./pc tp5c

Before compiling tp5c, the script checks if tp5c.pas has been updated and if not there is no need to recompile tp5c.pas.

It is set to assume that the p5c compiler and p5c.h are in the directory defined by the P5CDIR shell variable.

To get a quick help message, run the script with no arguments,

    ./pc

If you need to compile with optimisation turned on, you can add an optimise option, eg

    ./pc -O1 tp5c

Other optimisation options are available, depending on your version of the gcc compiler.

This script can be called with -cpp as an option,

    ./pc -cpp tp5c

This runs the c preprocessor (cpp) on the pascal source file.

Alternatively, the pc script will run the c preprocessor if it finds one of the first 10 lines of the pascal source file starts with a '#' character.

If the pascal file has include files, it will be recompiled if one of the include files has been updated.

You can define preprocessor symbols on the command line with the -Dsomething option. There are 2 forms, for example:

    ./pc -DFOO tp5c
    ./pc -DSIZE=100 tp5c

The first form defines a preprocessor symbol FOO with a default value of 1.
You can use this form for code like this:

#if defined FOO
   ... code that depends on FOO
#endif

The second form defines a preprocessor symbol SIZE with a value of 100.

Defining a value on the command line also forces use of the c preprocessor.

Any other options on the command line are passed directly to the gnu c compiler. See your gcc documentation for details. For example, the -v option shows the phases of the gcc compilation:

    ./pc -v tp5c

The script has logic that determines whether tp5c.pas needs to be compiled or not, so can avoid recompilation altogether.
Recompilation is necessary on any of the following conditions:

the executable file does not exist
the source file has been updated
the compile options have changed
any of the include files in the source have changed

There is another bash shell script called r that similarly compiles and then runs the pascal program.

To run tp5c.pas, for example, type something like this

    ./r tp5c

To get a quick help message, run the script with no arguments,

./r

pan - the pascal analyser

pan uses gcc's deep code analysis to find bugs and potential bugs in your code, eg uninitialised variables, overflow, array bounds errors.
It also reports unused variables, functions etc.
It can't identify all errors, but can still quickly spot otherwise hard to find bugs.

call it like this:

 pan -N myprog

where N is a number between 0 and 3 and specifies the level of detail (and accuracy) in the output report. 'myprog' is your pascal source code, without the extension (ie .pas or .p).

level 0 looks for uninitialised variables only.
leve1 1 does the same and also looks for unused variables, parameters and functions, etc. It uses a deeper level of gcc's code analysis, so can find errors missed by level 0, but now the line numbers are less accurate.
level 2 looks for even more errors and potential errors.
level 3 gives an even more detailed report, but sometimes it has a higher chance of report reporting false positives (ie code that is merely suspicious but otherwise correct).

Level 1 is the default level.
If one of the first 10 lines of your pascal code starts with a '#', the code is passed through the c preprocessor before being compiled.
pan also accepts an option to run the c pre-processor over the code before doing the analysis, eg

 pan -cpp myprog

Notes:

pan automatically uses the {$n+} option to enable the correct (or approximate) line numbers. You don't need to do this.
all global variables are initialised to 0 by gcc, so pan will never report an uninitialised global variable.
array components that are not initialised are reported like this:
```
   ‘a[+1]’ is used uninitialized ...
```
gcc assumes all arrays start at index 0, and a[+1] means index 1 relative to the first index.
For example, suppose we have an array
```
 var ac: array['a'..'z'] of integer;
```
and pan reports something like this
```
  ‘ac[+1]’ is used uninitialized
```
then the +1 means that ac['b'] has no initial value, since ac['a'] in pascal corresponds to ac[0] in gcc.
details of the inner workings of this can be found in the documentation for your version of gcc. Look for the section on warning options.

ppr - the pascal printer

ppr is a simple script that uses the a2ps program to format a pascal source file and send it to the local printer.
The settings, like landscape or portrait mode, can be changed by editing the ppr script.
Consult your a2ps documentation or type 'man a2ps' or 'info a2ps' for details of available settings.

rv - the pascal memory checker

rv is a shell script that finds uninitialised data and memory that is not disposed in your program.

rv is called and behaves almost the same as the r build and run script.
The difference is it runs your program in a virtual machine (called valgrind) that checks every memory reference. Any references that use uninitialised data are reported.

Notes:
- you may need to install valgrind
- valgrind is not avaliable in all operating systems

coverage analysis

How do you know how good your test code is?
One indicator is code coverage - ideally your tests should exercise every line of code in your program.
In practice this is very difficult, so >90% coverage is regarded as good.

The tcov.sh shell script is a template to help you analyse code coverage.
It cannot know how to run your tests, so you need to edit this script to run them. Everything else should already be there.

There are four steps to find code coverage:

clear the counters. This is really optional, if the counters are not cleared they just count up from the previous run.
build an instrumented version of your program.
run the tests. This is the part of tcov.sh that needs to be modified for the program under test.
show the results. tcov.sh looks for a program called lcov to turn the results into an html page to display with the konqueror web browser. If lcov does not exist the results are displayed as text.
You may want to modify this part of the script to use your favourite display tools rather than lcov and konqueror. gcovr may be a good alternative, see http://gcovr.com/

Contents

the pascal language
using the p5c compiler
p5x - pascal extensions
implementation notes

pascal p5c Wiki

fast, portable Pascal compiler using gnu c as intermediate code

using the p5c compiler

Compiler Options

File Header Parameters

files are closed when they go out of scope

Short Circuit boolean conditions

Mod & Div

sets (in particular, size of sets)

conformant arrays

packing

page procedure

differences from the pascal standard

write format field width can be negative

write(pointer)

string index

conformant array passing

mod -ve numbers

calculating exponentials of the form exp( ln(x) * y )

ord(pointer)

external directive

embedding C code in generated code stream

The c preprocessor

Debug Checks

p5c Utility Programs

Build Tools

pan - the pascal analyser

ppr - the pascal printer

rv - the pascal memory checker

coverage analysis

Contents

Related