Manual and notes for TPGetOpt
GNU getopt (from GNU textutils 1.12) originally ported to Turbo Pascal
by Hkon Lvdal, May 1996.
(Hkon is adamant you use codepage 865 for reading his name correctly).
Formatted; increased Pascalization; adjustments for recent compilers; fixes
and manual by Bloodbat, May 2023.
This unit is covered by the GNU LIBRARY GENERAL PUBLIC LICENSE read the file
LICENSE.TXT for details.
The original 1996 port can be found in getopt10.zip with a simple Google
search.
I moved the manual and a lot of notes here because, honestly, I find them
distracting: most of the source speaks for itself and wading through pages of
comments is not my thing.
The note below is from the original porter, some statements may no longer be
true, particularly when they concern keeping original C code or interfaces.
While nothing has been redacted, it has been proof-read; variable names made
consistent, and corrected by me. My own note follows.
-Bloodbat
------------------------------------------------------------------------------
Hkon writes:
"After being annoyed by having to 'reinvent the wheel' each time I was writing
a program that used command line options, I finally decided to port getopt.
The Pascal code is kept as close to the original C as possible.
All string handling is, however, changed into pascal strings.
(There are a few -insignificant- things that could be changed, like
'OptIndex' being initialized to 1 twice, but I haven't done it in order to be
consistent with the original C code).
There is some room for improvements, like not using a fixed size array for
the arguments, but instead dynamically allocating only the space needed.
Expanding wildcards would also be a nice feature. But I don't plan to
implement that or to do any further development. As long as the function
processes the arguments, it serves my needs."
------------------------------------------------------------------------------
Bloodbat's note:
I found Hkon's port looking for a Turbo Pascal version of GetOpt. Free Pascal
has its own version but it requires some units DOS Borland compilers don't
have.
Code internals are still quite similar to the C original; but some changes have
been made and the interface is simplified.
The parameters array allocates only required memory; limits have been
increased; if compiled under DOS it uses less memory than Hkon's version
because it accounts for DOS' command line limits; a bug from the original where
dynamic memory wasn't freed when options were no longer needed is fixed (this
is important for Free Pascal's protected mode). Keep in mind that now the user
is required to call InitGetOpt before processing arguments and DoneGetOpt to
release the used heap when done THIS IS IMPORTANT!
The characters returned when an unknown option is found or a required argument
is missing can be set by the unit's user via CharInvalidOption and
CharMissingArgument, unlike GNU's GetOpt that, to this day, can't be changed
without recompiling the source (NOT FUN). Just assign a new character to
either or both.
I fixed the unit not returning CharMissingArgument when a long option required
an argument that's nowhere to be found and ':' was the first character in
OptString.
When a long option had a missing argument error LongInd wasn't updated: that
has been fixed as well.
This manual is also fixed to include information on how to get a
CharMissingArgument response when required arguments are missing (if going
only by the comments in the original source finding that out was a PITA.
The current GNU manual for the C library is crystal clear in that regard).
The unit still won't expand wildcards... but, really, does it need to? I
believe that's a job for the main program.
OptIndex is set to "1" only once.
When compiled for Microsoft DOS and Windows, both "/" and "-" are valid
switch characters. This behavior can be disabled by defining "NODOSSWITCHES"
at compile time.
The unit compiles and works fine with Borland version 7 compilers and Free
Pascal.
------------------------------------------------------------------------------
MANUAL
DESCRIPTION
Using TPGetOpt makes your program accept commandline options in several
different ways. The following examples are all equal:
program -a -b -c --longoption value nonoption
program -abc --longoption=value nonoption
program nonoption -c --longoption value -ba
The programming interface is as follows:
Call GetOpt repeatedly until it returns EndOfOpts. Each time it returns
one option. GetOpt is called with a string describing the valid options.
A character followed by a ':' requires an argument (returned in the
variable 'OptArgument').
Example:
"GetOpt('f:vh')" will have 3 options
'-f' (e.g. file) that requires an argument,
'-v' (e.g. verbose) and
'-h' (e.g. help).
If GetOpt encounters an unknown option it returns CharInvalidOption (default:
'?'); if an option requires an argument that is not present it returns
CharMissingArgument (default: ':') if the first character in the valid
options string is ':'.
The faulty option can be obtained by reading the 'OptOption' variable.
This version of 'GetOpt', to the caller, is similar standard Unix 'GetOpt'
(see below: calling is simplified) but it behaves differently for the user:
it allows the user to intersperse the options with the other arguments.
As 'GetOpt' works it permutes command line parameters so, when it is done,
all the options precede everything else. Thus all application programs are
extended to handle flexible argument order.
Setting the environment variable POSIXLY_CORRECT disables permutation.
Then behavior is completely standard (the interface is not).
Programs can use a third alternative mode in which they can distinguish the
relative order of options and other arguments.
INTERFACE PROCEDURES AND FUNCTIONS
function GetOpt(OptString: string): char;
Call repeteadly to find valid short options contained in OptString.
function GetOptLong(Options: string; LongOptions: array of TOption;
var OptIndex: integer): char;
Call repeteadly to find valid short and long options contained in
OptString and LongOptions respectively.
function GetOptLongOnly(OptString: string; LongOptions: array of TOption;
var OptIndex: integer): char;
Call repeteadly to find only valid long options contained in LongOptions.
procedure InitGetOpt;
Prepare TPGetOpt internal structures. Must be called before any other
function.
procedure DoneGetOpt;
Free memory used by TPGetOpt internal structures. Must be called when
option processing is finished.
INTERFACE VARIABLES
PtrArgs: PArrayArgs;
A pointer to the parameters array. Can be used, for example, to get
non-options.
ArgCount: integer;
The total number of arguments as passed on the command line.
OptIndex: integer;
Index of the next element to be scanned in PtrArgs.
This is used for communication to and from the caller and for communication
between successive calls to 'GetOpt'.
On entry to 'GetOpt', 0 means this is the first call and it should be
initialize to 1 (as stated in XXX 1003.2).
When 'GetOpt' returns EndOfOpts, this is the index of the first of the
non-option elements that the caller should scan itself.
'OptIndex', otherwise, communicates from one call to the next how much of
PtrArgs has been scanned so far.
OptError: boolean;
Controls whether built-in error messages are used.
OptArgument: string;
For communication from 'GetOpt' to the caller.
When 'GetOpt' finds an option that takes an argument, the argument value
is returned here.
If 'OrderingType' is otReturnInOrder, each non-option parameter is
returned here.
OptOption: char;
Set to unrecognized option characters.
CharInvalidOption: char = '?';
The character returned by GetOpt when an invalid option is found.
CharMissingArgument: char = ':';
The character returned by GetOpt when a required argument is missing.
LONG OPTIONS
TOption = record
Name: string;
ArgumentKind: TArgumentKind;
Flag: ^char;
Value: char;
end;
The LongOptions argument to GetOptLong or GetOptLongOnly is an array
of TOption records terminated by an element where the Name field is an
empty string.
The field 'ArgumentKind' can be:
akNone : If the option does not take an argument.
akRequired : If the option requires an argument.
akOptional : If the option takes an optional argument.
If the field 'Flag' is not "nil", it points to a variable that is set
to the value given in the field 'Value' when the option appears, but
left unchanged if the option is not present.
To have a long-named option do something different than set a 'char' to
a compiled in constant, such as set a value from 'OptArgument', set the
option's 'Flag' field to "nil" and its 'Value' field to a valid character
(the equivalent single-letter option character, if there is one). For long
options that have a "nil" 'Flag' field, 'GetOpt' returns the contents of the
'Value' field.
INTERNALS
TOrderingType = (otRequireOrder, otPermute, otReturnInOrder);
Describe how to deal with options that follow non-option parameters.
If the caller did not specify anything and the environment variable
POSIXLY_CORRECT is defined otRequireOrder is the ordering type, otherwise
it is otPermute.
otRequireOrder means don't recognize them as options: stop option processing
when the first non-option is seen.
This is what Unix does.
This mode of operation is selected by either setting the environment
variable POSIXLY_CORRECT, or using '+' as the first character of the list
of option characters.
otPermute is the default for this unit. Parameters are permuted as they
are scanned, so that, eventually, all the non-options are at the end.
This allows options to be given in any order, even with programs that
were not written to expect this.
otReturnInOrder is an option available to programs that were written to
expect options and other parameters in any order and that care about the
ordering of the two; each non-option parameter is described as if it
were the argument of an option with character code 1 (#1).
Using '-' as the first character of the list of Option characters selects
this mode of operation.
The special argument '--' forces the end of option-scanning regardless of
the value of 'Ordering'. When 'Ordering' is otReturnInOrder, only '--'
can cause 'GetOpt' to return 'EndOfOpts' with 'OptIndex' <> 'ArgCount'.
function FindOpt(PtrArgs: PArrayArgs; OptString: string;
LongOpts: array of TOption; var LongInd: integer;
LongOnly: boolean): char;
Scans parameters (length ArgCount) for the option characters given in
OptString.
If an element of PtrArgs starts with the switch characters, and is not
exactly "-" or "--", then it is an option element. The characters of this
element (other than the initial switch character) are option characters.
If 'GetOpt' is called repeatedly, it returns successively each of the
option characters from each of the option elements.
If 'GetOpt' finds another option character, it returns that character,
updating 'OptIndex' and 'NextChar' so that the next call to 'GetOpt' can
resume the scan with the following option character or parameter.
If there are no more option characters, 'GetOpt' returns 'EndOfOpts'.
Then 'OptIndex' is the index in PtrArgs of the first PtrArgs-element that
is not an option. (The PtrArgs-elements have been permuted so that those
that are not options now come last.)
OptString is a string containing legitimate option characters.
If an option character is seen that is not listed in OptString,
CharInvalidOption (default: '?') will be returned after printing an error
message. If 'OptError' is set to False, the error message is suppressed
but CharInvalidOption is still returned.
If a character in 'OptSring' is followed by a colon it requires an
argument, so the text following following the same parameter or the text
of the following parameter is returned in 'OptArgument'. Two colons mean
an option that can have a discretional argument; if there is text in the
current PtrArgs-element, it is returned in 'OptArgument', otherwise
'OptArgument' is set to an empty string.
If 'OptString' starts with '-' or '+', it requests different methods of
handling non-option parameters.
See the comments above about otReturnInOrder and otRequireOrder.
Long-named options begin with '--' instead of the valid switch characters.
Their names may be abbreviated as long as the abbreviation is unique or is
an exact match for some defined option. If they have an argument, it
follows the option name in the same PtrArgs-element, separated from the
option name by a '=', or else the in next PtrArgs-element.
When 'GetOpt' finds a long-named option, it returns #0 if that option's
'Flag' field is a valid pointer or the value of the option's 'Value' field
if the 'Flag' field is "nil".
'LongOpts' is an array of 'TOption' records terminated by an element
containing a name which is an empty string.
'LongInd' returns the index in 'LongOpt' of the long-named option found.
It is only valid when a long-named Option has been found by the most recent
call.
If LongOnly is true, '-' as well as '--' can introduce long-named options.
Exchange
Exchanges two adjacent subsequences of parameters.
One subsequence is elements (FirstNonOpt,LastNonOpt) which contain all the
non-options that have been skipped so far.
The other is elements (LastNonOpt,OptIndex), which contain all the options
processed since those non-options were skipped.
'FirstNonOpt' and 'LastNonOpt' are relocated so that they describe the new
indices of the non-options in parameters after they are moved.
LIMITATIONS
The unit does not handle more than 128 parameters.
(That limit can be changed with iMaxArgs in tpgetopt.pas).
Under DOS the maximum command line length is 128 and that's the maximum
string length set for the unit under that platform (saves memory).
N.B: The DOS command line limit of 128 characters can be less if
it is constructed in certain specific ways (for an example, check the
April 1994 issue of PC Magazine); also DOS versions lower than 6.22
may have lower character limits.
The maximum string length is 255 for other platforms: this is imposed by
Pascal itself. If for some reason you need a unit that can process more
characters, use Free Pascal's version of GetOpt: it uses AnsiStrings. The
unit's name is getopts and won't compile with Borland products.
EXAMPLES
The unit ships with four working examples: test.pas, testlong.pas,
shortest.pas and tptest.pas. While simple, the programs show some of the
unit's capabilities and how to use it.
FAQ
1. My program fails with runtime error 216! What's wrong?
A: Most likely causes are:
a) InitGetOpt was not called before trying to parse any options. Call
InitGetOpt.
b) InitGetOpt was called but you're trying to dereference a "nil"
pointer.
It is likely the compiler is set not to produce runtime errors when
the heap can't grow. Check if PtrArgs is nil.
2. My program fails with runtime error 203! Why?
A: The heap has reached its limit, check the $M directive, the memory limits
for the host OS and the available free memory.
3. My program fails with runtime error 204! What can I do?
A: Possible causes include:
a) There's not enough memory to allocate the dynamic structure:
the program needs more free RAM to run.
b) Calling DoneGetOpt more than once or calling DoneGetOpt before calling
InitGetOpt.
c) Check question 1, item b.
4. Heathen! How dare you! You changed my beloved C interface!
A: First and foremost... that's not a question. Second: Take a breath, you
modern world hating, you. Third: I really dislike your "beloved" C
interface. It's old and self documentation is kinda lacking. I like clear
variable names and simple calls. If you really miss it... you can
always use Free Pascal's getopts unit or the original port from
getopt10.zip.
5. This unit is kinda DOS centric, don't you think?
A: Yes, I do. That's the point. While I try to accommodate other systems, my
main focus for updating this unit is using it for my old DOS Borland
programs and games. If you want something more familiar and modern use
the getopts unit Free Pascal includes.
6. Is this unit supported?
A: Not really, it does what I need it to do. If you want it to do something
specific the sources are available for you to play with.
If I run into something that breaks it or my programs... I'll probably
fix it. If you fix something, let me know so code can be analyzed and
merged.