|
From: <kin...@us...> - 2025-09-24 21:20:44
|
Revision: 7484
http://sourceforge.net/p/teem/code/7484
Author: kindlmann
Date: 2025-09-24 21:20:39 +0000 (Wed, 24 Sep 2025)
Log Message:
-----------
still hacking, and may have fixed longstanding bugs: never enforced that unflagged variable parameter options had min > 0, and, may not have checked that unflagged options without defaults were satisfied by command-line
Modified Paths:
--------------
teem/trunk/src/hest/README.md
teem/trunk/src/hest/adders.c
teem/trunk/src/hest/methodsHest.c
teem/trunk/src/hest/parsest.c
teem/trunk/src/hest/test/tparse.c
Modified: teem/trunk/src/hest/README.md
===================================================================
--- teem/trunk/src/hest/README.md 2025-09-24 18:49:34 UTC (rev 7483)
+++ teem/trunk/src/hest/README.md 2025-09-24 21:20:39 UTC (rev 7484)
@@ -16,8 +16,8 @@
- The set of arguments that logically belong together (often following a flag) in the service of setting a variable are called _parameters_ (or _parms_). There is some slippage of terminology between the `char *` string that communicates the parameter, and the value (such an `int`) parsed from the parameter string.
- Separately, and possibly confusingly, `hest`'s behavior has many knobs and controls, stored in the `hestParm` struct. The pointer-to-struct is always named `hparm` in the code, to try to distinguish it from the parameters appearing on the command-line.
- An _option_ determines how to set one C variable. In the C code, one `hestOpt` struct stores everything about how to parse one option, _and_ the results of that parsing. An array of `hestOpt` structs (not pointers to structs) is how a `hest`-using program communicates what it wants to learn from the command-line. The `hestOpt` array is usually built up by calls to one of the `hestOptAdd` functions.
-- On the command-line, the option may be defined by a flag and its associated parms; this is a _flagged_ option. Options may also be _unflagged_, or what others call "positional" arguments, because which C variable is set by parsing that option is disambiguated by the option's position on the command-line, and the corresponding ordering of `hestOpt` structs.
-- The typical way of using `hest` is to process _all_ the args in the `argv` you give it. In this way `hest` is more like Python's `argparse` that tries to make sense of the entire command-line, rather than, say, POSIX `getopt` which sees some parts of `argv` as flag-prefixed options but the rest as "operands". `hest` doesn't know what an operand is, and tries to slot every `argv` element into an argument of some (possibly unflagged) option.
+- On the command-line, the option may be specified by a flag and its associated parms; this is a _flagged_ option. Options may also be _unflagged_, or what others call "positional" arguments, because which C variable is set by parsing that option is disambiguated by the option's position on the command-line, and the corresponding ordering of `hestOpt` structs in the `hestOpt` array.
+- `hest`'s goal is to process _all_ the args in the `argv` you give it. In this way `hest` is more like Python's `argparse` that tries to make sense of the entire command-line, rather than, say, POSIX `getopt` which sees some parts of `argv` as flag-prefixed options but leaves the rest as "operands" for something else downstream to interpret. `hest` doesn't know what an operand is, and tries to slot every `argv` element into an argument of some (possibly unflagged) option.
- An option may have no parms, one parm, a fixed number of parms, or a variable number of parms. Unflagged options must have one or more parms. With `mv *.txt dir`, the `*.txt` filenames could be parsed as a variable number of parms for an unflagged option, and `dir` would be a fixed single parm for a second unflagged option. Flagged options can appear in any order on the command-line, and the same option can be repeated: later appearances over-ride earlier appearances.
- Sometimes multiple command-line options need to be saved and re-used together, over a time span longer than one shell or any variables set it. Command-line options can thus be stored in _response files_, and the contents of response files effecively expanded into the command-line. Response files can have comments, and response files can name other response files.
- The main `hest` function that does the parsing is `hestParse`. Its job is to set one variable (which may have multiple components) for every `hestOpt`. Information for setting each variable can come from the command-line, or from the default string set in the `hestOpt`, but it has to come from somewhere. Essentially, if no default string is given, then the option _must_ be set on the command-line (or a response file named there). In this sense, `hest`'s "options" are badly named, because they are not really optional.
@@ -57,7 +57,7 @@
min > 0 1 2
```
-The `kind` of option is independent of whether it is flagged or unflagged, and independent of being optional (due to the `hestOpt` having a default string) versus required (when no default is given). Note that users of `hest` may not ever need to worry about `kind`, and it certainly is not part of the API calls to create options and parse command-lines.
+The `kind` of option is mostly independent of whether it is flagged or unflagged, and independent of being optional (due to the `hestOpt` having a default string) versus required (when no default is given). The one wrinkle is that unflagged options must have at least one parameter (i.e. `min` > 0), either by the command-line or via a default string. An unflagged option allowed to have zero parameters has no textual existence and so is unactionable. Thus for unflagged options, `kind`s 1 and 4 are ruled out; kind 5 is possible with `min` >= 1. This is likely already too much low-level information: users of `hest` will likely not ever need to worry about `kind`, and certainly `kind` is not part of the API calls to create options and parse command-lines.
Some **concrete examples** may be helpful for understanding `hest`'s utility ... (IN PROGRESS) ...
... Give some specific examples, flagged and unflagged ...
Modified: teem/trunk/src/hest/adders.c
===================================================================
--- teem/trunk/src/hest/adders.c 2025-09-24 18:49:34 UTC (rev 7483)
+++ teem/trunk/src/hest/adders.c 2025-09-24 21:20:39 UTC (rev 7484)
@@ -22,21 +22,19 @@
/*
Since r6184 2014-03-17, GLK has noted (in ../TODO.txt):
- (from tendGlyph.c): there needs to be an alternative API for hest
- that is not var-args based (as is hestOptAdd). You can't tell when
- you've passed multiple strings for the detailed usage information by
- accident. GLK had accidentally inserted a comma into my multi-line
- string for the "info" arg, relying on the automatic string
- concatenation, and ended up passing total garbage to hestOptAdd for
- the airEnum pointer, causing him to think that the tenGlyphType airEnum
- was malformed, when it was in fact fine ...
+ (from tendGlyph.c): there needs to be an alternative API for hest that is not var-args
+ based (as is hestOptAdd). You can't tell when you've passed multiple strings for the
+ detailed usage information by accident. GLK had accidentally inserted a comma into my
+ multi-line string for the "info" arg, relying on the automatic string concatenation,
+ and ended up passing total garbage to hestOptAdd for the airEnum pointer, causing him
+ to think that the tenGlyphType airEnum was malformed, when it was in fact fine ...
This motivated the r7026 2023-07-06 addition of non-var-args hestOptAdd_nva, which
would have caught the above error.
-The underlying issue there, though, is the total lack of type-checking associated with
-the var-args functions. Even without var-args, the "void*" type of the value storage
-pointer is still a problem. Therefore, the functions in this file help do as much
-type-checking as possible with hest. These functions cover nearly all uses of hest
+The underlying issue there, though, is MORE than the total lack of type-checking
+associated with the var-args functions. Even without var-args, the "void*" type of the
+value storage pointer is still a problem. Therefore, the functions in this file help do
+as much type-checking as possible with hest. These functions cover all uses of hest
within Teem (and in GLK's SciVis class), in a way that is specific to the type of the
value storage pointer valueP, which is still a void* even in hestOptAdd_nva. Many of the
possibilities here are unlikely to be needed (an option for 4 booleans?), but are
@@ -52,14 +50,14 @@
type (like void* is the generic pointer type), and, we're not doing compile-time checks
on the non-NULL-ity of hestCB->destroy. So it all devolves back to plain void*. Still,
the hestOptAdd_*_Other function are generated here to slightly simplify the hestOptAdd
-call, since there is no more NULL and NULL for sawP and enum. The only way around this
-particular type-checking black hole is still extreme attentiveness.
+call, since there is no more NULL and NULL for sawP and enum, respectively. The only way
+around this particular type-checking black hole is still extreme attentiveness.
Two other design points of note:
(1) If we're moving to more type checking, why not make the default values be something
other than a mere string? Why isn't the default typed in a way analogous to the newly
-typed valueP? This is a great idea, and it was actually briefly tried, but then
+typed valueP? This is a great idea, and it was actually briefly tried in 2023, but then
abandoned. One of the original good ideas that made hest work (when it was created) was
the recognition that if the point is to get values out of argv strings collected from the
command-line, then you are absolutely unavoidably in the business of parsing values from
Modified: teem/trunk/src/hest/methodsHest.c
===================================================================
--- teem/trunk/src/hest/methodsHest.c 2025-09-24 18:49:34 UTC (rev 7483)
+++ teem/trunk/src/hest/methodsHest.c 2025-09-24 21:20:39 UTC (rev 7484)
@@ -396,6 +396,11 @@
and the 99 non-var-args hestOptAdd_* functions also all call this.
Like hestOptAdd has done since 2013: returns UINT_MAX in case of error.
+
+NOTE that we do NOT do here ANY error checking on the validity of the arguments passed,
+e.g. enforcing that we have a non-NULL sawP if min != max (a variable parameter option),
+or that without a flag (`flag` is NULL) we must have min > 0. All of that is done later,
+in _hestOPCheck.
*/
unsigned int
hestOptAdd_nva(hestOpt **optP, const char *flag, const char *name, int type,
@@ -491,30 +496,28 @@
return NULL;
}
-/*
- * _hestOPCheck
- *
- * new biff-based container for all logic that used to be in _hestOptCheck (which is
- * the 2025 rename of _hestPanic): the validation of the given hestOpt array `opt` itself
- * (but *not* anything about the command-line or its parsing), relative to the given
- * (non-NULL) hestParm `hparm`.
- *
- * Pre-2025, hest did not depend on biff, and this instead took a 'char *err' that
- * somehow magically had to be allocated for the size of any possible error message
- * generated here. The 2025 re-write recognized that biff is the right way to accumulate
- * error messages, but the use of biff is internal to biff, and not (unusually for Teem)
- * part of the the expected use of biff's API. Thus, public functions hestOptCheck() and
- * hestOptParmCheck(), which are the expected way to access the functionality herein,
- * take a `char **errP` arg into which a message is sprintf'ed, after allocation.
- *
- * The shift to using biff removed how this function used to fprintf(stderr) some
- * messages like "panic 0.5" which were totally uninformative. Now, hestOptCheck() and
- * hestOptParmCheck(), which both call _hestOPCheck, will fprintf(stderr) the informative
- * biff message.
- *
- * Prior to 2023 code revisit: this used to set the "kind" in all the opts, but now that
- * is more appropriately done at the time the option is added.
- */
+/* _hestOPCheck
+New biff-based container for all logic that originated in _hestOptCheck (which is the
+2025 rename of _hestPanic): the validation of the given hestOpt array `opt` itself (but
+*not* anything about the command-line or its parsing), relative to the given (non-NULL)
+hestParm `hparm`.
+
+Pre-2025, hest did not depend on biff, and this instead took a 'char *err' that somehow
+magically had to be allocated for the size of any possible error message generated here.
+The 2025 re-write recognized that biff is the right way to accumulate error messages, but
+the use of biff is internal to biff, and not (unusually for Teem) part of the the
+expected use of biff's API. Thus, public functions hestOptCheck() and hestOptParmCheck(),
+which are the expected way to access the functionality herein, take a `char **errP` arg
+into which a message is sprintf'ed, after allocation.
+
+The shift to using biff removed how this function used to fprintf(stderr) some messages
+like "panic 0.5" which were totally uninformative. Now, hestOptCheck() and
+hestOptParmCheck(), which both call _hestOPCheck, will fprintf(stderr) the informative
+biff message.
+
+Prior to 2023 code revisit: this used to set the "kind" in all the opts, but now that is
+more appropriately done at the time the option is added.
+*/
int
_hestOPCheck(const hestOpt *opt, const hestParm *hparm) {
if (!(opt && hparm)) {
@@ -552,7 +555,7 @@
biffAddf(HEST,
"%s%sopt[%u] (%s) is type \"enum\", but no "
"airEnum pointer given",
- _ME_, opi, opt[opi].flag ? opt[opi].flag : "?");
+ _ME_, opi, opt[opi].flag ? opt[opi].flag : "unflagged");
return 1;
}
}
@@ -561,7 +564,7 @@
biffAddf(HEST,
"%s%sopt[%u] (%s) is type \"other\", but no "
"callbacks given",
- _ME_, opi, opt[opi].flag ? opt[opi].flag : "?");
+ _ME_, opi, opt[opi].flag ? opt[opi].flag : "unflagged");
return 1;
}
if (!(opt[opi].CB->size > 0)) {
@@ -670,8 +673,14 @@
}
*/
free(tbuff);
+ } else { // ------ end of if (opt[opi].flag)
+ // opt[opi] is unflagged
+ if (!opt[opi].min) {
+ biffAddf(HEST, "%s%sunflagged opt[%u] (name %s) must have min >= 1, not 0", _ME_,
+ opi, opt[opi].name ? opt[opi].name : "not set");
+ return 1;
+ }
}
- // ------ end of if (opt[opi].flag)
if (1 == opt[opi].kind) {
if (!opt[opi].flag) {
biffAddf(HEST, "%s%sopt[%u] flag must have a flag", _ME_, opi);
Modified: teem/trunk/src/hest/parsest.c
===================================================================
--- teem/trunk/src/hest/parsest.c 2025-09-24 18:49:34 UTC (rev 7483)
+++ teem/trunk/src/hest/parsest.c 2025-09-24 21:20:39 UTC (rev 7484)
@@ -600,7 +600,7 @@
if (hparm->responseFileEnable && tharg->str[0] == RESPONSE_FILE_FLAG) {
if (hestSourceDefault == topHin->source) {
biffAddf(HEST,
- "%s%s(iter %u, on %s) %s response file not handled in this source",
+ "%s%s(iter %u, on %s) %s response files not handled in this source",
_ME_, iters, srcstr, tharg->str);
return 1;
} else {
@@ -697,12 +697,19 @@
return ident;
}
-// havecTransfer moves `num` args from `hvsrc` (starting at `srcIdx`) to `hvdest`
+/* havecTransfer
+(if `num`) moves `num` args from `hvsrc` (starting at `srcIdx`) to `opt->havec`
+This takes `hestOpt *opt` instead of `opt->havec` so that we can also take this
+time to set `opt->source` according to the incoming `hvsrc->harg[]->source`. To
+minimize cleverness, we set `opt->source` with every transferred argument, which
+means that the per-option source remembered is the source of the *last* argument of the
+option.
+*/
static int
-havecTransfer(hestArgVec *hvdst, hestArgVec *hvsrc, uint srcIdx, uint num,
+havecTransfer(hestOpt *opt, hestArgVec *hvsrc, uint srcIdx, uint num,
const hestParm *hparm) {
- if (!(hvdst && hvsrc)) {
- biffAddf(HEST, "%s%sgot NULL dst %p or src %p", _ME_, AIR_VOIDP(hvdst),
+ if (!(opt && hvsrc)) {
+ biffAddf(HEST, "%s%sgot NULL opt %p or hvsrc %p", _ME_, AIR_VOIDP(opt),
AIR_VOIDP(hvsrc));
return 1;
}
@@ -717,10 +724,12 @@
hvsrc->len, num, srcIdx);
return 1;
}
- // okay now do the work, starting with empty destination
- hestArgVecReset(hvdst);
+ // okay now do the work, starting with empty destination havec
+ hestArgVecReset(opt->havec);
for (uint ai = 0; ai < num; ai++) {
- hestArgVecAppendArg(hvdst, hestArgVecRemove(hvsrc, srcIdx));
+ hestArg *harg = hestArgVecRemove(hvsrc, srcIdx);
+ hestArgVecAppendArg(opt->havec, harg);
+ opt->source = harg->source;
}
}
return 0;
@@ -765,8 +774,8 @@
As a result of this work, the passed `havec` is shortened: all args associated with
flagged opts are removed, so that later work can extract args for unflagged opts.
-The sawP information is not set here, since it is better set at the final value parsing
-time, which happens after defaults are enstated.
+For variable parameter options, the sawP information is not set here, since it is better
+set at the final value parsing time, which happens after defaults are enstated.
This is where, thanks to the action of whichOptFlag(), "--" (and only "--" due to
VAR_PARM_STOP_FLAG) is used as a marker for the end of a flagged variable parameter
@@ -858,19 +867,14 @@
printf("%s: ________ argv[%d]=|%s|: optIdx %u |%s| followed by %u parms\n",
__func__, argIdx, havec->harg[argIdx]->str, optIdx, theOpt->flag, parmNum);
}
- // remember from whence this option came
- theOpt->source = havec->harg[argIdx]->source;
- // lose the flag argument
if (hparm->verbosity > 1) {
hestArgVecPrint(__func__, "main havec as it came", havec);
}
+ // remember from whence this flagged option came
+ theOpt->source = havec->harg[argIdx]->source;
+ // lose the flag argument
hestArgNix(hestArgVecRemove(havec, argIdx));
- if (hparm->verbosity > 1) {
- char info[AIR_STRLEN_HUGE + 1];
- sprintf(info, "main havec after losing argIdx %u", argIdx);
- hestArgVecPrint(__func__, info, havec);
- }
- if (havecTransfer(theOpt->havec, havec, argIdx, parmNum, hparm)) {
+ if (havecTransfer(theOpt, havec, argIdx, parmNum, hparm)) {
biffAddf(HEST, "%s%strouble transferring %u args for %s", _ME_, parmNum,
identStr(ident1, theOpt));
return 1;
@@ -904,7 +908,7 @@
}
// if needs to be set but hasn't been
if (needing && hestSourceUnknown == theOpt->source) {
- biffAddf(HEST, "%s%sdidn't get required %s", _ME_, identStr(ident1, theOpt));
+ biffAddf(HEST, "%s%sdidn't see required %s", _ME_, identStr(ident1, theOpt));
return 1;
}
}
@@ -946,7 +950,7 @@
}
if (!unflagNum) {
/* no unflagged options; we're ~done */
- goto anythingleft;
+ goto finishingup;
}
uint unflag1st = nextUnflagged(0, opt);
if (hparm->verbosity) {
@@ -958,8 +962,15 @@
unflagVar < optNum;
unflagVar = nextUnflagged(unflagVar + 1, opt)) {
if (opt[unflagVar].kind > 3) {
- // kind 4 = single variable parm; kind 5 = multiple variable parm
- break;
+ if (4 == opt[unflagVar].kind) {
+ // (should have been enforced by _hestOPCheck)
+ biffAddf(HEST, "%s%sopt[%u] is (disallowed) unflagged single var parm (kind 4)",
+ _ME_, unflagVar);
+ return 1;
+ } else {
+ // kind 5 = multiple variable parm; we allow one of these
+ break;
+ }
}
}
/* now, if there is a variable parameter unflagged opt (NOTE that _hestOPCheck()
@@ -972,14 +983,14 @@
}
/* grab parameters for all unflagged opts before opt[unflagVar] */
- for (uint opi = nextUnflagged(0, opt); opi < unflagVar;
+ for (uint opi = nextUnflagged(0, opt); //
+ opi < unflagVar;
opi = nextUnflagged(opi + 1, opt)) {
if (hparm->verbosity) {
printf("%s: looking at opi = %u kind %d\n", __func__, opi, opt[opi].kind);
}
- opt[opi].source = havec->harg[0]->source;
- if (havecTransfer(opt[opi].havec, havec, 0, opt[opi].min /* min == max */, hparm)) {
- biffAddf(HEST, "%s%strouble getting args for %s", _ME_,
+ if (havecTransfer(opt + opi, havec, 0, opt[opi].min /* min == max */, hparm)) {
+ biffAddf(HEST, "%s%strouble getting args for unflagged %s", _ME_,
identStr(ident, opt + opi));
return 1;
}
@@ -1005,10 +1016,8 @@
the sole variable parameter unflagged option, so snarf them up */
for (uint opi = nextUnflagged(unflagVar + 1, opt); opi < optNum;
opi = nextUnflagged(opi + 1, opt)) {
- opt[opi].source = havec->harg[nvp]->source;
- if (havecTransfer(opt[opi].havec, havec, nvp, opt[opi].min /* min == max */,
- hparm)) {
- biffAddf(HEST, "%s%strouble getting args for %s", _ME_,
+ if (havecTransfer(opt + opi, havec, nvp, opt[opi].min /* min == max */, hparm)) {
+ biffAddf(HEST, "%s%strouble getting args for unflagged %s", _ME_,
identStr(ident, opt + opi));
return 1;
}
@@ -1025,7 +1034,7 @@
printf("%s: unflagVar=%u: min, nvp, max = %u %d %d\n", __func__, unflagVar,
opt[unflagVar].min, nvp, _hestMax(opt[unflagVar].max));
}
- /* we'll do error checking for unexpected args later */
+ /* we'll do error checking for unexpected args next */
if (nvp) {
/* pre-2023: this check used to be done regardless of nvp, but that incorrectly
triggered this error message when there were zero given parms, but the default
@@ -1036,17 +1045,27 @@
identStr(ident, opt + unflagVar), nvp);
return 1;
}
- opt[unflagVar].source = havec->harg[0]->source;
- if (havecTransfer(opt[unflagVar].havec, havec, 0, nvp, hparm)) {
- biffAddf(HEST, "%s%strouble getting args for %s", _ME_,
+ if (havecTransfer(opt + unflagVar, havec, 0, nvp, hparm)) {
+ biffAddf(HEST, "%s%strouble getting args for unflagged %s", _ME_,
identStr(ident, opt + unflagVar));
return 1;
}
}
- } else {
- hestArgVecReset(opt[unflagVar].havec);
}
-anythingleft:
+
+ /* make sure that unflagged options without default were given */
+ for (uint opi = nextUnflagged(0, opt); //
+ opi < optNum;
+ opi = nextUnflagged(opi + 1, opt)) {
+ if (!(opt[opi].dflt) && hestSourceUnknown == opt[opi].source) {
+ char ident1[AIR_STRLEN_HUGE + 1];
+ biffAddf(HEST, "%s%sdidn't see required (unflagged) %s", _ME_,
+ identStr(ident1, opt + opi));
+ return 1;
+ }
+ }
+
+finishingup:
if (hparm->verbosity) {
optAllPrint(__func__, "end of havecExtractUnflagged", opt);
hestArgVecPrint(__func__, "end of havecExtractUnflagged", havec);
@@ -1053,9 +1072,10 @@
}
if (havec->len) {
biffAddf(HEST,
- "%s%safter handling %u unflagged opts, have %u unexpected args "
- "(starting with \"%s\")",
- _ME_, unflagNum, havec->len, havec->harg[0]->str);
+ "%s%safter getting %u unflagged opts, have %u unexpected arg%s "
+ "%s\"%s\"",
+ _ME_, unflagNum, havec->len, havec->len > 1 ? "s," : "",
+ havec->len > 1 ? "starting with " : "", havec->harg[0]->str);
return 1;
}
return 0;
@@ -1067,7 +1087,8 @@
processed (by transferring the arguments to per-option opt->havec arrays), but we need to
ensure that every option has information from which to set values. The per-option
opt->dflt string is what we look to now, to finish setting per-option opt->havec arrays
-for all the options for which opt->havec have not already been set.
+for all the options for which opt->havec have not already been set. We use
+`hestSourceUknown == opt->source` as the indicator of not already being set.
*/
static int
optProcessDefaults(hestOpt *opt, const hestParm *hparm) {
Modified: teem/trunk/src/hest/test/tparse.c
===================================================================
--- teem/trunk/src/hest/test/tparse.c 2025-09-24 18:49:34 UTC (rev 7483)
+++ teem/trunk/src/hest/test/tparse.c 2025-09-24 21:20:39 UTC (rev 7484)
@@ -46,7 +46,9 @@
&slen);
int *unpB;
unsigned int sawB;
- hestOptAdd_Nv_Int(&opt, NULL, "B B", 1, -1, &unpB, NULL, "unflagged B", &sawB);
+ hestOptAdd_Nv_Int(&opt, NULL, "B B", 1, -1, &unpB, "BBBB", "unflagged B", &sawB);
+ /* int unpB[2];
+ hestOptAdd_2_Int(&opt, NULL, "B B", unpB, NULL, "unflagged B"); */
int flag;
hestOptAdd_Flag(&opt, "b,bingo", &flag, "a flag");
int glaf;
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|