|
From: <kin...@us...> - 2025-09-17 10:18:52
|
Revision: 7442
http://sourceforge.net/p/teem/code/7442
Author: kindlmann
Date: 2025-09-17 10:18:49 +0000 (Wed, 17 Sep 2025)
Log Message:
-----------
beginning to create machinery for re-written parsing; for now this lives alongside the old working code
Modified Paths:
--------------
teem/trunk/src/hest/CMakeLists-v2.txt
teem/trunk/src/hest/GNUmakefile
teem/trunk/src/hest/hest.h
Added Paths:
-----------
teem/trunk/src/hest/README.md
teem/trunk/src/hest/argvHest.c
teem/trunk/src/hest/test/argv.c
Modified: teem/trunk/src/hest/CMakeLists-v2.txt
===================================================================
--- teem/trunk/src/hest/CMakeLists-v2.txt 2025-09-17 10:08:23 UTC (rev 7441)
+++ teem/trunk/src/hest/CMakeLists-v2.txt 2025-09-17 10:18:49 UTC (rev 7442)
@@ -1,6 +1,6 @@
_Teem_add_library(${CMAKE_CURRENT_SOURCE_DIR}
SOURCES
- adders.c defaultsHest.c methodsHest.c parseHest.c usage.c
+ adders.c argvHest.c defaultsHest.c methodsHest.c parseHest.c usage.c
# private header
privateHest.h
PUBLIC_HEADERS
Modified: teem/trunk/src/hest/GNUmakefile
===================================================================
--- teem/trunk/src/hest/GNUmakefile 2025-09-17 10:08:23 UTC (rev 7441)
+++ teem/trunk/src/hest/GNUmakefile 2025-09-17 10:18:49 UTC (rev 7442)
@@ -46,9 +46,9 @@
$(L).PublicHdr = hest.h
$(L).PrivateHdr = privateHest.h
$(L).Obj = $(patsubst %.c,%.o, \
- adders.c defaultsHest.c methodsHest.c parseHest.c usage.c \
+ adders.c argvHest.c defaultsHest.c methodsHest.c parseHest.c usage.c \
)
-$(L).Test = ex1 ex2 ex3 ex4 ex5 ex6 strings bday tmpl
+$(L).Test = argv ex1 ex2 ex3 ex4 ex5 ex6 strings bday tmpl
####
####
####
Added: teem/trunk/src/hest/README.md
===================================================================
--- teem/trunk/src/hest/README.md (rev 0)
+++ teem/trunk/src/hest/README.md 2025-09-17 10:18:49 UTC (rev 7442)
@@ -0,0 +1,22 @@
+# `hest`: command-line parsing
+
+## Intro
+
+The purpose of `hest` is to bridge the `int argc`, `char *argv[]` command-line arguments and a set of C variables that need to be set for a C program to run. The variables can be of most any type (boolean, `int`, `float`, `char *` strings, or user-defined types), and the variables can hold single values (such as `float thresh`) or multiple values (such as `float RGBA[4]`).
+
+`hest` was created in 2002 out of frustration with how limiting other C command-line parsing libraries were, and has become essential for the utility of tools like `unu`. To the extent that `hest` bridges the interactive command-line with compiled C code, it has taken on some of the roles that in other contexts are served by scripting languages with C extensions. The `hest` code was revisited in 2023 to add long-overdue support for `--help`, and to add typed functions for specifying options like `hestOptAdd_4_Float`. Re-revisiting the code in 2025 finally fixed long-standing bugs with how quoted strings were handled and how response files were parsed, and to add `-{`, `}-` comments.
+
+## Terminology and concepts
+
+`hest` has possibly non-standard terminology for the elements of command-line parsing. Here is a bottom-up description of the command-line and what `hest` can do with it. Note that `hest` does not follow POSIX conventions (or terminology) for command-line descriptions, because those conventions don't empower the kind of expressivity and flexibility that motivated `hest`'s creation. POSIX certainly isn't determinative for the scientific visualization contexts that Teem was built for.
+
+- What `main()` gets as `char *argv[]` is the vector of _arguments_ or _args_; each one is a `char*` string. An arg can contain spaces and other arbitrary characters if the user quoted strings or escaped characters; that is between the user and shell (the shell is responsible for taking the command-line and tokenizing it into `char *argv[]`). `hest` processes all the args in the `argv` you give it.
+- Arguments like `-v` and `-size`, which identify the variable to be set, are called _flags_.
+- Some flags are really just flags; no further information is given beyond their presence or absence. Other flags introduce subsequent arguments that together supply information for setting one variable.
+- The set of arguments that logically belong together (often following a flag) in the service of setting a variable are called _parameters_ (or _parms_). There is some slippage of terminology between the `char *` string that communicates the parameter, and the value (such an `int`) parsed from the parameter string.
+- Separately, and possibly confusingly, `hest`'s behavior has many knobs and controls, stored in the `hestParm` struct. The pointer-to-struct is always named `hparm` in the code, to try to distinguish it from the parameters appearing on the command-line.
+- An _option_ determines how to set one C variable. In the C code, one `hestOpt` struct stores everything about how to parse one option, _and_ the results of that parsing. An array of `hestOpt` structs (not pointers to structs) is how a `hest`-using program communicates what it wants to learn from the command-line. The `hestOpt` array is usually built up by calls to one of the `hestOptAdd` functions.
+- On the command-line, the option may be defined by a flag and its associated parms; this is a _flagged_ option. Options may also be _unflagged_, or what others call "positional" arguments, because which C variable is set by parsing that option is disambiguated by the option's position on the command-line, and the corresponding ordering of `hestOpt` structs.
+- An option may have no parms, one parm, a fixed number of parms, or a variable number of parms. Unflagged options must have one or more parms. With `mv *.txt dir`, the `*.txt` filenames could be parsed as a variable number of parms for an unflagged option, and `dir` would be a fixed single parm for a second unflagged option. Flagged options can appear in any order on the command-line, and the same option can be repeated: later appearances over-ride earlier appearances.
+- Sometimes multiple command-line options need to be saved and re-used together, over a time span longer than one shell or any variables set it. Command-line options can thus be stored in _response files_, and the contents of response files effecively expanded into the command-line. Response files can have comments, and response files can name other response files.
+- The main `hest` function that does the parsing is `hestParse`. Its job is to set one variable (which may have multiple components) for every `hestOpt`. Information for setting each variable can come from the command-line, or from the default string set in the `hestOpt`, but it has to come from somewhere. Essentially, if no default string is given, then the option _must_ be set on the command-line (or a response file named there). In this sense, `hest`'s "options" are badly named, because they are not really optional.
Added: teem/trunk/src/hest/argvHest.c
===================================================================
--- teem/trunk/src/hest/argvHest.c (rev 0)
+++ teem/trunk/src/hest/argvHest.c 2025-09-17 10:18:49 UTC (rev 7442)
@@ -0,0 +1,165 @@
+/*
+ Teem: Tools to process and visualize scientific data and images
+ Copyright (C) 2009--2025 University of Chicago
+ Copyright (C) 2005--2008 Gordon Kindlmann
+ Copyright (C) 1998--2004 University of Utah
+
+ This library is free software; you can redistribute it and/or modify it under the terms
+ of the GNU Lesser General Public License (LGPL) as published by the Free Software
+ Foundation; either version 2.1 of the License, or (at your option) any later version.
+ The terms of redistributing and/or modifying this software also include exceptions to
+ the LGPL that facilitate static linking.
+
+ This library is distributed in the hope that it will be useful, but WITHOUT ANY
+ WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
+ PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
+ You should have received a copy of the GNU Lesser General Public License
+ along with this library; if not, see <https://www.gnu.org/licenses/>.
+*/
+
+#include "hest.h"
+#include "privateHest.h"
+
+#include <assert.h>
+
+#define INCR 32
+
+/* dereferences as char *, sets to '\0' */
+static void
+setNul(void *_c) {
+ char *c = (char *)(_c);
+ c[0] = '\0';
+ return;
+}
+
+static void
+hargInit(void *_harg) {
+ airPtrPtrUnion appu;
+ hestArg *harg;
+ harg = (hestArg *)_harg;
+ harg->str = NULL;
+ harg->len = 0;
+ appu.c = &(harg->str);
+ harg->strArr = airArrayNew(appu.v, &(harg->len), 1 /* unit */, INCR);
+ airArrayStructCB(harg->strArr, setNul, NULL);
+ /* initialize with \0 so that harg->str is "" */
+ airArrayLenIncr(harg->strArr, 1);
+ /* now harg->str = {0:'\0'} and harg->len = 1; */
+ return;
+}
+
+hestArg *
+hestArgNew(void) {
+ hestArg *harg;
+
+ harg = AIR_CALLOC(1, hestArg);
+ assert(harg);
+ hargInit(harg);
+ return harg;
+}
+
+hestArg *
+hestArgNix(hestArg *harg) {
+ if (harg) {
+ if (harg->str) {
+ /* If caller wants to keep harg->str around,
+ they need to have copied it (the pointer) and set harg->str to NULL */
+ free(harg->str);
+ }
+ airArrayNix(harg->strArr); /* leave the underlying str alone */
+ }
+ return NULL;
+}
+
+void
+hestArgAddChar(hestArg *harg, char cc) {
+ assert(harg);
+ airArrayLenIncr(harg->strArr, 1);
+ /* if this was first call after hestArgNew, we have
+ harg->str = {0:'\0', 1:'\0'} and harg->len = 2 */
+ harg->str[harg->len - 2] = cc;
+ return;
+}
+
+void
+hestArgAddString(hestArg *harg, const char *str) {
+ assert(harg && str);
+ uint len, si;
+ len = AIR_UINT(strlen(str));
+ for (si = 0; si < len; si++) {
+ hestArgAddChar(harg, str[si]);
+ }
+ return;
+}
+
+typedef union {
+ hestArg **harg;
+ hestArgVec **havec;
+ void **v;
+} hestPtrPtrUnion;
+
+hestArgVec *
+hestArgVecNew() {
+ hestPtrPtrUnion hppu;
+ hestArgVec *havec;
+ havec = AIR_CALLOC(1, hestArgVec);
+ assert(havec);
+ havec->harg = NULL;
+ havec->len = 0;
+ hppu.harg = &(havec->harg);
+ havec->hargArr = airArrayNew(hppu.v, &(havec->len), sizeof(hestArgVec), INCR);
+ airArrayStructCB(havec->hargArr, hargInit, NULL);
+ return havec;
+}
+
+void
+hestArgVecAppendString(hestArgVec *havec, const char *str) {
+ uint idx;
+ idx = airArrayLenIncr(havec->hargArr, 1);
+ hestArgAddString(havec->harg + idx, str);
+}
+
+void
+hestArgVecPrint(const hestArgVec *havec) {
+ uint idx;
+ printf("hestArgVec %p has %u args:\n", havec, havec->len);
+ for (idx = 0; idx < havec->hargArr->len; idx++) {
+ const hestArg *harg;
+ harg = havec->harg + idx;
+ printf(" %u:<%s>", idx, harg->str);
+ }
+ printf("\n");
+}
+
+hestInput *
+hestInputNew(void) {
+ hestInput *hin;
+ hin = AIR_CALLOC(1, hestInput);
+ assert(hin);
+ hin->source = hestSourceUnknown;
+ hin->dflt = NULL;
+ hin->argc = 0;
+ hin->argv = NULL;
+ hin->argIdx = 0;
+ hin->fname = NULL;
+ hin->file = NULL;
+ return hin;
+}
+
+#if 0
+
+/* what is the thing we're currently processing to build up the arg vec */
+typedef struct {
+ int source; /* from the hestSource* enum */
+ /* ------ if source == hestSourceDefault ------ */
+ const char *dflt;
+ /* ------ if source == hestSourceCommandLine ------ */
+ int argc;
+ const char **argv;
+ unsigned int argIdx;
+ /* ------ if source == hestSourceResponseFile ------ */
+ char *fname;
+ FILE *file;
+} hestInput;
+
+#endif
\ No newline at end of file
Modified: teem/trunk/src/hest/hest.h
===================================================================
--- teem/trunk/src/hest/hest.h 2025-09-17 10:08:23 UTC (rev 7441)
+++ teem/trunk/src/hest/hest.h 2025-09-17 10:18:49 UTC (rev 7442)
@@ -222,6 +222,34 @@
disable this behavior entirely. */
} hestParm;
+/* for building up and representing one argument */
+typedef struct {
+ char *str;
+ unsigned int len; /* NOT strlen; this includes '\0'-termination */
+ airArray *strArr;
+} hestArg;
+
+/* for building up a "vector" of arguments */
+typedef struct {
+ hestArg *harg;
+ unsigned int len;
+ airArray *hargArr;
+} hestArgVec;
+
+/* what is the thing we're currently processing to build up the arg vec */
+typedef struct {
+ int source; /* from the hestSource* enum */
+ /* ------ if source == hestSourceDefault ------ */
+ const char *dflt;
+ /* ------ if source == hestSourceCommandLine ------ */
+ int argc;
+ const char **argv;
+ unsigned int argIdx;
+ /* ------ if source == hestSourceResponseFile ------ */
+ char *fname;
+ FILE *file;
+} hestInput;
+
/* defaultsHest.c */
HEST_EXPORT int hestDefaultVerbosity;
HEST_EXPORT int hestDefaultRespFileEnable;
@@ -241,6 +269,16 @@
HEST_EXPORT char hestDefaultVarParamStopFlag;
HEST_EXPORT char hestDefaultMultiFlagSep;
+/* argvHest.c */
+HEST_EXPORT hestArg *hestArgNew(void);
+HEST_EXPORT hestArg *hestArgNix(hestArg *harg);
+HEST_EXPORT void hestArgAddChar(hestArg *harg, char cc);
+HEST_EXPORT void hestArgAddString(hestArg *harg, const char *str);
+HEST_EXPORT hestArgVec *hestArgVecNew(void);
+HEST_EXPORT void hestArgVecAppendString(hestArgVec *havec, const char *str);
+HEST_EXPORT void hestArgVecPrint(const hestArgVec *havec);
+HEST_EXPORT hestInput *hestInputNew(void);
+
/* methodsHest.c */
HEST_EXPORT const int hestPresent;
HEST_EXPORT int hestSourceUser(int src);
Added: teem/trunk/src/hest/test/argv.c
===================================================================
--- teem/trunk/src/hest/test/argv.c (rev 0)
+++ teem/trunk/src/hest/test/argv.c 2025-09-17 10:18:49 UTC (rev 7442)
@@ -0,0 +1,71 @@
+/*
+ Teem: Tools to process and visualize scientific data and images
+ Copyright (C) 2009--2023 University of Chicago
+ Copyright (C) 2005--2008 Gordon Kindlmann
+ Copyright (C) 1998--2004 University of Utah
+
+ This library is free software; you can redistribute it and/or modify it under the terms
+ of the GNU Lesser General Public License (LGPL) as published by the Free Software
+ Foundation; either version 2.1 of the License, or (at your option) any later version.
+ The terms of redistributing and/or modifying this software also include exceptions to
+ the LGPL that facilitate static linking.
+
+ This library is distributed in the hope that it will be useful, but WITHOUT ANY
+ WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A
+ PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details.
+ You should have received a copy of the GNU Lesser General Public License
+ along with this library; if not, see <https://www.gnu.org/licenses/>.
+*/
+
+#include "../hest.h"
+
+int
+main(int argc, const char **argv) {
+
+ AIR_UNUSED(argc);
+ printf("%s: yo\n", argv[0]);
+ hestArg *harg = hestArgNew();
+ printf("%s: harg = %p\n", argv[0], harg);
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'c');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'a');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 't');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'a');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 's');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 't');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'r');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'o');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'p');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'h');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, 'e');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddChar(harg, '!');
+ printf("%s: |%s|\n", argv[0], harg->str);
+ hestArgAddString(harg, "bingo bob lives\n");
+ printf("%s: |%s|\n", argv[0], harg->str);
+
+ hestArgVec *havec = hestArgVecNew();
+ hestArgVecPrint(havec);
+ hestArgVecAppendString(havec, "this");
+ hestArgVecPrint(havec);
+ hestArgVecAppendString(havec, "is");
+ hestArgVecPrint(havec);
+ hestArgVecAppendString(havec, "totally");
+ hestArgVecPrint(havec);
+ hestArgVecAppendString(havec, "");
+ hestArgVecPrint(havec);
+ hestArgVecAppendString(havec, "bonkers");
+ hestArgVecPrint(havec);
+
+ exit(0);
+}
This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site.
|