You can subscribe to this list here.
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(8) |
Jul
(6) |
Aug
(5) |
Sep
(5) |
Oct
(2) |
Nov
(1) |
Dec
|
---|
From: <li...@mi...> - 2007-06-08 13:10:51
|
On Thu, Jun 07, 2007 at 05:06:33PM +0200, Michael Geng wrote: > On Thu, Jun 07, 2007 at 02:44:38PM +0200, Jim Meyering wrote: > > - one of the generated .c files I looked at calls strdup but doesn't > > check for a NULL return value or (less important) attempt to avoid > > the leak when the corresponding --backup=S option appears twice or more. > > I guess Jim looked into the mv example > (http://genparse.sourceforge.net/examples/mv_clp.c). > > I propose not to allocate any memory dynamically in Genparse at all and use > the stack instead. malloc() and strdup() might not be fast enough in all > applications. For instance in the mv command I can hardly imagine that it > is acceptable to have a call to malloc() for allocating the arg_t struct > which would have to be executed with every call to mv no matter what > command line switches or arguments were specified. By the way, the result > of this call to malloc() is also not checked for failure. > > In principle the Coreutils implementation of the mv command does it > similarly: The optind pointer which is returned by getopt_long() is first > stored in a variable on the heap and later the content of the string is > copied to a string with strdup() which allocates from the heap. In order > to leave the freedom to do so to the user of Genparse it is required that > Genparse uses only the stack. The problem with not using malloc() and strdup() is that this would break compatibility of Genparse. For this reason I suggest to add a new command line switch -s/--noheap to Genparse. If this switch is not set (the default) then Genparse will stay compatible to previous versions. If it is set then the parser interface will change: ----------------------------- The compatible version of the generated parser of the mycopy3 example (--noheap not set) will look as follows. I marked the lines which have changed with '>'. struct arg_t *Cmdline(int argc, char *argv[]) { extern char *optarg; extern int optind; int c; struct arg_t *my_args; int errflg = 0; static struct option long_options[] = { {"iterations", 1, 0, 'i'}, {"outfile", 1, 0, 'o'}, {"longopt", 0, 0, 256}, {"help", 0, 0, 'h'}, {"version", 0, 0, 'v'}, {0, 0, 0, 0} }; my_args = (struct arg_t *) malloc (sizeof(struct arg_t)); > if (my_args == NULL) > { > fprintf(stderr, "error: out of memory\n"); > return NULL; > } my_args->i = 1; my_args->o = NULL; my_args->longopt = false; my_args->h = false; my_args->v = false; optind = 0; while ((c = getopt_long(argc, argv, "i:o:hv", long_options, &optind)) != EOF) { switch(c) { case 'i': my_args->i = atoi(optarg); if (my_args->i < 0) { fprintf(stderr, "parameter range error: i must be >= 0\n"); errflg++; } break; case 'o': > if (my_args->o != NULL) > free(my_args->o); > if ((my_args->o = strdup(optarg)) == NULL) > { > fprintf(stderr, "error: out of memory\n"); > errflg++; > } break; case 256: my_args->longopt = true; break; case 'h': my_args->h = true; usage(argv[0]); break; case 'v': my_args->v = true; break; default: usage(argv[0]); } } /* while */ if (errflg) usage(argv[0]); if (optind >= argc) my_args->optind = 0; else my_args->optind = optind; return my_args; } ----------------------------- The new version which doesn't use malloc() and strdup() (--noheap set) will look as follows. void Cmdline(struct arg_t *my_args, int argc, char *argv[]) { extern char *optarg; extern int optind; int c; int errflg = 0; static struct option long_options[] = { {"iterations", 1, 0, 'i'}, {"outfile", 1, 0, 'o'}, {"longopt", 0, 0, 256}, {"help", 0, 0, 'h'}, {"version", 0, 0, 'v'}, {0, 0, 0, 0} }; my_args->i = 1; my_args->o = NULL; my_args->longopt = false; my_args->h = false; my_args->v = false; optind = 0; while ((c = getopt_long(argc, argv, "i:o:hv", long_options, &optind)) != EOF) { switch(c) { case 'i': my_args->i = atoi(optarg); if (my_args->i < 0) { fprintf(stderr, "parameter range error: i must be >= 0\n"); errflg++; } break; case 'o': > my_args->o = optarg; break; case 256: my_args->longopt = true; break; case 'h': my_args->h = true; usage(argv[0]); break; case 'v': my_args->v = true; break; default: usage(argv[0]); } } /* while */ if (errflg) usage(argv[0]); if (optind >= argc) my_args->optind = 0; else my_args->optind = optind; return my_args; } ----------------------------- The latter version is clearly the most simple one. For this reason I would recommend to always set the --noheap switch. I would also like to update all the examples including those in the info pages to use --noheap. I found another issue with the present version of Genparse: If a string has a default value then setting this parameter will always cause a memory leak because it is initialized with a strdup() result and when the parameter is actually processed it is overwritten with another strdup() result without freeing the old value. The same happens if you specify the same string parameter twice on the command line as Jim mentioned. Both cases will be solved in the code above no matter if Genparse is called with or without --noheap. Michael |
From: <li...@mi...> - 2007-06-07 15:07:49
|
Hi Mike, I suggested to use Genparse for command line parsing in the GNU Coreutils. I very quickly got a reply from Jim Meyering from the Coreutils team. See http://lists.gnu.org/archive/html/bug-coreutils/2007-06/msg00035.html for the thread on the Coreutils mailing list. I don't think it is appropriate to post all the Genparse problems to the Coreutils mailing list. But I'm posting to the "genparse-users" mailing list in order to make it available to a broader audience. In this email I'm focusing on only 1 of Jim's concerns. See my comments below. On Thu, Jun 07, 2007 at 02:44:38PM +0200, Jim Meyering wrote: > li...@mi... (Michael Geng) wrote: > > would it be an option to use Genparse (http://genparse.sourceforge.net/) > > for command line parsing in the GNU Coreutils? > > > > I'm one of the developers of Genparse and I recently used some of the > > well known Coreutils as an exercise for testing Genparse (see > > http://genparse.sourceforge.net/examples.html). Using Genparse for > > generating the command line parsing code could reduce the amount of > > coreutils source code because the input to Genparse is a short config > > file only. The overhead of writing the parser code would be delegated > > to the tool then. > > > > The Genparse generated parsers call getopt() (or getopt_long()) exactly > > the same way the Coreutils's command line parsers do it today. This > > wouldn't be changed. So the code Genparse generates will be very similar > > to the existing hand written parsers of the individual Coreutils tools. > > But calling getopt() is only part of the work, preparing and evaluating > > the results of getopt() also causes coding work which could be delegated > > to Genparse. Genparse also automatically generates a help screen which > > would no longer have to be done manually. > > Hi Michael, > > Genparse looks promising. > I like the examples. But there are almost 100 programs in the > coreutils. If genparse can really handle all of those use cases > without causing any significant degradation in the tools, then > it will be hard to object. > > How does it deal with internationalization? > > However, before I even consider it seriously, it'll need > some improvements: > > - it must detect any and all write failures[*] > - packaging so that I can run ./configure && make && make check, and > if I don't happen to have cppunit infrastructure, gcj or something > similar, it should tell me about it directly, rather than causing > harder-to-diagnose build failures. > - one of the generated .c files I looked at calls strdup but doesn't > check for a NULL return value or (less important) attempt to avoid > the leak when the corresponding --backup=S option appears twice or more. I guess Jim looked into the mv example (http://genparse.sourceforge.net/examples/mv_clp.c). I propose not to allocate any memory dynamically in Genparse at all and use the stack instead. malloc() and strdup() might not be fast enough in all applications. For instance in the mv command I can hardly imagine that it is acceptable to have a call to malloc() for allocating the arg_t struct which would have to be executed with every call to mv no matter what command line switches or arguments were specified. By the way, the result of this call to malloc() is also not checked for failure. In principle the Coreutils implementation of the mv command does it similarly: The optind pointer which is returned by getopt_long() is first stored in a variable on the heap and later the content of the string is copied to a string with strdup() which allocates from the heap. In order to leave the freedom to do so to the user of Genparse it is required that Genparse uses only the stack. > - not exactly essential, but highly recommended: it should work > with the latest autoconf and automake > > > [*] with genparse-0.6.3, I get this: > > $ strace -o log -e write ./genparse -o /full/tmp/foo ../examples/ls.gp \ > && echo exit status 0 > creating /full/tmp/foo.h...done > creating /full/tmp/foo.c...done > exit status 0 > $ tail -2 log > write(3, "/*******************************"..., 80) = -1 ENOSPC (No space left on device) > write(1, "creating /full/tmp/foo.c...done\n", 32) = 32 > > The two files it claims to have created are both empty, > and genparse exited successfully. > This means genparse is not detecting write or close failures. > > Note that in the example above, /full/tmp is a full file system that still > has free inodes, so open can create new files, but writes always fail. > > > _______________________________________________ > Bug-coreutils mailing list > Bug...@gn... > http://lists.gnu.org/mailman/listinfo/bug-coreutils |