[Flex-devel] Progress on a re-written skeleton
flex is a tool for generating scanners
Brought to you by:
wlestes
From: Joe K. <kr...@ni...> - 2008-10-10 15:37:34
|
I have an extensively redesigned skeleton file. It passes all tests except for C++, which is an area that needs some re-development anyhow. My initial goal is to move as much C-generated code into the skeleton file (as mentioned in the TODO), organize the skeleton into logical sections, and improve the M4 processing to minimize complexities in the skeleton C sources. I also put most of the M4 macro setup stuff at the top, to minimize mixing of m4 and C code. I have modified essentially none of the actual C sources in the skeleton, other than global search-and-replacement on the YY_G() macro. I modified code in gen.c, main.c and misc.c to move generation of code into the skeleton where possible, and reorganized some of the option management. Previously, there were many different places where M4 macros were set. I tried to group these together. I also removed all of the directives from misc.c, except for '%%' break points. Overall, I think it is a big improvement. Getting things better organized also revealed some bugs and inconsistencies. For example, I found that '--main', '--nomain' flags were broken, and that %option 'line' and 'noline' are missing from "scan.l". One problem is that checks for option consistency are not well organized. Option side effects are not consist between the lexer and the command-line parser. For example, "%option main" clears the do_yywrap flag, but "--main" does not. My idea is that all options (or just the ones with dependencies) should start in an "unspecified" state, then process all lexer and command-line options, and do consistency checks and implied side-effects afterwards. Also, there are many places in the skeleton where option dependencies are simply enforced with no user warnings. It makes sense for some (or even most) of the option checks to be in the skeleton, because many of these dependencies come directly from the skeleton's design. However, the skeleton should have a proper mechanism for issuing warnings and errors. Another problem is that many of the test examples expect to be processed by the M4 macros. M4 processing is not part of the API, so user code should never be exposed to M4. I modified the quoting to exclude M4 processing. But, there should be a mechanism to handle the reentrant globals in a way that a given lexer source can be compiled either was. So, I added CPP macros corresponding to the LAST_ARG and ONLY_ARG M4 macros, and designed the reentrant-globals CPP macros to avoid the need for "M4_YY_DECL_GUTS_VAR()". My reentrant CPP macros are defined like: #define yyin (((yyguts_t*)yyscanner)->yyin) This way, only the variable "void *yyscanner" has to be available, and only those LAST_ARG and ONLY_ARG macros are needed. If people have written scanners that actually use M4, there could be an option to allow M4 processing of user code. There are also some issues with code for external tables. They don't use function declaration/prototype macros to allow for pre-ANSI code. (Is that feature really still needed?) Also, yydmap[] is not part of the reentrant structure. This means it is only reentrant-safe after the external tables are loaded. It might be good to add some safety checks. Joe Krahn |