Re: [Flex-devel] [Flex-help] Suggestions for improving Flex

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Aaron Stone wrote:
> Replying back onto the flex-devel mailing list, so that we can track the 
> conversation. There are definitely some cases where if you're not 
> playing by the (often unwritten) rules of writing a flex grammar, the 
> code will bite you. Polishing edges is very welcome!
> 
> On Oct 3, 2008, at 2:42 PM, Joe Krahn wrote:
> 
>> Aaron Stone wrote:
>>> There's active maintenance right now, but not active development. 
>>> That's just a function of available time and not a function of lack 
>>> of interest -- although the C++ side could use a serious C++ 
>>> programmer's input. The current maintainers are all C programmers.
>>> Aaron
>> ...
>> I program mostly in C, but am trying to use it with C++ right now. 
>> However, there are a lot of problems that are not limited to just C++.
>>
>> For example, the --main and --nomain flags don't work. It appears that 
>> Flex moved away from manual skeleton processing to using the m4 
>> preprocessor instead, and the conversion is sort of only half 
>> completed. The --main option flags set a cpp macro, but the skeleton 
>> only honors the m4 macro.
> 
> Oh, that's no good.
> 
>>
>> Similarly, the skeleton file has some header sections marked with 
>> m4_ifdef, and some marked with %ok-for-header and %not-for-header. 
>> There are also a few unmatched %if sections, which only succeeds 
>> because someone added an extra push-true at the beginning of skelout().
>>
>> I can make an attempt at working on some improvements, but any sort of 
>> update will surely lead to errors, even if it is best for the long run.
> 
> There are pretty good tests in the tree, so feel free to (carefully) 
> mess with things and post patches to the list and/or to sourceforge bugs.
> 
>>
>> Maybe I should just proceed with some experimental code updates, post 
>> my initial results, and see what people think?
> 
> For sure!
> 
> Aaron

OK, I have done some initial hacking. It almost works, but will take 
some reviewing and debugging. The changes may seem a bit drastic for 
something that is mostly stable, but the current state of 
disorganization is leading to poor maintainability. I hope that other 
flex developers agree that it needs a general clean-up in skeleton 
processing.

Here is what I have done so far. Comments are very welcome.

My current design gets rid of the initial preprocessing m4 stage. 
Instead, it expects the m4_include files to be available at run time. 
The M4_GEN_PREFIX macro was updated to work with a single m4 pass. This 
makes it easier to work with an external skeleton file. (Bison works 
this way.)

I Moved most of the code generated in C source into the skeleton file, 
and added a few more M4 macro option definitions for the extra logic 
needed in the skeleton.

I replaced the %if/%endif conditionals from misc.c with m4 conditionals. 
Instead of the messy m4_ifdefs, I added some defined macros like 
"m4_if_c_only()".  The misc.c conditional processing is now essentially 
empty except for %# comment processing.

I reorganized the skeleton into sensible groups where possible: header, 
non-header, static non-reentrant globals, etc. The header parts are 
still divided into two parts, before and after user section 1, to ensure 
compatibility with existing code.

Replaced all of the YY_G() macros with m4 substitution macros, similar 
to what was already done for function prefixes. This keeps the skeleton 
code simpler. (I am assuming that user code never uses the YY_G() macro.)

The reentrant state object was renamed from yyguts_t to yyobject_t.  All 
of the struct members no longer have the yy prefix, because it is not 
needed when encapsulating them in a struct. (Ideally, the C++ and 
yyobject_t names should all match, but I have not compared them.)

There are now two prefix macros, for names starting with "yy_" versus 
"yy". For the yyobject_t variables, this avoids names with a leading 
underscore. For functions and non-reentrant globals, this could be used 
to make a C++ namespace prefix instead of a simple name prefix, in which 
case it would also be nice to exclude leading underscores. For now, the 
underscore is always retained.

Bison has much nicer m4 macros for traditional versus ANSI prototype 
generation. They have variable argument lists, instead one for each 
argument-list size.

I think it would have been much better not to put the _param suffix on
yylex arguments in the reentrant version, because it does not work well 
with a user-defined YY_DECL. Instead, macros to rename them should come 
just after the start of yylex, but before the user code is inserted. 
That allows a user-defined YY_DECL to work with normal parameter names. 
In addition, the current skeleton initializes the lloc and lval pointers 
after the user-code section, leading to segfaults unless the user-code 
knows to use the undocumented _param suffix.  Unfortunately, changing 
this will affect code that has already adapted. Maybe there should be a 
cpp macro or %option to name the yylval and yylloc args?

Another problem with reentrant mode is that yyset_lval and yyset_lloc
are useless, because yylex sets them every time. An updated yylex should
allow for YY_DECL not to have lval and lloc args, but instead allow use
of the set/get functions. Maybe the above mentioned yylval/yylloc naming
options can also disable one or both, so the automatic pointer-copying
code can adapt.

I also think the %top section is designed wrong. It should terminate
with '%}' instead of trying to count braces. But, how to fix it without 
breaking existing code? Maybe there could be a new code section called 
`%header{ ... %}' to emphasize that it is the place to put macros that 
affect the header section?

After the changes I've made so far, I am working on getting it to pass 
all of the tests.

Joe Krahn

Re: [Flex-devel] [Flex-help] Suggestions for improving Flex

flex is a tool for generating scanners

Re: [Flex-devel] [Flex-help] Suggestions for improving Flex