flex-devel Mailing List for flex: the fast lexical analyser (Page 10)
flex is a tool for generating scanners
Brought to you by:
wlestes
You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(5) |
May
(2) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(3) |
2008 |
Jan
(1) |
Feb
(2) |
Mar
(1) |
Apr
(2) |
May
(1) |
Jun
|
Jul
|
Aug
(5) |
Sep
(3) |
Oct
(33) |
Nov
(4) |
Dec
(4) |
2009 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(10) |
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2012 |
Jan
|
Feb
(11) |
Mar
(12) |
Apr
|
May
|
Jun
(3) |
Jul
(62) |
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2013 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2014 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(5) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
(3) |
Nov
(33) |
Dec
(31) |
2016 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(2) |
Sep
(5) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
(4) |
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
(5) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Yuri <yu...@ra...> - 2009-08-08 11:21:58
|
It seems logical and beneficial to have UTF8 support in flex. I found the patch that claims to support it: http://xqilla.sourceforge.net/flex/flex-2.5.4a-unicode-patch. Are there any plans to integrate this patch o otherwise support UTF8? Yuri |
From: Will E. <wl...@us...> - 2009-04-07 13:29:44
|
This does seem like the right approach. Particularly when combined with the notion that we want to unentangle C-generated and skeleton code. On Tuesday, 21 October 2008, 11:04 am -0700, Aaron Stone <aa...@se...> wrote: > > On Oct 21, 2008, at 9:23 AM, Joe Krahn wrote: > > > Aaron Stone wrote: > >> On Oct 14, 2008, at 8:45 AM, Joe Krahn wrote: > >>> What part of the generated code is supposed to be visible to user > >>> code > >>> from section 1? My understanding is that section 1 is meant for > >>> including extra headers, and inserting cpp #defines to modify cpp- > >>> based > >>> options. However, it apparently does not come early enough in the > >>> generated code, because somebody added the %top{} feature. > >>> > >>> My first guess is just to avoid moving any code to the other side > >>> of the > >>> current section 1 code insertion, to avoid breaking things. But, > >>> there > >>> should be some specific rules. In any case, the %top{} feature is > >>> useful > >>> because it gets written to the public header. > >> Huh, ok. I seem to recall that stuff in the top section came very > >> nearly the beginning, if not actually the beginning, of the output > >> file. I wonder if there are assumptions of this in many scanners. > > The problem is that location of section 1 code is ill-defined. It > > was created mainly to include headers, prototypes and macros needed > > in the actions. Those declarations should all include their own > > prerequisite headers. For that reason, section 1 could be inserted > > at the very top, which would avoid the need for %top{}. In fact, > > that makes the most sense to me, because Flex documentation never > > claims to make anything available to section 1 code, and is also how > > Flex's internal scanner source is written. But, that would surely > > break a lot of scanners, which are usually built by trial-and-error. > > > > Other people may have different opinions that certain CPP > > definitions should be available, even if undocumented, based on > > experience with Flex. Even if the current plan is to just avoid > > changes that can break scanners, we should come up with a more > > precise definition of where section 1 code is inserted. My thinking > > is that section 1 code should really not depend on anything from > > Flex, except perhaps the version macros and standard I/O headers. > > The right thing to do is clearly define where the section 1 code goes > in the output and revolve around that. > > I am in favor of section 1 being in the absolute top of the scanner. > If the user wants to put scanner-definition-dependent code in section > 1, they can generate a matching header file and include it, then add > their own stuff. If we set up the include guards correctly, an early- > inclusion / double-inclusion of the generated header will have no ill > effect. > > Aaron > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Jan E. <je...@me...> - 2009-03-24 21:20:12
|
Hi, when one uses GCC's -Wredundant-decls, an ugly extra warning appears: cc1: warnings being treated as errors <stdout>:3279: error: redundant redeclaration of `isatty' /usr/include/unistd.h:757: error: previous declaration of `isatty' was here make[1]: *** [scanner.o] Error 1 Since the .l file in question does not contain any use of isatty, I wonder what that extra declaration that flex puts in is there for. #ifndef __cplusplus extern int isatty (int ); #endif /* __cplusplus */ |
From: Joe K. <kr...@ni...> - 2008-12-30 19:54:05
|
My original intent was better option and code organization, which could not easily be handled as small patches. In the process, I developed an understanding of the Flex code and made changes that are worth contributing before trying to make significant changes to the overall skeleton and code generation process. Here is a list of changes I made that may be useful for the current Flex code. It would be easier to avoid splitting each change into a separate patch. Maye I can start by grouping the most favorable small changes into a single patch. A few changes were mainly aesthetic, like renaming the "guts" structure. Some are useful, but not so important, like adding run-time NLS support. I hope that line-wrapping doesn't make this too ugly. Joe ------------------------------------------------------------------------ * New macro: YY_ISATTY(). This allows you to define your own method to check for an interactive session, because isatty() is very non-portable. * NLS support for messages in generated code, in the text domain "flex-runtime". This is how the newer Bison code works. * Updated generation of main(): It now handles external tables, and supports all scanner types, and the optional use of C++ <iostream> for I/O. * Added tests for successful malloc/realloc where it was absent. A new function was added to report allocation errors by function name, to avoid the need for multiple NLS-translation strings. * Renamed the reentrant state object was renamed from yyguts_t to yyobject_t. All of the struct members no longer have the yy prefix, because it is not needed when encapsulating them in a struct. (Ideally, the C++ and yyobject_t names should all match) * YY_DECL is never defined by the flex code. Instead, the header has a conditional prototype, and the function body has a conditional declaration. This makes the main lex functions easier to read, and avoids adding "#undef YY_DECL" to the header file. Something like this: header: #ifndef YY_DECL int yylex(...); #endif function source: #ifdef YY_DECL YYDECL #else int yylex(...) #endif { ... * A new #define YY_ISATTY() macro was added, because 'isatty()' is the least portable function, but the user's code may be able to provide another means of checking for interactive input. The macro could instead be name "YY_IS_INTERACTIVE()". * An option to use C++ stream I/O was added. This allows C++ users to use a standard lexer with C++ I/O, and it allows use of the C++ class with standard C I/O. * Converted K&R syntax to ANSI in all source code. * Removed K&R option for the generated code. For now, it still has the options keywords, but reports that it is obsolete. This also avoids the need for the yyconst macros. Bugs Fixed: * Flex was mis-detecting the use of REJECT and yywrap by matching sub-strings instead of whole words. (Bug added when m4-quote escaping was implemented.) * [1544933] input() and unput() now work before calling yylex(). An init function was created to hold some of the set-up code at the beginning of yylex, which is called by the input() and unput() handlers if needed. The newer POSIX spec are a bit unclear, but seem to indicate that calling input() before yylex() is valid. * [2043583] Avoid signed/unsigned comparisons. * [1990170] Check the return value of write() in ECHO and issue error on failures, similar to input(). * [2125513] Replaced direct calls to yy_fatal_error() with the macro version. * [2203641] Added `default:' to all switch statements (for purists). * [2178663] Added "#include <cstdio>" for the C++ scanner. * [1783536] Fixed yywrap definition to include an argument only for the reentrant scanner. (This is really just to avoid warnings, and allow for broken compilers that don't like an empty macro argument value.) * [2040664] Use YY_CURRENT_BUFFER_LVALUE instead of YY_CURRENT_BUFFER in cases where NULL makes no sense. (i.e. left of `->'). This avoids warinings about possible use of NULL pointers with strict arg checking. Internal design changes: * The 2D yynxt_tbl was converted to 1D, for better uniformity. This was already the case for the serialized version of the 2D array, so it actually simplifies number of the code-generation conditionals. However, the resulting 2D table code is a bit easier to read, but it's not meant to read. * New function yylex_init_state() initializes I/O buffers, and allows input() and unput() to be used before the first call to yylex(). * The internal reentrant struct is no called `yyobject_t' instead of `yyguts_t'. The skeleton no longer uses the YY_G() macro. Instead, m4 processing converts the code, and avoids the need for declaring the yyobject pointer by expressing a global as: (((struct yyobject_t*)yyscanner)->global) * The internal scan.l is now more careful at handling the 'no' prefix. |
From: Will E. <wl...@us...> - 2008-12-05 22:33:08
|
Yes, we do want to keep bug fixes separate from reorgs and enhancements and the like. On Friday, 5 December 2008, 11:23 am -0800, Aaron Stone <aa...@se...> wrote: > On Tue, 2008-12-02 at 10:36 -0500, Joe Krahn wrote: > > Aaron Stone wrote: > > > In the code generated for a reentrant scanner, two functions call > > > yy_fatal_error directly instead of going through the YY_FATAL_ERROR > > > macro. I believe this to be an omission. Any objections to changing it? > > ... > > > > Actually, I had changed this in my mega-patch. > > Oh cool, well then that makes two people who noticed this and made the > same change, so I'll see about committing it. > > > > > My modifications are > > fairly extreme, so maybe it would be good if collected general bug-fixes > > to incorporate into the current code? > > Ah, yes splitting things up is really good practice. Also, I had started > to review your patches but then lost the message list I had open. Lame > excuse, I know, but FYI that reading over your work is on my todo list. > > > Another thing I fixed is the yywrap() macro. The yywrap function has 1 > > arg in the reentrant scanner, but zero otherwise. A blank argument is > > valid for a macro, so yywrap(n) is actually valid code for an expression > > using yywrap with either zero or one arguments. > > > > It turns out that at least one broken compiler considers this an error, > > claiming that a 1-argument macro was given zero args. There are also > > non-broken compilers that give warnings. In my modified code, I switched > > to using alternate definitions for yywrap, on the basis that it is > > reasonable to emit warnings in this situation, even though it really is > > valid code. > > Interesting, I didn't know that! > > Aaron > > > ------------------------------------------------------------------------------ > SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. > The future of the web can't happen without you. Join us at MIX09 to help > pave the way to the Next Web now. Learn more and register at > http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Aaron S. <aa...@se...> - 2008-12-05 19:28:02
|
On Tue, 2008-12-02 at 10:36 -0500, Joe Krahn wrote: > Aaron Stone wrote: > > In the code generated for a reentrant scanner, two functions call > > yy_fatal_error directly instead of going through the YY_FATAL_ERROR > > macro. I believe this to be an omission. Any objections to changing it? > ... > > Actually, I had changed this in my mega-patch. Oh cool, well then that makes two people who noticed this and made the same change, so I'll see about committing it. > > My modifications are > fairly extreme, so maybe it would be good if collected general bug-fixes > to incorporate into the current code? Ah, yes splitting things up is really good practice. Also, I had started to review your patches but then lost the message list I had open. Lame excuse, I know, but FYI that reading over your work is on my todo list. > Another thing I fixed is the yywrap() macro. The yywrap function has 1 > arg in the reentrant scanner, but zero otherwise. A blank argument is > valid for a macro, so yywrap(n) is actually valid code for an expression > using yywrap with either zero or one arguments. > > It turns out that at least one broken compiler considers this an error, > claiming that a 1-argument macro was given zero args. There are also > non-broken compilers that give warnings. In my modified code, I switched > to using alternate definitions for yywrap, on the basis that it is > reasonable to emit warnings in this situation, even though it really is > valid code. Interesting, I didn't know that! Aaron |
From: Joe K. <kr...@ni...> - 2008-12-02 15:36:59
|
Aaron Stone wrote: > In the code generated for a reentrant scanner, two functions call > yy_fatal_error directly instead of going through the YY_FATAL_ERROR > macro. I believe this to be an omission. Any objections to changing it? ... Actually, I had changed this in my mega-patch. My modifications are fairly extreme, so maybe it would be good if collected general bug-fixes to incorporate into the current code? Another thing I fixed is the yywrap() macro. The yywrap function has 1 arg in the reentrant scanner, but zero otherwise. A blank argument is valid for a macro, so yywrap(n) is actually valid code for an expression using yywrap with either zero or one arguments. It turns out that at least one broken compiler considers this an error, claiming that a 1-argument macro was given zero args. There are also non-broken compilers that give warnings. In my modified code, I switched to using alternate definitions for yywrap, on the basis that it is reasonable to emit warnings in this situation, even though it really is valid code. Joe Krahn |
From: Aaron S. <aa...@se...> - 2008-11-26 06:19:48
|
In the code generated for a reentrant scanner, two functions call yy_fatal_error directly instead of going through the YY_FATAL_ERROR macro. I believe this to be an omission. Any objections to changing it? void libsieve_addrset_lineno (int line_number , yyscan_t yyscanner) { struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* lineno is only valid if an input buffer exists. */ if (! YY_CURRENT_BUFFER ) yy_fatal_error( "libsieve_addrset_lineno called with no buffer" , yyscanner); yylineno = line_number; } And... void libsieve_addrset_column (int column_no , yyscan_t yyscanner) { struct yyguts_t * yyg = (struct yyguts_t*)yyscanner; /* column is only valid if an input buffer exists. */ if (! YY_CURRENT_BUFFER ) yy_fatal_error( "libsieve_addrset_column called with no buffer" , yyscanner); yycolumn = column_no; } Cheers, Aaron |
From: Will E. <wl...@us...> - 2008-11-16 13:43:48
|
Joe, Thanks for your two posts on unicode and flex. Given the direction that technology is going, flex needs to have some more coherent answer to the unicode question than "we don't do that." I'll look at the packages you mention and see if I can draw out some sensible goals for flex to implement. On Saturday, 15 November 2008, 1:16 pm -0500, Joe Krahn <kr...@ni...> wrote: > Supporting Unicode is actually more complex than supporting locale > attributes, because there is no "Unicode" locale that allows matching > type attributes like "alpha" for multiple locales. In looking for > information on handling Unicode, I found an alternative lexer: Quex, > hosted on SourceForge. Maybe Flex should avoid Unicode support, at least > for now, and just mention Quex to help users looking for Unicode. > > Quex claims to be mush faster than flex, because it uses actual code > with conditionals and gotos instead of tables. If speed is not the main > goal, tables can be quite useful, so I don't see it as a general > replacement for Flex. However, it is being actively developed, and it > may be a good idea for Flex and Quex to collaborate to maximize > compatibility between them. > > Joe Krahn > > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Joe K. <kr...@ni...> - 2008-11-15 18:22:28
|
Supporting Unicode is actually more complex than supporting locale attributes, because there is no "Unicode" locale that allows matching type attributes like "alpha" for multiple locales. In looking for information on handling Unicode, I found an alternative lexer: Quex, hosted on SourceForge. Maybe Flex should avoid Unicode support, at least for now, and just mention Quex to help users looking for Unicode. Quex claims to be mush faster than flex, because it uses actual code with conditionals and gotos instead of tables. If speed is not the main goal, tables can be quite useful, so I don't see it as a general replacement for Flex. However, it is being actively developed, and it may be a good idea for Flex and Quex to collaborate to maximize compatibility between them. Joe Krahn |
From: Joe K. <kr...@ni...> - 2008-11-11 17:12:35
|
After investing time into a re-write of the skeleton file processing for flex, I was looking into how to add support for Unicode. I was comparing flex to Sun's open-source lex, available at http://heirloom.sourceforge.net/. It is based on the original AT&T lex, but has added support for wide character sets using yytext as wchar_t, or by leaving yytext as char and using multi-byte encoding. There is also an existing "unicode" patch for flex, but it is really somewhat of a hack. I think it must have been written by an MS Windows user, because it really is only 16-bit, which is the size of wchar_t on Windows, and therefore not really Unicode. It also uses raw 2-byte file I/O instead of the proper C library wcs functions. It is also very inefficient, because it just sets CSIZE to 65536, creating rather large tables. The right way to do large character set is to make a sparse table of characters that are actually referenced in the grammar. It turns out that this is essentially what flex does with the ECS option, so it will be fairly easy to implement large character sets correctly by making ECS mode a requirement. OTOH, the lower 16-bit part of the Unicode character sets covers all of the common written languages in modern use, so a 16-bit limited version as implemented in the current "unicode" patch would be a good start. It will also work for other 16-bit encodings that are still widely used. So, I have implemented a variation of the 16-bit patch to provide a general-purpose --16bit mode. The main difference is that I converted all flex CSIZE tables to dynamic allocation, so that the large tables required for this approach will not make flex a memory hog for all of the 8-bit users. I also added a "--with-wchar" option to configure.in so it can be an experimental feature. By 'wchar' I mean general wide-character support, and not 'wchar_t'. Maybe another name would avoid possible confusion? Using hints from Sun's lex code, it was actually fairly easy to get a quick initial implementation. The problem is that I use American English, and am not a Unicode or NLS user, so I am somewhat clueless about actually using wide-character support, other than inserting special characters into some of my print messages. I mainly did it because I know a lot of other users have expressed a strong interest in getting Unicode support. I suspect there must be some test examples that use wide-char support in Sun's lex, or maybe examples that work with the existing unicode patch. Joe Krahn |
From: Joe K. <kr...@ni...> - 2008-10-29 20:20:43
|
I added a special token to allow flex's internal scanner to support --options=OPTIONS, where OPTIONS is anything valid in %option. The modified scan.l also keeps boolean options separate, so that the "no" prefix is not silently accepted where it is invalid. I modified the full/fast options to only set the specific full/fast flags, which is in 3 places, because the command-line has individual flags and the grouped -C<opt> flag. I then modify useecs, usemecs, and use_read in check_options(), but only if the user didn't specify them. They are initially set to 'unspecified'. I added %options to exclude generating the remaining optional functions, but didn't make command-line versions. I added --include="PATH" to define the m4 include path to access skeleton files, and a matching one-letter option "-Y" which Sun's lex uses. Maybe the long option should be --skel-path=PATH. The main skeleton file is also expected to be there, but it checks the current directory first. The %option noline was missing, possibly removed because #line marks could be inserted before that option was read. I fixed this by modifying the linedir filter function to remove any #lines when noline is active, and added noline back to scan.l. I modified the prefix option to remove leading underscores if the prefix is blank. I added a yynamespace option for compiling the C scanner as C++, which can be used with a blank prefix, giving yy::lex() as the lexer function, for example. |
From: Joe K. <kr...@ni...> - 2008-10-29 17:26:53
|
There are several issues in the flex source code that should be addressed, but which don't have an obvious answer. The stdinit option is supposed to initialize I/O to stdin and stdout, although those may not be compile-time constants on some compilers. The strange thing is that yylex() starts by setting yyin,yyout to stdin,stdout if they are NULL. The only thing that stdinit could do is allow for using input() to get stdin data before calling yylex, but that doesn't work in flex anyhow (bug #618177). So, this option seems useless unless the input and unput functions are updated to work outside of yylex(). On a related issue, flex uses yyinput() instead of input() for C++ scanners, to avoid a name conflict with the 'input' stream. However, there is no such stream name. Maybe it was from an early implementation of C++. Besides, all C++ names are protected by namespace prefixes. Also, input() is a macro in traditional lex, although POSIX says it cannot be redefined. (Does anyone have access to the full POSIX specs?) My idea is to make the actual function name yyinput(), so it has the yy-prefix namespace, and make input() a macro. C++ code that uses yyinput() will still work, and input() will then be a macro that can be renamed or even skipped. According to documentation, the lex-compat option is supposed to make unput() non-redefinable, but this is not actually any change to unput() in the generated code. I think the idea of POSIX disallowing redefinition of unput, input, and output is mainly because there is no specification on how to redefine them. So, I still think using macros as with the original lex is a good idea, but that the docs should warn that changing them is non-portable. We could also add the output() feature to be more POSIX compatible. The docs say that yylineno is only supported with the lex-compat option, but this seems no longer the case. But, yylineno and yycolumn both need some work to make them fully functional. Flex has gettext NLS macros everywhere except the generated code. It would be good to implement it there, bit it means adding an NLS option to flex itself. Joe Krahn |
From: Joe K. <kr...@ni...> - 2008-10-27 16:40:38
|
Aaron Stone wrote: > I would also like to see less mixing of skeleton-generated and > C-generated output. They're too deeply intertwingled right now, and I am > concerned that people with custom skeletons will end up relying on hacky > internal functions. > > Aaron ... OK, I did some more work on my modified skeleton. Now, the only code written directly from C are the data tables. The only way to reduce C-generated code further is to define M4 macros for just the raw table data, i.e. "{1,2,3...}" and let the skeleton declare their variable names as well. Actually, that would probably be a good idea, because it would allow for skeletons written in something other than C. This makes it impossible to use the old skeleton, but it still is possible to follow the old skeleton's order of putting section 1 code in the middle of header parts, if people want to avoid incompatibilities. I wanted to see how traditional lex works, and found that Sun has released their lex source in "The Heirloom Development Tools" on sourceforge. It is essentially the basic traditional lex, and still has an output option for ratfor source-code. It does have an extension to parse wchar_t input, which might be good to add to flex. Sun's lex actually puts user code from section 1 near the end of the "header" stuff, so you would have to either use CPP options on the command line to set some of the options like YYLMAX, or undefine then redefine them in the user code. I think most flex users depend on being able to set those macros, with few expectations for predefined types. One exception that I found in a few places in test code is to declare variables of type "yyscan_t" in the reentrant scanner. They can instead just be declared "void*", but it also works to read in the output header, so you could do something like this: %option header="scanner.h" %top{ #define YYLMAX 256 } %{ #include "scanner.h" yyscan_t my_scanner; %} This will put YYLMAX into the header so it will be valid for external code, and still allows section 1 user code to access header typedefs, etc. If users don't want to be required to write out a header file, flex could also support a new %bottom{} code tag that allows user-code to be inserted after the header definitions. Or, just define a %header command to indicates "insert header sutff here", so that all subsequent section 1 code comes after the header block. Joe Krahn |
From: Michael J. <jam...@gm...> - 2008-10-22 07:27:39
|
Hello All: I posted a patch to fix a bug that was reported a while ago in the tracker. Since the ticket seems to be dead I am putting a note here in the hope that someone will notice and integrate into the main branch. For reference, the bug report is titled "noyywrap generates incorrect define - ID: 1783536," the patch is attached to this e-mail and also as a comment on the bug report. Kind regards, Michael James P.S. Apologies if this makes it to the list twice. First attempt bounced. |
From: Aaron S. <aa...@se...> - 2008-10-21 18:05:07
|
On Oct 21, 2008, at 9:23 AM, Joe Krahn wrote: > Aaron Stone wrote: >> On Oct 14, 2008, at 8:45 AM, Joe Krahn wrote: >>> What part of the generated code is supposed to be visible to user >>> code >>> from section 1? My understanding is that section 1 is meant for >>> including extra headers, and inserting cpp #defines to modify cpp- >>> based >>> options. However, it apparently does not come early enough in the >>> generated code, because somebody added the %top{} feature. >>> >>> My first guess is just to avoid moving any code to the other side >>> of the >>> current section 1 code insertion, to avoid breaking things. But, >>> there >>> should be some specific rules. In any case, the %top{} feature is >>> useful >>> because it gets written to the public header. >> Huh, ok. I seem to recall that stuff in the top section came very >> nearly the beginning, if not actually the beginning, of the output >> file. I wonder if there are assumptions of this in many scanners. > The problem is that location of section 1 code is ill-defined. It > was created mainly to include headers, prototypes and macros needed > in the actions. Those declarations should all include their own > prerequisite headers. For that reason, section 1 could be inserted > at the very top, which would avoid the need for %top{}. In fact, > that makes the most sense to me, because Flex documentation never > claims to make anything available to section 1 code, and is also how > Flex's internal scanner source is written. But, that would surely > break a lot of scanners, which are usually built by trial-and-error. > > Other people may have different opinions that certain CPP > definitions should be available, even if undocumented, based on > experience with Flex. Even if the current plan is to just avoid > changes that can break scanners, we should come up with a more > precise definition of where section 1 code is inserted. My thinking > is that section 1 code should really not depend on anything from > Flex, except perhaps the version macros and standard I/O headers. The right thing to do is clearly define where the section 1 code goes in the output and revolve around that. I am in favor of section 1 being in the absolute top of the scanner. If the user wants to put scanner-definition-dependent code in section 1, they can generate a matching header file and include it, then add their own stuff. If we set up the include guards correctly, an early- inclusion / double-inclusion of the generated header will have no ill effect. Aaron |
From: Aaron S. <aa...@se...> - 2008-10-21 18:03:27
|
I would also like to see less mixing of skeleton-generated and C- generated output. They're too deeply intertwingled right now, and I am concerned that people with custom skeletons will end up relying on hacky internal functions. Aaron On Oct 21, 2008, at 9:35 AM, Joe Krahn wrote: > My primary goal for reorganizing the skeleton is to reduce bugs and > improve maintainability. However, any reorganization is likely to > cause > problems for some users. It might be better as a first step to convert > C-generated code to skeleton-generated code, and replace the simple > skelout() conditionals with m4 processing, but without the reorganized > header. In that case, the actual sources generated for all of the test > cases can be checked with diff, and should have only trivial changes. > > The reorganized header could then be distributed as an experimental > version that people can try with the --skel= flag, and hopefully get > some good user feedback. > > Joe > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel |
From: Joe K. <kr...@ni...> - 2008-10-21 16:35:42
|
My primary goal for reorganizing the skeleton is to reduce bugs and improve maintainability. However, any reorganization is likely to cause problems for some users. It might be better as a first step to convert C-generated code to skeleton-generated code, and replace the simple skelout() conditionals with m4 processing, but without the reorganized header. In that case, the actual sources generated for all of the test cases can be checked with diff, and should have only trivial changes. The reorganized header could then be distributed as an experimental version that people can try with the --skel= flag, and hopefully get some good user feedback. Joe |
From: Joe K. <kr...@ni...> - 2008-10-21 16:23:18
|
Aaron Stone wrote: > > On Oct 14, 2008, at 8:45 AM, Joe Krahn wrote: > >> What part of the generated code is supposed to be visible to user code >> from section 1? My understanding is that section 1 is meant for >> including extra headers, and inserting cpp #defines to modify cpp-based >> options. However, it apparently does not come early enough in the >> generated code, because somebody added the %top{} feature. >> >> My first guess is just to avoid moving any code to the other side of the >> current section 1 code insertion, to avoid breaking things. But, there >> should be some specific rules. In any case, the %top{} feature is useful >> because it gets written to the public header. > > Huh, ok. I seem to recall that stuff in the top section came very nearly > the beginning, if not actually the beginning, of the output file. I > wonder if there are assumptions of this in many scanners. The problem is that location of section 1 code is ill-defined. It was created mainly to include headers, prototypes and macros needed in the actions. Those declarations should all include their own prerequisite headers. For that reason, section 1 could be inserted at the very top, which would avoid the need for %top{}. In fact, that makes the most sense to me, because Flex documentation never claims to make anything available to section 1 code, and is also how Flex's internal scanner source is written. But, that would surely break a lot of scanners, which are usually built by trial-and-error. Other people may have different opinions that certain CPP definitions should be available, even if undocumented, based on experience with Flex. Even if the current plan is to just avoid changes that can break scanners, we should come up with a more precise definition of where section 1 code is inserted. My thinking is that section 1 code should really not depend on anything from Flex, except perhaps the version macros and standard I/O headers. Joe |
From: Aaron S. <aa...@se...> - 2008-10-20 19:54:03
|
I think C89 should be the new minimum requirement, and code should be C99-clean (for example, C99 disallows functions used before their prototype declaration to assume 'int' argument types). Aaron On Oct 18, 2008, at 8:38 AM, Will Estes wrote: > I don't have strong felings either way, but I would say that insisting > on ANSI C is pretty reasonable these days. > > On Friday, 17 October 2008, 11:41 am -0400, Joe Krahn <kr...@ni... > > wrote: > >> I added M4 macros to my modified skeleton to handle C and C++ >> functions >> with a single macro, instead of one macro for each argument count >> to add >> the reentrant argument. The idea was to simplify the skeleton body by >> having better M4 macros. >> >> My current implementation is derived from Bison's macros. It doesn't >> look as pretty here because Flex uses double brackets, and results >> in a >> syntax that is not really nicer looking than the previous method. It >> will be easier to simplify the macros if K&R support is dropped. Does >> anyone in the world still need it? If so, they can always use an >> older >> release of Flex. >> >> Joe >> >> ------------------------------------------------------------------------- >> This SF.Net email is sponsored by the Moblin Your Move Developer's >> challenge >> Build the coolest Linux based applications with Moblin SDK & win >> great prizes >> Grand prize is a trip for two to an Open Source event anywhere in >> the world >> http://moblin-contest.org/redirect.php?banner_id=100&url=/ >> _______________________________________________ >> Flex-devel mailing list >> Fle...@li... >> https://lists.sourceforge.net/lists/listinfo/flex-devel >> > > -- > Will Estes (wl...@us...) > Flex Project Maintainer > http://flex.sourceforge.net/ > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel |
From: Will E. <wl...@us...> - 2008-10-18 15:39:58
|
I don't have strong felings either way, but I would say that insisting on ANSI C is pretty reasonable these days. On Friday, 17 October 2008, 11:41 am -0400, Joe Krahn <kr...@ni...> wrote: > I added M4 macros to my modified skeleton to handle C and C++ functions > with a single macro, instead of one macro for each argument count to add > the reentrant argument. The idea was to simplify the skeleton body by > having better M4 macros. > > My current implementation is derived from Bison's macros. It doesn't > look as pretty here because Flex uses double brackets, and results in a > syntax that is not really nicer looking than the previous method. It > will be easier to simplify the macros if K&R support is dropped. Does > anyone in the world still need it? If so, they can always use an older > release of Flex. > > Joe > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's challenge > Build the coolest Linux based applications with Moblin SDK & win great prizes > Grand prize is a trip for two to an Open Source event anywhere in the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Joe K. <kr...@ni...> - 2008-10-17 15:41:43
|
I added M4 macros to my modified skeleton to handle C and C++ functions with a single macro, instead of one macro for each argument count to add the reentrant argument. The idea was to simplify the skeleton body by having better M4 macros. My current implementation is derived from Bison's macros. It doesn't look as pretty here because Flex uses double brackets, and results in a syntax that is not really nicer looking than the previous method. It will be easier to simplify the macros if K&R support is dropped. Does anyone in the world still need it? If so, they can always use an older release of Flex. Joe |
From: Will E. <wl...@us...> - 2008-10-16 18:34:19
|
On Thursday, 16 October 2008, 11:03 am -0400, Joe Krahn <kr...@ni...> wrote: > Also, there are a few more functions that could be disabled by a > no<func_name> option. Instead making command-line options even more > numerous, maybe it would be a good idea to add a flag --options=LIST, > where LIST is a string containing options processed exactly as in the > %option directive. Yes, that does make sense. |
From: Will E. <wl...@us...> - 2008-10-16 18:27:42
|
On Thursday, 16 October 2008, 10:58 am -0700, Aaron Stone <aa...@se...> wrote: > We'll have to see if any of those options are relied upon by > Makefiles, but otherwise I like your idea. I don't recall if --nofoo > is something that's part of lex-ancient or flex-2.5.4 or not. > No, they're not a part of the older incarnations. |
From: Aaron S. <aa...@se...> - 2008-10-16 17:58:14
|
We'll have to see if any of those options are relied upon by Makefiles, but otherwise I like your idea. I don't recall if --nofoo is something that's part of lex-ancient or flex-2.5.4 or not. Aaron On Oct 16, 2008, at 8:03 AM, Joe Krahn wrote: > I noticed that the option processor allows a 'no' prefix even where it > doesn't make sense. In most cases, it probably is not that important, > but someone may get confused if they try to use an option like > "noextra-type". > > Also, there are a few more functions that could be disabled by a > no<func_name> option. Instead making command-line options even more > numerous, maybe it would be a good idea to add a flag --options=LIST, > where LIST is a string containing options processed exactly as in the > %option directive. > > Joe Krahn > > ------------------------------------------------------------------------- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win > great prizes > Grand prize is a trip for two to an Open Source event anywhere in > the world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel |