flex-devel Mailing List for flex: the fast lexical analyser (Page 5)
flex is a tool for generating scanners
Brought to you by:
wlestes
You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(5) |
May
(2) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(3) |
2008 |
Jan
(1) |
Feb
(2) |
Mar
(1) |
Apr
(2) |
May
(1) |
Jun
|
Jul
|
Aug
(5) |
Sep
(3) |
Oct
(33) |
Nov
(4) |
Dec
(4) |
2009 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(10) |
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2012 |
Jan
|
Feb
(11) |
Mar
(12) |
Apr
|
May
|
Jun
(3) |
Jul
(62) |
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2013 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2014 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(5) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
(3) |
Nov
(33) |
Dec
(31) |
2016 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(2) |
Sep
(5) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
(4) |
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
(5) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mighty Jo <mig...@gm...> - 2015-10-25 00:24:26
|
Hi all, I'm a C++ developer (among other things) who's been working with Flex. I noticed some C quirks in the skeleton and I'm planning to ameliorate them. If I'm reading the mailing list archives correctly, I shouldn't be violating any designs you've set out, but I'll write to the list before overhauling anything major. I'll be sticking to the C++ specific code to begin with (e.g. changing pointers to std::istream into references wherever the lexer doesn't take ownership). Cheers! -Joe Langley |
From: Will E. <wes...@gm...> - 2015-08-06 13:32:37
|
Thanks for your patches. They've been applied and pushed to origin/master. They will be included in the next released version of flex. (which I am planning on doing this weekend.) On Thursday, 6 August 2015, 12:37 pm +0300, Jaska Uimonen <jas...@he...> wrote: > Hello, > > I was assigned a task to fix some static analysis issues > in certain sw components. There we're couple of issues found > in flex. > > I have to say that I'm not totally sure these are real bugs > and will ever manifest themselves. Some functions in flex > are really long and it is really difficult to follow if the > issues are false positives or not. OTH I don't think the > fixes will do any harm either... > > First patch is just initializing an array, because the > analysis says it might be used uninitialized (some 500 > lines later). I'm not sure it is a correct either to > use the array initialized to 0 i.e. should the code exit > earlier with an error if the array is not populated. > Anyway this silences the warnings. > > Second patch is about not freeing yynultrans_tbl in some > case. Again the function is really long so I have a little > bit of trouble following the code flow. > > If there are issues with the patches or they are considered > "non-issues", please let me know. > > br, > Jaska Uimonen > > > > >From 69e1c063e830f91cac4001ed8d0d264de539c03c Mon Sep 17 00:00:00 2001 > From: Jaska Uimonen <jas...@he...> > Date: Mon, 27 Jul 2015 10:59:58 +0300 > Subject: [PATCH 1/2] fix possible uninitialized array values > > --- > src/dfa.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/src/dfa.c b/src/dfa.c > index c16a010..0a68e3a 100644 > --- a/src/dfa.c > +++ b/src/dfa.c > @@ -400,7 +400,7 @@ void ntod () > * from 1 to CSIZE, so their size must be CSIZE + 1. > */ > int duplist[CSIZE + 1], state[CSIZE + 1]; > - int targfreq[CSIZE + 1], targstate[CSIZE + 1]; > + int targfreq[CSIZE + 1] = {0}, targstate[CSIZE + 1]; > > /* accset needs to be large enough to hold all of the rules present > * in the input, *plus* their YY_TRAILING_HEAD_MASK variants. > -- > 1.9.3 > > >From 0c897db0dab3bca5f96d4c4fb1f6f20ed1024c0c Mon Sep 17 00:00:00 2001 > From: Jaska Uimonen <jas...@he...> > Date: Mon, 27 Jul 2015 11:20:05 +0300 > Subject: [PATCH 2/2] fix possible resource leak with yynultrans_tbl > > --- > src/gen.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/src/gen.c b/src/gen.c > index 049cbfe..81e7c27 100644 > --- a/src/gen.c > +++ b/src/gen.c > @@ -1525,7 +1525,7 @@ void make_tables (void) > { > int i; > int did_eof_rule = false; > - struct yytbl_data *yynultrans_tbl; > + struct yytbl_data *yynultrans_tbl = NULL; > > > skelout (); /* %% [2.0] - break point in skel */ > @@ -1755,9 +1755,13 @@ void make_tables (void) > 0) > flexerror (_ > ("Could not write yynultrans_tbl")); > + } > + > + if (yynultrans_tbl != NULL) { > yytbl_data_destroy (yynultrans_tbl); > yynultrans_tbl = NULL; > - } > + } > + > /* End generating yy_NUL_trans */ > } > > -- > 1.9.3 > > ------------------------------------------------------------------------------ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel |
From: Jaska U. <jas...@he...> - 2015-08-06 09:37:20
|
Hello, I was assigned a task to fix some static analysis issues in certain sw components. There we're couple of issues found in flex. I have to say that I'm not totally sure these are real bugs and will ever manifest themselves. Some functions in flex are really long and it is really difficult to follow if the issues are false positives or not. OTH I don't think the fixes will do any harm either... First patch is just initializing an array, because the analysis says it might be used uninitialized (some 500 lines later). I'm not sure it is a correct either to use the array initialized to 0 i.e. should the code exit earlier with an error if the array is not populated. Anyway this silences the warnings. Second patch is about not freeing yynultrans_tbl in some case. Again the function is really long so I have a little bit of trouble following the code flow. If there are issues with the patches or they are considered "non-issues", please let me know. br, Jaska Uimonen |
From: Will E. <wes...@gm...> - 2015-02-28 15:10:03
|
The listing of po/Makefile.in in configure.ac is correct usage. It's a gettext thing. The manual points out the difference and explains it saying that the distributed filefrom gettext is Makefile.in.in. It comes from generalizing so as not to assume things about the toolchain I think. On Saturday, 28 February 2015, 8:10 am -0500, SenseiC <cgo...@gm...> wrote: > While slowly digging through the files I noticed something (almost > certainly just an oversight) in the file [1]configure.ac that probably > needs correction. > Line 122: po/Makefile.in > I believe that should have the trailing ".in" removed. Otherwise when > autogen.sh gets executed it creates the following file: > $ ls -a po | grep Makefile > [2]Makefile.in.in > Probably not the intended result. > SenseiC bows out. > > References > > Visible links > 1. http://configure.ac/ > 2. http://makefile.in.in/ > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel |
From: SenseiC <cgo...@gm...> - 2015-02-28 13:10:50
|
While slowly digging through the files I noticed something (almost certainly just an oversight) in the file configure.ac that probably needs correction. Line 122: po/Makefile.in I believe that should have the trailing ".in" removed. Otherwise when autogen.sh gets executed it creates the following file: $ ls -a po | grep Makefile Makefile.in.in Probably not the intended result. SenseiC bows out. |
From: Will E. <wes...@gm...> - 2015-02-27 15:41:29
|
Thanks for writing. Yes, since your question involves the distribution and building of flex itself, it's best directed here. You can clone the most recent sources off of github at https://github.com/westes/flex/. I do vaguely remember removing a bunch of things from the source tree a while back in an effort to clean up and remove cruft. I haven't used OpenVMS in going on a decade, so I can't offer any help on the specifics of that. There are some files in the source tree that are intentionally not distributed. If you point out something that would be helpful to have in the distribution, it's easy enough to include more files. Is OpenVMS a target that autoconf supports? If so, if I need to use some more recent tools in the GNU/autotools toolset, point me at the appropriate versions. --Will On Friday, 27 February 2015, 10:23 am -0500, SenseiC <cgo...@gm...> wrote: > Going on the assumption that since my question involves the content of the > actual Flex tarball... I have sent this query to this distribution list. > If it belongs in the "help" list, please let me know and I will resend it > there. > Earlier this week I set out to build an Open Source product on OpenVMS. > As seems the norm building one thing requires building several other > "supporting" applications first. While I CAN obtain pre-built Flex > binaries for OpenVMS (based on the 2.5.4 code base), I noticed that it > appears the \misc folder no longer appears in the tarball. > Did it actually "get removed" or do I need to pull the source from > somewhere OTHER than the SourceForge site??? > Assuming it DID get removed... > At what point did this get removed? > Did it get removed for a particular reason? > Has anyone ever produced a "standard" Makefile for building Flex on > OpenVMS? Unfortunately I have yet to figure out a way to coerce the > blasted configure script to generate an appropriate Makefile, but alas > that frustration belongs to a different group :) > SenseiC > ------------------------------------------------------------------------------ > Dive into the World of Parallel Programming The Go Parallel Website, sponsored > by Intel and developed in partnership with Slashdot Media, is your hub for all > things parallel software development, from weekly thought leadership blogs to > news, videos, case studies, tutorials and more. Take a look and join the > conversation now. http://goparallel.sourceforge.net/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel |
From: SenseiC <cgo...@gm...> - 2015-02-27 15:23:30
|
Going on the assumption that since my question involves the content of the actual Flex tarball... I have sent this query to this distribution list. If it belongs in the "help" list, please let me know and I will resend it there. Earlier this week I set out to build an Open Source product on OpenVMS. As seems the norm building one thing requires building several other "supporting" applications first. While I CAN obtain pre-built Flex binaries for OpenVMS (based on the 2.5.4 code base), I noticed that it appears the \misc folder no longer appears in the tarball. Did it actually "get removed" or do I need to pull the source from somewhere OTHER than the SourceForge site??? Assuming it DID get removed... At what point did this get removed? Did it get removed for a particular reason? Has anyone ever produced a "standard" Makefile for building Flex on OpenVMS? Unfortunately I have yet to figure out a way to coerce the blasted configure script to generate an appropriate Makefile, but alas that frustration belongs to a different group :) SenseiC |
From: Will E. <wes...@gm...> - 2014-06-14 14:45:46
|
On Saturday, 14 June 2014, 11:39 am +0200, Mariusz Pluciński <plu...@gm...> wrote: > On 06/14/2014 02:38 AM, Will Estes wrote: > >So one of the things I'm doing is rewriting the test suite to use > >automake's parallel test suite support as it makes the test suite > >much easier to maintain, assuming I can get all the tests > >rewritten. I've got about 20 tests rewritten, with the caveat that > >I'm rewriting the easy tests first. > > Just for sake of experiment, I've ported flex build system from > autotools to CMake. As far as I checked, it works quite good- for > both building and testing. Do you think that such change would be > beneficial for mainline flex too? I don't think so at this point. > The parts of table that are not actively used by scanner programmer, > usually all contain the same value. This is especially true in long > not-yet-assigned Unicode planes. I suppose that even simple RLE may > give great difference here. I'd rather see a solution that works and is refineable than wait for something more perfect. > Don't worry, my code is currently on top of your master branch. > Okay, I'll then split my commits into smaller branches and start > with pull requests. Go for it! --Will |
From: Mariusz P. <plu...@gm...> - 2014-06-14 10:02:06
|
At the moment, no. The conversion table is hardcoded in flex.skl and supports only common subset of EBCDIC. But as I've written, I am going to add way for user to specify his own conversion routines and/or fallback unspecified ones to iconv. I suppose it will fulfill your needs. On 06/14/2014 10:32 AM, John P. Hartmann wrote: > Does your EBCDIC support allow the specification of the particular > EBCDIC code page? If it supports 1047 you have saved me from > maintaining my own hack. Once your branch is merged into the base, that is. > > On 14/06/14 02:38, Will Estes wrote: >> On Saturday, 14 June 2014, 2:20 am +0200, Mariusz Pluciński <mpl...@mp...> wrote: >> >>> I'm writing, because I have recently done some hacking in flex. My >>> goal was to add Unicode support (the feature that I see as very >>> important these days). Yesterday, I published the result as a fork >>> of your repository on Github. It is available under address: >>> https://github.com/mplucinski/flex >> Awesome. >> >>> Using my fork, it is possible to write rules in ".l" files that >>> capture non-ASCII characters (the source must be encoded in UTF-8). >>> Generated scanners are theoretically possible to handle any existing >>> character encoding, but for now only ASCII, UTF-8 and EBCDIC are >>> available (the last one has been added just for testing purposes). >>> The example of such scanner is available in >>> https://github.com/mplucinski/flex/blob/master/tests/test-unicode-nr >>> : scanner.l is able to properly parse test.input file. >> So one of the things I'm doing is rewriting the test suite to use automake's parallel test suite support as it makes the test suite much easier to maintain, assuming I can get all the tests rewritten. I've got about 20 tests rewritten, with the caveat that I'm rewriting the easy tests first. >> >>> My code does not inherit from version that is in flex's >>> "to.do/unicode" directory, as that one uses wchar_t (which I see as >> That's an old solution kept around for purely historical reasons until something better comes along. >> >>> a wrong way), and does not provide any way to deal with various >>> character encodings. My version uses char32_t, which makes it >>> possible to deal with all 17 Unicode planes on all modern platforms >>> (as far as I know, at least one of popular operating systems still >>> defines wchar_t as 16-bit long). Also, my version makes it quite >>> simple to support virtually any character encoding, even ones with >>> variable character size. >>> >>> What I would like to accomplish, is to get my changes integrated >>> with main line of flex project. However, I am aware that my code is >>> not yet mature enough - I'm pretty sure there are big and small >>> issues with new features, as well as regressions (despite that all >>> existing test cases pass). >> Starting with the test suite is good though. >> >>> I currently see a few things that needs an urgent improvement: >>> >>> - support for more encodings - at least UTF-32 and both variants of >>> UTF-16 (these should be builtins, I think). For others, the >>> interface for programmer to add his own conversion function would be >>> sufficient. Optional support of "iconv" may be another approach. >> THis all sounds reasonable, understanding that I'm not anything like a unicode expert. >> >>> - output binary size - as character classes array now may have up >>> to 65536*17+1 elements - on 64-bit platform it gives almost 9 MB of >>> data in final binary. Not mentioning intermediate .c file... >> Yeah but what are options to reduce the size of the output that don't require a lot of code complexity? >> >>> - generation speed - I'm not sure if it may be greatly improved, >>> but generating even simplest Unicode scanner takes a long time (over >>> 30 seconds on my machine). This definitely needs at least profiling. >> Sure. >> >>> I will be working on those issues in upcoming weeks, but meanwhile: >>> >>> >>> What do you think, would it be good idea to introduce my changes >>> into official flex release? If yes, may I ask you for looking into >>> my commits and point out issues that should be resolved before such >>> merge? >> The standard thing to do is to submit a pull request and interested persons can comment. >> >> You'll have a lot of rebasing to do if you aren't following all the 2.6.0 changes, but don't let that hold you back. >> >>> Your review would be a great help for me. I would be especially >>> happy to hear about some corner cases I missed, or solutions that >>> does not fit very well with general flex approach. >>> >>> >>> I'm looking forward to hearing from you. >> Thanks for your work and interest in flex. We'll see about your changes. >> >>> (if anyone on the mailing list would like to also take a look there, >>> it would be great to hear from you too!) >> Amen to this; man hands and all that. >> >>> Regards, >>> Mariusz Pluciński >>> http://www.mplucinski.com/ > > ------------------------------------------------------------------------------ > HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions > Find What Matters Most in Your Big Data with HPCC Systems > Open Source. Fast. Scalable. Simple. Ideal for Dirty Data. > Leverages Graph Analysis for Fast Processing & Easy Data Exploration > http://p.sf.net/sfu/hpccsystems > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel -- Best Regards, Mariusz Pluciński http://www.mplucinski.com/ |
From: Mariusz P. <plu...@gm...> - 2014-06-14 09:39:45
|
On 06/14/2014 02:38 AM, Will Estes wrote: > So one of the things I'm doing is rewriting the test suite to use > automake's parallel test suite support as it makes the test suite much > easier to maintain, assuming I can get all the tests rewritten. I've > got about 20 tests rewritten, with the caveat that I'm rewriting the > easy tests first. Just for sake of experiment, I've ported flex build system from autotools to CMake. As far as I checked, it works quite good- for both building and testing. Do you think that such change would be beneficial for mainline flex too? >> - output binary size - as character classes array now may have up >> to 65536*17+1 elements - on 64-bit platform it gives almost 9 MB of >> data in final binary. Not mentioning intermediate .c file... > Yeah but what are options to reduce the size of the output that don't require a lot of code complexity? The parts of table that are not actively used by scanner programmer, usually all contain the same value. This is especially true in long not-yet-assigned Unicode planes. I suppose that even simple RLE may give great difference here. > The standard thing to do is to submit a pull request and interested > persons can comment. You'll have a lot of rebasing to do if you aren't > following all the 2.6.0 changes, but don't let that hold you back. Don't worry, my code is currently on top of your master branch. Okay, I'll then split my commits into smaller branches and start with pull requests. Thanks, Mariusz |
From: John P. H. <jph...@gm...> - 2014-06-14 08:32:58
|
Does your EBCDIC support allow the specification of the particular EBCDIC code page? If it supports 1047 you have saved me from maintaining my own hack. Once your branch is merged into the base, that is. On 14/06/14 02:38, Will Estes wrote: > On Saturday, 14 June 2014, 2:20 am +0200, Mariusz Pluciński <mpl...@mp...> wrote: > >> I'm writing, because I have recently done some hacking in flex. My >> goal was to add Unicode support (the feature that I see as very >> important these days). Yesterday, I published the result as a fork >> of your repository on Github. It is available under address: >> https://github.com/mplucinski/flex > Awesome. > >> Using my fork, it is possible to write rules in ".l" files that >> capture non-ASCII characters (the source must be encoded in UTF-8). >> Generated scanners are theoretically possible to handle any existing >> character encoding, but for now only ASCII, UTF-8 and EBCDIC are >> available (the last one has been added just for testing purposes). >> The example of such scanner is available in >> https://github.com/mplucinski/flex/blob/master/tests/test-unicode-nr >> : scanner.l is able to properly parse test.input file. > So one of the things I'm doing is rewriting the test suite to use automake's parallel test suite support as it makes the test suite much easier to maintain, assuming I can get all the tests rewritten. I've got about 20 tests rewritten, with the caveat that I'm rewriting the easy tests first. > >> My code does not inherit from version that is in flex's >> "to.do/unicode" directory, as that one uses wchar_t (which I see as > That's an old solution kept around for purely historical reasons until something better comes along. > >> a wrong way), and does not provide any way to deal with various >> character encodings. My version uses char32_t, which makes it >> possible to deal with all 17 Unicode planes on all modern platforms >> (as far as I know, at least one of popular operating systems still >> defines wchar_t as 16-bit long). Also, my version makes it quite >> simple to support virtually any character encoding, even ones with >> variable character size. >> >> What I would like to accomplish, is to get my changes integrated >> with main line of flex project. However, I am aware that my code is >> not yet mature enough - I'm pretty sure there are big and small >> issues with new features, as well as regressions (despite that all >> existing test cases pass). > Starting with the test suite is good though. > >> I currently see a few things that needs an urgent improvement: >> >> - support for more encodings - at least UTF-32 and both variants of >> UTF-16 (these should be builtins, I think). For others, the >> interface for programmer to add his own conversion function would be >> sufficient. Optional support of "iconv" may be another approach. > THis all sounds reasonable, understanding that I'm not anything like a unicode expert. > >> - output binary size - as character classes array now may have up >> to 65536*17+1 elements - on 64-bit platform it gives almost 9 MB of >> data in final binary. Not mentioning intermediate .c file... > Yeah but what are options to reduce the size of the output that don't require a lot of code complexity? > >> - generation speed - I'm not sure if it may be greatly improved, >> but generating even simplest Unicode scanner takes a long time (over >> 30 seconds on my machine). This definitely needs at least profiling. > Sure. > >> I will be working on those issues in upcoming weeks, but meanwhile: >> >> >> What do you think, would it be good idea to introduce my changes >> into official flex release? If yes, may I ask you for looking into >> my commits and point out issues that should be resolved before such >> merge? > The standard thing to do is to submit a pull request and interested persons can comment. > > You'll have a lot of rebasing to do if you aren't following all the 2.6.0 changes, but don't let that hold you back. > >> Your review would be a great help for me. I would be especially >> happy to hear about some corner cases I missed, or solutions that >> does not fit very well with general flex approach. >> >> >> I'm looking forward to hearing from you. > Thanks for your work and interest in flex. We'll see about your changes. > >> (if anyone on the mailing list would like to also take a look there, >> it would be great to hear from you too!) > Amen to this; man hands and all that. > >> >> Regards, >> Mariusz Pluciński >> http://www.mplucinski.com/ |
From: Will E. <wes...@gm...> - 2014-06-14 00:39:00
|
On Saturday, 14 June 2014, 2:20 am +0200, Mariusz Pluciński <mpl...@mp...> wrote: > I'm writing, because I have recently done some hacking in flex. My > goal was to add Unicode support (the feature that I see as very > important these days). Yesterday, I published the result as a fork > of your repository on Github. It is available under address: > https://github.com/mplucinski/flex Awesome. > > Using my fork, it is possible to write rules in ".l" files that > capture non-ASCII characters (the source must be encoded in UTF-8). > Generated scanners are theoretically possible to handle any existing > character encoding, but for now only ASCII, UTF-8 and EBCDIC are > available (the last one has been added just for testing purposes). > The example of such scanner is available in > https://github.com/mplucinski/flex/blob/master/tests/test-unicode-nr > : scanner.l is able to properly parse test.input file. So one of the things I'm doing is rewriting the test suite to use automake's parallel test suite support as it makes the test suite much easier to maintain, assuming I can get all the tests rewritten. I've got about 20 tests rewritten, with the caveat that I'm rewriting the easy tests first. > > My code does not inherit from version that is in flex's > "to.do/unicode" directory, as that one uses wchar_t (which I see as That's an old solution kept around for purely historical reasons until something better comes along. > a wrong way), and does not provide any way to deal with various > character encodings. My version uses char32_t, which makes it > possible to deal with all 17 Unicode planes on all modern platforms > (as far as I know, at least one of popular operating systems still > defines wchar_t as 16-bit long). Also, my version makes it quite > simple to support virtually any character encoding, even ones with > variable character size. > > What I would like to accomplish, is to get my changes integrated > with main line of flex project. However, I am aware that my code is > not yet mature enough - I'm pretty sure there are big and small > issues with new features, as well as regressions (despite that all > existing test cases pass). Starting with the test suite is good though. > > I currently see a few things that needs an urgent improvement: > > - support for more encodings - at least UTF-32 and both variants of > UTF-16 (these should be builtins, I think). For others, the > interface for programmer to add his own conversion function would be > sufficient. Optional support of "iconv" may be another approach. THis all sounds reasonable, understanding that I'm not anything like a unicode expert. > > - output binary size - as character classes array now may have up > to 65536*17+1 elements - on 64-bit platform it gives almost 9 MB of > data in final binary. Not mentioning intermediate .c file... Yeah but what are options to reduce the size of the output that don't require a lot of code complexity? > > - generation speed - I'm not sure if it may be greatly improved, > but generating even simplest Unicode scanner takes a long time (over > 30 seconds on my machine). This definitely needs at least profiling. Sure. > > I will be working on those issues in upcoming weeks, but meanwhile: > > > What do you think, would it be good idea to introduce my changes > into official flex release? If yes, may I ask you for looking into > my commits and point out issues that should be resolved before such > merge? The standard thing to do is to submit a pull request and interested persons can comment. You'll have a lot of rebasing to do if you aren't following all the 2.6.0 changes, but don't let that hold you back. > > Your review would be a great help for me. I would be especially > happy to hear about some corner cases I missed, or solutions that > does not fit very well with general flex approach. > > > I'm looking forward to hearing from you. Thanks for your work and interest in flex. We'll see about your changes. > > (if anyone on the mailing list would like to also take a look there, > it would be great to hear from you too!) Amen to this; man hands and all that. > > > Regards, > Mariusz Pluciński > http://www.mplucinski.com/ -- Will Estes (wes...@gm...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Will E. <wes...@gm...> - 2014-04-30 00:36:31
|
On Tuesday, 29 April 2014, 7:29 pm -0500, Nathan Royce <na...@ho...> wrote: > I don't know why it was rejected. I didn't think it was appropriate for "flex-help" since I wasn't using it yet for any other applications. You're likely not a member of the list. > > > Subject: Configure CPP With ARM Hard-Float > From: fle...@li... > To: na...@ho... > Date: Tue, 29 Apr 2014 14:17:34 +0000 > > You are not allowed to post to this mailing list, and your message has > been automatically rejected. If you think that your messages are > being rejected in error, contact the mailing list owner at > fle...@li.... > > > > --Forwarded Message Attachment-- > From: na...@ho... > To: fle...@li... > Subject: Configure CPP With ARM Hard-Float > Date: Tue, 29 Apr 2014 09:17:26 -0500 > > > > > I was having an issue with configuring flex and found that CPP was being set to "/lib/cpp". Looking at the line 7150 "for CPP in "$CC -E" "$CC -E -traditional-cpp" "/lib/cpp"", I see the test does not take into account any flags and the particular one in CFLAGS is "mfloat-abi=hard", otherwise it looks for a missing "stubs-soft.h" > I had a "make check" problem with lzlib and plzip with the same issue and submitted to that as well, and is fixed in the development releases. The neat trick the developer told me before it was fixed was to set "LDFLAGS='$CFLAGS $LDFLAGS'" which I used in this flex case by setting CC='$CC $CFLAGS'.I have a feeling I'm going to encounter this with many packages. Thanks for your report. I'm rewriting the test suite for a number of reasons, as it happens, so I'll watch out for this sort of thing. $CPPFLAGS, $CFLAGS and $LDFLAGS are all different variables that should be set independently. |
From: Ayan G. <ay...@ay...> - 2013-12-06 13:50:09
|
On 12/06/2013 08:19 AM, Will Estes wrote: > All, > > For a variety of reasons, I'm transitioning flex over to be hosted at github. The initial push is there now. The documentation needs edited to reflect the change too. > > https://github.com/westes/flex > Thank you! Excellent move. |
From: Will E. <wes...@gm...> - 2013-12-06 13:19:30
|
All, For a variety of reasons, I'm transitioning flex over to be hosted at github. The initial push is there now. The documentation needs edited to reflect the change too. https://github.com/westes/flex Share and enjoy, --Will -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Will E. <wes...@gm...> - 2013-02-04 18:29:54
|
Long time done. And it's been great, folks have had an easier time sending patches. And I've had an easier time reviewing/applying patches, working on feature branches--you know, everything everybody likes about gitl. On Monday, 4 February 2013, 12:58 pm -0500, Ayan George <ay...@ay...> wrote: > > I recall there was talk about moving the flex SCM to git. How is that > going? Is there any progress? > > -ayan > > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_jan > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Ayan G. <ay...@ay...> - 2013-02-04 18:21:04
|
I recall there was talk about moving the flex SCM to git. How is that going? Is there any progress? -ayan |
From: John P. H. <jph...@gm...> - 2013-02-03 11:26:50
|
I like the re-entrant scanner, but I have a requirement for a get function to be able to determine if the push stack is empty so I can issue a meaningful diagnostic when trying to pop something that isn't pushed. This is often the user supplying unbalanced parenthesis. I currently test yy_start_stack_ptr, but I'm only interest in null/not null. Thanks, j. |
From: Will E. <wes...@gm...> - 2012-10-01 16:20:46
|
This report is reproduceable. I'll have to look at what the right fix is, although I'm considering a revert of that particular commit until I can do further testing. Anyone have thoughts? --Will On Saturday, 8 September 2012, 2:11 am +0400, "Dmitry V. Levin" <ld...@al...> wrote: > Hi, > > Commit flex-2.5.37-10-gec2fdb8 "put user code after yyguts init; resolves #1744516" > introduced a regression: YY_USER_INIT is now put before user's declarations, > which looks wrong and breaks already existing code. For example, bison-2.6.2 > fails to build after this change: > > In file included from scan-gram-c.c:3:0: > scan-gram.c: In function 'gram_lex': > scan-gram.c:1294:3: error: 'code_start' undeclared (first use in this function) > > I suppose the previous behavior (when user's declarations were put before > initialization) was correct. > > > -- > ldv > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Dmitry V. L. <ld...@al...> - 2012-09-07 22:11:16
|
Hi, Commit flex-2.5.37-10-gec2fdb8 "put user code after yyguts init; resolves #1744516" introduced a regression: YY_USER_INIT is now put before user's declarations, which looks wrong and breaks already existing code. For example, bison-2.6.2 fails to build after this change: In file included from scan-gram-c.c:3:0: scan-gram.c: In function 'gram_lex': scan-gram.c:1294:3: error: 'code_start' undeclared (first use in this function) I suppose the previous behavior (when user's declarations were put before initialization) was correct. -- ldv |
From: Will E. <wes...@gm...> - 2012-08-04 23:55:10
|
Thanks Mike. Applied and pushed up to sourceforge. --Will On Saturday, 4 August 2012, 3:51 pm -0400, Mike Frysinger <va...@ge...> wrote: > Signed-off-by: Mike Frysinger <va...@ge...> > --- > flexdef.h | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/flexdef.h b/flexdef.h > index 0e81410..046dd9a 100644 > --- a/flexdef.h > +++ b/flexdef.h > @@ -908,6 +908,9 @@ extern void lerrif PROTO ((const char *, int)); > /* Report an error message formatted with one string argument. */ > extern void lerrsf PROTO ((const char *, const char *)); > > +/* Like lerrsf, but also exit after displaying message. */ > +extern void lerrsf_fatal PROTO ((const char *, const char *)); > + > /* Spit out a "#line" statement. */ > extern void line_directive_out PROTO ((FILE *, int)); > > -- > 1.7.9.7 > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel -- Will Estes (wl...@us...) Flex Project Maintainer http://flex.sourceforge.net/ |
From: Mike F. <va...@ge...> - 2012-08-04 19:51:51
|
Signed-off-by: Mike Frysinger <va...@ge...> --- flexdef.h | 3 +++ 1 file changed, 3 insertions(+) diff --git a/flexdef.h b/flexdef.h index 0e81410..046dd9a 100644 --- a/flexdef.h +++ b/flexdef.h @@ -908,6 +908,9 @@ extern void lerrif PROTO ((const char *, int)); /* Report an error message formatted with one string argument. */ extern void lerrsf PROTO ((const char *, const char *)); +/* Like lerrsf, but also exit after displaying message. */ +extern void lerrsf_fatal PROTO ((const char *, const char *)); + /* Spit out a "#line" statement. */ extern void line_directive_out PROTO ((FILE *, int)); -- 1.7.9.7 |
From: Peter M. <pet...@gm...> - 2012-07-23 23:44:50
|
I'm not sure what you mean by the traditional un-flagged behavior? Without the option utf8 (or the (?u: modifier), it should behave exactly as before. If it's not, there's a bug in my logic. On Mon, Jul 23, 2012 at 7:39 PM, Paul <pa...@pr...> wrote: > I suppose that maintaining the traditional un-flagged behavior of flex > was part of my thinking. > > Paul > > > On 12-07-23 07:27 PM, Peter Martini wrote: > > With 8bit but without utf8, it shouldn't do the extra utf8 processing - in > particular, a 3byte char will match "..." rather than "." > > -- sent from my phone, please excuse my brevity > On Jul 23, 2012 7:12 PM, "Paul" <pa...@pr...> wrote: > >> Looking at Peter Martini's work, I think that when set as 8 bit, it will >> accept both ascii and utf-8. >> I would prefer to only accept utf-8 with a flex flag like: %option utf8. >> I believe a generated scanner can be more efficient if it does not have >> to deal with utf-8. >> This seems to be a philosophical issue, so I would like opinions. >> I may be off the rails here. >> >> Paul Neelands >> >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Flex-devel mailing list >> Fle...@li... >> https://lists.sourceforge.net/lists/listinfo/flex-devel >> > > |
From: Paul <pa...@pr...> - 2012-07-23 23:40:07
|
I suppose that maintaining the traditional un-flagged behavior of flex was part of my thinking. Paul On 12-07-23 07:27 PM, Peter Martini wrote: > > With 8bit but without utf8, it shouldn't do the extra utf8 processing > - in particular, a 3byte char will match "..." rather than "." > > -- sent from my phone, please excuse my brevity > > On Jul 23, 2012 7:12 PM, "Paul" <pa...@pr... > <mailto:pa...@pr...>> wrote: > > Looking at Peter Martini's work, I think that when set as 8 bit, > it will > accept both ascii and utf-8. > I would prefer to only accept utf-8 with a flex flag like: > %option utf8. > I believe a generated scanner can be more efficient if it does not > have > to deal with utf-8. > This seems to be a philosophical issue, so I would like opinions. > I may be off the rails here. > > Paul Neelands > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. > Discussions > will include endpoint security, mobile security and the latest in > malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > <mailto:Fle...@li...> > https://lists.sourceforge.net/lists/listinfo/flex-devel > |
From: Peter M. <pet...@gm...> - 2012-07-23 23:27:28
|
With 8bit but without utf8, it shouldn't do the extra utf8 processing - in particular, a 3byte char will match "..." rather than "." -- sent from my phone, please excuse my brevity On Jul 23, 2012 7:12 PM, "Paul" <pa...@pr...> wrote: > Looking at Peter Martini's work, I think that when set as 8 bit, it will > accept both ascii and utf-8. > I would prefer to only accept utf-8 with a flex flag like: %option utf8. > I believe a generated scanner can be more efficient if it does not have > to deal with utf-8. > This seems to be a philosophical issue, so I would like opinions. > I may be off the rails here. > > Paul Neelands > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > |