flex-devel Mailing List for flex: the fast lexical analyser (Page 7)
flex is a tool for generating scanners
Brought to you by:
wlestes
You can subscribe to this list here.
2005 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
(1) |
Dec
|
2007 |
Jan
|
Feb
(1) |
Mar
(4) |
Apr
(5) |
May
(2) |
Jun
(2) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(3) |
2008 |
Jan
(1) |
Feb
(2) |
Mar
(1) |
Apr
(2) |
May
(1) |
Jun
|
Jul
|
Aug
(5) |
Sep
(3) |
Oct
(33) |
Nov
(4) |
Dec
(4) |
2009 |
Jan
|
Feb
|
Mar
(1) |
Apr
(1) |
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(1) |
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(10) |
Sep
|
Oct
(1) |
Nov
(1) |
Dec
(1) |
2012 |
Jan
|
Feb
(11) |
Mar
(12) |
Apr
|
May
|
Jun
(3) |
Jul
(62) |
Aug
(2) |
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
2013 |
Jan
|
Feb
(3) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(2) |
2014 |
Jan
|
Feb
|
Mar
|
Apr
(1) |
May
|
Jun
(5) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
(4) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
(3) |
Nov
(33) |
Dec
(31) |
2016 |
Jan
(2) |
Feb
|
Mar
(1) |
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(2) |
Sep
(5) |
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
|
Sep
(3) |
Oct
|
Nov
(4) |
Dec
|
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
2021 |
Jan
|
Feb
(5) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2024 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2025 |
Jan
|
Feb
(2) |
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Peter M. <pet...@gm...> - 2012-07-10 20:59:32
|
I'm not familiar with ARM, although it seems like it would be a good idea to start looking into it now. Is that bit setting something that can change at runtime? On Tue, Jul 10, 2012 at 4:50 PM, Paul <pa...@pr...> wrote: > I'm not aware of a fully general compile time check. I use flex on pc's > with Linux & Windows, and cross compiled to small micro's (ARM) that can be > either little or big endian depending on a bit setting. > > > Paul > > > On 07/10/2012 03:59 PM, Peter Martini wrote: > >> if we do the check at compile time >> > > > > |
From: Peter M. <pet...@gm...> - 2012-07-10 20:55:53
|
I'm sorry if I wasn't clear, I meant converting the mainline to git - this thread was really aimed at Will, to see if he's willing to do the work. I've done a dry run and it seems simple enough, but obviously it could only be truly done by a committer. On Tue, Jul 10, 2012 at 4:53 PM, Paul <pa...@pr...> wrote: > I really would rather not effectively fork flex. I would prefer to do > whatever is necessary to integrate with the mainline. > > Paul > > On 07/10/2012 04:43 PM, ruertar wrote: > > On 07/10/2012 04:33 PM, Peter Martini wrote: > >> Any thoughts on converting to Git? Sourceforge already has a > >> (empty) git repository for flex, and it's not that difficult to > >> convert the CVS repo to Git while preserving all of the history, > >> branches, and tags. (I've done it locally already). > >> > > I would like this as well. > > > > > ------------------------------------------------------------------------------ > > Live Security Virtual Conference > > Exclusive live event will cover all the ways today's security and > > threat landscape has changed and how IT managers can respond. Discussions > > will include endpoint security, mobile security and the latest in malware > > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > _______________________________________________ > > Flex-devel mailing list > > Fle...@li... > > https://lists.sourceforge.net/lists/listinfo/flex-devel > > > > > > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > |
From: Paul <pa...@pr...> - 2012-07-10 20:53:25
|
I really would rather not effectively fork flex. I would prefer to do whatever is necessary to integrate with the mainline. Paul On 07/10/2012 04:43 PM, ruertar wrote: > On 07/10/2012 04:33 PM, Peter Martini wrote: >> Any thoughts on converting to Git? Sourceforge already has a >> (empty) git repository for flex, and it's not that difficult to >> convert the CVS repo to Git while preserving all of the history, >> branches, and tags. (I've done it locally already). >> > I would like this as well. > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > > |
From: Paul <pa...@pr...> - 2012-07-10 20:50:54
|
I'm not aware of a fully general compile time check. I use flex on pc's with Linux & Windows, and cross compiled to small micro's (ARM) that can be either little or big endian depending on a bit setting. Paul On 07/10/2012 03:59 PM, Peter Martini wrote: > if we do the check at compile time |
From: ruertar <ru...@gm...> - 2012-07-10 20:44:07
|
On 07/10/2012 04:33 PM, Peter Martini wrote: > Any thoughts on converting to Git? Sourceforge already has a > (empty) git repository for flex, and it's not that difficult to > convert the CVS repo to Git while preserving all of the history, > branches, and tags. (I've done it locally already). > I would like this as well. |
From: Peter M. <pet...@gm...> - 2012-07-10 20:33:37
|
Any thoughts on converting to Git? Sourceforge already has a (empty) git repository for flex, and it's not that difficult to convert the CVS repo to Git while preserving all of the history, branches, and tags. (I've done it locally already). |
From: Peter M. <pet...@gm...> - 2012-07-10 20:31:54
|
Certainly enough to get started, thanks! On Tue, Jul 10, 2012 at 4:25 PM, Paul <pa...@pr...> wrote: > No, I don't. I''m working against a cvs checkout of flex from 1 July 2012. > I would like to merge with flex fairly soon before the repository version > changes. > I could post a diff against the cvs. > > Paul Neelands > > > On 07/10/2012 03:59 PM, Peter Martini wrote: > >> Do you have a git repository >> > > > > |
From: Paul <pa...@pr...> - 2012-07-10 20:25:53
|
No, I don't. I''m working against a cvs checkout of flex from 1 July 2012. I would like to merge with flex fairly soon before the repository version changes. I could post a diff against the cvs. Paul Neelands On 07/10/2012 03:59 PM, Peter Martini wrote: > Do you have a git repository |
From: Peter M. <pet...@gm...> - 2012-07-10 20:00:00
|
Do you have a git repository (or smething) where I can follow along with the changes? By the way - thanks for setting the options to --utf16le and --utf16be; is there anything we can do about -U though? I do a lot of web and UNIX work, so my default frame of mind for Unicode is UTF-8. Going to the the official Unicode FAQ, at http://unicode.org/faq/utf_bom.html, I think for --utf16 it should only every stay in utf16 mode for a single character. If that character is a BOM, switch to the appropriate BE or LE, and otherwise switch to the native encoding (although technically plain utf-16 is specified as BE, I kind of think most UTF-16 processing will be done on Wintel and native encoding makes more sense for us). Is the byte swap check done at program start up time or compile time? It may be a bit uglier to generate, but for performance if we do the check at compile time and use macros for the byte swapping functions, there will be no penalty at all when running the code if its using native endianness, and we can muck around with using htons etc on platforms where its available (if that makes it faster, which I don't know). I have a branch on github with my UTF-8 work, which is a little stale (from 2011), and I'd like to try to merge our efforts. The pattern classes are compiled as a list of ranges in the first phase, and then translated to a byte sequence as the second phase, so the things I was working on with pattern matching (particulary having '.' exclude the surrogates but still match characters above 0xFFFF) would automatically work with double byte. An important test is to try to match "0xD800 0xDC00" - "." should match, and ".." should not. When I get some time, I want to add an optional dependency on libicu, which we could use to turn named characters, properties, and classes into character ranges for use in character classes. After all, just ask any Perl guy who's worked with Unicode, '\w' and '\d' have a severely different meaning when Unicode is in effect ... On Tue, Jul 10, 2012 at 3:36 PM, Paul <pa...@pr...> wrote: > The flex unicode version has now been changed so that: > %option utf16 generates a scanner that accepts the native utf of the > machine. > %option utf16le generates a scanner that accepts UTF-16LE regardless of > the machine byte order. > %option utf16be generates a scanner that accepts UTF-16BE regardless of > the machine byte order. > > This means that when utf16le or utf16be is an option the scanner tests > the byte order of the machine. This code is not generated otherwise or > for utf16. The test is done at startup, but checked at each read to > decide to swap bytes. The byte swap is also only generated with utf16le > or utf16be. Thus a Flex scanner.c may be moved between machines with > hopefully least astonishment. > > There are separate tests for C, C++, reentrant and non-reentrant > scanners, with options utf16le & utf16be. > There are now 105 tests all of which pass. > > The flag -U or --utf is the same as %option utf16. > The flag --utf16le is the same as %option utf16le. > The flag --utf16be is the same as %option utf16be. > > Suggestions? > > Paul Neelands > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > |
From: Paul <pa...@pr...> - 2012-07-10 19:36:24
|
The flex unicode version has now been changed so that: %option utf16 generates a scanner that accepts the native utf of the machine. %option utf16le generates a scanner that accepts UTF-16LE regardless of the machine byte order. %option utf16be generates a scanner that accepts UTF-16BE regardless of the machine byte order. This means that when utf16le or utf16be is an option the scanner tests the byte order of the machine. This code is not generated otherwise or for utf16. The test is done at startup, but checked at each read to decide to swap bytes. The byte swap is also only generated with utf16le or utf16be. Thus a Flex scanner.c may be moved between machines with hopefully least astonishment. There are separate tests for C, C++, reentrant and non-reentrant scanners, with options utf16le & utf16be. There are now 105 tests all of which pass. The flag -U or --utf is the same as %option utf16. The flag --utf16le is the same as %option utf16le. The flag --utf16be is the same as %option utf16be. Suggestions? Paul Neelands |
From: Peter M. <pet...@gm...> - 2012-07-09 20:06:20
|
Do you have a forked repo somewhere? Will - would you be willing to convert from CVS to git? There's actually a clnversion procedure that keeps the history; I've done it locally and it seems to work. Git has the lovely property that I can commit to my repo for tracking purposes, and then send commits to the master later, so we don't have to trade megapatches. -- sent from my phone, please excuse my brevity On Jul 9, 2012 2:21 PM, "Paul" <pa...@pr...> wrote: > A unicode escape sequence \uxxx has been implemented in unicode flex. > For example \u391 is the Greek ALPHA. > > Paul Neelands > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > |
From: Paul <pa...@pr...> - 2012-07-09 18:21:08
|
A unicode escape sequence \uxxx has been implemented in unicode flex. For example \u391 is the Greek ALPHA. Paul Neelands |
From: Peter M. <pet...@gm...> - 2012-07-09 15:25:13
|
Am I correct in thinking that if this is compiled on a big-endian machine, it would accept UTF-16BE? (I have an old SPARC box somewhere I can boot up to test). I think the only sane thing we can do is explicitly support only UTF-16BE and UTF-16LE and allow a function to toggle which is which as a result of processing tokens. The reason I think that is, after double checking the Unicode FAQ, the BOM is only valid at the start of a text stream (and is otherwise a zero width space) - and it must be the user's responsibility to define the start and end of a text stream. What do you think? On Mon, Jul 9, 2012 at 11:08 AM, Paul <pa...@pr...> wrote: > Right now the utf16 flex accepts UTF-16LE. > It does not handle BOM or endianness. > Presently in my own work, that is handled by the YY_INPUT macro. > I suspect there are varying philosophical positions here. > Would like some input suggestions for this. > > > Paul Neelands > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > |
From: Paul <pa...@pr...> - 2012-07-09 15:09:01
|
Right now the utf16 flex accepts UTF-16LE. It does not handle BOM or endianness. Presently in my own work, that is handled by the YY_INPUT macro. I suspect there are varying philosophical positions here. Would like some input suggestions for this. Paul Neelands |
From: Paul <pa...@pr...> - 2012-07-09 13:58:39
|
Have now changed the flex option & 47 tests to utf16. Passes 99/0 Paul On 07/09/2012 01:50 AM, Peter Martini wrote: > Paul, > > Would you mind changing the flag/option in your work to utf16 rather > than unicode? That will leave us room to add utf8 handling in the future. > > Also, I haven't looked deeply enough to see how you handle BOMs / > endianness. Right now, if I copy a file from Windows to a > Solaris/SPARC machine, will it process properly (or can it be made to > process properly, maybe by specifying utf16be or utf16le?) > > Peter |
From: Paul <pa...@pr...> - 2012-07-09 11:58:01
|
Sure, I will happily change the unicode flag to utf16. At the moment unicode flex handles UTF-16LE files. Paul Neelands On 07/09/2012 01:50 AM, Peter Martini wrote: > Paul, > > Would you mind changing the flag/option in your work to utf16 rather > than unicode? That will leave us room to add utf8 handling in the future. > > Also, I haven't looked deeply enough to see how you handle BOMs / > endianness. Right now, if I copy a file from Windows to a > Solaris/SPARC machine, will it process properly (or can it be made to > process properly, maybe by specifying utf16be or utf16le?) > > Peter |
From: Paul <pa...@pr...> - 2012-07-09 11:56:07
|
With the -Cm & Unicode banished. The unicode version of flex now pass all 99 tests. Paul Neelands |
From: Peter M. <pet...@gm...> - 2012-07-09 05:50:31
|
Paul, Would you mind changing the flag/option in your work to utf16 rather than unicode? That will leave us room to add utf8 handling in the future. Also, I haven't looked deeply enough to see how you handle BOMs / endianness. Right now, if I copy a file from Windows to a Solaris/SPARC machine, will it process properly (or can it be made to process properly, maybe by specifying utf16be or utf16le?) Peter |
From: Paul <pa...@pr...> - 2012-07-08 22:44:24
|
The unicode version of flex core dumps with the flags -U & -Cm. Since in general the unicode version does not compress tables, I would like to banish this particular flag combination. Presently -U & -Ca have been banished since version 2.5.4.U. Would like some direction on this though. Paul Neelands |
From: Paul <pa...@pr...> - 2012-07-08 14:25:38
|
a. Passes 92 of 93 tests. b. Have added one test to match some unicode (greek). (passes) c. Still have not fixed the failing utest-table-opts. (fails). d. Have been using this now in my own work for about 6 months and works reliably. Paul Neelands |
From: Paul <pa...@pr...> - 2012-07-06 17:49:29
|
I think now that this is a C problem & not a flex problem. Paul On 07/06/2012 12:34 PM, Paul wrote: > using gcc & glibc <stdint.h> > uint16_t *x=UINT16_C("ababab"); > compiles to: in hex: > 6261, 6261, 6261,0 > > Thus the initialization problem asked about is still not solved. > > Paul > > On 07/06/2012 12:11 PM, Peter Martini wrote: >> Sorry, sent from my phone, I did mean uint16_t :-) And I thought that >> it was mandated by C99 - is there something in particular that >> doesn't support it? >> >> On Fri, Jul 6, 2012 at 12:08 PM, Paul <pa...@pr... >> <mailto:pa...@pr...>> wrote: >> >> No Uint16t in gcc. Is a Micrsoft-ism. >> >> glibc has <stdint.h> which has uint16_t but not universal. >> >> Paul >> >> On 07/06/2012 11:36 AM, Peter Martini wrote: >> >> >> Uint16t? >> >> >> >> > > > > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > > > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel |
From: Paul <pa...@pr...> - 2012-07-06 16:34:08
|
using gcc & glibc <stdint.h> uint16_t *x=UINT16_C("ababab"); compiles to: in hex: 6261, 6261, 6261,0 Thus the initialization problem asked about is still not solved. Paul On 07/06/2012 12:11 PM, Peter Martini wrote: > Sorry, sent from my phone, I did mean uint16_t :-) And I thought that > it was mandated by C99 - is there something in particular that doesn't > support it? > > On Fri, Jul 6, 2012 at 12:08 PM, Paul <pa...@pr... > <mailto:pa...@pr...>> wrote: > > No Uint16t in gcc. Is a Micrsoft-ism. > > glibc has <stdint.h> which has uint16_t but not universal. > > Paul > > On 07/06/2012 11:36 AM, Peter Martini wrote: > > > Uint16t? > > > > |
From: Peter M. <pet...@gm...> - 2012-07-06 16:11:13
|
Sorry, sent from my phone, I did mean uint16_t :-) And I thought that it was mandated by C99 - is there something in particular that doesn't support it? On Fri, Jul 6, 2012 at 12:08 PM, Paul <pa...@pr...> wrote: > No Uint16t in gcc. Is a Micrsoft-ism. > > glibc has <stdint.h> which has uint16_t but not universal. > > Paul > > On 07/06/2012 11:36 AM, Peter Martini wrote: > >> >> Uint16t? >> >> >> > |
From: Paul <pa...@pr...> - 2012-07-06 16:08:29
|
No Uint16t in gcc. Is a Micrsoft-ism. glibc has <stdint.h> which has uint16_t but not universal. Paul On 07/06/2012 11:36 AM, Peter Martini wrote: > > Uint16t? > > |
From: Peter M. <pet...@gm...> - 2012-07-06 15:36:14
|
Uint16t? On Jul 6, 2012 11:29 AM, "Paul" <pa...@pr...> wrote: > The unicode16 version of flex uses unsigned short. > Without opening a large can of worms re types, to pass the tests > utest-posix > and utest-posixly-correct (nothing to do with posix compatibility) with > the -U flag on, I have altered the line: > > char * tests[NUM_TESTS] = { "ababab"}; /* non unicode */ > to (simplified): > unsigned short testu[7]={'a','b','a','b','a','b',0}; /* altered for > unicode pn */ > > The test then passes with testu OK. > > This makes initializing a pain since: > a. wchar_t is 4 bytes wide in gcc > b. L"ababab" is also 4 bytes wide. > whereas unsigned short & unicode 16 is 2 bytes wide. > I will note that char16_t is not defined in C, only C++. > > Thoughts please. > > Paul > > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Flex-devel mailing list > Fle...@li... > https://lists.sourceforge.net/lists/listinfo/flex-devel > |