Re: [Flex-help] How to Match 0 or More Whitespace at Line Beginning?
flex is a tool for generating scanners
Brought to you by:
wlestes
From: Lene N. <neu...@gm...> - 2010-12-31 05:50:00
|
Sorry, I made a mistake, but it seems flex could match tokens of 0 character. Thanks a lot! 2010/12/30 John P. Hartmann <jph...@gm...> > ^[ ]+ return INDENTED; > ^[^ \n]+ yylval.string=strdup(yytext); return NOTINDENTEDWORD. > > You could also match a null string, but you'd need start conditions to > avoid matching it forever: > > ^""/[^ \n] return ... > > And don't forget to exclude newlines when you are excluding blanks. > > On 30 December 2010 07:00, Lene Neuron <neu...@gm...> wrote: > > Hi > > > > Thanks for your advice, but it seems that rule > > ^/[^ ] { return NO_INDENT; } > > doesn't work. Flex still prints the message ``unrecognized rule''. > > > > I guess flex will not match a token of 0 character, so the following > > rule > > a* { printf("%d\n", yyleng); } > > has the same effect as > > a+ { printf("%d\n", yyleng); } > > > > I have bypass this problem in syntax analysis, by defining ``indent'' > as > > indent > > : INDENT { return yyleng; } > > | { return 0; } > > ; > > > > Neuron Teckid > > > > 2010/12/30 Alexandre Bique <biq...@gm...> > > > >> On Thu, Dec 30, 2010 at 3:09 AM, Lene Neuron <neu...@gm...> > >> wrote: > >> > Hi, all: > >> > I would like to use *flex* to parse *Python* source code. The problem > is > >> > that I cannot match 0 whitespace character at line beginning, which > >> > indicates the line is not indented. > >> > > >> > I have tried this at first: > >> > > >> > *^[ ]* { return INDENT; }* > >> > > >> > but it didn't work if no whitespace at line beginning. Then I changed > it > >> as > >> > > >> > *^[ ]+ { return INDENT; }* > >> > *^ { return INDENT; }* > >> > > >> > but I was told that the latter rule is an "unrecognized rule". > >> > > >> > Would you please give me some ideas? Thanks in advance. > >> > > >> > Neuron Teckid > >> > >> Hi, > >> > >> I am not sure, but you can do something like that : > >> > >> ^/[^ ] { return NO_INDENT; } > >> > >> See the explanation from the manual: > >> > >> Chapter 6 "Patterns": > >> > >> [...] > >> > >> `r/s' > >> an `r' but only if it is followed by an `s'. The text matched by > >> `s' is included when determining whether this rule is the longest > >> match, but is then returned to the input before the action is > >> executed. So the action only sees the text matched by `r'. This > >> type of pattern is called "trailing context". (There are some > >> combinations of `r/s' that flex cannot match correctly. *Note > >> Limitations::, regarding dangerous trailing context.) > >> > >> `^r' > >> an `r', but only at the beginning of a line (i.e., when just > >> starting to scan, or right after a newline has been scanned). > >> > >> [...] > >> > >> But, from my point of view, it should be better to count the > >> indentation level rather than match the "no indentation" case. > >> You can also have multiple states in your lexer (see chapter 10 "Start > >> Conditions"). > >> > >> Good luck. > >> > >> -- > >> Alexandre Bique > >> > > > ------------------------------------------------------------------------------ > > Learn how Oracle Real Application Clusters (RAC) One Node allows > customers > > to consolidate database storage, standardize their database environment, > and, > > should the need arise, upgrade to a full multi-node Oracle RAC database > > without downtime or disruption > > http://p.sf.net/sfu/oracle-sfdevnl > > _______________________________________________ > > Flex-help mailing list > > Fle...@li... > > https://lists.sourceforge.net/lists/listinfo/flex-help > > > |