Re: [jflex-users] Issue in Tokenizing
The fast lexer generator for Java
Brought to you by:
lsf37,
steve_rowe
From: Denis W. <ddw...@gm...> - 2009-12-29 23:35:48
|
Thanks, Stowe. It was really helpful. Best Regards, Denis Weerasiri On Mon, Dec 28, 2009 at 8:41 PM, Steve Rowe <sa...@od...> wrote: > Hi Denis, > > You appear to be directly using the syntax from the XML specification, > and this won't work. The first issue I can see is the use of '-' as a > regular language set subtraction operator - JFlex does not support this > syntax. Check out the documentation, and look for the '!' operator and > a description of using it to do something similar. > > The Symbol class's toString() method is: > > public int sym; > [...] > public String toString() { return "#"+sym; } > > and it sounds like you want to print out the Symbol name rather than the > integer code for it, but AFAICT, there is no out-of-the-box way to do > this. Maybe you could subclass Symbol and add the int->String mappings > there, along with a toString() method that references the mappings? > > When you create symbols, I think you should be using the constructor > that carries a value, and pass in the matched text, when you're > interested in the text. > > I again encourage you to take a look at the IntelliJ IDEA xml grammar I > sent the link for in my last email - it's Apache2 licensed, so you are > free to use it for anything you want. > > Steve > > Denis Weerasiri wrote: > > My Flex file is attached here. > > > > My required output is something like as follows. for the test code I've > > mentioned in previous mail. > > > > stringtype : ANYCONTENTS > > id : VALUE > > h : PREFIX > > NAME : VALUE > > > > Best Regards, > > Denis > > > > On Mon, Dec 28, 2009 at 7:16 PM, Steve Rowe <sa...@od... > > <mailto:sa...@od...>> wrote: > > > > Hi Denis, > > > > For some reason your grammar file attachment didn't come through - > > can you resend it inline? > > > > It's not clear to me exactly what the problem is - can you show what > > you *want* the output to be? > > > > You may find it useful to see other implementations of JFlex XML > > lexers - here's the one used by IntelliJ IDEA's Community Edition: > > > > < > http://git.jetbrains.org/?p=idea/community.git;a=blob_plain;f=xml/impl/src/com/intellij/lexer/_XmlLexer.flex > > > > > > Steve > > > > > > Denis Weerasiri wrote: > > > > Hi all, > > I wrote a .flex to tokenize an XML document. Tokens will be like > > elements, attributes, values etc. in the XML document. My .flex > > is attached here. > > I used the following code test the input. I would be happy if > > anyone give me a clue on how to resolve this issue. > > Best Regards, > > Denis. > > > > String text = "*<stringtype id=\"h:NAME\">\n" + > > " </stringtype>*"; > > try { > > InputStream is = new > > ByteArrayInputStream(text.getBytes("UTF-8")); > > Scanner sc = new Scanner(is); > > try { > > for(int i=0; i <45; i++) > > System.out.println(sc.yytext() + ":" + > > sc.next_token()); > > }catch (Exception ex) { > > ex.printStackTrace(); > > } > > } catch (UnsupportedEncodingException e) { > > e.printStackTrace(); > > } catch (java.io.IOException e) { > > e.printStackTrace(); //To change body of catch > > statement use File | Settings | File Templates. > > } > > > > The output is always like (I wanna show that tokenizing happens > > at character level.) > > > > :#106 > > <:#104 > > s:#104 > > t:#104 > > r:#104 > > i:#104 > > n:#104 > > g:#104 > > t:#104 > > y:#104 > > p:#104 > > e:#104 > > i:#104 > > d:#106 > > =:#106 > > ":#104 > > h:#106 > > ::#104 > > N:#104 > > A:#104 > > M:#104 > > E:#106 > > ":#106 > > >:#106 > > <:#106 > > /:#104 > > s:#104 > > t:#104 > > r:#104 > > i:#104 > > n:#104 > > g:#104 > > t:#104 > > y:#104 > > p:#104 > > e:#106 > > > > > ------------------------------------------------------------------------------ > This SF.Net email is sponsored by the Verizon Developer Community > Take advantage of Verizon's best-in-class app development support > A streamlined, 14 day to market process makes app distribution fast and > easy > Join now and get one step closer to millions of Verizon customers > http://p.sf.net/sfu/verizon-dev2dev > -- > jflex-users mailing list > https://lists.sourceforge.net/lists/listinfo/jflex-users > |