Re: [jflex-users] [Help] Misbehaving JFLex rules - wrong rule matched

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Thank you, that solved it! In fact, the JFlex 1.4.* manual states that it doesn't really support characters outside the 16-bit range unless you define them as macros: http://jflex.de/manual.html#SECTION000101000000000000000. I suspect that implies that a character range definition is out of the question, unless I'm missing something.

Either way, I've documented that gotcha in my source code, so other developers at least have a starting point for debugging issues should they occur.

Thanks,
Ruslan

________________________________
 From: Martin Walch <wal...@we...>
To: jfl...@li...; Ruslan Dimov <rus...@ya...> 
Sent: Friday, August 30, 2013 5:26 AM
Subject: Re: [jflex-users] [Help] Misbehaving JFLex rules - wrong rule matched

Hi,

> If you wouldn't mind, I'd rather point you to the question I posted today on
> StackOverflow:
> http://stackoverflow.com/questions/18520420/misbehaving-jflex-rules-wrong-r
> ule-matched

I am not a jflex expert, but I still give it a shot.

Your code says:

> han   = [\u3400-\u9fff\uf900-\ufaff\u2f800-\u2fa1f]

My guess is that handling the unicode characters \u2f800-\u2fa1f above number 
65535 is not that easy in jflex.

The manual states:

> %unicode 
> %16bit 
> Both options cause the generated scanner to use the full 16 bit Unicode
> input character set that Java supports natively (character code points
> 0-65535).

Maybe you can work around this by splitting those characters. You will 
probably need an additional scanner state for this.

Regards
Martin Walch
-- 

Re: [jflex-users] [Help] Misbehaving JFLex rules - wrong rule matched

The fast lexer generator for Java

Re: [jflex-users] [Help] Misbehaving JFLex rules - wrong rule matched