From: Peter M. <pet...@gm...> - 2012-07-18 15:29:29
|
Not nearly as thoroughly as you did. I confirmed that with or without UTF-8 parsing enabled, if its only ASCII text, the same exact tables are generated, and I added tests to make sure that the unicode escapes and the '.' work as designed. As designed is a very broad definition though. Take a look at https://github.com/PeterMartini/flex/commits/master - I made a lot of small commits, so you can see the changes I was making. The biggest change structurally was I changed the CCL bitmap to be a linked list of ranges, to speed up processing the 0x10 FFFF allowed values, and then added a secondary step to convert those unicode characters to an equivalent 8-bit pattern. But my commits also track some housekeeping I did to get rid of warnings and get those C++ tests to pass, and I think some of that has been merged into the main branch now. On Mon, Jul 16, 2012 at 5:03 PM, Paul <pa...@pr...> wrote: > How completely have you tested your utf8 version of flex? > > Paul > |