From: Paul <pa...@pr...> - 2012-07-19 22:55:41
|
Have the barest bones of an integration of flex-cvs, flex utf-16, & flex utf-8. Passes 106 of 107 tests. The last is a problem with ccl lists in utf16-mode. Besides this fix it needs many more tests. I'm wondering if the change from the ccl bit map to lists is ok in general in all modes. I'm using it for the tests and there is no perceptible speed difference. Having a bit map for non utf-8 and a list for it would be a pain. Paul On 12-07-19 05:25 AM, Peter Martini wrote: > I'd be thrilled, actually. I started trying to integrate the changes > that flex had gone through and the patches you were sending into my > branch directly, and ended up deferring the work each time when I got > intimidated by the scale of the diffs :-) > > Hopefully the commits in my branch are discrete enough to be useful. > > On Thu, Jul 19, 2012 at 5:21 AM, Paul <pa...@pr... > <mailto:pa...@pr...>> wrote: > > Would you mind if I took your changes, made them conditional on > the utf8 flag and integrated them into the unicode16 version. > I would also like to add more tests. > > > Paul > > On 12-07-18 11:29 AM, Peter Martini wrote: >> Not nearly as thoroughly as you did. I confirmed that with or >> without UTF-8 parsing enabled, if its only ASCII text, the same >> exact tables are generated, and I added tests to make sure that >> the unicode escapes and the '.' work as designed. As designed is >> a very broad definition though. >> >> Take a look at >> https://github.com/PeterMartini/flex/commits/master - I made a >> lot of small commits, so you can see the changes I was making. >> The biggest change structurally was I changed the CCL bitmap to >> be a linked list of ranges, to speed up processing the 0x10 FFFF >> allowed values, and then added a secondary step to convert those >> unicode characters to an equivalent 8-bit pattern. >> >> But my commits also track some housekeeping I did to get rid of >> warnings and get those C++ tests to pass, and I think some of >> that has been merged into the main branch now. >> >> On Mon, Jul 16, 2012 at 5:03 PM, Paul <pa...@pr... >> <mailto:pa...@pr...>> wrote: >> >> How completely have you tested your utf8 version of flex? >> >> Paul >> >> > > > |