From: Eric B. <er...@go...> - 2008-06-29 21:03:06
|
Ted wrote: > I read somewhere Gobo Regexp doesn't support Unicode yet. > After some investigation. I found some minor changes would make it > support Unicode (See following patch). > > Can someone confirm? And merge it into the library if possible. I just committed your modifications in SVN. However, in addition to Colin's remark about case-insensitivity, I also noticed that character classes (e.g. "[a-z]") will not work if they contain characters with code greater then 255. For your information, this regexp library in Gobo was born out of a translation to Eiffel of the code of the PCRE C library. The version of the C library was 3.9 if I remember correctly, and it was the very early stage of its support for unicode. The version number today is 7.7 and they claim to support unicode now, supposedly with a regexp pattern syntax and behavior compatible with Perl's regexp. It might be worth looking at the PCRE C library: http://www.pcre.org/ in order to implement unicode support in a way that does not depart too much from the original PCRE library and at the same time preventing us from having to reinvent the wheel. I suggest that we move this discussion to the gobo-eiffel-develop mailing list. -- Eric Bezault mailto:er...@go... http://www.gobosoft.com |