From: Peter J. <pj...@wa...> - 2003-08-26 16:53:53
|
Hi Nickolay, > AFAIU, uppercasing is locale-insensitive operation. Case (none, uppercas= e, > lowercase or titlecase) is the property of character and may be changed > appropriately. Files UnicodeData.txt, SpecialCasing.txt and CaseFolding.= txt > from Unicode standard clearly specify case conversion rules. This means = that > default case conversion may be safely added to all Firebird character se= ts > as they are all have direct UNICODE mappings. Generally speaking I agree with you, and changing the uppercasing to match the default non-normative UNICODE rules, will be the next step after code cleanup. Wait some minutes, ...err... days to see this happen. But... 1a. There are very few cases of locale sensitive uppercasing, involving letters like LATIN SMALL LETTER DOTLESS I and LATIN CAPITAL LETTER I WITH DOT. 1b. There are rumours about locales, where accents should go away when uppercasing, e.g. "French Traditional" if I guess right. 2. What about deployed database with column constraints "s =3D UPPER(s)", and having unfortunately some "=E4" inside? Is it fair to go south on such data? 3. You cannot (now) localize UNICODE_FSS, see such gems as case ttype_none: case ttype_ascii: case ttype_unicode_fss: dest =3D src; while (len--) { *dest++ =3D UPPER7(*src); src++; } break; (in jrd/intl.cpp) Regards, Peter Jacobi |