From: Jeff D. <da...@da...> - 2002-10-28 21:29:11
|
> Do you think it would cause any problems to just commit this change of > FieldSeparator to '\xff'? Yes it would. '\xff' is a printing character in the ISO-8859-x encodings. (In iso-8859-1 it's a "latin small letter y with diaeresis".) > I'm going to leave it this way on my Wiki for a while to see how it > works out as a permanent change. If this is not a good idea, maybe add > a check in config.php for CHARSET==utf-8 and set FieldSeparator > accordingly? That would be okay, I think. Switching to one of the non-used ASCII control characters (in the 0x01 - 0x1f range) would probably be okay, too (as long as care is taken to strip that character from form input, etc...) If you want to do anything, it's probably safer to stick with our current setting except for UTF-8. But, note that in the current CVS code, lib/transform.php is no longer used. (Though the code is still there.) Both old and new markup get run through the new markup engines (old markup goes through a pre-processor to hack it up into new markup...) IIRC, the new markup code doesn't use any magic marker characters (FieldSeparator), so the issue is mostly moot. > So far the only (minor) problems I see running PhpWiki CVS in utf-8 are > the field names on the DebugInfo page and character translation problem > on the Sign In page. Since PhpWiki was never designed with multi-byte character encodings in mind (really it's only been well(?) tested under iso-8859-1), I suspect that numerous small problems will show up (most, probably, could be easily fixed). There may be less minor problems with the searching functionality. PHP regexps are not unicode aware... (does the latest PHP have unicode support yet?) MySQL knows nothing about unicode, so any pattern/string matching done in MySQL queries is problematic. |