Re: [Phpwiki-talk] Re: phpWiki and utf-8/double-byte

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Monday, October 28, 2002, at 04:29  pm, Jeff Dairiki wrote:

> IIRC, the new markup code doesn't use any magic marker characters
> (FieldSeparator), so the issue is mostly moot.

Hehe, I didn't realize this, thanks for pointing it out. So I changed 
that magic marker back and of course it has no effect and UTF-8 is 
still working in the latest CVS. ^^

> There may be less minor problems with the searching functionality.
> PHP regexps are not unicode aware... (does the latest PHP have
> unicode support yet?)  MySQL knows nothing about unicode, so any
> pattern/string matching done in MySQL queries is problematic.

Yes the regexp stuff definitely could be a problem. I'll look at the 
PHP website to see how the mb functions are coming along.

Preliminary informal tests so far (read: goofing around) with utf-8 
shows that searching for Japanese words works, and surprisingly 
syntax-highlighting in a FullText search looks just fine too.

Logins with utf-8 Japanese text doesn't work, probably due to 
bumpywords checking. In diffs the line prefixes for non-changed lines 
and line-endings show with a garbage character, but this might just be 
my browser. Otherwise the only issue I noticed is that square brackets 
must be used for Japanese text. This is expected because there aren't 
any BumpyWords and so it's not really a problem.

I don't want to jump to any conclusions because I can't really read 
Japanese at all. I'm interested in how well this works for other people 
using the latest CVS PhpWiki. Change your CHARSET to utf-8 in index.php 
so your browser knows which charset to use. Try pasting some text in 
from http://www.yahoo.co.jp/ or something and linkify a few words with 
[ ].

Carsten