From: Matthew M. <ma...@tu...> - 2006-12-11 14:32:58
|
Thank you for testing. Could you please send me a database dump so I can test locally? Matt On Sat, 2006-12-09 at 13:17 +0100, Yves Kuendig wrote: > Hi Matt > > I'm not familiar enough with fallout code to present a real fix, but i > played around a bit with the conversion and utf-8 problem. > > The answers i found (maybe not complete) are the following: > The database table's charset and collation did not affect my results. > Your conversion script did work allmost fine. The characters where stored in > the database > either if the table was with latin1_german2_ci or with utf8_unicode_ci > collation. > But the text displayed on the webpage was garbage. > I found, the page was displayed ok, if i change the browser to ISO-8859-1 > encoding ?! > So i started to investigate the connection and the output (layout). > > Things i tried: > I sent "SET NAMES utf8" to mysql direct after 'connect'; in > convert/class/Convert.php and also in pear/DB/mysql.php . > This is probably not needed. (But not yet investigated). > I added this line: > $text = utf8_encode($text); //yok > at line 66 in layout/class/Layout.php before: > Layout::_loadBox($text, $module, $content_var); > > Thereafter the output on the website (only webpage tested) was fine. > > But PhpWebSite is still not fixed. If i enter a text now inside phpws > (webpages) the output of it is garbage. The database content shows ä > instead of the real lowercase_a_umlaut. This makes me believe, the content > is html-encoded and not utf-8. > > Conclusion: > The database seems not to be the problem, if mysql is newer than 4.1.x . But > the mysql-server has to know wich encoding on the client-side is used. It > handles the tables and it's collation on whatever setup. This is imho good > news, because lot of users out there get a preconfigured db from the host > and are not able to change charset nor collation. > Since the server knows the client-encoding, he is translating the db-content > to it (eg. utf-8). > But what we have to do now, is to make sure, all the ingoing and outgoing > content inside phpws is encoded as correct utf-8 also. > > Maybe we have to use a function to check the content if it is allready > encoded?! > eg. (not mine, somewhere from the net): > > /** > * Checks if String is UTF-8 Encoded > * @param string $string string to check > * @return boolean > */ > function is_utf8($string) > { > return preg_match('%^(?: > [\x09\x0A\x0D\x20-\x7E] # ASCII > | [\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte > | \xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs > | [\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte > | \xED[\x80-\x9F][\x80-\xBF] # excluding surrogates > | \xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3 > | [\xF1-\xF3][\x80-\xBF]{3} # planes 4-15 > | \xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16 > )*$%xs', $string); > } > > and convert if not: > > /** > * Encodes String to UTF8 > * @param string $string > * @return string > */ > function cms_utf8_encode($string) > { > if(is_utf8($string)) > { > return $string; > } else { > if(function_exists('mb_convert_encoding')) > { > return mb_convert_encoding($string,'utf-8'); > } else { > return utf8_encode($string); > } > } > } > > > We should also double-check the headers and meta-tags of the output: > eg: <meta http-equiv="content-type" > content="application/xhtml+xml;charset=utf-8" /> > eg: header('content-type: text/html; charset=utf-8'); > and maybe also in css ??? > eg: @charset "utf-8"; > > And last but not least; to work with forms, the charset should be defined: > <form accept-charset="utf-8" method= ...> > > > All this is 'only' some kind of brainstorming. But maybe the direction, > where to go, to handle different languages, charsets and encodings... > > Regards > Yves > > > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys - and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Phpwebsite-developers mailing list > Php...@li... > https://lists.sourceforge.net/lists/listinfo/phpwebsite-developers -- Matthew McNaney Electronic Student Services Appalachian State University http://phpwebsite.appstate.edu |