Re: [Plog-general] i18n and character encoding in pLog
Brought to you by:
jondaley
From: Oscar R. <os...@re...> - 2003-08-06 10:47:21
|
Hi! =46irst of all, I think that I was too worried too early and that I should = have=20 done some more tests. First thing I tried was this the attached php script,= =20 to see how it would work... And then I noticed that the only way I could=20 properly display accented characters, for example, was to utf8_encode them.= =20 My first idea was that we would have to automatically encode everything=20 coming from the database. But hey, I was wrong ;) I just tried to add a new= =20 post to the database using the same characters and it worked fine. The thin= g=20 I find most surprising is that somewhere in the way between the browser and= =20 the php server, those characters got "transformed" into unicode since that = is=20 what I can see in the database if I do a query. Excuse my ignorance regardi= ng=20 these topics, but is it so that the browser automatically converts the=20 characters in the html form? Because if so, then there is not much we have = to=20 do! :) But please correct me if I am wrong... Your pointers were also very helpful. Now I understand what is Unicode and= =20 what is UTF-8, UTF-16 and so on... I will also have a look at how we could= =20 convert the language files to Unicode. If anybody knows of a good Unicode=20 editor for Linux, let me know ;) I have already found somebody who can translate the texts to Russian.=20 Actually, I didn't have to go that far to get that person: my girlfriend :D= =20 She's working on them and I hope I can get them today or tomorrow! Thanks for your help and always valuable information. Regards, Oscar. > I have to take a look at how smarty handles output, will take a look at > it tonight (GMT+7). In my experience, we don't really have to deal with > any special encoding methods via php, as long as the user is using a > browser that can handle charset selection, as long as the browser is > capable and I can't think of any current ones since 1998 that is not, > well maybe for folks still using NS4.7, unicode should display the > internal gibberish to readable characters. I will also dig into phpbb2 > since they have this issue down pat already. The thing is that if I use any of my browsers (Firebird 0.6, Mozilla 1.3, a= nd=20 Konqueror/khtml 3.1.2) to type in Spanish texts, what I get in the database= =20 needs to be encoded to utf-8 if I want to see it correctly. > One thing we might want to look at is to convert the text file formats > to U8-Unix. Since I'm working on a wintel platform, I use UltraEdit as > my coding and text file handling program, save the text file as utf8 > unix format. Upload it to my hosting server running redhat. Haven't > encountered any situation at all where the output comes out gibberish > other than when inputting chinese characters using big5 charset and > forgetting to convert them to utf-8. > > Going to install the latest release and do some testing and will suggest > what can be done with the accented characters issue. Do we have anyone > who knows russian, should test out the russian characters as well since > my understanding is that spanish, german and french's only big > difference in characters with english is the use of many accented > characters, while russian, chinese and other double byte languages have > massive differences. > > FYI I use mozilla 1.5a, firebird 0.6.1 and IE6 as my testing browsers. > No access to Konqueror until I buy a powerbook this xmas and can use > safari as well for khtml based browsers to test. > > PS Oscar did you get the list of URLs I sent in my email. There are some > stuff there that might help but I did not dig through it carefully since > I was digging out those URL over my lunch break. > > > -----Original Message----- > > From: plo...@li... > > [mailto:plo...@li...] On Behalf > > Of Oscar Renalias > > Sent: Wednesday, August 06, 2003 2:43 AM > > To: plo...@li... > > Subject: [Plog-general] i18n and character encoding in pLog > > > > > > Hi all (but specially Warren) > > > > I have a question when it comes to enabling pLog to use utf-8 > > as the basic > > character encoding... I was doing some tests with php and the > > problem is > > that... let's see if I can explain it O:) > > > > Internally the application works in iso-8859-1, which is the > > default character > > encoding used by PHP. The thing is that if we set the > > character set of the > > html template to utf-8 by doing something like: > > > > <meta http-equiv=3D"content-type" content=3D"text/html; charset=3Dutf-8= "> > > > > The Chinese texts work fine... but how about, let's say, > > Spanish texts? I did > > some tests with the string "=E1=E9=ED=F3=FA=E0=E8=EC=F2=F9" and if the = charset of > > the template > > was set to utf-8, I could only get all the characters right > > when doing: > > > > print(utf8_encode("=E1=E9=ED=F3=FA=E0=E8=EC=F2=F9")); > > > > So, I basically had to manually encode the characters, > > because they were > > stored in a different character encoding. One thing that > > could be done is > > that we could automatically encode anything that is going to > > be output to the > > user, but what happens if the text has already been encoded? If I run > > utf8_encode on a chinese string that is already in utf-8, I > > get gibberish. > > How does it work in for example a Chinese browser? In which > > encoding are the > > texts in html forms sent to pLog? What would pLog receive in > > the request if a > > user adds a new post in Chinese? Would we need to utf8_encode > > that text again > > or not? > > > > I am a bit confused with these things, sorry if it doesn't > > make so much sense > > but some help would be appreciated! > > > > Regards, > > > > Oscar. > > > > > > ------------------------------------------------------- > > This SF.Net email sponsored by: Free pre-built ASP.NET sites > > including Data Reports, E-commerce, Portals, and Forums are > > available now. Download today and enter to win an XBOX or > > Visual Studio .NET. > > http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet > > _072303_01/01 > _______________________________________________ > Plog-general mailing list > Plo...@li... > https://lists.sourceforge.net/lists/listinfo/plog-general > > > > > > ------------------------------------------------------- > This SF.Net email sponsored by: Free pre-built ASP.NET sites including > Data Reports, E-commerce, Portals, and Forums are available now. > Download today and enter to win an XBOX or Visual Studio .NET. > http://aspnet.click-url.com/go/psa00100003ave/direct;at.aspnet_072303_01/= 01 > _______________________________________________ > Plog-general mailing list > Plo...@li... > https://lists.sourceforge.net/lists/listinfo/plog-general |