From: Boris Z. <bz...@2b...> - 2006-01-08 22:18:43
|
Hi All, sorry, I'm somewhat late here. For some unknown reason I did not read the= list=20 for a while.=20 The answer is a browser is free to change the charset in the answer. Even= if=20 you know the charset you used to send the form. There is a attribute to t= he=20 form tag to hint a charset to the browser. But for my tests it did not wo= rk. The only solutio, that always worked for me is add a hidden field to the = form=20 with a char or word that is diffenent in utf8 and your prefered charset(s= ). In my case I use utf8 and latin-1. Then look at the length or values of the string in your hidden field if t= hat=20 string is in utf8 all other form values are also in utf8 thats the whole=20 trick. And the best, to always do this on the fly just subclass=20 Apache::Request::PageKit and add request_class =3D "MyCharsetFunPackage" = to=20 your config. Am Mittwoch, 23. November 2005 14:23 schrieb Erik G=C3=BCnther: > Hi > > I have played with pagekit for some time now. And now I would be able t= o > have a site that use UTF8 internally. But how to I do that. The easy > part is to have all files in UTF-8 and save to the DB in UTF-8 and so > on. But pagekit are smart and sends the page in the encoding the browse= r > prefers. That is not any problem. But who do I handle the input from a > form? > > I mean how do I know what char encoding the web-browser are sending in? > I can't trust the outgoing encoding because that is trivial to change i= n > ant browser. Afaik there are no serten way to tell what encoding by jus= t > looking at the string. > > What are you doing to fix this? Om my previous site i "converted" all t= o > Latin-1. But that was just a ugly hack. utf8:Is_utf8() and > Encode::is_uft8() won't help they say false on every string passed by > apache. :/ > > > One way is to block pagekit and send everything in UTF-8 because most > often the browser will send the return in UTF-8... but that solution > aren't bullet prof. The user can still send in eg Latin-1 or the browse= r > do not handle UTF-8 (rare). > > Any ideas? --=20 Boris |