Re: [Simpleweb-Support] What encoding of String that Request.getParameter() return?
Brought to you by:
niallg
From: Martin N. <mar...@gm...> - 2005-12-07 14:32:24
|
on 12/7/05, Carfield Yim <car...@gm...> wrote: > > > > I dont quite understand this? Java Strings are charset-neutral as far > > as i know (and always stored internally as UTF-16), so there is no > > need to "convert" a string to anything as the string does not retain > > charset information. The only conversion is dont at Input/Output > > stream level or when encoding to other formats (such as URLEncoder). > > > In some servlet container (tomcat and jetty as I know) If you submit > multibytes character in HTML form, it will incorrectly assume that it > is "ISO-8859-1", and return the incorrect encoded string at > request.getparameter() method. In order to get back the correct > string, I need to do the above. I see how it works now. Anyway might it have something to do with the fact that SimpleWeb actually tries to decode query parameters submitted in UTF-8 as i see the ParameterParser and URIParse classes does? I would think this causes a problem if the input is already in UTF and when you do inString.getBytes("ISO-8859-1") you get the ISO representations of characters (for example you get the string "Bj=F6rnb=E4r" which is already correct), which will mess up when trying to encode these into UTF-8 again (which will result in "Bj?rnb?"). /Martin |