Re: [Simpleweb-Support] What encoding of String that Request.getParameter() return?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

on 12/7/05, Carfield Yim <car...@gm...> wrote:
> >
> > I dont quite understand this? Java Strings are charset-neutral as far
> > as i know (and always stored internally as UTF-16), so there is no
> > need to "convert" a string to anything as the string does not retain
> > charset information. The only conversion is dont at Input/Output
> > stream level or when encoding to other formats (such as URLEncoder).
> >
> In some servlet container (tomcat and jetty as I know) If you submit
> multibytes character in HTML form, it will incorrectly assume that it
> is "ISO-8859-1", and return the incorrect encoded string at
> request.getparameter() method. In order to get back the correct
> string, I need to do the above.

I see how it works now. Anyway might it have something to do with the
fact that SimpleWeb actually tries to decode query parameters
submitted in UTF-8 as i see the ParameterParser and URIParse classes
does? I would think this causes a problem if the input is already in
UTF and  when you do inString.getBytes("ISO-8859-1")  you get the ISO
representations of characters (for example you get the string
"Bj=F6rnb=E4r" which is already correct), which will mess up when trying
to encode these into UTF-8 again (which will result in "Bj?rnb?").

/Martin