|
From: Johnson, C. <cla...@vi...> - 2001-07-05 17:24:39
|
Technically, ISO-8859-1 does not assign code points 0x80-9f. I suppose an
eager implementation may refuse to map these code points, although I would
expect it to substitute 0x3f ('?') rather than return nothing. I'm curious
what you would see if you tried the following; explicitly casting out the
bytes rather than indicating an encoding:
String original = request.getParameter("original");
String newString;
if (request.getCharacterEncoding()==null) {
byte[] bytes = new byte[original.length()];
for (int i=0; i<original.length(); i++)
// presuming utf8, all chars <= 0xff
bytes[i] = (byte) original.charAt(i);
newString = new String(bytes, "UTF-8");
}
> -----Original Message-----
> From: RUCH,SCOTT (HP-NewJersey,ex2) [mailto:sco...@hp...]
> Sent: Wednesday, July 04, 2001 9:52 PM
> To: 'icu...@ww...'
> Subject: [ICU4J-discussion] Converting form data from an HTTP POST
> request in a servlet
>
>
>
> Not ICU4J-specific, but I figure there's a few ;-)
> Java I18n / J2EE experts lurking out here that might
> have an opinion on this:
>
> One of the ways that people deal with the fact that
> there is no information in the POST request specifying
> the underlying encoding of the form data is to let
> the servlet container apply the default ISO 8859-1
> encoding to the data and then convert to the desired
> encoding as such:
>
> newString = new String(original.getBytes("8859_1"), "desiredEncoding")
>
> I was testing this in a simple JSP with a sampling of
> text from various languages encoded in UTF-8. I found
> that depending on the original string content, the
> conversion would fail sometimes. (Failure = zero-length
> string. I interpret a zero length string as an "trans-coding
> failure").
>
> Consequently, I'm suspect of this technique. Is there
> a reasonable explanation why UTF-8 and ISO 8859-1 are
> incompatible. Intuitively, it would seem that this
> should always work, but I've observed it failing...
>
> Thanks,
>
> Scott
>
> _______________________________________________
> ICU4J-discussion mailing list
> ICU...@ww...
> http://www-124.ibm.com/developerworks/opensource/mailman/
listinfo/icu4j-discussion
|