Menu

#1 I18N support

open
nobody
None
5
2014-08-17
2002-06-14
No

This message is from "ZHANG,GUOYONG (HP-NewJersey,ex2)"
<guoyong_zhang@hp.com>. He adds support for I18N to
Anteater.

From: "ZHANG,GUOYONG (HP-NewJersey,ex2)" <
guoyong_zhang@hp.com>
To: "JONES,BILL (HP-NewJersey,ex2)" <bill_jones@hp.com>,
"PREDESCU,OVIDIU (HP-Cupertino,unix1)" <
ovidiu_predescu@hp.com>
Subject: I18N patches for anteater
Date: Thu, 9 May 2002 11:34:49 -0700

Hi, Ovidiu,

To patch anteater to work with our I18N testing, I made changes to
two classes:

org.apache.anteater.util.Utils:
line 43: changed from new InputStreamReader(is) to new
InputStreamReader(is, "ISO-8859-1").
Constructing string from url stream is actually impossible,
because the unknown encoding. Without specifying encoding in
InputStreamReader actually uses platform's default encoding.
Instead, I used "ISO-8859-1", which creates one character for each
byte. It may not be the right encoding for what's really on the wire.
But, later I can correctly recover the bytes from this string using
the same encoding.

org.apache.anteater.test.HttpMessage
line 84: changed from new InputStreamReader(is) to new
InputStreamReader(is, "ISO-8859-1").
same reason as above.
line 335: changed from
document = DocumentHelper.parseText(body);
to
SAXReader reader = new SAXReader();
document = reader.read(new
ByteArrayInputStream(body.getBytes("ISO-8859-1")));
Since we know that string body were constructed using "ISO-
8859-1", so we can correctly get the bytes from the string using
same encoding.
Note: using platform's default encoding to create string from
stream and then create bytes from string doesn't necessarily work,
because some bytes might be lost during the double conversion,
depending the encoding. "ISO-8859-1" doesn't guarantee the
correct result.
line 420: changed from new OutputStreamWriter(os) to new
OutputStreamWriter(os, "ISO-8859-1").
same reason as above.

As you may have noticed, I used double conversion to get the right
bytes from stream or vice-versa, which is not necessary. I was
doing that just so that I don't have to change any method's
signature.
If you want to refactor the anteater to be I18Ned, you don't have to
do what I did.
My experience with I18N in http and xml area tells me that:
1) avoid using InputStreamReader and OutputStreamWriter. If you
have to use them (for convenience), use them in a controlled way,
don't use default encoding.
2) similar rule with using String.getBytes() and new String(byte[]).
3) feed xml parser stream, not reader.

Thanks,
Guoyong

-----Original Message-----
From: JONES,BILL (HP-NewJersey,ex2)
Sent: Thursday, May 09, 2002 9:05 AM
To: PREDESCU,OVIDIU (HP-Cupertino,unix1);
ZHANG,GUOYONG (HP-NewJersey,ex2)
Subject: anteater

Ovidiu,

I have been tied up recently and unable to commit our anteater
mods to sourceforge. If GZ sends your our codebase (or you grab
it from StarTeam), can you do the diffs and consult with GZ to
understand the changes? I can probably get some time this
afternoon to help GZ if necessary.

In addition to GZ's i18n updates, I added support for namespaces
within the response matching code...

Cheers,
Bill.

_____

Bill Jones
Development Manager
HP Middleware / Web Services

<http://www.hpmiddleware.com/ <http://www.hpmiddleware.com/>
>

Phone:++1-856 638-6007
Fax: ++1-856 638-6190
Telnet 638-6007
bill_jones@hp.com <mailto:bill_jones@hp.com <
mailto:bill_jones@hp.com> >
_____

Discussion

  • Ovidiu Predescu

    Ovidiu Predescu - 2002-06-14
     
  • Ovidiu Predescu

    Ovidiu Predescu - 2002-06-14
     

Log in to post a comment.