When php runs with internal_encoding other than
ISO-8859-1 (eg, UTF-8) then characters outside ascii-7
(ie, code point > 127) are badly handled.
The reason is multiple:
WHEN SENDING DATA:
* HessianWriter.writeStringData uses utf8_encode that
converts a string interpreted as ISO-8859-1 encoded to
UTF-8. Instead use
$this->stream .=
mb_convert_encoding($value,"UTF-8",mb_internal_encoding());
WHEN READING DATA
* handling binary data with a multi-byte internal
encoding set and mbstring.func_overload set to at least
2 scrambles the stream... (because HessianParser->read
uses the substr function)
To circumvent this, replace the read method by:
function read($num){
$res="";
for ($i=0;$i<$num;$i++)
$res.=$this->stream[$this->pos++];
return $res;
}
As it uses [] to get data from inside the stream, the
bytes are returned as-is without character encoding
tken into account
* then method HessianParser->readString uses
utf8_decode to force the conversion of the hessian utf8
characters to iso-8859-1. Instead use this:
return
mb_convert_encoding($string,mb_internal_encoding(),"UTF-8");
This will convert the UTF-8 characters received by
hessian into the current internal encoding
Finally, with those 3 fixes it runs smoothly... sending
and receiving chinese characters or regular western
european accentuated characters (like é à ...)
Bye
vincent
Logged In: YES
user_id=43277
Hi there,
there is another problem in method HessianParser.setStream:
change $this->len = strlen($stream);
TO
$this->len = mb_strlen($stream, 'latin1');
//counts the number of bytes, regardless of the internal
encoding (that would cause strlen if overriden to count the
number of chars...)
Bye
vincent