I have a client that as a result of a request to a
CORBA object developed with Orbacus gets a wstring.
The OMG GIOP specification (126.96.36.199 Character Types)
says that for wstrings (and wchars) encoded with
UTF-16, there is a BOM after the wstring length
indication that must be used in order to decode the
wstring, If the BOM is not present then it defaults to
Note that the byte-ordering specified for the
encapsulation can be different from the UTF-16 byte
ordering used. The byte-ordering specified in the
encapsulation should be used to decode the wstring
length for instance, and maybe the BOM (???), whereas
the byte ordering indicated by the BOM should be used
to decode the bytes of the wstring value.
I agree this looks confusing,
Looking at the code of the CdrStreamXXXXXEndianReadOP
classes, the byte-ordering specified in the
encapsulation seems to be used to decode the Unicode
strings, which I don't think is correct. In my opinion
the Encoding class for UTF-16 should check the BOM to
decide wich byte ordering to use to decode the wstring.
In my case, the encapsulation specified little-endian,
but the UTF-16 encoded wstring does not have a BOM, so
big-endian should be used which is not the case currently.