Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

#65 Incorrect decoding of UTF-16 encoded wstrings

closed-fixed
9
2006-03-26
2006-01-18
No

I have a client that as a result of a request to a
CORBA object developed with Orbacus gets a wstring.

The OMG GIOP specification (15.3.1.6 Character Types)
says that for wstrings (and wchars) encoded with
UTF-16, there is a BOM after the wstring length
indication that must be used in order to decode the
wstring, If the BOM is not present then it defaults to
big-endian.
Note that the byte-ordering specified for the
encapsulation can be different from the UTF-16 byte
ordering used. The byte-ordering specified in the
encapsulation should be used to decode the wstring
length for instance, and maybe the BOM (???), whereas
the byte ordering indicated by the BOM should be used
to decode the bytes of the wstring value.

I agree this looks confusing,

Looking at the code of the CdrStreamXXXXXEndianReadOP
classes, the byte-ordering specified in the
encapsulation seems to be used to decode the Unicode
strings, which I don't think is correct. In my opinion
the Encoding class for UTF-16 should check the BOM to
decide wich byte ordering to use to decode the wstring.

In my case, the encapsulation specified little-endian,
but the UTF-16 encoded wstring does not have a BOM, so
big-endian should be used which is not the case currently.

Discussion

  • Logged In: YES
    user_id=858238

    By the way, if I force the usage of the UnicodeEncodingExt
    big endian the wstring is properly decoded. Otherwise I get
    a series of "?" (the characters could not be decoded).

     
    • priority: 5 --> 9
    • assigned_to: nobody --> dullmann
     
  • Logged In: YES
    user_id=660259

    Hi

    Thank you very much for finding this bug.

    You're right, the implementation is not correct. Sadly,
    these details were first decribed in version 2.4 of the
    standard, and I've used mostly 2.3 for the implementation.

    I've commited a fix to cvs, which should solve the issue.
    You may checkout the iiop-net-1-9-0-perfopt branch from cvs
    in a few hours (because of synchronization delay between
    developer and anonymous cvs).
    See also: http://sourceforge.net/cvs/?group_id=80227

    The following two files contain the fix:
    CodeSetConversion.cs in version 1.8.2.3
    CodeSetService.cs in version 1.20.2.1

    Please tell me, if the fix works for you.

    As a thank you, I would like to add your name to the
    IIOP.NET hall of fame, if you agree.
    http://iiop-net.sourceforge.net/faq.html#faq8_2

    Best regards!

     
  • Logged In: YES
    user_id=858238

    Thanks for the fix. I'll do the test and let you know.
    As for the Hall of Fame, I'll be glad to appear in the list.

    Regards

     
  • Logged In: YES
    user_id=858238

    Sorry for the delay.

    I have not been able to co the cvs code using the
    instructions in your link. I was not even able to login. A
    cvs error:

    cvs [login aborted]: End of file received from server

    It could be a problem with the server being overloaded (I
    found that on the internet). I will try again this week-end.

     
    • status: open --> closed-fixed
     
  • Logged In: YES
    user_id=660259

    Hi

    Sorry for this delay.

    Did this fix work for you?
    I've just created a new release (1.9.0 beta3), which
    contains this fix.
    Therefore, I close this bug report now.

    If the fix does not work for you, please reopen this bug
    report.

    Thank you
    Best regards!