Menu

#4 int_daisy_unicodeTranscoder non unicode output

200610 release
open
5
2007-09-29
2006-08-21
No

Erroneous result when transcoding to other output
encoding than utf-8 or utf-16 using
int_daisy_unicodeTranscoder. Use for example a doc that
contains the diff chars between windows-1252 or
iso-8859-1; and output to these two encodings; output
is bit-by-bit identical.

Discussion

  • Markus Gylling

    Markus Gylling - 2006-08-21

    Logged In: YES
    user_id=523358

    Note that the UCharReplacer functionality of this
    transformer works ok, and the bug does not surface as long
    as output encoding is set to utf8 or utf16.

    The problem is caused by the currently used version of StAX
    (Woodstox 2.0.x).

    Probable solution is a) check if Woodstox 3.x solves the
    problem, if not b) try the SJSXP outputwriter (although the
    latest version of that seems not to represent non-bmp chars
    correctly as per the jdk5 update to drop char and use int or
    char[] for representation of non-bmp codepoints, need to
    double check that).

     
  • Markus Gylling

    Markus Gylling - 2007-09-29
    • milestone: --> 200610 release
     

Log in to post a comment.