From: Patrick S. <Pat...@qu...> - 2002-10-24 15:38:40
|
In trying to figure out how to get an XML document containing unicode characters correctly serialized to an output file, discovered that Jython= and Python behave differently when attempting to write Unicode characters to files. Is this a known issue? (Not that it actually helps solve my orig= inal problem, but hey...) > In Jython, unicode chars get written as bogus characters to the output = stream: >=20 > Jython 2.1 on java1.3.1_04 (JIT: null) > Type "copyright", "credits" or "license" for more information. > >>> s =3D u'ABC\u03a3DEF' > >>> s > u'ABC\u03A3DEF' > >>> import sys > >>> sys.stdout.write(s) > ABC?DEF>>> > >>> ^Z >=20 > Note that really is an ascii '?' character - if write to a file in text= mode, > you get following from 'od': >=20 > D:\pds\jython>od -Ax -c -b foo-text-mode > 000000 A B C ? D E F > 101 102 103 077 104 105 106 > 000007 >=20 > If you write the file in binary mode, you get the low-order byte of the= unicode > char (0xA3 =3D=3D 0243): >=20 > D:\pds\jython>od -Ax -c -b foo-binary-mode > 000000 A B C =FA D E F > 101 102 103 243 104 105 106 > 000007 >=20 > In Python you get an exception: >=20 > Python 2.2.1 (#34, Apr 9 2002, 19:34:33) [MSC 32 bit (Intel)] on win32 > Type "help", "copyright", "credits" or "license" for more information. > >>> s =3D u'ABC\u03a3DEF' > >>> s > u'ABC\u03a3DEF' > >>> import sys > >>> sys.stdout.write(s) > Traceback (most recent call last): > File "<stdin>", line 1, in ? > UnicodeError: ASCII encoding error: ordinal not in range(128) |