From: Pekka L. <pe...@ik...> - 2006-08-10 22:51:38
|
Hello all, In our project we are implementing a test automation framework that runs both on Python and Jython. We recently found some differences in the unicode syntax with these two platforms. Following examples ought to demonstrate the issue pretty well. Jython 2.2a1 on java1.5.0_04 (JIT: null) Type "copyright", "credits" or "license" for more information. >>> u = u'Hyv\u00E4' >>> u 'Hyv\xE4' >>> type(u) <type 'str'> >>> unicode(u) Traceback (innermost last): File "<console>", line 1, in ? UnicodeError: ascii decoding error: ordinal not in range(128) >>> str(u) 'Hyv\xE4' Python 2.4.3 (#69, Mar 29 2006, 17:35:34) [MSC v.1310 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> u = u'Hyv\u00E4' >>> u u'Hyv\xe4' >>> type(u) <type 'unicode'> >>> unicode(u) u'Hyv\xe4' >>> str(u) Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xe4' in position 3: ordinal not in range(128) We want to convert all the test data we get from users into unicode internally so this issue causes some problems. As a workaround we are planning to use following utility method. def unic(text): if os.name == 'java': return str(text) else: return unicode(text) Is this a Jython bug (we submitted bug [1] anyway) or are we doing something wrong? Furthermore, do you think our workaround utility really works? [1] http://sourceforge.net/tracker/index.php?func=detail&aid=1538001&group_id=12867&atid=112867 Cheers, .peke |