From: Rose P. <ros...@or...> - 2008-12-18 00:37:31
|
Hi, Jython gurus: I need some help on running Jython 2.2.1 with multi-byte strings. Jython 2.2.1 cannot pass a unicode String correctly to a function defined in a py script. The value of the parameter is converted to different \x format. This is not happened in Jython 2.1. To reproduce it, define a py script, test.py file. The test.py file defines a function called create() which simply returns the value of the parameter: ======= start of test.py ====== def create(name): return name ======= end of test.py ===== Then start Jython 2.1 and run the function create() from the py file: java -classpath jython.jar.2.1 org.python.util.jython Jython 2.1 on java1.6.0_05 (JIT: null) execfile("test.py") create('\u4f7f\u7528') <-- input Japanese characters u'\u4F7F\u7528' <-- return the same unicode representing the Japanese characters with length 2 We can see the output of create function returns a two-byte unicode, which can be displayed correctly by Java System.out.println() method. Then we try Jython 2.2.1 with the same step: java -classpath jython.jar.2.2.1 org.python.util.jython Jython 2.2.1 on java1.6.0_05 execfile("test.py") create('\u4f7f\u7528') <-- input Japanese characters '\xBB\xC8\xCD\xD1' <-- returns different values with length 4. The \xBB\xC8\xCD\xD1 are not recognized by java so we always get "????" if use System.out.println() to print. This is a regression for Jython 2.2.1. This is going to affect all the customer written py files. Is there a workaround for this in Jython? Jython 2.5 seems to have the same issue. Thanks, Rose |