From: Jim B. <jb...@zy...> - 2013-10-03 01:16:16
|
The API of PythonInterpreter probably needs to be reviewed more closely, since it is surprising without a deep understanding of Jython internals and how it passes the arg of exec through. But with that said, here's how to get #1 and #2 resolved. I don't have enough info to understand precisely your requirement for #3, but it probably has a similar requirement to carefully follow encoding/decoding. So with this test code: import org.python.core.Py; import org.python.util.PythonInterpreter; public class TestUnicode { public static void main(String[] args) { try { PythonInterpreter runtime = new PythonInterpreter(); runtime.exec((Py.newUnicode("# encoding=UTF-8\nprint u'ā'")).encode("UTF-8")); runtime.exec((Py.newUnicode("# encoding=UTF-8\nprint ord(u'ā')")).encode("UTF-8")); runtime.exec("print u'\\u0101'"); } catch (Exception ex) { System.err.println("Exception: " + ex); } } } I get the following output: $ java TestUnicode ā 257 ā which should be what you wanted. On Wed, Oct 2, 2013 at 3:42 PM, Pāvils Jurjāns <pas...@gm...> wrote: > Hello, > > I use Jython as a scripting solution for a Java app. The Python code > that's executed in Jython is read from a Unicode database field. So, before > the interp.exec(myPyCode) call the code is stored in a Unicode string. > > Unfortunately I found that all three functions I need are failing if some > Unicode characters are involved: > > 1) Using unicode characters in the Python code: > > PythonInterpreter interp = new PythonInterpreter(null, new > PySystemState()); > interp.exec("print ord(\"ā\")"); > > gives ordinal of "?", or 63, while 257 was expected. How to pass a string > that contains Unicode characters to the interpreter without messing up them? > > (not sure if the special character "ā" will survive the mailing list. It's > a unicode character with ordinal 257, you can easily get it in Java by this: > String uniStr = Character.toString((char) 257); > > 2) Entering unicode literals with escaping: > interp.exec("print u\"\\u0101\""); > > This code causes this Python error: > UnicodeEncodeError: 'ascii' codec can't encode character u'\u0101' in > position 0: ordinal not in range(128) > > So, what would be the correct way of instantiating the interpreter so that > the Unicode literals would be happily processed, just as they are processed > with Python interpreter in Ubuntu? > > 3) Reading unicode characters at standard output > > I can't seem to find a way how the code executed by Jython could pass > Unicode characters nicely to the standard output. I have my custom > OutputStream class' write(int code) method waiting for data (to store it on > Unicode string), but it never receives anything with charcode above 127. > Maybe the problem is that I never really managed to get any Unicode string > working in the Jython interpreter firsthand. > > Unfortunately, I work in a country where there are plenty of Unicode > characters in frequent use, so I can't really cope with this problem by > using only ANSI table. > > Thanks for helping in advance! > > > ------------------------------------------------------------------------------ > October Webinars: Code for Performance > Free Intel webinars can help you accelerate application performance. > Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most > from > the latest Intel processors and coprocessors. See abstracts and register > > http://pubads.g.doubleclick.net/gampad/clk?id=60134791&iu=/4140/ostg.clktrk > _______________________________________________ > Jython-users mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-users > > |