From: Jeff A. <ja...@fa...> - 2012-12-06 23:08:00
|
On 06/12/2012 16:59, Chris Clark wrote: > On Thursday 2012-12-06 08:53 (-0800), Akshay Kini > <kga...@gm...> wrote: >> I was using Jython 2.2 and it defaulted to UTF-8 I suppose and when I >> passed text to PythonInterpreter.exec() or .eval() it was encoded >> correctly. >> >> Since I recently moved to Jython 2.5, I realise that Python has >> changed it's encoding to Ascii by default.*I need this encoding >> changed to UTF-8 before my first call to .eval("<some python code >> here>").* >> >> Doing: >> interpreter.exec("# coding=utf-8"); >> interpreter.exec("print " + japaneseString); >> >> *is not working.* >> >> I thought this is a Jython bug and filed >> http://bugs.jython.org/issue1992 (It might be as well?) >> But if you need a sample program, screenshots etc. you can refer to >> the bug. >> > That doesn't look correct to me. Java strings are UTF16, there is no way > you could be sending in utf8 "strings" from java. Unless you are sending > in byte arrays (I checked the bug test case for 1992 and it is a String, > not a byte array). > ... If they really are UTF-8, that is to say the char values in the string are all 0..255, and these bytes encode characters as UTF-8, as they might be if read from a stream using Jython's io, then I think you address this when you create the interpreter like this: interpreter = new PythonInterpreter() { { cflags = new CompilerFlags(CompilerFlags.PyCF_SOURCE_IS_UTF8); } }; This is untried by me. But looking at the code, the protected cflags member seems to be the thing that controls how the compiler reads the text. If the japaneseString is actually a java.lang.String containing the characters, then I think it would have worked as expected. Failing that, recode it as UTF-8. :-( Jeff Allen |