From: Jose M. R. <jm...@in...> - 2003-07-11 07:26:32
|
I tried this in a Jython session: >>> c = unichr(225) >>> print ord(c), '->', c 225 -> á >>> c = 'á' >>> print ord(c), '->', c 225 -> á And the same embedding Jython: public class Test { public static void main(String[] args) { PythonInterpreter interp = new PythonInterpreter(); interp.exec("c = unichr(225)"); interp.exec("print ord(c), '->', c"); interp.exec("c = 'á'"); interp.exec("print ord(c), '->', c"); } But now the output is: 225 -> á 65533 -> (bad charater) This is strange because the Jython console uses the same PythonInterpreter. Also the web documentation says "Jython have only one string type which support full two-byte Unicode characters and the functions in the string module are Unicode-aware. The u"" string modifier is optional and completely ignored if specified". But in a Jython session: >>> print "\u00E1" \u00E1 >>> print u"\u00E1" á So the u"" modifier is needed here, I guess it is for compatibilty with CPython. Olloh, Annette wrote: >Jose, > >Came across this problem once with currency symbols. Do you want to try >printing out the Unicode character of it instead? > >E.g. > >System.out.println(String.valueOf('\u00E1'); > >--Annette. > >-----Original Message----- >From: Jose M. Rus [mailto:jm...@in...] >Sent: Thursday, July 10, 2003 4:19 PM >To: jyt...@li... >Subject: [Jython-users] problem with encoding > > >Hi, > >I'm trying to extend InteractiveInterpreter but I have some problems >with encoding. For example, >if I compile the following test class: > >public class Test { > public static void main(String[] args) { > System.out.println("Test: á"); > PythonInterpreter interp = new PythonInterpreter(); > interp.exec("print 'á'"); >} > >The second printed line shows a wrong character for 'á', but the Jython >interactive console prints >the right char. I think maybe it is due to Readers using the platform >enconding (UTF-8 in my case) >and InputStreams using the Java default encoding (UTF-16?). > >Any suggestions? > >Thanks. > > > |