Re: [Jython-users] problem with encoding

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

I tried this in a Jython session:

 >>> c = unichr(225)
 >>> print ord(c), '->', c
225 -> á
 >>> c = 'á'
 >>> print ord(c), '->', c
225 -> á

And the same embedding Jython:

public class Test {

    public static void main(String[] args) {
        PythonInterpreter interp = new PythonInterpreter();
        interp.exec("c = unichr(225)");
        interp.exec("print ord(c), '->', c");
        interp.exec("c = 'á'");
        interp.exec("print ord(c), '->', c");
}

But now the output is:
225 -> á
65533 ->  (bad charater)

This is strange because the Jython console uses the same PythonInterpreter.

Also the web documentation says "Jython have only one string type which 
support
full two-byte Unicode characters and the functions in the string module are
Unicode-aware. The u"" string modifier is optional and completely 
ignored if
specified". But in a Jython session:

 >>> print "\u00E1"
\u00E1
 >>> print u"\u00E1"
á

So the u"" modifier is needed here, I guess it is for compatibilty with 
CPython.

Olloh, Annette wrote:

>Jose,
>
>Came across this problem once with currency symbols. Do you want to try
>printing out the Unicode character of it instead? 
>
>E.g.
>
>System.out.println(String.valueOf('\u00E1');
>
>--Annette.
>
>-----Original Message-----
>From: Jose M. Rus [mailto:jm...@in...] 
>Sent: Thursday, July 10, 2003 4:19 PM
>To: jyt...@li...
>Subject: [Jython-users] problem with encoding
>
>
>Hi,
>
>I'm trying to extend InteractiveInterpreter but I have some problems 
>with encoding. For example,
>if I compile the following test class:
>
>public class Test {
>    public static void main(String[] args) {
>        System.out.println("Test: á");
>        PythonInterpreter interp = new PythonInterpreter();
>        interp.exec("print 'á'");
>}
>
>The second printed line shows a wrong character for 'á', but the Jython 
>interactive console prints
>the right char. I think maybe it is due to Readers using the platform 
>enconding (UTF-8 in my case)
>and InputStreams using the Java default encoding (UTF-16?).
>
>Any suggestions?
>
>Thanks.
>
>  
>