From: Yuji Y. <Yam...@og...> - 2011-11-14 19:45:39
|
What encoding do you use on the command prompt? I got differenet result with English version of Windows XP with East Asian Language Support. C:\Documents and Settings\yyamano>chcp Active code page: 932 C:\Documents and Settings\yyamano>c:\opt\jython2.5.2\bin\jython.bat Jython 2.5.2 (Release_2_5_2:7206, Mar 2 2011, 23:12:06) [Java HotSpot(TM) Client VM (Sun Microsystems Inc.)] on java1.6.0_06 Type "help", "copyright", "credits" or "license" for more information. >>> print 'test kanji - こんにちは' ... ... ... LookupError: unknown encoding 'ms932' On Mon, 14 Nov 2011 10:11:17 -0800, Joey Jarosz <jo...@ca...> wrote: > Hi Alan, > > Java 6u17 > sys.version_info == (2, 5, 2, 'final', 0) > Windows XP > > I am actually using the interpreter embedded in my Java application. I have verified in my debugger that I am passing the UTF-8 characters down to the interpreter without losing anything. > > > If I access an attribute that I exposed from the Java level that returns a string, it indeeds returns things correctly as shown below. > > >>> p.notes > '\u3053\u3093\u306b\u3061\u306f' > > >>> print p.notes > こんにちは > > But if I import the following function it does not work. > > # -*- coding: UTF-8 -*- > def testUTF8(): > print 'こんにちは' > > I can open the above file in several different text editors that support UTF8 and they display it correctly. > > Am getting myself really confused. > > -----Original Message----- > From: ala...@gm... [mailto:ala...@gm...] On Behalf Of Alan Kennedy > Sent: Friday, November 11, 2011 10:40 AM > To: Joey Jarosz > Cc: jyt...@li... > Subject: Re: [Jython-users] [jython-users] UTF-8 support for interactive input > > [Joey] > > I cannot get Jython 2.5.2 to correctly handle UTF characters when executing > > the following sort of lines. I can get Kanji to out correctly if I call one > > of my Java routines that returns a string that contains Kanji – so it seems > > like an “input” issue. Any ideas? > > > >>>>print 'test kanji - こんにちは' > > > > test kanji - ã “ã‚“ã «ã ¡ã ¯ > > A few questions we need to find the answers to before getting to the > bottom of it. > > 1. By the look of the the above, you're specifying this Kanji string > in the interactive interpreter. If so, > - What java version are you using? > - What (precise) jython version are you using? > - What operating system are you using? > - What is the input encoding is your console? > - What is the output encoding of your console? > > The encoding of your console matters. > > If I enter your test string in an encoding independent manner, e.g. > > >>> s = u'test kanji \u3053\u3093\u306b\u3061\u306f' > > The I can't print it on my windows CP437 console > > >>> print s > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "C:\jython252\Lib\encodings\cp1252.py", line 12, in encode > return codecs.charmap_encode(input,errors,encoding_table) > UnicodeEncodeError: 'charmap' codec can't encode character u'\u3053' > in position 11: character maps to <undefined> > > But I can print by escaping the contents so that they display in > ascii-friendly encoding > > >>> print s.encode("ascii", "xmlcharrefreplace") > test kanji こんにちは > > >>>> execfile('C:/temp/こんにちは/test.py') > > And this case is different again, because you're using kanji in a > filename. Operating systems treat unicode filenames differently. > (Although I'm guessing from your pathname that you're using windows). > > Try this > > >>> execfile (u'C:/temp/\u3053\u3093\u306b\u3061\u306f/test.py") > > Alan. > ------------------------------------------------------------------------------ > RSA(R) Conference 2012 > Save $700 by Nov 18 > Register now > http://p.sf.net/sfu/rsa-sfdev2dev1 > _______________________________________________ > Jython-users mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-users |