On Mon, Nov 26, 2001 at 03:58:39PM +0000, Finn Bock wrote:
Short version :
jython gives no result when running scripts encoded in latin1 with
non-ASCII chars in them.
| >What difference does it make to jython whether a (python) source file
| >is saved in latin1 or utf-8? In any case, I think it is a gross error
| >to simply terminate with no message when encountering a file that it
| >doesn't like.
| Sure. Normally jython doesn't. So what is special about woody?
See below. I have now figured out the source of this problem.
| >I started the conversion to utf-8 from main.py,
| I have now removed the latin-1 copyright character in the CVS version.
Cool. That will certainly fix all portability problems since ASCII is
a common subset of all encodings AFAIK (latin1 and utf-8 for sure).
| >The interesting thing about jythonc's source files is that they all
| >have the copyright symbol in a comment at the top of the file. In
| >'latin1' this is character 0xa9.
| The python source files is read as text files with a InputStreamReader
| using the default encoding for the platform. Normally that is a good way
| to read text files but a sideeffect is that python source programs with
| non-ascii characters isn't portable to other platforms with a different
| I don't know what the cause is, but these experiments might help shed
| light on it.
| What file encoding is used in your setup of woody?
| >>> import java
| >>> java.lang.System.getProperty("file.encoding")
The woody machine I have at work had no problems running jythonc, just
my machine at home. I remembered late last night that I had set $LANG
to en_US.UTF-8 at home. Now that I am at work, I checked with that
machine and it has $LANG set to the default of "C". If I tried
"LANG=en_US.UTF-8 jythonc --help" it failed the same as it was doing
With LANG=C, the enconding used by java is "ISO-8859-1", with
LANG=en_US.UTF-8 the enconding is "UTF-8".
| Whatever the encoding used is, it may be unable to handle 0xA9
Perhaps, and perhaps java is broken?
I created "hello world" with the copyright symbol in a comment. I did
this with both latin1 and utf-8.
$ LANG=en_US python2.2 hello_latin1.py
$ LANG=en_US python2.2 hello_utf-8.py
$ LANG=en_US.UTF-8 python2.2 hello_latin1.py
$ LANG=en_US.UTF-8 python2.2 hello_utf-8.py
$ LANG=en_US jython hello_latin1.py
$ LANG=en_US jython hello_utf-8.py
$ LANG=en_US.UTF-8 jython hello_latin1.py
$ LANG=en_US.UTF-8 jython hello_utf-8.py
As you can see, CPython (2.2b1) has no problems with the script
regardless of environment and file encoding, however Java can't handle
a latin1 file with the environment set to UTF-8.
I should do some experiments at the Java level and see what it does in
that situation. Maybe it causes a problem in Jython's parsing (ie the
comments ends up extending to the end of the file) or maybe there is
some error that is silenty ignored.
| >>> from java import io
| >>> s = io.FileOutputStream("foo")
| >>> s.write("\xA9")
| >>> s.close()
| >>> s = io.FileReader("foo")
| >>> print hex(s.read())
| >>> s.close()
I just did a quick test using jython (interactive coding is very
$ LANG=en_US.UTF-8 jython
Jython 2.1a1 on java1.3.1 (JIT: null)
>>> from java.io import *
>>> f = InputStreamReader( FileInputStream( "hello_latin1.py" ) )
>>> while 1 : print f.read()
Traceback (innermost last):
File "<console>", line 1, in ?
at java.lang.reflect.Method.invoke(Native Method)
I'll attach the file so you can see it for yourself. It looks like
jython catches this exception, but silently ignores it. Perhaps it
would be a good idea to try and fall back to latin1, then display an
error message if that fails too.
| >I use (g)vim 6.0 as my editor. As
| >you may already know it has two variables, 'enc' and 'fenc'.
| You could change the file encoding of the source files. You would then
| have to change the encoding used by java as well. But I strongly doubt
It was already changed -- changing the encoding of the files caused
them to match the encoding java was using.
| that you want to go there. If latin1 is suitable for your country and
| language, stick with that.
I suppose maybe I should. At least I know what to look for now if it
happens again :-).
Even youths grow tired and weary,
and young men stumble and fall;
but those who hope in the Lord
will renew their strength.
They will soar on wings like eagles;
they will run and not grow weary,
they will walk and not be faint.