From: dman <ds...@ri...> - 2001-11-25 01:57:55
|
On one of my Debian woody boxes jythonc stopped working a while ago. Jython still worked, but running 'jythonc' would give no output. I have now solved the problem, but I think it involves a bug in jython. I traced through how jythonc was supposed to be run -- it is pretty straightforward : jython is run with /usr/share/jython/Tools/jythonc/jythonc.py as the first argument (and any other arguments are passed to the script). I added a print to the top of jythonc.py, but it wouldn't get printed. It was really strange because I could create a "hello world" program and it would work. As I took a deeper look, looking at main.py I noticed that there were several Form Feed characters in it. I removed those (from the other source files as well) but those had no bearing on my problem. (I don't think there is a reason to have form feeds anyways, unless perhaps one intends to "cat <source> > /dev/lp0" with an old printer) The solution, as it turned out, was to open each of the source files, convert them to utf-8 and save them again. What difference does it make to jython whether a (python) source file is saved in latin1 or utf-8? In any case, I think it is a gross error to simply terminate with no message when encountering a file that it doesn't like. I started the conversion to utf-8 from main.py, and tried running jythonc after each file was changed. It would give me "ImportError" or "AttributeError" when an import of a non-converted file was encountered. Once I had converted all files jythonc worked properly. The interesting thing about jythonc's source files is that they all have the copyright symbol in a comment at the top of the file. In 'latin1' this is character 0xa9. I use (g)vim 6.0 as my editor. As you may already know it has two variables, 'enc' and 'fenc'. 'enc' is the global encoding specifier. I can set it to "latin1" or "utf-8" (and probably others, but I haven't tried them). 'fenc' is a setting that is local to the current buffer and specifies what encoding the file should be written as. I can set that to "latin1" or "utf-8" also. I created 4 files containing only the copyright symbol, each file with a different combination of 'enc' and 'fenc' settings. Interestingly enough, both files with 'fenc' set to "latin1" contained only 0xa9 0xa0 (when viewed with a hex editor). The file with enc=latin1, fenc=utf-8 contained 0xc2 0xa9 0xa0. The file with enc=utf-8, fenc=utf-8 contained 0x00 0x70 0xa0. I think this copyright character and its encoding may be the source of the whole problem. I'll check with the vim folks too regarding the differences in the two utf-8 files. Hmm, when I open them again, the utf8-utf8 file is messed up (shows ^@p) but the latin1-utf8 file is correct. I used latin1-utf8 as the settings when I converted the jythonc sources. -D |