From: Philip J. <pj...@un...> - 2013-12-03 21:28:46
|
On Nov 30, 2013, at 4:08 PM, Jeff Allen wrote: > I'm looking at a last test failure in test_file2k, which fails because > we do not support PYTHONIOENCODING (and for an incidental problem with > subprocess.Popen), but it asks more than a simple bug-fix or two. It > alerts us to a missing feature. > > PYTHONIOENCODING was new in CPython 2.6, so 2.7 is the first version of > Jython that should support it. I think the implications are: > > 1. Initialisation should read PYTHONIOENCODING to set the encoding > applicable to std(in|out|err) and the error policy applicable to > std(in|out). I can tell from the test this applies even if these streams > are not terminals. > > 2. PYTHONIOENCODING is applied during PySystemState initialisation (so > it applies to the embedded interpreter) not just when using the command > org.python.util.jython. As a property of the stream it could not apply > to streams replaced with setIn, setOut, etc. e.g. under JSR-223 > interpreters. > > 3. We remove the -C <encoding> option from the Jython command line. > (It's not present in CPython 2.6 onwards: I can't find any documentation > for the CPython 2.5 command but assume it was superseded ... if it ever > existed. It's not documented for Jython 2.5 even.) It's a really old Jython specific argument (8b2add647610). It looks like it was originally '-E' but I changed it to '-C' in 5b29613a1add for compatibility w/ CPython 2.5's -E. It didn't seem to bother anyone when it was changed. Maybe we should deprecate it, but since the -C change was a non event I'd be fine with removing it for 2.7. > > 4. The Java property python.console.encoding if given would have > priority over PYTHONIOENCODING on console streams, that is where the > stream is a console according to isatty(). This property seems to just be the registry equivalent of -C (-E). So maybe we can kill it as well? > > 5. The file type should have an errors property, and the errors and > encoding properties should by consulted when a PyUnicode is written, > whether to a pipe or the console. If it is a console, the console > handler will then decode (all) bytes written, using the same decoding to > present characters. > > 6. file.encoding is not consulted when reading. If from a pipe, the pipe > supplies bytes and the read calls return them; if from the console, the > console handler has already encoded the keystrokes to bytes. > > Does this sound about right? LGTM -- Philip Jenvey |