On Sep 21, 2012, at 1:32 PM, Jeff Allen wrote:
> Thanks for that Philip. I'll start with _io.FileIO, as you suggest, and
> its hierarchy up to _IOBase made to look like the corresponding CPython
> classes. I hope to populate this basically by abstraction from your
> PyFileIO. I've just drawn out the inheritance hierarchy of _io in
> CPython, and there are a lot of classes there.
> On 20/09/2012 21:32, Philip Jenvey wrote:
>> On Sep 18, 2012, at 1:07 AM, Jeff Allen wrote:
>> Hey Jeff, I haven't thought about all this in too much detail but I should point out (as the author of core.io):
>> o It's "loosely" based on PEP 3116 because there are some differences between py2 and py3 file:
>> - universal newlines mode is more configurable (you can choose the 'newline' to use) and it now supports writing
>> - since the buffer/raw layers aren't exposed in Py 2 file, I didn't bother making them threadsafe (PyFile is responsible for the locking)
>> - no 'encoding' arg to open() functionality was needed for Py 2
>> - other small things (like some of the exceptions IOBase raises should be different in Py 3) and probably other things I've forgotten
> In mentioning Py 2 and Py 3 here, do you mean that your implementation
> is Py 3 in character and needs some down-shifting to match he Py 2
> capability? Obviously I'm targeting the Py 2.7 tests.
Yep, but I mention the differences just as a warning if we're to start using these classes for the io module, particularly the text layers where most of the differences are. Then core.io would somehow have to straddle Py 2 (when used w/ file) functionality vs Py 3 (when used via io).
Right now you're going to start with only FileIO, so you probably won't run into any big 2 vs 3 differences except for the differing exception types raised.
>> o I was hoping PyString might eventually be based on bytes instead of char.
> I think so too, for Jython 3. BaseBytes.java was written with this in
> mind, although I think refactoring that hierarchy is overdue.
This will definitely have to happen for 3 but it could also benefit 2. We'd save the bytes -> String conversion in core.io. Plus, right now if you have 1mb of bytes as a Jython 2 str it takes 2mb of memory =[
>> I also thought that *possibly* future Py_Buffer support in Jython might be based/or somehow integrate with java.nio.Buffer (I'm not sure that's even a great goal though, you might have some insight. Integrating with a ByteBuffer is simple if you have an underlying Java byte array somewhere).
> I agree and I know where it would fit: it can't *be* a PyBuffer, nor be
> extended, but it could (probably should) replace the thing I called
> BufferPointer that encapsulates a byte and an offset into it. I think
> revisiting readinto() etc. will make me do this.
>> - So note that core.io is well optimized right now, though it could actually gain a slight speedup in Py2 if we got rid of the extra bytes->String (for PyString) conversion
> I think I noticed some of this unwelcome conversion.
>> Basically, adapting core.io to _io will take some doing and I'm not sure how to handle the 2 vs 3 differences. We should also keep in mind that the work shouldn't affect Py2 file performance negatively as that's ultimately more important to Py 2 code. Adding locks to all of the layers could hurt (though maybe Java 6 escape analysis/lock coarsening helps here)
>> In fact, I'm not sure the io module is very heavily used in Py 2 code at all (probably just in some cases of Py3 compat)? You might want to consider doing the bare minimum to get it working for now, and leave optimizing it until later (maybe even until Jython 3). Then you can basically defer on all the points I'm worrying about =]
> I was going for correct first.
>>> 3. There should be a static open() function in
>>> org.python.modules._io._io.java .
>>> 4. fileno() should return something the Python user treats as an opaque
>>> handle, and that open() and the constructors of streams will have to
>>> accept, where currently their CPython implementations expect an int. I
>>> read the discussion around the proper return type fileno()
>>> (http://comments.gmane.org/gmane.comp.lang.jython.devel/3994 and refs
>> We should have this already unless I'm missing something
> fileno() returns it, but open() doesn't accept it, only a string, nor if
> I comment out the type test in open, does the FileIO constructor. It
> shouldn't be too difficult.
builtin open shouldn't accept it, but os.open should (and already does) and io.open allows it too.
>>> 5. I can make these changes progressively by ditching _io.py (clone of
>>> _pyio.py) and replacing the current CPython io.py with one that
>>> delegates to _pyio.py initially. Then class by class, I change its
>>> delegation from _pyio to _io (Java implementation). In the end, we go
>>> back to the CPython io.py.
>> It was a little simpler in 2.6 in that the bare minimum you needed to implement for the pure Python version of io to work was _fileio.FileIO. The 2.7 _pyio is a little strange in that it refers to io.IO/RawIO/Buffered/TextIOBase to register as ABCs (which requires _io).
>> You can probably get away with implementing just io.FileIO and SEEK_SET/CUR/END (as a builtin that'd replace io.py). Then comment out the ABC registration calls in _pyio.
> I plan to keep io.py and mirror exactly the hierarchy in CPython. That
> way I expect to be able to use existing Python implementations of
> classes I haven't implement in Java. My understanding of how classes and
> modules cross that boundary should improve a lot! By which I mean I'll
> be asking whether I've done it right.
Cool, sounds like we're on the same page. If you have any quick questions you can also find me and others on IRC (irc.freenode.net #jython)