Re: [Jython-dev] Jython buffer protocol

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hey Jeff,

> This is working out reasonably well.

Sounds like good news! Would you put a draft e.g. on github once it is
somehow at a sane state?

> ByteBuffer getByteBuffer(int... indices);
I wonder what this is supposed to do; afaik ByteBuffer supports no
multi-index logic. (Correct me if I'm wrong).

> // Extract column c
> ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL);
> for (int r=0; r<x.length; r++)
>      x[r] = bb.getFloat( pybuf.index(r,c) );

This looks slow, because method calls are slow (compared to array-access)
and it requires at least two calls per index.
Maybe JIT applies some magic here, but I would not count on it.
However I guess it's presented fairly out of context, so I maybe got the
wrong impression. (Which is why I'm looking forward to a complete
draft as mentioned above).

> I'm following the deprecation route at the moment, but bearing in mind
> Jim's view that breaking change is acceptable by virtue of low adoption,

I wonder if there is any evidence how low adoption currently is at all.
Are there any publicly known projects using this?

Best

Stefan

> Gesendet: Mittwoch, 11. Mai 2016 um 08:55 Uhr
> Von: "Jeff Allen" <ja...@fa...>
> An: jyt...@li...
> Betreff: Re: [Jython-dev] Jython buffer protocol
>
> This is working out reasonably well. It results widespread change, but 
> mostly downwards in complexity. There is less reason to give special 
> cases a fast path when the underlying storage is indirect. Hopefully 
> bulk sequential operations on ByteBuffer implementations are well-optimised.
> 
> getByteBuffer(int index), which is actually still getNIOByteBuffer(int 
> index), does not seem to do as much for me as the Pointer equivalent. I 
> think requiring a PyBuffer to offer you its index calculation is the way 
> to go. Something like:
> 
> assert pybuf.getNdim() == 2;
> assert pybuf.getShape()[0] == x.length;
> 
> // Extract column c
> ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL);
> for (int r=0; r<x.length; r++)
>      x[r] = bb.getFloat( pybuf.index(r,c) );
> 
> 
> Jeff Allen
> 
> On 07/05/2016 00:50, Jeff Allen wrote:
> > I'm following the deprecation route at the moment, but bearing in mind
> > Jim's view that breaking change is acceptable by virtue of low adoption,
> > this may only be a transient arrangement. I don't want to maintain two
> > approaches to storage. I favour adding:
> >
> > ByteBuffer getByteBuffer(); // = getNIOByteBuffer
> > ByteBuffer getByteBuffer(int index);
> > ByteBuffer getByteBuffer(int... indices);
> >
> > which differ only in the position() of the returned buffer. Each returns
> > a new ByteBuffer, so that clients may call the incremental get and put
> > methods without interfering. An alternative is to have only the first,
> > but expose the index calculation helpers so one can set the position
> > easily in complex cases.
> >
> > Writing the test code first was quite helpful in this case.
> >
> > The current getNIOByteBuffer attempts to set the buffer limit according
> > to the actual data extent in the view (which is not the whole underlying
> > byte array when it's a slice). This seems unnecessary, and is only
> > useful in the contiguous case. I figure you should always get the whole
> > thing, then work out how many items to read and write from the
> > navigation, not from ByteBuffer.remaining().
> >
> > Ok, in CPython 2.7 the reference to the underlying object is present in
> > the code, just missing from the documentation. I think we can
> > accommodate it.
> >
> >    Jeff Allen
> >
> > On 28/04/2016 11:06, Stefan Richthofer wrote:
> >>> this is a breaking change to the API
> >> I think this can be achieved without a breaking API-change (detailed comments below).
> >> (However if you prefer a slight break to achieve a cleaner API I won't complain.)
> >>    
> >> The possibility that an array-storage access cannot be provided is already contained in the API.
> >> If the flag AS_ARRAY is not set, the current API already doesn't guarantee to offer array-access (via PyBuffer.Pointer).
> >>
> >> Does a type-change of storage field in BaseBuffer count as breaking API change, given that it is
> >> protected? Third-parties that extend BaseBuffer might be affected, which can be avoided by
> >> option 1), i.e. adding ByteBuffer view as a separate field, e.g. "storageBufferView".
> >> We could start with option 1), declare the byte[]-storage field as deprecated and remove it in
> >> 2.7.4 or so. This would provide a smooth transition to variant 2).
> >>    
> >> Replacing PyBuffer.Pointer by ByteBuffer would be a breaking change, but could be avoided too. In Java
> >> fashion PyBuffer.Pointer and corresponding API/methods can be kept as @deprecated. (Or just kept -
> >> I am actually +0 about replacing PyBuffer.Pointer with ByteBuffer)
> >>
> >>
> >>> Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature,
> >> which may be why I haven't replicated it.
> >>    
> >> Taking another look it seems like this feature was actually backported to Python 2. Py_buffer declaration in
> >> object.h of Python 2.7 is titled /* Py3k buffer interface */. However, for sake of compatibility it would
> >> be best to support it in Jython, given that it is (presumably) easy to add.
> >>
> >>
> >> Do I miss some aspect?
> >>
> >> -Stefan
> >>    
> >>
> >> Gesendet: Mittwoch, 27. April 2016 um 23:57 Uhr
> >> Von: "Jim Baker" <jim...@py...>
> >> An: "Jeff Allen" <ja...@fa...>
> >> Cc: "Stefan Richthofer" <Ste...@gm...>, "Jython Developers" <jyt...@li...>
> >> Betreff: Re: [Jython-dev] Jython buffer protocol
> >>
> >> On Wed, Apr 27, 2016 at 4:36 PM, Jeff Allen <ja...@fa...> wrote:
> >> I'm giving serious consideration to idea 2, that is, the storage implementation is j.n.ByteBuffer, always, and *may* wrap a byte[] object. I'd need to try this out to ensure there is no fatal flaw.
> >> *Jim:* this is a breaking change to the API. Do we need to be more careful of possible users? I suspect we are only breaking our own work here: how about you?
> >>
> >> We should mention such a breaking change. Necessarily we have been very conservative on various aspects of our Java API - there is certainly usage out there. But that has been seen in 2.5 or earlier API definitions. I don't see a problem here - any users will be sophisticated and can readily adapt.
> >>    
> >> We would be saying in this that the Jython PyBuffer is allowed to be less like the CPython one than I've been aiming for. This consistency may be less important than Stefan's use case. The CPython protocol promises efficient access to the storage of an object via a pointer, and we would be saying "only as efficient as a j.n.ByteBuffer" ... although it may turn out there's a backing array. j.n.ByteBuffer does not replace PyBuffer, because it cannot describe strided access or the get-release behaviour.
> >> I think this leads to an API in which what I've tried to do with PyBuffer.Pointer we now do by handing out ByteBuffer slices. So Pointer goes away. In that case getBuf() and getNIOByteBuffer() are probably the same thing. I do not think it is safe to hand out the actual storage: it is almost unavoidable clients would manipulate the internal state (position, limit), surprising each other and the PyBuffer implementation if it relies on them, as I think it should.
> >> Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, which may be why I haven't replicated it. It seems easy to add. (I'd be rewriting all the constructors anyway.) In CPython it's null when there's a buffer but no object.
> >>    
> >> Jeff Allen
> >>
> >> On 24/04/2016 15:36, Stefan Richthofer wrote:
> >> Jeff,
> >>
> >> good to hear that you can help with this stuff and also that your answer implies you don't have concerns with the new feature itself. Thinking it through again, I think the following way would be cleanest to add this functionality:
> >>
> >> Add a ByteBuffer-type storage, either exclusively or in addition to byte[] storage.
> >>
> >>
> >>
> >> 1) Version with additional field java.nio.ByteBuffer bufferStorage:
> >>
> >> Case byte[]-backed PyBuffer:
> >> (buffer storage must be view on storage, i.e. backed by it and must always point to first element)
> >>
> >> storage is byte[]
> >> bufferStorage is ByteBuffer.wrap(storage)
> >>
> >> getNIOByteBuffer() can use bufferStorage and needn't call ByteBuffer.wrap every time again.
> >>
> >>
> >> Case direct ByteBuffer (likely not having backing array):
> >>
> >> storage is null or if the JVM happens to be capable of providing direct ByteBuffer with byte[] backend: bufferStorage.array()
> >>
> >> bufferStorage is ByteBuffer.allocateDirect(capacity)
> >>
> >> Methods that used to access elements of storage directly are enriched by a fallback for case storage == null. The fallback would directly operate on bufferStorage.
> >>
> >>
> >>
> >>
> >> 2) Version with exclusive Buffer-storage:
> >>
> >> storage type is java.nioByteBuffer instead of byte[]
> >>
> >>
> >> Case byte[]-backed PyBuffer:
> >>
> >> storage is ByteBuffer.allocate(capacity) (i.e. non-Direct, so buffer will have backing array!)
> >>
> >> getNIOByteBuffer() can use storage and needn't call ByteBuffer.wrap.
> >>
> >> Methods that used to access elements of storage directly now do this on storage.array() rather than on storage itself (should be doable by a simple search/replace refactoring more or less).
> >>
> >>
> >> Case direct ByteBuffer (likely not having backing array):
> >>
> >> bufferStorage is ByteBuffer.allocateDirect(capacity)
> >>
> >> Methods that used to access elements of storage directly are enriched by a fallback for case storage.hasArray() == false. The fallback would directly operate on storage's ByteBuffer methods.
> >>
> >>
> >> I can do the work of writing the fallbacks or help with it up to your discretion.
> >>
> >>
> >> Then another thing: I noticed CPython's PyBuffer-pendant contains a reference to the PyObject that exported it, so you can always find the origin of a given PyBuffer. I don't see how this would be feasible with Jython's current PyBuffer implementation. So from JyNI perspective I can store (as a mapping) the exporter in case it is known for some reason, e.g. because PyBuffer was converted from a native CPython-like variant.
> >> However there could be situations where the buffer comes from Jython and the origin would be unknown. In that case I would (currently) just provide a NULL-value or PyNone for this field and hope to get away with it for the important extensions. Maybe we could attach a PyBuffer's origin in Jython too...? (e.g. as a JyAttribute only if some global flag is set, which JyNI would then set on load).
> >>
> >> Best
> >>
> >> Stefan
> >>
> >>
> >>
> >> Gesendet: Samstag, 23. April 2016 um 20:14 Uhr
> >> Von: "Jeff Allen" <ja...@fa...>
> >> An: "Stefan Richthofer" <Ste...@gm...>
> >> Cc: jim...@py...
> >> Betreff: Re: [Jython-dev] Jython buffer protocol
> >>
> >> Hi Stefan.
> >>
> >> Refreshing my memory about how these classes work, I can see that I took
> >> at face value the CPython view that the purpose of the buffer interface
> >> is to give clients access to the underlying array of bytes, so
> >> abstraction of the storage always gave way to what I thought would be
> >> efficient. (Abstraction of the unit to be something other than byte is
> >> sketched but clarity and a use case eluded me.)
> >>
> >> I always feel I've failed if I have to cast. My instinct is for option a.
> >>
> >> But I think you would not create a "Direct" parallel to BaseBuffer,
> >> since it contains a lot of helper methods independent of the storage
> >> implementation. Rather, factor it into two layers, the first being
> >> either BaseBuffer or AbstractBuffer (depending on what causes least
> >> pain) and the next layer being two base classes, one the revised
> >> BaseBuffer containing:
> >>        protected byte[] storage;
> >> and the other containing:
> >>        protected ByteBuffer storage;
> >> And in each you migrate case whatever it seems natural should come along
> >> with these declarations.
> >>
> >> I've been meaning to get back to Jython: I could do this groundwork if
> >> that would not be confusing.
> >>
> >> Jeff
> >>
> >> Jeff Allen
> >>
> >> On 22/04/2016 21:50, Stefan Richthofer wrote:
> >> Hello Jeff,
> >>
> >> I'm warming up this old thread, because I am about to start actual work on JyNI's support
> >> for buffer-protocol / the PyBuffer builtin type.
> >> I'd like to point you to my recent pull request https://github.com/jythontools/jython/pull/39[https://github.com/jythontools/jython/pull/39].
> >> It's a preliminary step for adding support for direct java.nio.ByteBuffers. After establishing this flag
> >> I am going to add some actual support for it. I see basically two ways to go for this
> >>
> >> a) Create a parallel class hierarchy to BaseBuffer et al, backed by direct ByteBuffers. E.g.
> >> call everything with "Direct": DirectBaseBuffer, DirectSimpleBuffer etc.
> >> Then let BufferProtocol implementers check for the flag and use Direct counterpart of the
> >> usually used Buffer-Class accordingly.
> >>
> >> or
> >>
> >> b) Modify existing BaseBuffer such that storage is Object rather than byte[]. Then according to
> >> flags it will be byte[] or ByteBuffer. This variant will involve more explicit type casting than
> >> a), but would involve fewer new classes however.
> >>
> >> What is your opinion about this?
> >>
> >> Best
> >>
> >> Stefan
> >>
> >
> > ------------------------------------------------------------------------------
> > Find and fix application performance issues faster with Applications Manager
> > Applications Manager provides deep performance insights into multiple tiers of
> > your business applications. It resolves application problems quickly and
> > reduces your MTTR. Get your free trial!
> > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z
> > _______________________________________________
> > Jython-dev mailing list
> > Jyt...@li...
> > https://lists.sourceforge.net/lists/listinfo/jython-dev
> >
> 
> 
> ------------------------------------------------------------------------------
> Mobile security can be enabling, not merely restricting. Employees who
> bring their own devices (BYOD) to work are irked by the imposition of MDM
> restrictions. Mobile Device Manager Plus allows you to control only the
> apps on BYO-devices by containerizing them, leaving personal data untouched!
> https://ad.doubleclick.net/ddm/clk/304595813;131938128;j
> _______________________________________________
> Jython-dev mailing list
> Jyt...@li...
> https://lists.sourceforge.net/lists/listinfo/jython-dev
>