From: Stefan R. <Ste...@gm...> - 2014-05-29 13:17:50
|
Hello, I am currently working out the support of garbage collection and weak references in JyNI. For weak references, I would use the _weakref module that is already included in Jython, but I need to tweak it a bit, because JyNI needs to know, whether a PyCPeer (i.e. a subclass of PyObject that JyNI uses to wrap native objects in some cases) is weakly referenced. The reason for this is - roughly speaking - that in case of weak references, the native object should keep the peer alive (i.e. use a global ref in JNI), while normal references should use a peer that keeps the native object alive and not vise versa (i.e. use a weak global ref in JNI). So there will be potentially 2 versions of PyCPeer having the keep-alive relation PyCPeer ---keeps alive---> native object ---keeps alive---> PyCPeer Normal references should refere to the leftmost one while weak references should refere to the rightmost one (the python level equality function will report them as equal). Of course the default implementation of _weakref is not aware of this. So I made up roughly two (three) competing ideas how to do it. 0) The official way to get those weak references that point to a specific object is WeakrefModule.getweakrefs(...) However I would have to poll that method, which is an unprecise and unefficient solution. (So I don't count it as solution) Once a new weak reference to a PyCPeer would show up, I would replace its referent by the right version. Since GlobalRef extends java.lang.ref.WeakReference, which does not allow to modify the referent, this would involve creating a new GlobalRef and replacing the old one. This is rather tweaky since that is stored in a private field (i.e. I would use reflection with setAccessible(true); or let native code do the operation). 1) Use org.python.modules._weakref.GlobalRef.references to keep track. I would write a List implementation that wraps some other list as backend but reports modifications to a listener. I would use the original org.python.modules._weakref.GlobalRef.references as backend and insert my custom list into the org.python.modules._weakref.GlobalRef.references-field. Since it is a private field, this would be tweaky again, but could be done the same way as mentioned above. The rest would be like in "solution" 0) but the polling would be avoided. 2) Write an adjusted version of the _weakref-module (i.e. that checks for PyCPeer BEFORE creating GlobalRefs - so it works completely without fumbling with private fields and avoids useless recreation of GlobalRefs). I tried this first, because I consider it the least tweaky solution, but ran into problems I can't resolve. For testing and a proof of concept, I took the _weakref source code, put it into a new package JyNI._weakref and only modified the doc-string to be able to distinguish the modules. Then I adjusted the JyNI-initializer as follows: public void initialize(Properties preProperties, Properties postProperties, String[] argv, ClassLoader classLoader, ExtensiblePyObjectAdapter adapter) { //Customize the _weakref module to JyNI-needs: String blti = (String) postProperties.get("python.modules.builtin"); postProperties.put("python.modules.builtin", blti == null ? "_weakref:JyNI._weakref.WeakrefModule" : blti+",_weakref:JyNI._weakref.WeakrefModule"); //init PySystemState: PySystemState initState = PySystemState.doInitialize(preProperties, postProperties, argv, classLoader, adapter); //further initialization stuff... This would overwrite the _weakref-builtin with my custom version. If I do import _weakref print _weakref.__doc__ it works nicely and indeed prints the doc-string of my custom module. But if I do from _weakref import ref print ref bla = "bla" test = ref(bla) print test original Jython outputs: <type 'weakref'> <weakref at 0x2; to 'str' at 0x3> (i.e. it works fine), while the output using my initializer is: <type 'JyNI._weakref.ReferenceType'> Traceback (most recent call last): File "/home/stefan/eclipseWorkspace/JyNI/JyNI-Demo/src/JyNIWeakRefTest.py", line 41, in <module> test = ref(bla) TypeError: JyNI._weakref.ReferenceType(): expected 2-3 args; got 1 So obviously, my tweak fails to make up the right type for ref and yields no appropriate mro. However I have no clue how this comes, since Jython should initialize the custom module just the same way it would do with or.python.modules._weakref.WeakrefModule - it is absolutely identical apart from the doc-string and the package path (but that is adjusted in the properties). Can someone tell me why approach 2) fails and maybe how to fix it? Or should I work out approach 1) instead? Or do you even have a better idea how to reach my goal? Thanks in advance. -Stefan |
From: Jim B. <jim...@py...> - 2014-05-29 22:59:51
|
Stefan, Glad to hear of the progress on JyNI. Getting weakrefs working is going to be essential to JyNI supporting the CPython memory model in conjunction with Java's. We likely need another solution, one that's closer to approach #0, but without the poll. Basically it should be possible for using code to register callbacks, with RefReaperThread calling each of the registered callbacks for each GlobalRef object. A callback could enqueue (onto a LinkedBlockingQueue for example) or do some other processing. This should be general enough for your management purposes. Note that the current implementation causes a recurring issue we have seen in how threads can interact with class loaders: http://bugs.jython.org/issue2127 This simply means we can fix this problem in revisiting this support! Interestingly, Jiwon Seo requested a somewhat similar notification mechanism here, but with respect to generated bytecode, https://bitbucket.org/jython/jython/pull-request/41/adding-bytecode-notification-mechanism This makes sense to me - we are now trying to build more sophisticated integrations with Jython. (Hopefully we will get that merged in soon, sorry that has taken so long, Jiwon!) - Jim On Thu, May 29, 2014 at 7:17 AM, Stefan Richthofer <Ste...@gm... > wrote: > Hello, > > I am currently working out the support of garbage collection and weak > references in JyNI. For weak references, I would use the _weakref module > that is already included in Jython, but I need to tweak it a bit, because > JyNI needs to know, whether a PyCPeer (i.e. a subclass of PyObject that > JyNI uses to wrap native objects in some cases) is weakly referenced. The > reason for this is - roughly speaking - that in case of weak references, > the native object should keep the peer alive (i.e. use a global ref in > JNI), while normal references should use a peer that keeps the native > object alive and not vise versa (i.e. use a weak global ref in JNI). So > there will be potentially 2 versions of PyCPeer having the keep-alive > relation > PyCPeer ---keeps alive---> native object ---keeps alive---> PyCPeer > Normal references should refere to the leftmost one while weak references > should refere to the rightmost one (the python level equality function will > report them as equal). > > Of course the default implementation of _weakref is not aware of this. So > I made up roughly two (three) competing ideas how to do it. > > 0) The official way to get those weak references that point to a specific > object is WeakrefModule.getweakrefs(...) > However I would have to poll that method, which is an unprecise and > unefficient solution. (So I don't count it as solution) > Once a new weak reference to a PyCPeer would show up, I would replace its > referent by the right version. Since GlobalRef extends > java.lang.ref.WeakReference, which does not allow to modify the referent, > this would involve creating a new GlobalRef and replacing the old one. This > is rather tweaky since that is stored in a private field (i.e. I would use > reflection with setAccessible(true); or let native code do the operation). > > 1) Use org.python.modules._weakref.GlobalRef.references to keep track. > I would write a List implementation that wraps some other list as backend > but reports modifications to a listener. I would use the original > org.python.modules._weakref.GlobalRef.references as backend and insert my > custom list into the > org.python.modules._weakref.GlobalRef.references-field. Since it is a > private field, this would be tweaky again, but could be done the same way > as mentioned above. The rest would be like in "solution" 0) but the polling > would be avoided. > > 2) Write an adjusted version of the _weakref-module (i.e. that checks for > PyCPeer BEFORE creating GlobalRefs - so it works completely without > fumbling with private fields and avoids useless recreation of GlobalRefs). > I tried this first, because I consider it the least tweaky solution, but > ran into problems I can't resolve. For testing and a proof of concept, I > took the _weakref source code, put it into a new package JyNI._weakref and > only modified the doc-string to be able to distinguish the modules. > > Then I adjusted the JyNI-initializer as follows: > > public void initialize(Properties preProperties, Properties > postProperties, String[] argv, ClassLoader classLoader, > ExtensiblePyObjectAdapter adapter) > { > //Customize the _weakref module to JyNI-needs: > String blti = (String) > postProperties.get("python.modules.builtin"); > postProperties.put("python.modules.builtin", blti == null > ? "_weakref:JyNI._weakref.WeakrefModule" : > blti+",_weakref:JyNI._weakref.WeakrefModule"); > > //init PySystemState: > PySystemState initState = > PySystemState.doInitialize(preProperties, postProperties, argv, > classLoader, adapter); > > //further initialization stuff... > > > This would overwrite the _weakref-builtin with my custom version. If I do > > import _weakref > print _weakref.__doc__ > > it works nicely and indeed prints the doc-string of my custom module. > But if I do > > from _weakref import ref > print ref > bla = "bla" > test = ref(bla) > print test > > original Jython outputs: > > <type 'weakref'> > <weakref at 0x2; to 'str' at 0x3> > > (i.e. it works fine), while the output using my initializer is: > > <type 'JyNI._weakref.ReferenceType'> > Traceback (most recent call last): > File > "/home/stefan/eclipseWorkspace/JyNI/JyNI-Demo/src/JyNIWeakRefTest.py", line > 41, in <module> > test = ref(bla) > TypeError: JyNI._weakref.ReferenceType(): expected 2-3 args; got 1 > > > So obviously, my tweak fails to make up the right type for ref and yields > no appropriate mro. However I have no clue how this comes, since Jython > should initialize the custom module just the same way it would do with > or.python.modules._weakref.WeakrefModule - it is absolutely identical apart > from the doc-string and the package path (but that is adjusted in the > properties). > > Can someone tell me why approach 2) fails and maybe how to fix it? Or > should I work out approach 1) instead? Or do you even have a better idea > how to reach my goal? > > Thanks in advance. > > -Stefan > > > ------------------------------------------------------------------------------ > Time is money. Stop wasting it! Get your web API in 5 minutes. > www.restlet.com/download > http://p.sf.net/sfu/restlet > _______________________________________________ > Jython-dev mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-dev > |
From: Stefan R. <Ste...@gm...> - 2014-05-30 07:01:54
|
<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div> <div>Hello Jim,</div> <div>thanks for the response. I also thought about using weakref's callback-mechanism. The problem here is that callbacks are processed when the referent is collected. However this is too late for my purpose. Without the adjustment, the PyCPeers (if in the role of referents) might be collected too early, i.e. when the native object is still available in fact (but kept alive by native pointers). So my adjustment must take place before the CPeer is collected, optimally directly when a weak reference is created. I also thought about reviving the weakref (i.e. GlobalRef) on AbstractReference-level once the callback is received. Problem here is that other callbacks might be registered for the CPeer. These would be called by the RefReaperThread equally to my own callback, receive wrong alerts and do all sorts of cleanup too early. I don't see a good way to tell the RefReaperThread which callbacks to call and which not. So I would prefere solution #1. I am still keen on understanding why #2 fails. Any clue?</div> <div> </div> <div>- Stefan</div> <div> <div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"> <div style="margin:0 0 10px 0;"><b>Gesendet:</b> Freitag, 30. Mai 2014 um 00:59 Uhr<br/> <b>Von:</b> "Jim Baker" <jim...@py...><br/> <b>An:</b> "Stefan Richthofer" <Ste...@gm...><br/> <b>Cc:</b> "Jython Developers" <jyt...@li...><br/> <b>Betreff:</b> Re: [Jython-dev] weak reference support in JyNI</div> <div name="quoted-content"> <div>Stefan, <div> </div> <div>Glad to hear of the progress on JyNI. Getting weakrefs working is going to be essential to JyNI supporting the CPython memory model in conjunction with Java's.</div> <div> </div> <div>We likely need another solution, one that's closer to approach #0, but without the poll. Basically it should be possible for using code to register callbacks, with RefReaperThread calling each of the registered callbacks for each GlobalRef object. A callback could enqueue (onto a LinkedBlockingQueue for example) or do some other processing. This should be general enough for your management purposes.</div> <div> </div> <div>Note that the current implementation causes a recurring issue we have seen in how threads can interact with class loaders: <a href="http://bugs.jython.org/issue2127" target="_blank">http://bugs.jython.org/issue2127</a> This simply means we can fix this problem in revisiting this support!</div> <div> </div> <div>Interestingly, Jiwon Seo requested a somewhat similar notification mechanism here, but with respect to generated bytecode, <a href="https://bitbucket.org/jython/jython/pull-request/41/adding-bytecode-notification-mechanism" target="_blank">https://bitbucket.org/jython/jython/pull-request/41/adding-bytecode-notification-mechanism</a> This makes sense to me - we are now trying to build more sophisticated integrations with Jython. (Hopefully we will get that merged in soon, sorry that has taken so long, Jiwon!)</div> <div> </div> <div>- Jim</div> <div> </div> </div> <div class="gmail_extra"> <div class="gmail_quote">On Thu, May 29, 2014 at 7:17 AM, Stefan Richthofer <span><<a href="Ste...@gm..." target="_parent">Ste...@gm...</a>></span> wrote: <blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-left: 1.0ex;">Hello,<br/> <br/> I am currently working out the support of garbage collection and weak references in JyNI. For weak references, I would use the _weakref module that is already included in Jython, but I need to tweak it a bit, because JyNI needs to know, whether a PyCPeer (i.e. a subclass of PyObject that JyNI uses to wrap native objects in some cases) is weakly referenced. The reason for this is - roughly speaking - that in case of weak references, the native object should keep the peer alive (i.e. use a global ref in JNI), while normal references should use a peer that keeps the native object alive and not vise versa (i.e. use a weak global ref in JNI). So there will be potentially 2 versions of PyCPeer having the keep-alive relation<br/> PyCPeer ---keeps alive---> native object ---keeps alive---> PyCPeer<br/> Normal references should refere to the leftmost one while weak references should refere to the rightmost one (the python level equality function will report them as equal).<br/> <br/> Of course the default implementation of _weakref is not aware of this. So I made up roughly two (three) competing ideas how to do it.<br/> <br/> 0) The official way to get those weak references that point to a specific object is WeakrefModule.getweakrefs(...)<br/> However I would have to poll that method, which is an unprecise and unefficient solution. (So I don't count it as solution)<br/> Once a new weak reference to a PyCPeer would show up, I would replace its referent by the right version. Since GlobalRef extends java.lang.ref.WeakReference, which does not allow to modify the referent, this would involve creating a new GlobalRef and replacing the old one. This is rather tweaky since that is stored in a private field (i.e. I would use reflection with setAccessible(true); or let native code do the operation).<br/> <br/> 1) Use org.python.modules._weakref.GlobalRef.references to keep track.<br/> I would write a List implementation that wraps some other list as backend but reports modifications to a listener. I would use the original org.python.modules._weakref.GlobalRef.references as backend and insert my custom list into the org.python.modules._weakref.GlobalRef.references-field. Since it is a private field, this would be tweaky again, but could be done the same way as mentioned above. The rest would be like in "solution" 0) but the polling would be avoided.<br/> <br/> 2) Write an adjusted version of the _weakref-module (i.e. that checks for PyCPeer BEFORE creating GlobalRefs - so it works completely without fumbling with private fields and avoids useless recreation of GlobalRefs).<br/> I tried this first, because I consider it the least tweaky solution, but ran into problems I can't resolve. For testing and a proof of concept, I took the _weakref source code, put it into a new package JyNI._weakref and only modified the doc-string to be able to distinguish the modules.<br/> <br/> Then I adjusted the JyNI-initializer as follows:<br/> <br/> public void initialize(Properties preProperties, Properties postProperties, String[] argv, ClassLoader classLoader, ExtensiblePyObjectAdapter adapter)<br/> {<br/> //Customize the _weakref module to JyNI-needs:<br/> String blti = (String) postProperties.get("python.modules.builtin");<br/> postProperties.put("python.modules.builtin", blti == null ? "_weakref:JyNI._weakref.WeakrefModule" : blti+",_weakref:JyNI._weakref.WeakrefModule");<br/> <br/> //init PySystemState:<br/> PySystemState initState = PySystemState.doInitialize(preProperties, postProperties, argv, classLoader, adapter);<br/> <br/> //further initialization stuff...<br/> <br/> <br/> This would overwrite the _weakref-builtin with my custom version. If I do<br/> <br/> import _weakref<br/> print _weakref.__doc__<br/> <br/> it works nicely and indeed prints the doc-string of my custom module.<br/> But if I do<br/> <br/> from _weakref import ref<br/> print ref<br/> bla = "bla"<br/> test = ref(bla)<br/> print test<br/> <br/> original Jython outputs:<br/> <br/> <type 'weakref'><br/> <weakref at 0x2; to 'str' at 0x3><br/> <br/> (i.e. it works fine), while the output using my initializer is:<br/> <br/> <type 'JyNI._weakref.ReferenceType'><br/> Traceback (most recent call last):<br/> File "/home/stefan/eclipseWorkspace/JyNI/JyNI-Demo/src/JyNIWeakRefTest.py", line 41, in <module><br/> test = ref(bla)<br/> TypeError: JyNI._weakref.ReferenceType(): expected 2-3 args; got 1<br/> <br/> <br/> So obviously, my tweak fails to make up the right type for ref and yields no appropriate mro. However I have no clue how this comes, since Jython should initialize the custom module just the same way it would do with or.python.modules._weakref.WeakrefModule - it is absolutely identical apart from the doc-string and the package path (but that is adjusted in the properties).<br/> <br/> Can someone tell me why approach 2) fails and maybe how to fix it? Or should I work out approach 1) instead? Or do you even have a better idea how to reach my goal?<br/> <br/> Thanks in advance.<br/> <br/> -Stefan<br/> <br/> ------------------------------------------------------------------------------<br/> Time is money. Stop wasting it! Get your web API in 5 minutes.<br/> <a href="http://www.restlet.com/download" target="_blank">www.restlet.com/download</a><br/> <a href="http://p.sf.net/sfu/restlet" target="_blank">http://p.sf.net/sfu/restlet</a><br/> _______________________________________________<br/> Jython-dev mailing list<br/> <a href="Jyt...@li..." target="_parent">Jyt...@li...</a><br/> <a href="https://lists.sourceforge.net/lists/listinfo/jython-dev" target="_blank">https://lists.sourceforge.net/lists/listinfo/jython-dev</a></blockquote> </div> </div> ------------------------------------------------------------------------------ Time is money. Stop wasting it! Get your web API in 5 minutes. <a href="http://www.restlet.com/download" target="_blank">www.restlet.com/download</a> <a href="http://p.sf.net/sfu/restlet_______________________________________________" target="_blank">http://p.sf.net/sfu/restlet_______________________________________________</a> Jython-dev mailing list Jyt...@li... <a href="https://lists.sourceforge.net/lists/listinfo/jython-dev" target="_blank">https://lists.sourceforge.net/lists/listinfo/jython-dev</a></div> </div> </div> </div></div></body></html> |
From: Jim B. <jim...@py...> - 2014-05-30 14:44:19
|
Stefan, Rather than monkeypatching a custom _weakref module from JyNI (solution #2), it would seem better if we developed a better API, much as I tried to do with the callback scheme in my previous email. This would then let you build an efficient mechanism for your solution #1 and without any problematic "tweaks". But in doing so, we can also revisit the old scheme of the private static field references being an ArrayList in _weakref.GlobalRef, and the management of AbstractRef objects in it. There easily seem to be additional O(n) and O(n^2) factors in this code that we could eliminate, not to mention possibly also removing the synchronized access, while also providing support for your needs. In part, I'm pretty sure this is due to the fact that weakref was apparently part of Jython 2.1 (anyone want to verify this?), and we really haven't looked at its performance in subsequent releases, just correctness. (I even touched it at one point to change references from Vector to ArrayList, but I did so as part of a general refactoring of the Jython codebase at the end of 2008.) In general, I'm a big fan of Google Guava collections, which we already use extensively in Jython's implementation; I'm pretty sure we can find a better mapping. I'm on vacation today through the weekend, so I'm sure I'll have more analysis next week. Looking forward to a good conversation. - Jim On Fri, May 30, 2014 at 1:01 AM, Stefan Richthofer <Ste...@gm... > wrote: > Hello Jim, > thanks for the response. I also thought about using weakref's > callback-mechanism. The problem here is that callbacks are processed when > the referent is collected. However this is too late for my purpose. Without > the adjustment, the PyCPeers (if in the role of referents) might be > collected too early, i.e. when the native object is still available in fact > (but kept alive by native pointers). So my adjustment must take place > before the CPeer is collected, optimally directly when a weak reference is > created. I also thought about reviving the weakref (i.e. GlobalRef) on > AbstractReference-level once the callback is received. Problem here is that > other callbacks might be registered for the CPeer. These would be called by > the RefReaperThread equally to my own callback, receive wrong alerts and do > all sorts of cleanup too early. I don't see a good way to tell the > RefReaperThread which callbacks to call and which not. So I would prefere > solution #1. I am still keen on understanding why #2 fails. Any clue? > > - Stefan > > *Gesendet:* Freitag, 30. Mai 2014 um 00:59 Uhr > *Von:* "Jim Baker" <jim...@py...> > *An:* "Stefan Richthofer" <Ste...@gm...> > *Cc:* "Jython Developers" <jyt...@li...> > *Betreff:* Re: [Jython-dev] weak reference support in JyNI > Stefan, > > Glad to hear of the progress on JyNI. Getting weakrefs working is going to > be essential to JyNI supporting the CPython memory model in conjunction > with Java's. > > We likely need another solution, one that's closer to approach #0, but > without the poll. Basically it should be possible for using code to > register callbacks, with RefReaperThread calling each of the registered > callbacks for each GlobalRef object. A callback could enqueue (onto a > LinkedBlockingQueue for example) or do some other processing. This should > be general enough for your management purposes. > > Note that the current implementation causes a recurring issue we have seen > in how threads can interact with class loaders: > http://bugs.jython.org/issue2127 This simply means we can fix this > problem in revisiting this support! > > Interestingly, Jiwon Seo requested a somewhat similar notification > mechanism here, but with respect to generated bytecode, > https://bitbucket.org/jython/jython/pull-request/41/adding-bytecode-notification-mechanism > This makes sense to me - we are now trying to build more sophisticated > integrations with Jython. (Hopefully we will get that merged in soon, sorry > that has taken so long, Jiwon!) > > - Jim > > > On Thu, May 29, 2014 at 7:17 AM, Stefan Richthofer < > Ste...@gm...> wrote: >> >> Hello, >> >> I am currently working out the support of garbage collection and weak >> references in JyNI. For weak references, I would use the _weakref module >> that is already included in Jython, but I need to tweak it a bit, because >> JyNI needs to know, whether a PyCPeer (i.e. a subclass of PyObject that >> JyNI uses to wrap native objects in some cases) is weakly referenced. The >> reason for this is - roughly speaking - that in case of weak references, >> the native object should keep the peer alive (i.e. use a global ref in >> JNI), while normal references should use a peer that keeps the native >> object alive and not vise versa (i.e. use a weak global ref in JNI). So >> there will be potentially 2 versions of PyCPeer having the keep-alive >> relation >> PyCPeer ---keeps alive---> native object ---keeps alive---> PyCPeer >> Normal references should refere to the leftmost one while weak references >> should refere to the rightmost one (the python level equality function will >> report them as equal). >> >> Of course the default implementation of _weakref is not aware of this. So >> I made up roughly two (three) competing ideas how to do it. >> >> 0) The official way to get those weak references that point to a specific >> object is WeakrefModule.getweakrefs(...) >> However I would have to poll that method, which is an unprecise and >> unefficient solution. (So I don't count it as solution) >> Once a new weak reference to a PyCPeer would show up, I would replace its >> referent by the right version. Since GlobalRef extends >> java.lang.ref.WeakReference, which does not allow to modify the referent, >> this would involve creating a new GlobalRef and replacing the old one. This >> is rather tweaky since that is stored in a private field (i.e. I would use >> reflection with setAccessible(true); or let native code do the operation). >> >> 1) Use org.python.modules._weakref.GlobalRef.references to keep track. >> I would write a List implementation that wraps some other list as backend >> but reports modifications to a listener. I would use the original >> org.python.modules._weakref.GlobalRef.references as backend and insert my >> custom list into the >> org.python.modules._weakref.GlobalRef.references-field. Since it is a >> private field, this would be tweaky again, but could be done the same way >> as mentioned above. The rest would be like in "solution" 0) but the polling >> would be avoided. >> >> 2) Write an adjusted version of the _weakref-module (i.e. that checks for >> PyCPeer BEFORE creating GlobalRefs - so it works completely without >> fumbling with private fields and avoids useless recreation of GlobalRefs). >> I tried this first, because I consider it the least tweaky solution, but >> ran into problems I can't resolve. For testing and a proof of concept, I >> took the _weakref source code, put it into a new package JyNI._weakref and >> only modified the doc-string to be able to distinguish the modules. >> >> Then I adjusted the JyNI-initializer as follows: >> >> public void initialize(Properties preProperties, Properties >> postProperties, String[] argv, ClassLoader classLoader, >> ExtensiblePyObjectAdapter adapter) >> { >> //Customize the _weakref module to JyNI-needs: >> String blti = (String) >> postProperties.get("python.modules.builtin"); >> postProperties.put("python.modules.builtin", blti == null >> ? "_weakref:JyNI._weakref.WeakrefModule" : >> blti+",_weakref:JyNI._weakref.WeakrefModule"); >> >> //init PySystemState: >> PySystemState initState = >> PySystemState.doInitialize(preProperties, postProperties, argv, >> classLoader, adapter); >> >> //further initialization stuff... >> >> >> This would overwrite the _weakref-builtin with my custom version. If I do >> >> import _weakref >> print _weakref.__doc__ >> >> it works nicely and indeed prints the doc-string of my custom module. >> But if I do >> >> from _weakref import ref >> print ref >> bla = "bla" >> test = ref(bla) >> print test >> >> original Jython outputs: >> >> <type 'weakref'> >> <weakref at 0x2; to 'str' at 0x3> >> >> (i.e. it works fine), while the output using my initializer is: >> >> <type 'JyNI._weakref.ReferenceType'> >> Traceback (most recent call last): >> File >> "/home/stefan/eclipseWorkspace/JyNI/JyNI-Demo/src/JyNIWeakRefTest.py", line >> 41, in <module> >> test = ref(bla) >> TypeError: JyNI._weakref.ReferenceType(): expected 2-3 args; got 1 >> >> >> So obviously, my tweak fails to make up the right type for ref and yields >> no appropriate mro. However I have no clue how this comes, since Jython >> should initialize the custom module just the same way it would do with >> or.python.modules._weakref.WeakrefModule - it is absolutely identical apart >> from the doc-string and the package path (but that is adjusted in the >> properties). >> >> Can someone tell me why approach 2) fails and maybe how to fix it? Or >> should I work out approach 1) instead? Or do you even have a better idea >> how to reach my goal? >> >> Thanks in advance. >> >> -Stefan >> >> >> ------------------------------------------------------------------------------ >> Time is money. Stop wasting it! Get your web API in 5 minutes. >> www.restlet.com/download >> http://p.sf.net/sfu/restlet >> _______________________________________________ >> Jython-dev mailing list >> Jyt...@li... >> https://lists.sourceforge.net/lists/listinfo/jython-dev > > ------------------------------------------------------------------------------ > Time is money. Stop wasting it! Get your web API in 5 minutes. > www.restlet.com/download > http://p.sf.net/sfu/restlet_______________________________________________ > Jython-dev mailing list Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-dev > |
From: Stefan R. <Ste...@gm...> - 2014-05-30 15:49:50
|
<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div> <div>Alright.</div> <div>>>it would seem better if we developed a better API</div> <div>I'm not sure, whether this would result in better API, since the JyNI purpose is very special and opening an official gate for this</div> <div>might encourage nasty or even malicious hacks. So it might be better to simply keep it semi-closed and use tweaky solutions in</div> <div>the rare cases when it is needed. That said, let me state, what API I would need.</div> <div> </div> <div>I would need an interface like</div> <div> </div> <div> <div>public interface WeakrefListener {</div> <div> public void weakrefCreated(AbstractReference ref);</div> <div> public void weakrefDisposed(AbstractReference ref); (<- I would not need this one, but it would be a more consequent realization of this functionality)</div> <div>}</div> </div> <div> </div> <div>Any PyObject implementing this interface would be notified by the _weakref module once it becomes a referent or stops being one.</div> <div>However this would not yet fix the tweak neccessary to replace the referent by kind of a proxy (what I need to fix keep-alive relations</div> <div>with native objects). To fix even this, I would need something like</div> <div> </div> <div>public interface WeakrefDelegate {</div> <div> PyObject getReferentProxy(PyObject referent);</div> <div>}</div> <div> </div> <div>which is even more radical. When creating a weak reference, the _weakref module would first check whether the referent implements</div> <div>this interface and - if true - would use the obtained proxy as referent rather than the original one.</div> Not to imagine what bad stuff this would allow for. Maybe one should constrain the resulting object to have the <div>same class as the original or something.</div> <div>Note that I would not need WeakrefListener if WeakrefDelegate were part of the API, so you would not have to include both ideas.</div> <div> </div> <div>I think I will work out #1 for now - it is the closest one to my suggested API extension and it would be easy for me to adopt the solution</div> <div>to a potential API change one day.</div> <div> </div> <div>- Stefan</div> <div> </div> <div> </div> <div> <div name="quote" style="margin:10px 5px 5px 10px; padding: 10px 0 10px 10px; border-left:2px solid #C3D9E5; word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space;"> <div style="margin:0 0 10px 0;"><b>Gesendet:</b> Freitag, 30. Mai 2014 um 16:38 Uhr<br/> <b>Von:</b> "Jim Baker" <jim...@py...><br/> <b>An:</b> "Stefan Richthofer" <Ste...@gm...><br/> <b>Cc:</b> "Jython Developers" <jyt...@li...><br/> <b>Betreff:</b> Re: [Jython-dev] weak reference support in JyNI</div> <div name="quoted-content"> <div>Stefan, <div> </div> <div>Rather than monkeypatching a custom _weakref module from JyNI (solution #2), it would seem better if we developed a better API, much as I tried to do with the callback scheme in my previous email. This would then let you build an efficient mechanism for your solution #1 and without any problematic "tweaks".</div> <div> </div> <div>But in doing so, we can also revisit the old scheme of the private static field references being an ArrayList in _weakref.GlobalRef, and the management of AbstractRef objects in it. There easily seem to be additional O(n) and O(n^2) factors in this code that we could eliminate, not to mention possibly also removing the synchronized access, while also providing support for your needs. In part, I'm pretty sure this is due to the fact that weakref was apparently part of Jython 2.1 (anyone want to verify this?), and we really haven't looked at its performance in subsequent releases, just correctness. (I even touched it at one point to change references from Vector to ArrayList, but I did so as part of a general refactoring of the Jython codebase at the end of 2008.)</div> <div> </div> <div>In general, I'm a big fan of Google Guava collections, which we already use extensively in Jython's implementation; I'm pretty sure we can find a better mapping.</div> <div> </div> <div>I'm on vacation today through the weekend, so I'm sure I'll have more analysis next week. Looking forward to a good conversation.</div> <div> </div> <div>- Jim</div> <div> </div> </div> <div class="gmail_extra"> <div class="gmail_quote">On Fri, May 30, 2014 at 1:01 AM, Stefan Richthofer <span><<a href="Ste...@gm..." target="_parent">Ste...@gm...</a>></span> wrote: <blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-left: 1.0ex;"> <div> <div style="font-family: Verdana;font-size: 12.0px;"> <div> <div>Hello Jim,</div> <div>thanks for the response. I also thought about using weakref's callback-mechanism. The problem here is that callbacks are processed when the referent is collected. However this is too late for my purpose. Without the adjustment, the PyCPeers (if in the role of referents) might be collected too early, i.e. when the native object is still available in fact (but kept alive by native pointers). So my adjustment must take place before the CPeer is collected, optimally directly when a weak reference is created. I also thought about reviving the weakref (i.e. GlobalRef) on AbstractReference-level once the callback is received. Problem here is that other callbacks might be registered for the CPeer. These would be called by the RefReaperThread equally to my own callback, receive wrong alerts and do all sorts of cleanup too early. I don't see a good way to tell the RefReaperThread which callbacks to call and which not. So I would prefere solution #1. I am still keen on understanding why #2 fails. Any clue?</div> <div> </div> <div>- Stefan</div> <div> <div style="margin: 10.0px 5.0px 5.0px 10.0px;padding: 10.0px 0 10.0px 10.0px;border-left: 2.0px solid rgb(195,217,229);"> <div style="margin: 0 0 10.0px 0;"><b>Gesendet:</b> Freitag, 30. Mai 2014 um 00:59 Uhr<br/> <b>Von:</b> "Jim Baker" <<a href="jim...@py..." target="_parent">jim...@py...</a>><br/> <b>An:</b> "Stefan Richthofer" <<a href="Ste...@gm..." target="_parent">Ste...@gm...</a>><br/> <b>Cc:</b> "Jython Developers" <<a href="jyt...@li..." target="_parent">jyt...@li...</a>><br/> <b>Betreff:</b> Re: [Jython-dev] weak reference support in JyNI</div> <div> <div> <div class="h5"> <div>Stefan, <div> </div> <div>Glad to hear of the progress on JyNI. Getting weakrefs working is going to be essential to JyNI supporting the CPython memory model in conjunction with Java's.</div> <div> </div> <div>We likely need another solution, one that's closer to approach #0, but without the poll. Basically it should be possible for using code to register callbacks, with RefReaperThread calling each of the registered callbacks for each GlobalRef object. A callback could enqueue (onto a LinkedBlockingQueue for example) or do some other processing. This should be general enough for your management purposes.</div> <div> </div> <div>Note that the current implementation causes a recurring issue we have seen in how threads can interact with class loaders: <a href="http://bugs.jython.org/issue2127" target="_blank">http://bugs.jython.org/issue2127</a> This simply means we can fix this problem in revisiting this support!</div> <div> </div> <div>Interestingly, Jiwon Seo requested a somewhat similar notification mechanism here, but with respect to generated bytecode, <a href="https://bitbucket.org/jython/jython/pull-request/41/adding-bytecode-notification-mechanism" target="_blank">https://bitbucket.org/jython/jython/pull-request/41/adding-bytecode-notification-mechanism</a> This makes sense to me - we are now trying to build more sophisticated integrations with Jython. (Hopefully we will get that merged in soon, sorry that has taken so long, Jiwon!)</div> <div> </div> <div>- Jim</div> <div> </div> </div> <div class="gmail_extra"> <div class="gmail_quote">On Thu, May 29, 2014 at 7:17 AM, Stefan Richthofer <span><<a href="http://Ste...@gm..." target="_blank">Ste...@gm...</a>></span> wrote: <blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-left: 1.0ex;">Hello,<br/> <br/> I am currently working out the support of garbage collection and weak references in JyNI. For weak references, I would use the _weakref module that is already included in Jython, but I need to tweak it a bit, because JyNI needs to know, whether a PyCPeer (i.e. a subclass of PyObject that JyNI uses to wrap native objects in some cases) is weakly referenced. The reason for this is - roughly speaking - that in case of weak references, the native object should keep the peer alive (i.e. use a global ref in JNI), while normal references should use a peer that keeps the native object alive and not vise versa (i.e. use a weak global ref in JNI). So there will be potentially 2 versions of PyCPeer having the keep-alive relation<br/> PyCPeer ---keeps alive---> native object ---keeps alive---> PyCPeer<br/> Normal references should refere to the leftmost one while weak references should refere to the rightmost one (the python level equality function will report them as equal).<br/> <br/> Of course the default implementation of _weakref is not aware of this. So I made up roughly two (three) competing ideas how to do it.<br/> <br/> 0) The official way to get those weak references that point to a specific object is WeakrefModule.getweakrefs(...)<br/> However I would have to poll that method, which is an unprecise and unefficient solution. (So I don't count it as solution)<br/> Once a new weak reference to a PyCPeer would show up, I would replace its referent by the right version. Since GlobalRef extends java.lang.ref.WeakReference, which does not allow to modify the referent, this would involve creating a new GlobalRef and replacing the old one. This is rather tweaky since that is stored in a private field (i.e. I would use reflection with setAccessible(true); or let native code do the operation).<br/> <br/> 1) Use org.python.modules._weakref.GlobalRef.references to keep track.<br/> I would write a List implementation that wraps some other list as backend but reports modifications to a listener. I would use the original org.python.modules._weakref.GlobalRef.references as backend and insert my custom list into the org.python.modules._weakref.GlobalRef.references-field. Since it is a private field, this would be tweaky again, but could be done the same way as mentioned above. The rest would be like in "solution" 0) but the polling would be avoided.<br/> <br/> 2) Write an adjusted version of the _weakref-module (i.e. that checks for PyCPeer BEFORE creating GlobalRefs - so it works completely without fumbling with private fields and avoids useless recreation of GlobalRefs).<br/> I tried this first, because I consider it the least tweaky solution, but ran into problems I can't resolve. For testing and a proof of concept, I took the _weakref source code, put it into a new package JyNI._weakref and only modified the doc-string to be able to distinguish the modules.<br/> <br/> Then I adjusted the JyNI-initializer as follows:<br/> <br/> public void initialize(Properties preProperties, Properties postProperties, String[] argv, ClassLoader classLoader, ExtensiblePyObjectAdapter adapter)<br/> {<br/> //Customize the _weakref module to JyNI-needs:<br/> String blti = (String) postProperties.get("python.modules.builtin");<br/> postProperties.put("python.modules.builtin", blti == null ? "_weakref:JyNI._weakref.WeakrefModule" : blti+",_weakref:JyNI._weakref.WeakrefModule");<br/> <br/> //init PySystemState:<br/> PySystemState initState = PySystemState.doInitialize(preProperties, postProperties, argv, classLoader, adapter);<br/> <br/> //further initialization stuff...<br/> <br/> <br/> This would overwrite the _weakref-builtin with my custom version. If I do<br/> <br/> import _weakref<br/> print _weakref.__doc__<br/> <br/> it works nicely and indeed prints the doc-string of my custom module.<br/> But if I do<br/> <br/> from _weakref import ref<br/> print ref<br/> bla = "bla"<br/> test = ref(bla)<br/> print test<br/> <br/> original Jython outputs:<br/> <br/> <type 'weakref'><br/> <weakref at 0x2; to 'str' at 0x3><br/> <br/> (i.e. it works fine), while the output using my initializer is:<br/> <br/> <type 'JyNI._weakref.ReferenceType'><br/> Traceback (most recent call last):<br/> File "/home/stefan/eclipseWorkspace/JyNI/JyNI-Demo/src/JyNIWeakRefTest.py", line 41, in <module><br/> test = ref(bla)<br/> TypeError: JyNI._weakref.ReferenceType(): expected 2-3 args; got 1<br/> <br/> <br/> So obviously, my tweak fails to make up the right type for ref and yields no appropriate mro. However I have no clue how this comes, since Jython should initialize the custom module just the same way it would do with or.python.modules._weakref.WeakrefModule - it is absolutely identical apart from the doc-string and the package path (but that is adjusted in the properties).<br/> <br/> Can someone tell me why approach 2) fails and maybe how to fix it? Or should I work out approach 1) instead? Or do you even have a better idea how to reach my goal?<br/> <br/> Thanks in advance.<br/> <br/> -Stefan<br/> <br/> ------------------------------------------------------------------------------<br/> Time is money. Stop wasting it! Get your web API in 5 minutes.<br/> <a href="http://www.restlet.com/download" target="_blank">www.restlet.com/download</a><br/> <a href="http://p.sf.net/sfu/restlet" target="_blank">http://p.sf.net/sfu/restlet</a><br/> _______________________________________________<br/> Jython-dev mailing list<br/> <a href="http://Jyt...@li..." target="_blank">Jyt...@li...</a><br/> <a href="https://lists.sourceforge.net/lists/listinfo/jython-dev" target="_blank">https://lists.sourceforge.net/lists/listinfo/jython-dev</a></blockquote> </div> </div> </div> </div> ------------------------------------------------------------------------------ Time is money. Stop wasting it! Get your web API in 5 minutes. <a href="http://www.restlet.com/download" target="_blank">www.restlet.com/download</a> <a href="http://p.sf.net/sfu/restlet_______________________________________________" target="_blank">http://p.sf.net/sfu/restlet_______________________________________________</a> Jython-dev mailing list <a href="Jyt...@li..." target="_parent">Jyt...@li...</a> <a href="https://lists.sourceforge.net/lists/listinfo/jython-dev" target="_blank">https://lists.sourceforge.net/lists/listinfo/jython-dev</a></div> </div> </div> </div> </div> </div> </blockquote> </div> </div> ------------------------------------------------------------------------------ Time is money. Stop wasting it! Get your web API in 5 minutes. <a href="http://www.restlet.com/download" target="_blank">www.restlet.com/download</a> <a href="http://p.sf.net/sfu/restlet_______________________________________________" target="_blank">http://p.sf.net/sfu/restlet_______________________________________________</a> Jython-dev mailing list Jyt...@li... <a href="https://lists.sourceforge.net/lists/listinfo/jython-dev" target="_blank">https://lists.sourceforge.net/lists/listinfo/jython-dev</a></div> </div> </div> </div></div></body></html> |
From: Jim B. <jim...@py...> - 2014-06-02 17:00:15
|
First, let me say that the JyNI project is fantastic work, and we generally try to support such projects in Jython. This is even more so when they have minimal integration requirements for Jython core; and this is letting us actually support the C extension API! On Fri, May 30, 2014 at 9:49 AM, Stefan Richthofer <Ste...@gm... > wrote: > Alright. > >>it would seem better if we developed a better API > I'm not sure, whether this would result in better API, since the JyNI > purpose is very special and opening an official gate for this > might encourage nasty or even malicious hacks. > Given that it's easy to reflect on the Jython's runtime internals, much like Python in general, and do other crazy stuff, this is not something that should preclude this type of API. In order words, a language that lets one say True = 0 in someone else's namespace, but actually doesn't have this problem in practice, is going to be fine with this sort of hook requirement. > So it might be better to simply keep it semi-closed and use tweaky > solutions in > the rare cases when it is needed. That said, let me state, what API I > would need. > > I would need an interface like > > public interface WeakrefListener { > public void weakrefCreated(AbstractReference ref); > public void weakrefDisposed(AbstractReference ref); (<- I would not > need this one, but it would be a more consequent realization of this > functionality) > } > > Any PyObject implementing this interface would be notified by the _weakref > module once it becomes a referent or stops being one. > Fair enough, it's a very cheap operation to check for a specific interface (possibly no cost, depending on inlining iirc), and is similar to other customization hooks we already have. > However this would not yet fix the tweak neccessary to replace the > referent by kind of a proxy (what I need to fix keep-alive relations > with native objects). To fix even this, I would need something like > > public interface WeakrefDelegate { > PyObject getReferentProxy(PyObject referent); > } > > which is even more radical. When creating a weak reference, the _weakref > module would first check whether the referent implements > this interface and - if true - would use the obtained proxy as referent > rather than the original one. > Also reasonable. > Not to imagine what bad stuff this would allow for. Maybe one should > constrain the resulting object to have the > same class as the original or something. > Note that I would not need WeakrefListener if WeakrefDelegate were part of > the API, so you would not have to include both ideas. > > I think I will work out #1 for now - it is the closest one to my suggested > API extension and it would be easy for me to adopt the solution > to a potential API change one day. > Let's put together a bitbucket branch with these ideas in it. This beta phase is a good time to try these ideas out; otherwise we will have to wait to 2.7.1. Also we can still revisit the implementation details of using GlobalRefs.references. - Jim |
From: Stefan R. <Ste...@gm...> - 2014-07-18 14:01:42
|
<html><head></head><body><div style="font-family: Verdana;font-size: 12.0px;"><div> <div>Dear Jeff,</div> <div> </div> <div>- Your suggestion for #1 with the new PyBUF-flag seems fine for me. My only worry would be that in most situations PyBuffers would support PyBuffer.Pointer even if requested without the flag. This itself is okay, but since programmers are not familiar with the new flag, they might get used to get away with not using it. Maybe with a clear warning in the doc this would go fine.</div> <div> </div> <div>- I was not aware that package-private stuff prevents one from extending ByteBuffer. Stack-Overflow says this allows the JVM to do significant optimization, but indeed ByteBuffers appear less useful to me than before.</div> <div> </div> <div>- Your suggestion for #1 made me recall that also CPython's BufferProtocol takes into account that an exporter might raise an error if it can't deal with the requested PyBUF-flags. I can use this behaviour in #2 to deal with the case that JNI can't provide direct access to an array's elements. However the downside of this "solution" would be inconsistent behavior between Jython's and JyNI's BufferProtocols - PyBUF flags that work on Jython might fail on native side.</div> <div> </div> <div>- I agree on your concerns for #2 regarding efficiency (although I can imagine, that ByteBuffers would still perform rather well as far as one sticks to bulk get/put-operations; ByteBuffers are said to be well optimized by the JVM). However, I had an implementation in mind that distinguishes direct ByteBuffer vs byte[]-backed buffer case early, i.e. at Jython startup (via the proposed config parameter). Jython could choose the appropriate implementations of BufferProtocol-supporting builtin types and assign them to the builtins in the builtin-type-dictionary. I admit this is a bunch of work, but it is not so urgent and I would offer to provide the alternative implementations or at least help with it. However I would postpone this until an actual use-case comes up (i.e. an extension appears that could not be made working cleanly without it). With your suggestion for #1/new PyBUF-flag, Jython's BufferProtocol would be flexible enough to insert such enhanchments later.</div> <div> </div> <div>- One more thing. I recently recognized that the method "irepeat" in PyByteArray (used by bytearray___imul__) replaces the storage backend by a new array of changed size. But it does not call resizeCheck(). Is this a bug, or do I miss something? (I admit I am currently not looking at the newest beta-code, so please ignore this if it was already fixed.)</div> <div> </div> <div>-Stefan</div> <div> </div> <div> <div style="margin: 10.0px 5.0px 5.0px 10.0px;padding: 10.0px 0 10.0px 10.0px;border-left: 2.0px solid rgb(195,217,229);"> <div style="margin: 0 0 10.0px 0;"><b>Gesendet:</b> Donnerstag, 17. Juli 2014 um 22:18 Uhr<br/> <b>Von:</b> "Jeff Allen" <ja...@fa...><br/> <b>An:</b> "Stefan Richthofer" <Ste...@gm...><br/> <b>Cc:</b> "Jython Developers" <jyt...@li...>, "Jim Baker" <jim...@py...><br/> <b>Betreff:</b> Re: Jython buffer protocol</div> <div> <div style="background-color: rgb(255,255,255);"> <div class="moz-cite-prefix">Dear Stefan:<br/> <br/> I did indeed mean move it to jython-dev, not the other place. Thanks for the careful exposition of your ideas.<br/> <br/> In pursuit of idea #1, I think it would not be too difficult to present a ByteBuffer as a PyBuffer, even when there is no byte[] behind it. There are obvious implementations for most of the "abstract" API, and I imagine getBufferSlice could be done with ByteBuffer.slice().<br/> <br/> We could add a getNIOByteBuffer to PyBuffer with a default implementation to wrap a Pointer. That is the most significant use of Pointer in the core already. I think I even added it once and reverted it as gold-plating. I see a reason for it now: an object not presenting byte[] access, could implement that method its own way.<br/> <br/> I am not keen to lose the direct access to the underlying byte[] that Pointer gives, in general. It seems to me that this is the essence of the CPython Py_Buffer: it gives you a char* and some dimensions, and it lets you have at the data directly, in whatever pattern of access you need, and quickly. A PyBuffer.Pointer is the Java equivalent of the char *buf member. However, I think it would be acceptable for certain objects not to implement the operations that return a Pointer, or (like PyString) to implement them expensively, but defer the cost. I appreciate that implementation may be practical only for a read-only object.<br/> <br/> We could add a PyBUF-flag, whereby a client says whether it expects to make Pointer access, so getBuffer(int) could fail early and helpfully, rather than the client fail later.<br/> <br/> Idea #2 sounds very difficult to pull off at all (at the JyNI end). And to make it possible you imply that all core objects that implement BufferProtocol would be rewritten to store their data in a ByteBuffer. BaseBytes has a lot in common with ByteBuffer, it is true, but the difference is that inside BaseBytes I can get at the bytes directly, and the clients can do so through a PyBuffer.Pointer. By comparison, using ByteBuffer feels like working through a keyhole. It cannot even be extended, since its constructor is package-private. It is possible to do this, I would say, but to forbid Jython efficient access to data as byte[], in order to afford C-code direct access to it as char*, seems perverse. <pre class="moz-signature">Jeff Allen</pre> On 17/07/2014 13:40, Stefan Richthofer wrote:</div> <blockquote> <div style="font-family: Verdana;font-size: 12.0px;"> <div> <div>Dear Jeff,</div> <div> </div> <div>the true goal I want to achieve is a clean BufferProtocol support in JyNI. That means two things in my opinion.</div> <div> </div> <div>1) If a CPython extension features types that support BufferProtocol and these are passed to Jython via JyNI, they shall appear as Jython PyObjects that support Jython's BufferProtocol. The exact memory-data that the CPython object exposes shall be exposed in Java by the Jython object. Optimally this should work for reading and writing and as direct as possible (for efficiency, minimizing memory requirements and -most important- to guarantee sync between Java-view and native view on the data). Using direct ByteBuffers this could be achieved, at least for JVMs that support these. On other JVMs, I would use byte[] as a fallback and try to obtain native access on the memory. However in this case, JNI does not guarantee to provide access to the actual memory of the array. It might only offer a copy, which would hold the risk of loosing sync. I see some techniques how to avoid that somehow, but it won't be very efficient nor elegant, nor absolutely save.</div> <div> </div> <div>2) If a Jython object implements Jython's BufferProtocol in Java and JyNI is used to pass a PyObject from Jython down to a CPython extension, the native variant of the object shall support CPython's BufferProtocol. The object shall expose the corresponding memory from the JVM to the extension via this protocol. This shall work for reading and writing and as direct as possible for same reasons as above. I admit, this will be hard for Jython objects where a user implements PyBuffer in his own fashion. However, I think at least Jython's built-in types could and should support this. It can also be done with direct ByteBuffers.<br/> I know that direct ByteBuffers also have their disadvantages. They prohibit the JVM from doing memory optimization. Additionally the GC does not consider their memory when it determines what to delete or when to run (it would still clear the memory, if it collects a buffer). So direct ByteBuffers should only be used when really needed. I would propose a configuration-parameter for Jython that tells BaseBytes and BaseBuffer to use direct ByteBuffers. If this parameter is turned off, they could use ByteBuffer.wrap to fall back to the current implementation.</div> <div> </div> <div>The doc of Jython's BufferProtocol could tell potential other implementers also to look at this parameter and store their data as direct or ordinary ByteBuffer accordingly. However, I would build fallbacks into JyNI to deal as good as possible with situations where implementers don't stick to this.</div> <div> </div> <div> </div> <div>Both scenarios would mean that the backend in BaseBytes and BaseBuffer had to be of a type that unifies byte[] and ByteBuffer. One variant would be to use ByteBuffer as type and ByteBuffer.wrap for byte[] case. Another variant would be to have the storage of type Object. The PyBuffer implementation would know whether it is byte[] or ByteBuffer, or maybe even String and work accordingly (this would involve lots of explicit type casts though). However PyBuffer.pointer should not provide a byte[] and pretend it to be the actual backend in either variant.</div> <div> </div> <div>I think the solution from PyString, i.e. SimpleStringBuffer only works well for read-only scenarios, since it would lead to asynchronity between byte[] and String backends if the user uses the obtained byte[] for writing (assuming a mutable String variant like StringBuffer). Additionally, such an approach potentially doubles the memory requirements, which might be significant in some situations.</div> <div> </div> <div>I mentioned my thoughts about support for (>2^31 bytes)-arrays, because -afaIk- a CPython extension might expose such long data via BufferProtocol and I was wondering how to deal with this. Then I mentioned it as a maybe misleading example. I could imagine to provide kind of LongPyBuffer that enhanches PyBuffer by long-index methods and allows to obtain ordinary int-sized slices to portions of the data. But these are future thoughts - for a first step I would just document it as a JyNI-limitation to support BufferProtocol only for (>2^31 bytes)-buffers.</div> <div><br/> Again - I would be fine with postponing this decision if the public API was adjusted to be open for more variants than byte[]. I would also offer to help with implementing the adjustments and -if accepted- the refining of Jython's BufferProtocol.</div> <div>I hope this explains my intentions a bit better.</div> <div><br/> Cheers</div> <div> </div> <div>Stefan</div> <div> </div> <div> </div> <div>P.S. Feel free to move the discussion to jython-dev (assuming 'python-dev' was a typo^^)</div> <div> <div style="margin: 10.0px 5.0px 5.0px 10.0px;padding: 10.0px 0 10.0px 10.0px;border-left: 2.0px solid rgb(195,217,229);"> <div style="margin: 0 0 10.0px 0;"><b>Gesendet:</b> Donnerstag, 17. Juli 2014 um 10:26 Uhr<br/> <b>Von:</b> "Jeff Allen" <a class="moz-txt-link-rfc2396E"><ja...@fa...></a><br/> <b>An:</b> "Stefan Richthofer" <a class="moz-txt-link-rfc2396E"><Ste...@gm...></a><br/> <b>Cc:</b> "Jim Baker" <a class="moz-txt-link-rfc2396E"><jim...@py...></a><br/> <b>Betreff:</b> Re: Jython buffer protocol</div> <div> <div style="background-color: rgb(255,255,255);"> <div class="moz-cite-prefix">Stefan:<br/> <br/> I see why ByteBuffer is useful for referencing bytes that might not be in an a byte[]. I don't yet see why you want to do this through the PyBuffer interface. I imagine you want to define a PyObject that offers BufferProtocol to represent the data. I think we could make it work, but I don't know if what I'm imagining meets your need. Could you give some toy examples?<br/> <br/> If the killer application is to represent very large arrays (>2^31 bytes) then your are blocked, since nothing else in the API will work with long indices. So only normal-sized objects, or normal-sized slices of large data objects can be handled.<br/> <br/> PyString provides an example of bytes that are not stored in a byte[]. This is perhaps the model we should use. It *does* give you a Pointer to a real byte[], if you insist, but it tries hard to avoid creating one. Most clients accessing its PyBuffer, don't provoke this action. I think it would be tolerable for a private-use object to throw instead of creating the massive array.<br/> <br/> BTW, I'm ok with this appearing on python-dev. This is an e-mail address I'm content to expose there. Jim was the one to answer you previously, because Jim was the one who knew the answers. <pre class="moz-signature">Jeff Allen</pre> On 16/07/2014 13:40, Stefan Richthofer wrote:</div> <blockquote> <div style="font-family: Verdana;font-size: 12.0px;"> <div> <div>Dear Jeff,</div> <div>I agree on most of your concerns. Safety is not the running argument (since safty in Java is mainly an illusion anyway). I mainly mentioned it to enhanch the pro's section a bit. Enforcing some buffer properties would however benefit debugging purposes I think.</div> <div> </div> <div>Also agree that Java tends to overdo checking constraints etc and that ByteBuffer has its issues like many things in Java.<br/> And I absolutely agree that ByteBuffer is not sufficient to replace PyBuffer, because of the striding features etc.<br/> It should be used as a backend and maybe as a replacement for PyBuffer.Pointer, which is up to you.</div> <div> </div> <div>The killer-feature is that ByteBuffer fundamentally offers additional functionality over byte[] - it can potentially point to memory outside the JVM. This makes it the truest notion of a "pointer" that Java can offer. Just to give some non-JyNI example what could be done with it:</div> <div>I'm not 100% sure, but I believe one could even write a ByteBuffer subclass that offers long-index access methods overcoming the (2^31)−1 limitation of array size (in fact even (2^31)−6). (One would have to manage allocation and memory natively). So ByteBuffers would be the best chance to allow working with big data and stuff. In general, direct ByteBuffers would be the way to provide a buffer protocol that even extends to C-level.</div> <div><br/> Let me put it that way - you don't have to decide this now. Neither in a year. I just ask you to change the public API such that it is not tied to the plain array variant. Hide the storage field in PyBuffer.Pointer. Offer a getStorage()-method instead and state that it is not guaranteed to provide the actual backing array. Add a boolean method that tells whether it does. If you like, you can even guarantee access to the backing array in default configuration of Jython (i.e. provide a flag in the future that leverages advanced buffer functionality). These are minor changes, but they must be done before the beta-phase ends.</div> <div> </div> <div>Thanks for considering my proposal!</div> <div> </div> <div>Cheers</div> <div> </div> <div>Stefan</div> <div> <div style="margin: 10.0px 5.0px 5.0px 10.0px;padding: 10.0px 0 10.0px 10.0px;border-left: 2.0px solid rgb(195,217,229);"> <div style="margin: 0 0 10.0px 0;"><b>Gesendet:</b> Mittwoch, 16. Juli 2014 um 10:10 Uhr<br/> <b>Von:</b> "Jeff Allen" <a class="moz-txt-link-rfc2396E"><ja...@fa...></a><br/> <b>An:</b> "Jim Baker" <a class="moz-txt-link-rfc2396E"><jim...@py...></a>, "Stefan Richthofer" <a class="moz-txt-link-rfc2396E"><Ste...@gm...></a><br/> <b>Betreff:</b> Re: Jython buffer protocol</div> <div> <div style="background-color: rgb(255,255,255);"> <div class="moz-cite-prefix">Hi Jim, and Stefan - thanks for your careful reading of my work.<br/> <br/> I have considered java.nio.ByteBuffer a lot during this development, both as a possible substitute for the API as a whole (but it isn't close enough to CPython's API) and as a candidate for Pointer. It also gives me clues about how the API should extend to elements other than byte, which I'd like to do, and I imagine we need for NumPy. (The precursors of this exist in the API, but may be incorrect.)<br/> <br/> So java.nio.Buffer is good, and I'm trying to remember why it didn't make it as Pointer.<br/> <br/> I've definitely thought more than once I would like easily to get a ByteBuffer from a PyBuffer, when looking into io and codecs. Adding that to the API seemed to burden the implementer unnecessarily when the client can so easily call ByteBuffer.wrap() on information the Pointer gives. But the idea here seems to be that an object that is a java.nio.ByteBuffer already, should be able to back a PyBuffer. I'm not sure this works.<br/> <br/> The CPython API that I'm copying absolutely gives the client a pointer to bytes directly, with the purpose of efficient access, and all the attendant risks accepted. So I'm not convinced by the "safety" argument.<br/> <br/> I recall looking at the ptr:limit range-checking and readonly checks in ByteBuffer and thinking I those were at contrary to the intentions of the API. (It's bad enough that the array bounds are checked!) Efficiency is largely a matter of taste, however, and one's faith in the optimiser.<br/> <br/> Many operations on ByteBuffer move the pointers and limit around in ways that may surprise. I remember feeling that it was too rich and overly-encapsulated when what I wanted really was just a holder for two/three quantities: array base, offset and length.<br/> <br/> I'm trying to remember if there was a real show-stopper. The hard case, I predict is, given a PyBuffer backed by an (opaque) ByteBuffer, can I create a PyBuffer slice that returns a ByteBuffer having the properties a client would expect.<br/> <br/> I'll think about it. <pre class="moz-signature">Jeff Allen</pre> On 16/07/2014 07:12, Jim Baker wrote:</div> <blockquote> <div>Adding Jeff Allen to the discussion, since he's the author of this support, and therefore has substantially more insight about PyBuffer and its subtleties than I do. <div> </div> <div>In general, using ByteBuffer as our basis, as wrapped by PyBuffer, makes sense, given the pros and cons below. Something comparable is seen in <a href="http://netty.io/4.0/api/io/netty/buffer/ByteBuf.html" target="_blank">http://netty.io/4.0/api/io/netty/buffer/ByteBuf.html</a>, although Netty's ByteBuf does wrap byte[] directly as well. In particular, Python's buffer protocol is heavily influenced by the needs of NumPy support, so our own efficient support of NumPy via JyNI is an important consideration. We also know that at JNI has strict requirements for safety that ByteBuffer helps ameliorate. (Although I would be also curious: to what extent can we sidestep via sun.misc.Unsafe, as we already are doing in JNR?)</div> <div> </div> <div>My naive reading of PyBuffer.Pointer is that this potentially could make certain indexed ops more expensive, but it's also not clear how much we are using such ops today.</div> <div> </div> <div>I have probably muddied the discussion at this point, but hopefully we can start a good discussion on the best approach.</div> <div> </div> <div>- Jim</div> </div> <div class="gmail_extra"> <div class="gmail_quote">On Sat, Jul 12, 2014 at 9:11 AM, Stefan Richthofer <span><<a>Ste...@gm...</a>></span> wrote: <blockquote class="gmail_quote" style="margin: 0 0 0 0.8ex;border-left: 1.0px rgb(204,204,204) solid;padding-left: 1.0ex;">Hey Jim,<br/> <br/> I recently had a look into Jython's implementation of the Buffer protocol (just needed a change from gc stuff and work). It uses a plain byte[] array as storage backend. Thinking of it as a default backend for the PyBuffer interface might be okay, since it could be changed on demand without breaking external code. But there is the result type PyBuffer.Pointer, which contains a public reference to the backing byte[] of the PyBuffer. This makes it obligatory to use byte[] as PyBuffer backends. (BaseBytes and BaseBuffer also expose their byte[]-backends, but at least they use the "protected" access modifier.)<br/> <br/> However I see good reasons to use a ByteBuffer from java.nio as storage backend instead of a plain byte[] array and highly recommend to change at least the field in PyBuffer.Pointer to the type java.nio.ByteBuffer. This has no regressions, since one can still have byte[] as backend and use java.nio.ByteBuffer.wrap to create a ByteBuffer on top of the array.<br/> <br/> Pros:<br/> - Using ByteBuffer allows for a wider range of possible backends without significantly restricting functionality or efficiency.<br/> <br/> - The Buffers from java.nio are Java's equivalent of Python's buffer protocol. They were made for this. So Jython's buffer protocol should be build on top of them. They are optimized for sharing memory, even with native code.<br/> <br/> - The ByteBuffer can be constructed such that it enforces some of the properties defined by flags from PyBUF. For instance if the flags indicate a read-only buffer, an according read-only ByteBuffer can be exposed to the user.<br/> <br/> - Last but not least, the use of direct ByteBuffers would allow me to emulate CPython's buffer protocol in JyNI. Unfortunately it is not obligatory for JVMs to support direct buffers, but for those that do I believe, I could produce almost 100% CPython behaviour with JyNI. On other JVMs it would always remain an issue to detect changes in the buffer and sync them back to Java, leading to low efficiency and lack of success guarantees (I will still have to deal with this as a fallback).<br/> You might guess, that this is honestly the reason, why I propose this. However, I think the other advantages of using ByteBuffer still hold and there are hardly disadvantages on the other hand.<br/> <br/> <br/> <br/> Con's:<br/> - The only thing that would be lost is that writers of external java code interfacing with Jython had the guarantee that they can have write-access to the buffer via a byte array. But in my opinion they should not have this guarantee anyway. Think of a slice of a PyBuffer. By letting the users have the byte[] backend they would have access to the sliced-out data. Or maybe also to read-only data.<br/> <br/> - It would break existing code that interfaces with Jython via the buffer protocol. However it would be trivial to fix such code. The protocol was introduced with Jython 2.7 which is still beta and -hopefully- there is currently no or hardly code out there that uses this feature. So this kind of change should be done urgently. To ease fixing external code one could add a method to PyBuffer.Pointer that returns the buffer values as byte[] (With the doc stating that it might be a copy, if the underlying ByteBuffer has no backing array).<br/> <br/> <br/> <br/> So let me comprehend my proposal:<br/> <br/> Step 1 (urgent):<br/> In PyBuffer.Pointer change the type of the "storage"-field from byte[] to java.nio.ByteBuffer.<br/> Fix the implementation of the constructor of PyBuffer.Pointer by using java.nio.ByteBuffer.wrap.<br/> <br/> Step 2 (not so urgent):<br/> Change the default backends in BaseBytes and BaseBuffer to be java.nio.ByteBuffer instead of byte[].<br/> Fix current constructors the same way as done with that of PyBuffer.Pointer.<br/> Add new constructors that directly take ByteBuffer (also to PyBuffer.Pointer).<br/> <br/> Step 3:<br/> Provide a hook or startup-parameter/flag that tells Jython to use direct ByteBuffers as default backends for PyByteArray and BaseBuffer. These can prohibit some memory optimization of the JVM, so providing it as a flag allows me or others to turn it on in JyNI-case (or in case someone else wants to use the buffer protocol via JNI in native code), i.e. only when it is really needed.<br/> <br/> <br/> <br/> Cheers<br/> <br/> <span class="HOEnZb"><font color="#888888">Stefan</font></span></blockquote> </div> </div> </blockquote> </div> </div> </div> </div> </div> </div> </blockquote> </div> </div> </div> </div> </div> </div> </blockquote> </div> </div> </div> </div> </div></div></body></html> |
From: Jeff A. <ja...@fa...> - 2014-07-19 12:44:15
|
I'm editing for size in the hope it makes it to the list this time On 18/07/2014 15:01, Stefan Richthofer wrote: > Dear Jeff, > - Your suggestion for #1 with the new PyBUF-flag seems fine for me. My > only worry would be that in most situations PyBuffers would support > PyBuffer.Pointer even if requested without the flag. This itself is > okay, but since programmers are not familiar with the new flag, they > might get used to get away with not using it. Maybe with a clear > warning in the doc this would go fine. Yes, the flags are confusing, even for me. I take the point that we're now asking the client to understand another. I see it working in the same sense as READONLY: the client signals that it will *not* be using a feature the exporter *might not* be willing to provide. Maybe it is simpler if the implementation simply throws when a Pointer is requested. > - I was not aware that package-private stuff prevents one from > extending ByteBuffer. Stack-Overflow says this allows the JVM to do > significant optimization, but indeed ByteBuffers appear less useful to > me than before. You have to call super() explicitly, as there is no default constructor, but the declared ones are not visible. > - Your suggestion for #1 made me recall that also CPython's > BufferProtocol takes into account that an exporter might raise an > error if it can't deal with the requested PyBUF-flags. I can use this > behaviour in #2 to deal with the case that JNI can't provide direct > access to an array's elements. However the downside of this "solution" > would be inconsistent behavior between Jython's and JyNI's > BufferProtocols - PyBUF flags that work on Jython might fail on native > side. The flags don't mean quite the same things anyway. Jython's mean "I can cope with", where CPython's mean "I can cope with and am going to use". The difference is that clients using only the abstracted API can cope with any buffer organisation, since they don't actually use (for example) the strides array. > - I agree on your concerns for #2 regarding efficiency (although I can > imagine, that ByteBuffers would still perform rather well as far as > one sticks to bulk get/put-operations; ByteBuffers are said to be well > optimized by the JVM). However, I had an implementation in mind that > distinguishes direct ByteBuffer vs byte[]-backed buffer case early, > i.e. at Jython startup (via the proposed config parameter). Jython > could choose the appropriate implementations of > BufferProtocol-supporting builtin types and assign them to the > builtins in the builtin-type-dictionary. I admit this is a bunch of > work, but it is not so urgent and I would offer to provide the > alternative implementations or at least help with it. However I would > postpone this until an actual use-case comes up (i.e. an extension > appears that could not be made working cleanly without it). With your > suggestion for #1/new PyBUF-flag, Jython's BufferProtocol would be > flexible enough to insert such enhanchments later. The JUnit tests on PyByteArray (or is it BaseBytes?) also time insertion, appending and deleting, which would be helpful regarding performance achieved. Do we know how CPython would deal with an object claiming to support the buffer protocol, but that couldn't furnish a char* buf member? > - One more thing. I recently recognized that the method "irepeat" in > PyByteArray (used by bytearray___imul__) replaces the storage backend > by a new array of changed size. But it does not call resizeCheck(). Is > this a bug, or do I miss something? (I admit I am currently not > looking at the newest beta-code, so please ignore this if it was > already fixed.) Good spot. That's a bug: >>> b = bytearray('hello') >>> m = memoryview(b) >>> b.append(' ') Traceback (most recent call last): File "<stdin>", line 1, in <module> BufferError: Existing exports of data: object cannot be re-sized >>> b*=3 # should raise the same error >>> b bytearray(b'hellohellohello') >>> bytearray(m) bytearray(b'hello') > -Stefan > *Gesendet:* Donnerstag, 17. Juli 2014 um 22:18 Uhr > *Von:* "Jeff Allen" <ja...@fa...> > *An:* "Stefan Richthofer" <Ste...@gm...> > *Cc:* "Jython Developers" <jyt...@li...>, "Jim > Baker" <jim...@py...> > *Betreff:* Re: Jython buffer protocol > Dear Stefan: > > I did indeed mean move it to jython-dev, not the other place. Thanks > for the careful exposition of your ideas. > > In pursuit of idea #1, I think it would not be too difficult to > present a ByteBuffer as a PyBuffer, even when there is no byte[] > behind it. There are obvious implementations for most of the > "abstract" API, and I imagine getBufferSlice could be done with > ByteBuffer.slice(). > > We could add a getNIOByteBuffer to PyBuffer with a default > implementation to wrap a Pointer. That is the most significant use of > Pointer in the core already. I think I even added it once and reverted > it as gold-plating. I see a reason for it now: an object not > presenting byte[] access, could implement that method its own way. > > I am not keen to lose the direct access to the underlying byte[] that > Pointer gives, in general. It seems to me that this is the essence of > the CPython Py_Buffer: it gives you a char* and some dimensions, and > it lets you have at the data directly, in whatever pattern of access > you need, and quickly. A PyBuffer.Pointer is the Java equivalent of > the char *buf member. However, I think it would be acceptable for > certain objects not to implement the operations that return a Pointer, > or (like PyString) to implement them expensively, but defer the cost. > I appreciate that implementation may be practical only for a read-only > object. > > We could add a PyBUF-flag, whereby a client says whether it expects to > make Pointer access, so getBuffer(int) could fail early and helpfully, > rather than the client fail later. > > Idea #2 sounds very difficult to pull off at all (at the JyNI end). > And to make it possible you imply that all core objects that implement > BufferProtocol would be rewritten to store their data in a ByteBuffer. > BaseBytes has a lot in common with ByteBuffer, it is true, but the > difference is that inside BaseBytes I can get at the bytes directly, > and the clients can do so through a PyBuffer.Pointer. By comparison, > using ByteBuffer feels like working through a keyhole. It cannot even > be extended, since its constructor is package-private. It is possible > to do this, I would say, but to forbid Jython efficient access to data > as byte[], in order to afford C-code direct access to it as char*, > seems perverse. > Jeff Allen > On 17/07/2014 13:40, Stefan Richthofer wrote: > > Dear Jeff, > the true goal I want to achieve is a clean BufferProtocol support > in JyNI. That means two things in my opinion. > 1) If a CPython extension features types that support > BufferProtocol and these are passed to Jython via JyNI, they shall > appear as Jython PyObjects that support Jython's BufferProtocol. > The exact memory-data that the CPython object exposes shall be > exposed in Java by the Jython object. Optimally this should work > for reading and writing and as direct as possible (for efficiency, > minimizing memory requirements and -most important- to guarantee > sync between Java-view and native view on the data). Using direct > ByteBuffers this could be achieved, at least for JVMs that support > these. On other JVMs, I would use byte[] as a fallback and try to > obtain native access on the memory. However in this case, JNI does > not guarantee to provide access to the actual memory of the array. > It might only offer a copy, which would hold the risk of loosing > sync. I see some techniques how to avoid that somehow, but it > won't be very efficient nor elegant, nor absolutely save. > 2) If a Jython object implements Jython's BufferProtocol in Java > and JyNI is used to pass a PyObject from Jython down to a CPython > extension, the native variant of the object shall support > CPython's BufferProtocol. The object shall expose the > corresponding memory from the JVM to the extension via this > protocol. This shall work for reading and writing and as direct as > possible for same reasons as above. I admit, this will be hard for > Jython objects where a user implements PyBuffer in his own > fashion. However, I think at least Jython's built-in types could > and should support this. It can also be done with direct ByteBuffers. > I know that direct ByteBuffers also have their disadvantages. They > prohibit the JVM from doing memory optimization. Additionally the > GC does not consider their memory when it determines what to > delete or when to run (it would still clear the memory, if it > collects a buffer). So direct ByteBuffers should only be used when > really needed. I would propose a configuration-parameter for > Jython that tells BaseBytes and BaseBuffer to use direct > ByteBuffers. If this parameter is turned off, they could use > ByteBuffer.wrap to fall back to the current implementation. > The doc of Jython's BufferProtocol could tell potential other > implementers also to look at this parameter and store their data > as direct or ordinary ByteBuffer accordingly. However, I would > build fallbacks into JyNI to deal as good as possible with > situations where implementers don't stick to this. > Both scenarios would mean that the backend in BaseBytes and > BaseBuffer had to be of a type that unifies byte[] and ByteBuffer. > One variant would be to use ByteBuffer as type and ByteBuffer.wrap > for byte[] case. Another variant would be to have the storage of > type Object. The PyBuffer implementation would know whether it is > byte[] or ByteBuffer, or maybe even String and work accordingly > (this would involve lots of explicit type casts though). However > PyBuffer.pointer should not provide a byte[] and pretend it to be > the actual backend in either variant. > I think the solution from PyString, i.e. SimpleStringBuffer only > works well for read-only scenarios, since it would lead to > asynchronity between byte[] and String backends if the user uses > the obtained byte[] for writing (assuming a mutable String variant > like StringBuffer). Additionally, such an approach potentially > doubles the memory requirements, which might be significant in > some situations. > I mentioned my thoughts about support for (>2^31 bytes)-arrays, > because -afaIk- a CPython extension might expose such long data > via BufferProtocol and I was wondering how to deal with this. Then > I mentioned it as a maybe misleading example. I could imagine to > provide kind of LongPyBuffer that enhanches PyBuffer by long-index > methods and allows to obtain ordinary int-sized slices to portions > of the data. But these are future thoughts - for a first step I > would just document it as a JyNI-limitation to support > BufferProtocol only for (>2^31 bytes)-buffers. > > Again - I would be fine with postponing this decision if the > public API was adjusted to be open for more variants than byte[]. > I would also offer to help with implementing the adjustments and > -if accepted- the refining of Jython's BufferProtocol. > I hope this explains my intentions a bit better. > > Cheers > Stefan > |
From: Jeff A. <ja...@fa...> - 2014-08-05 22:42:57
|
Stefan: I've added a method to get a ByteBuffer and one for testing the presence of array access. I've implemented this in the base classes we use to add buffers, and I've used it in core objects where previously I used PyBuffer.Pointer. Rather than push this to the project repository immediately, I've put it here for comment: https://bitbucket.org/tournesol/jython-ja/commits/330839dc597a4ba3cb8168e8bd859ad91b9f4dec I think BaseBuffer needs refactoring now to separate the layer that assumes a real backing array from a base that doesn't. For now I'm minimising the changes. I've never really settled in my mind how I should deal with arrays of types other than byte. I keep looking at ByteBuffer.getInt() etc. and at IntBuffer, thinking those should be some sort of model. Looking at how PyArray works, it almost seems that also should be based on the sub-classes of java.nio.Buffer. I think a proper, typed solution would make the itemsize property unnecessary. Do you have any thoughts about itemsize and non-byte data? Jeff Allen |
From: Stefan R. <Ste...@gm...> - 2014-08-06 01:28:10
|
Jeff, I just quickly scanned the changes and everything looks fine as far as I see. PyByteArray and BaseBytes may need adjustments too (on this occasion remember to add resizeCheck() in irepeat). >>I've never really settled in my mind how I should deal with arrays of types other than byte. I keep looking at ByteBuffer.getInt() etc. and at IntBuffer, thinking those should be some sort of model. In my opinion, the API (at least in its current design) should exclusively stick to bytes for basically two reasons: Direct buffer functionality is only available for ByteBuffers and a ByteBuffer can create an IntBuffer-, LongBuffer-, etc-view on the data. I assume this is the most efficient way that Java offers to assemble Ints, Longs etc from bytes in mass. Unfortunately, it does not offer to obtain a ByteBuffer view on an IntBuffer or LongBuffer etc (unless the starting point was ByteBuffer and one kept it). So ByteBuffer (wrapping byte[] or not) yields most functionality and should be the only data-baseline for the Buffer protocol. If a user has int[]-data, I suppose one could create a BytBuffer with an IntBuffer view and insert the data via bulk write methods. >>Looking at how PyArray works, it almost seems that also should be based on the sub-classes of java.nio.Buffer. I think a proper, typed solution would make the itemsize property unnecessary. Do you have any thoughts about itemsize and non-byte data? Maybe the best idea would be to use ByteBuffer and keep the itemsize to tell the user what asXXBuffer method should be used on it. I also discussed the BufferProtocol question with Jim (rather briefly, since our main focus was gc) and he suggested it might be a good idea to introduce another indirection level between PyBuffer and ByteBuffer/byte[]. Due to inlining capabilities of the JVM, this would not have notable performance issues, but might solve some problems. For these lines I refer to the intermediate layer as "AbstractBuffer". Although ByteBuffer can be used to wrap byte[], another level would give us control about features and allow more implementations, overcoming the issue that ByteBuffer cannot be extended. Why not directly do this with PyBuffer? It seems reasonable to have multi-dim logic above the data representation, i.e. it would make AbstractBuffer too complex. Note that PyArray already uses such intermediate layer, i.e. AbstractArray. One would have to enhanche it to support ByteBuffers or maybe other buffers too. And of course one would have to adjust the other Built-ins and API to use this consequently. Some ideas about what could be done with it: - AbstractBuffer could change its backend on the fly. Maybe byte[] was appropriate at first and suddenly a direct ByteBuffer is needed. The user would still have an AbstractBuffer object, and would not even notice that backend changed. However he/she might have stored a byte[] he/she extracted from it, so there should be something in AbstractBuffer that tells the user, some view was invalidated (isValid(byte[] b) or somethng, maybe even offer an invalidation listener). For write-access there should be methods to lock and release the backend. - AbstractBuffer could have int[] and long[] backends - one would always use the given backend directly and only convert the backend on demand, if there is a good reason. I think one would provide format info that tells the user the type of backend. If she decides to access the data in another form, this can be provided with a lack of efficiency, i.e. by converting the data view forth and back. So I imagine AbstractBuffer with a bunch of "asXXBuffer" and "asXXArray" access methods. All would be functional but not all would be efficient, depending on the actual backend. Format info tells the user which are the efficient access methods for a specific buffer. - In future, AbstractBuffer could provide long-index access to native big data sources (maybe via a subclass LongAbstractBuffer or something) - In general an intermediate layer under Jython's control would be the most flexible approach to currently still unknown needs and issues. I know there would be careful thinking needed on how to design such intermediate layer and it would be a drastic change of the current API. I just wanted to make you aware of this idea. Maybe one could approach this in Jython 3 or so, or work out a minimal implementation of it, being open for more advanced use in the future. Stefan Gesendet: Mittwoch, 06. August 2014 um 00:42 Uhr Von: "Jeff Allen" <ja...@fa...> An: "Stefan Richthofer" <Ste...@gm...> Cc: "Jython Developers" <jyt...@li...> Betreff: Re: Jython buffer protocol Stefan: I've added a method to get a ByteBuffer and one for testing the presence of array access. I've implemented this in the base classes we use to add buffers, and I've used it in core objects where previously I used PyBuffer.Pointer. Rather than push this to the project repository immediately, I've put it here for comment: https://bitbucket.org/tournesol/jython-ja/commits/330839dc597a4ba3cb8168e8bd859ad91b9f4dec I think BaseBuffer needs refactoring now to separate the layer that assumes a real backing array from a base that doesn't. For now I'm minimising the changes. I've never really settled in my mind how I should deal with arrays of types other than byte. I keep looking at ByteBuffer.getInt() etc. and at IntBuffer, thinking those should be some sort of model. Looking at how PyArray works, it almost seems that also should be based on the sub-classes of java.nio.Buffer. I think a proper, typed solution would make the itemsize property unnecessary. Do you have any thoughts about itemsize and non-byte data? Jeff Allen |
From: Jeff A. <ja...@fa...> - 2014-08-07 19:50:08
|
Jeff Allen On 06/08/2014 02:28, Stefan Richthofer wrote: > Jeff, > > I just quickly scanned the changes and everything looks fine as far as I see. PyByteArray and BaseBytes may need adjustments too (on this occasion remember to add resizeCheck() in irepeat). Thanks for looking that over. I was rather asking whether the use case was served adequately by the addition of PyBUF.AS_ARRAY, hasArray() and getNIOByteBuffer(), since you obviously have a clear idea of it. I took a second/third look but didn't find anything to change in BaseBytes and PyByteArray: when arguments objects have the Buffer API they are accessed through the abstract API (not as byte[]). PyByteArray obviously can support AS_ARRAY, and that seems to be covered in SimpleBuffer. I fixed the irepeat bug in the previous change set: it's a distinct issue, so it gets its own change set. Thanks for spotting. > I know there would be careful thinking needed on how to design such intermediate layer and it would be a drastic change of the current API. I just wanted to make you aware of this idea. Maybe one could approach this in Jython 3 or so, or work out a minimal implementation of it, being open for more advanced use in the future. > I think it would make sense to have a layer below BaseBytes that contained all those mechanisms that work without assuming a byte[] storage. This would help you implement PyBuffer in an object unable to export a byte[]. That wouldn't change the API and is likely harmless to efficiency. But more radical ideas, I agree, need more careful thought. (The present design has had a lot of thought.) Jeff |
From: Jeff A. <ja...@fa...> - 2016-04-27 21:51:04
|
I'm giving serious consideration to idea 2, that is, the storage implementation is j.n.ByteBuffer, always, and *may* wrap a byte[] object. I'd need to try this out to ensure there is no fatal flaw. *Jim:* this is a breaking change to the API. Do we need to be more careful of possible users? I suspect we are only breaking our own work here: how about you? We would be saying in this that the Jython PyBuffer is allowed to be less like the CPython one than I've been aiming for. This consistency may be less important than Stefan's use case. The CPython protocol promises efficient access to the storage of an object via a pointer, and we would be saying "only as efficient as a j.n.ByteBuffer" ... although it may turn out there's a backing array. j.n.ByteBuffer does not replace PyBuffer, because it cannot describe strided access or the get-release behaviour. I think this leads to an API in which what I've tried to do with PyBuffer.Pointer we now do by handing out ByteBuffer slices. So Pointer goes away. In that case getBuf() and getNIOByteBuffer() are probably the same thing. I do not think it is safe to hand out the actual storage: it is almost unavoidable clients would manipulate the internal state (position, limit), surprising each other and the PyBuffer implementation if it relies on them, as I think it should. Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, which may be why I haven't replicated it. It seems easy to add. (I'd be rewriting all the constructors anyway.) In CPython it's null when there's a buffer but no object. Jeff Allen On 24/04/2016 15:36, Stefan Richthofer wrote: > Jeff, > > good to hear that you can help with this stuff and also that your answer implies you don't have concerns with the new feature itself. Thinking it through again, I think the following way would be cleanest to add this functionality: > > Add a ByteBuffer-type storage, either exclusively or in addition to byte[] storage. > > > > 1) Version with additional field java.nio.ByteBuffer bufferStorage: > > Case byte[]-backed PyBuffer: > (buffer storage must be view on storage, i.e. backed by it and must always point to first element) > > storage is byte[] > bufferStorage is ByteBuffer.wrap(storage) > > getNIOByteBuffer() can use bufferStorage and needn't call ByteBuffer.wrap every time again. > > > Case direct ByteBuffer (likely not having backing array): > > storage is null or if the JVM happens to be capable of providing direct ByteBuffer with byte[] backend: bufferStorage.array() > > bufferStorage is ByteBuffer.allocateDirect(capacity) > > Methods that used to access elements of storage directly are enriched by a fallback for case storage == null. The fallback would directly operate on bufferStorage. > > > > > 2) Version with exclusive Buffer-storage: > > storage type is java.nioByteBuffer instead of byte[] > > > Case byte[]-backed PyBuffer: > > storage is ByteBuffer.allocate(capacity) (i.e. non-Direct, so buffer will have backing array!) > > getNIOByteBuffer() can use storage and needn't call ByteBuffer.wrap. > > Methods that used to access elements of storage directly now do this on storage.array() rather than on storage itself (should be doable by a simple search/replace refactoring more or less). > > > Case direct ByteBuffer (likely not having backing array): > > bufferStorage is ByteBuffer.allocateDirect(capacity) > > Methods that used to access elements of storage directly are enriched by a fallback for case storage.hasArray() == false. The fallback would directly operate on storage's ByteBuffer methods. > > > I can do the work of writing the fallbacks or help with it up to your discretion. > > > Then another thing: I noticed CPython's PyBuffer-pendant contains a reference to the PyObject that exported it, so you can always find the origin of a given PyBuffer. I don't see how this would be feasible with Jython's current PyBuffer implementation. So from JyNI perspective I can store (as a mapping) the exporter in case it is known for some reason, e.g. because PyBuffer was converted from a native CPython-like variant. > However there could be situations where the buffer comes from Jython and the origin would be unknown. In that case I would (currently) just provide a NULL-value or PyNone for this field and hope to get away with it for the important extensions. Maybe we could attach a PyBuffer's origin in Jython too...? (e.g. as a JyAttribute only if some global flag is set, which JyNI would then set on load). > > Best > > Stefan > > > >> Gesendet: Samstag, 23. April 2016 um 20:14 Uhr >> Von: "Jeff Allen" <ja...@fa...> >> An: "Stefan Richthofer" <Ste...@gm...> >> Cc: jim...@py... >> Betreff: Re: [Jython-dev] Jython buffer protocol >> >> Hi Stefan. >> >> Refreshing my memory about how these classes work, I can see that I took >> at face value the CPython view that the purpose of the buffer interface >> is to give clients access to the underlying array of bytes, so >> abstraction of the storage always gave way to what I thought would be >> efficient. (Abstraction of the unit to be something other than byte is >> sketched but clarity and a use case eluded me.) >> >> I always feel I've failed if I have to cast. My instinct is for option a. >> >> But I think you would not create a "Direct" parallel to BaseBuffer, >> since it contains a lot of helper methods independent of the storage >> implementation. Rather, factor it into two layers, the first being >> either BaseBuffer or AbstractBuffer (depending on what causes least >> pain) and the next layer being two base classes, one the revised >> BaseBuffer containing: >> protected byte[] storage; >> and the other containing: >> protected ByteBuffer storage; >> And in each you migrate case whatever it seems natural should come along >> with these declarations. >> >> I've been meaning to get back to Jython: I could do this groundwork if >> that would not be confusing. >> >> Jeff >> >> Jeff Allen >> >> On 22/04/2016 21:50, Stefan Richthofer wrote: >>> Hello Jeff, >>> >>> I'm warming up this old thread, because I am about to start actual work on JyNI's support >>> for buffer-protocol / the PyBuffer builtin type. >>> I'd like to point you to my recent pull request https://github.com/jythontools/jython/pull/39. >>> It's a preliminary step for adding support for direct java.nio.ByteBuffers. After establishing this flag >>> I am going to add some actual support for it. I see basically two ways to go for this >>> >>> a) Create a parallel class hierarchy to BaseBuffer et al, backed by direct ByteBuffers. E.g. >>> call everything with "Direct": DirectBaseBuffer, DirectSimpleBuffer etc. >>> Then let BufferProtocol implementers check for the flag and use Direct counterpart of the >>> usually used Buffer-Class accordingly. >>> >>> or >>> >>> b) Modify existing BaseBuffer such that storage is Object rather than byte[]. Then according to >>> flags it will be byte[] or ByteBuffer. This variant will involve more explicit type casting than >>> a), but would involve fewer new classes however. >>> >>> What is your opinion about this? >>> >>> Best >>> >>> Stefan >>> >>> >>>> Gesendet: Donnerstag, 07. August 2014 um 21:49 Uhr >>>> Von: "Jeff Allen" <ja...@fa...> >>>> An: "Stefan Richthofer" <Ste...@gm...> >>>> Cc: "Jython Developers" <jyt...@li...> >>>> Betreff: Re: [Jython-dev] Jython buffer protocol >>>> >>>> >>>> Jeff Allen >>>> >>>> On 06/08/2014 02:28, Stefan Richthofer wrote: >>>>> Jeff, >>>>> >>>>> I just quickly scanned the changes and everything looks fine as far as I see. PyByteArray and BaseBytes may need adjustments too (on this occasion remember to add resizeCheck() in irepeat). >>>> Thanks for looking that over. I was rather asking whether the use case >>>> was served adequately by the addition of PyBUF.AS_ARRAY, hasArray() and >>>> getNIOByteBuffer(), since you obviously have a clear idea of it. >>>> >>>> I took a second/third look but didn't find anything to change in >>>> BaseBytes and PyByteArray: when arguments objects have the Buffer API >>>> they are accessed through the abstract API (not as byte[]). PyByteArray >>>> obviously can support AS_ARRAY, and that seems to be covered in >>>> SimpleBuffer. >>>> >>>> I fixed the irepeat bug in the previous change set: it's a distinct >>>> issue, so it gets its own change set. Thanks for spotting. >>>>> I know there would be careful thinking needed on how to design such intermediate layer and it would be a drastic change of the current API. I just wanted to make you aware of this idea. Maybe one could approach this in Jython 3 or so, or work out a minimal implementation of it, being open for more advanced use in the future. >>>>> >>>> I think it would make sense to have a layer below BaseBytes that >>>> contained all those mechanisms that work without assuming a byte[] >>>> storage. This would help you implement PyBuffer in an object unable to >>>> export a byte[]. That wouldn't change the API and is likely harmless to >>>> efficiency. But more radical ideas, I agree, need more careful thought. >>>> (The present design has had a lot of thought.) >>>> >>>> Jeff >>>> >>>> ------------------------------------------------------------------------------ >>>> Infragistics Professional >>>> Build stunning WinForms apps today! >>>> Reboot your WinForms applications with our WinForms controls. >>>> Build a bridge from your legacy apps to the future. >>>> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Jython-dev mailing list >>>> Jyt...@li... >>>> https://lists.sourceforge.net/lists/listinfo/jython-dev >>>> >> |
From: Jim B. <jim...@py...> - 2016-04-27 22:30:00
|
On Wed, Apr 27, 2016 at 4:36 PM, Jeff Allen <ja...@fa...> wrote: > I'm giving serious consideration to idea 2, that is, the storage > implementation is j.n.ByteBuffer, always, and *may* wrap a byte[] object. > I'd need to try this out to ensure there is no fatal flaw. > > *Jim:* this is a breaking change to the API. Do we need to be more careful > of possible users? I suspect we are only breaking our own work here: how > about you? > We should mention such a breaking change. Necessarily we have been very conservative on various aspects of our Java API - there is certainly usage out there. But that has been seen in 2.5 or earlier API definitions. I don't see a problem here - any users will be sophisticated and can readily adapt. > We would be saying in this that the Jython PyBuffer is allowed to be less > like the CPython one than I've been aiming for. This consistency may be > less important than Stefan's use case. The CPython protocol promises > efficient access to the storage of an object via a pointer, and we would be > saying "only as efficient as a j.n.ByteBuffer" ... although it may turn out > there's a backing array. j.n.ByteBuffer does not replace PyBuffer, because > it cannot describe strided access or the get-release behaviour. > > I think this leads to an API in which what I've tried to do with > PyBuffer.Pointer we now do by handing out ByteBuffer slices. So Pointer > goes away. In that case getBuf() and getNIOByteBuffer() are probably the > same thing. I do not think it is safe to hand out the actual storage: it is > almost unavoidable clients would manipulate the internal state (position, > limit), surprising each other and the PyBuffer implementation if it relies > on them, as I think it should. > > Concerning the pointer to object member in CPython Py_Buffer, it seems to > be a 3.x feature, which may be why I haven't replicated it. It seems easy > to add. (I'd be rewriting all the constructors anyway.) In CPython it's > null when there's a buffer but no object. > > Jeff Allen > > On 24/04/2016 15:36, Stefan Richthofer wrote: > > Jeff, > > good to hear that you can help with this stuff and also that your answer implies you don't have concerns with the new feature itself. Thinking it through again, I think the following way would be cleanest to add this functionality: > > Add a ByteBuffer-type storage, either exclusively or in addition to byte[] storage. > > > > 1) Version with additional field java.nio.ByteBuffer bufferStorage: > > Case byte[]-backed PyBuffer: > (buffer storage must be view on storage, i.e. backed by it and must always point to first element) > > storage is byte[] > bufferStorage is ByteBuffer.wrap(storage) > > getNIOByteBuffer() can use bufferStorage and needn't call ByteBuffer.wrap every time again. > > > Case direct ByteBuffer (likely not having backing array): > > storage is null or if the JVM happens to be capable of providing direct ByteBuffer with byte[] backend: bufferStorage.array() > > bufferStorage is ByteBuffer.allocateDirect(capacity) > > Methods that used to access elements of storage directly are enriched by a fallback for case storage == null. The fallback would directly operate on bufferStorage. > > > > > 2) Version with exclusive Buffer-storage: > > storage type is java.nioByteBuffer instead of byte[] > > > Case byte[]-backed PyBuffer: > > storage is ByteBuffer.allocate(capacity) (i.e. non-Direct, so buffer will have backing array!) > > getNIOByteBuffer() can use storage and needn't call ByteBuffer.wrap. > > Methods that used to access elements of storage directly now do this on storage.array() rather than on storage itself (should be doable by a simple search/replace refactoring more or less). > > > Case direct ByteBuffer (likely not having backing array): > > bufferStorage is ByteBuffer.allocateDirect(capacity) > > Methods that used to access elements of storage directly are enriched by a fallback for case storage.hasArray() == false. The fallback would directly operate on storage's ByteBuffer methods. > > > I can do the work of writing the fallbacks or help with it up to your discretion. > > > Then another thing: I noticed CPython's PyBuffer-pendant contains a reference to the PyObject that exported it, so you can always find the origin of a given PyBuffer. I don't see how this would be feasible with Jython's current PyBuffer implementation. So from JyNI perspective I can store (as a mapping) the exporter in case it is known for some reason, e.g. because PyBuffer was converted from a native CPython-like variant. > However there could be situations where the buffer comes from Jython and the origin would be unknown. In that case I would (currently) just provide a NULL-value or PyNone for this field and hope to get away with it for the important extensions. Maybe we could attach a PyBuffer's origin in Jython too...? (e.g. as a JyAttribute only if some global flag is set, which JyNI would then set on load). > > Best > > Stefan > > > > > Gesendet: Samstag, 23. April 2016 um 20:14 Uhr > Von: "Jeff Allen" <ja...@fa...> <ja...@fa...> > An: "Stefan Richthofer" <Ste...@gm...> <Ste...@gm...> > Cc: jim...@py... > Betreff: Re: [Jython-dev] Jython buffer protocol > > Hi Stefan. > > Refreshing my memory about how these classes work, I can see that I took > at face value the CPython view that the purpose of the buffer interface > is to give clients access to the underlying array of bytes, so > abstraction of the storage always gave way to what I thought would be > efficient. (Abstraction of the unit to be something other than byte is > sketched but clarity and a use case eluded me.) > > I always feel I've failed if I have to cast. My instinct is for option a. > > But I think you would not create a "Direct" parallel to BaseBuffer, > since it contains a lot of helper methods independent of the storage > implementation. Rather, factor it into two layers, the first being > either BaseBuffer or AbstractBuffer (depending on what causes least > pain) and the next layer being two base classes, one the revised > BaseBuffer containing: > protected byte[] storage; > and the other containing: > protected ByteBuffer storage; > And in each you migrate case whatever it seems natural should come along > with these declarations. > > I've been meaning to get back to Jython: I could do this groundwork if > that would not be confusing. > > Jeff > > Jeff Allen > > On 22/04/2016 21:50, Stefan Richthofer wrote: > > Hello Jeff, > > I'm warming up this old thread, because I am about to start actual work on JyNI's support > for buffer-protocol / the PyBuffer builtin type. > I'd like to point you to my recent pull request https://github.com/jythontools/jython/pull/39. > It's a preliminary step for adding support for direct java.nio.ByteBuffers. After establishing this flag > I am going to add some actual support for it. I see basically two ways to go for this > > a) Create a parallel class hierarchy to BaseBuffer et al, backed by direct ByteBuffers. E.g. > call everything with "Direct": DirectBaseBuffer, DirectSimpleBuffer etc. > Then let BufferProtocol implementers check for the flag and use Direct counterpart of the > usually used Buffer-Class accordingly. > > or > > b) Modify existing BaseBuffer such that storage is Object rather than byte[]. Then according to > flags it will be byte[] or ByteBuffer. This variant will involve more explicit type casting than > a), but would involve fewer new classes however. > > What is your opinion about this? > > Best > > Stefan > > > > Gesendet: Donnerstag, 07. August 2014 um 21:49 Uhr > Von: "Jeff Allen" <ja...@fa...> <ja...@fa...> > An: "Stefan Richthofer" <Ste...@gm...> <Ste...@gm...> > Cc: "Jython Developers" <jyt...@li...> <jyt...@li...> > Betreff: Re: [Jython-dev] Jython buffer protocol > > > Jeff Allen > > On 06/08/2014 02:28, Stefan Richthofer wrote: > > Jeff, > > I just quickly scanned the changes and everything looks fine as far as I see. PyByteArray and BaseBytes may need adjustments too (on this occasion remember to add resizeCheck() in irepeat). > > Thanks for looking that over. I was rather asking whether the use case > was served adequately by the addition of PyBUF.AS_ARRAY, hasArray() and > getNIOByteBuffer(), since you obviously have a clear idea of it. > > I took a second/third look but didn't find anything to change in > BaseBytes and PyByteArray: when arguments objects have the Buffer API > they are accessed through the abstract API (not as byte[]). PyByteArray > obviously can support AS_ARRAY, and that seems to be covered in > SimpleBuffer. > > I fixed the irepeat bug in the previous change set: it's a distinct > issue, so it gets its own change set. Thanks for spotting. > > I know there would be careful thinking needed on how to design such intermediate layer and it would be a drastic change of the current API. I just wanted to make you aware of this idea. Maybe one could approach this in Jython 3 or so, or work out a minimal implementation of it, being open for more advanced use in the future. > > > I think it would make sense to have a layer below BaseBytes that > contained all those mechanisms that work without assuming a byte[] > storage. This would help you implement PyBuffer in an object unable to > export a byte[]. That wouldn't change the API and is likely harmless to > efficiency. But more radical ideas, I agree, need more careful thought. > (The present design has had a lot of thought.) > > Jeff > > ------------------------------------------------------------------------------ > Infragistics Professional > Build stunning WinForms apps today! > Reboot your WinForms applications with our WinForms controls. > Build a bridge from your legacy apps to the future.http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk > _______________________________________________ > Jython-dev mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/jython-dev > > > |
From: Stefan R. <Ste...@gm...> - 2016-04-28 10:06:31
|
>this is a breaking change to the API I think this can be achieved without a breaking API-change (detailed comments below). (However if you prefer a slight break to achieve a cleaner API I won't complain.) The possibility that an array-storage access cannot be provided is already contained in the API. If the flag AS_ARRAY is not set, the current API already doesn't guarantee to offer array-access (via PyBuffer.Pointer). Does a type-change of storage field in BaseBuffer count as breaking API change, given that it is protected? Third-parties that extend BaseBuffer might be affected, which can be avoided by option 1), i.e. adding ByteBuffer view as a separate field, e.g. "storageBufferView". We could start with option 1), declare the byte[]-storage field as deprecated and remove it in 2.7.4 or so. This would provide a smooth transition to variant 2). Replacing PyBuffer.Pointer by ByteBuffer would be a breaking change, but could be avoided too. In Java fashion PyBuffer.Pointer and corresponding API/methods can be kept as @deprecated. (Or just kept - I am actually +0 about replacing PyBuffer.Pointer with ByteBuffer) > Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, which may be why I haven't replicated it. Taking another look it seems like this feature was actually backported to Python 2. Py_buffer declaration in object.h of Python 2.7 is titled /* Py3k buffer interface */. However, for sake of compatibility it would be best to support it in Jython, given that it is (presumably) easy to add. Do I miss some aspect? -Stefan Gesendet: Mittwoch, 27. April 2016 um 23:57 Uhr Von: "Jim Baker" <jim...@py...> An: "Jeff Allen" <ja...@fa...> Cc: "Stefan Richthofer" <Ste...@gm...>, "Jython Developers" <jyt...@li...> Betreff: Re: [Jython-dev] Jython buffer protocol On Wed, Apr 27, 2016 at 4:36 PM, Jeff Allen <ja...@fa...> wrote: I'm giving serious consideration to idea 2, that is, the storage implementation is j.n.ByteBuffer, always, and *may* wrap a byte[] object. I'd need to try this out to ensure there is no fatal flaw. *Jim:* this is a breaking change to the API. Do we need to be more careful of possible users? I suspect we are only breaking our own work here: how about you? We should mention such a breaking change. Necessarily we have been very conservative on various aspects of our Java API - there is certainly usage out there. But that has been seen in 2.5 or earlier API definitions. I don't see a problem here - any users will be sophisticated and can readily adapt. We would be saying in this that the Jython PyBuffer is allowed to be less like the CPython one than I've been aiming for. This consistency may be less important than Stefan's use case. The CPython protocol promises efficient access to the storage of an object via a pointer, and we would be saying "only as efficient as a j.n.ByteBuffer" ... although it may turn out there's a backing array. j.n.ByteBuffer does not replace PyBuffer, because it cannot describe strided access or the get-release behaviour. I think this leads to an API in which what I've tried to do with PyBuffer.Pointer we now do by handing out ByteBuffer slices. So Pointer goes away. In that case getBuf() and getNIOByteBuffer() are probably the same thing. I do not think it is safe to hand out the actual storage: it is almost unavoidable clients would manipulate the internal state (position, limit), surprising each other and the PyBuffer implementation if it relies on them, as I think it should. Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, which may be why I haven't replicated it. It seems easy to add. (I'd be rewriting all the constructors anyway.) In CPython it's null when there's a buffer but no object. Jeff Allen On 24/04/2016 15:36, Stefan Richthofer wrote: Jeff, good to hear that you can help with this stuff and also that your answer implies you don't have concerns with the new feature itself. Thinking it through again, I think the following way would be cleanest to add this functionality: Add a ByteBuffer-type storage, either exclusively or in addition to byte[] storage. 1) Version with additional field java.nio.ByteBuffer bufferStorage: Case byte[]-backed PyBuffer: (buffer storage must be view on storage, i.e. backed by it and must always point to first element) storage is byte[] bufferStorage is ByteBuffer.wrap(storage) getNIOByteBuffer() can use bufferStorage and needn't call ByteBuffer.wrap every time again. Case direct ByteBuffer (likely not having backing array): storage is null or if the JVM happens to be capable of providing direct ByteBuffer with byte[] backend: bufferStorage.array() bufferStorage is ByteBuffer.allocateDirect(capacity) Methods that used to access elements of storage directly are enriched by a fallback for case storage == null. The fallback would directly operate on bufferStorage. 2) Version with exclusive Buffer-storage: storage type is java.nioByteBuffer instead of byte[] Case byte[]-backed PyBuffer: storage is ByteBuffer.allocate(capacity) (i.e. non-Direct, so buffer will have backing array!) getNIOByteBuffer() can use storage and needn't call ByteBuffer.wrap. Methods that used to access elements of storage directly now do this on storage.array() rather than on storage itself (should be doable by a simple search/replace refactoring more or less). Case direct ByteBuffer (likely not having backing array): bufferStorage is ByteBuffer.allocateDirect(capacity) Methods that used to access elements of storage directly are enriched by a fallback for case storage.hasArray() == false. The fallback would directly operate on storage's ByteBuffer methods. I can do the work of writing the fallbacks or help with it up to your discretion. Then another thing: I noticed CPython's PyBuffer-pendant contains a reference to the PyObject that exported it, so you can always find the origin of a given PyBuffer. I don't see how this would be feasible with Jython's current PyBuffer implementation. So from JyNI perspective I can store (as a mapping) the exporter in case it is known for some reason, e.g. because PyBuffer was converted from a native CPython-like variant. However there could be situations where the buffer comes from Jython and the origin would be unknown. In that case I would (currently) just provide a NULL-value or PyNone for this field and hope to get away with it for the important extensions. Maybe we could attach a PyBuffer's origin in Jython too...? (e.g. as a JyAttribute only if some global flag is set, which JyNI would then set on load). Best Stefan Gesendet: Samstag, 23. April 2016 um 20:14 Uhr Von: "Jeff Allen" <ja...@fa...> An: "Stefan Richthofer" <Ste...@gm...> Cc: jim...@py... Betreff: Re: [Jython-dev] Jython buffer protocol Hi Stefan. Refreshing my memory about how these classes work, I can see that I took at face value the CPython view that the purpose of the buffer interface is to give clients access to the underlying array of bytes, so abstraction of the storage always gave way to what I thought would be efficient. (Abstraction of the unit to be something other than byte is sketched but clarity and a use case eluded me.) I always feel I've failed if I have to cast. My instinct is for option a. But I think you would not create a "Direct" parallel to BaseBuffer, since it contains a lot of helper methods independent of the storage implementation. Rather, factor it into two layers, the first being either BaseBuffer or AbstractBuffer (depending on what causes least pain) and the next layer being two base classes, one the revised BaseBuffer containing: protected byte[] storage; and the other containing: protected ByteBuffer storage; And in each you migrate case whatever it seems natural should come along with these declarations. I've been meaning to get back to Jython: I could do this groundwork if that would not be confusing. Jeff Jeff Allen On 22/04/2016 21:50, Stefan Richthofer wrote: Hello Jeff, I'm warming up this old thread, because I am about to start actual work on JyNI's support for buffer-protocol / the PyBuffer builtin type. I'd like to point you to my recent pull request https://github.com/jythontools/jython/pull/39[https://github.com/jythontools/jython/pull/39]. It's a preliminary step for adding support for direct java.nio.ByteBuffers. After establishing this flag I am going to add some actual support for it. I see basically two ways to go for this a) Create a parallel class hierarchy to BaseBuffer et al, backed by direct ByteBuffers. E.g. call everything with "Direct": DirectBaseBuffer, DirectSimpleBuffer etc. Then let BufferProtocol implementers check for the flag and use Direct counterpart of the usually used Buffer-Class accordingly. or b) Modify existing BaseBuffer such that storage is Object rather than byte[]. Then according to flags it will be byte[] or ByteBuffer. This variant will involve more explicit type casting than a), but would involve fewer new classes however. What is your opinion about this? Best Stefan Gesendet: Donnerstag, 07. August 2014 um 21:49 Uhr Von: "Jeff Allen" <ja...@fa...> An: "Stefan Richthofer" <Ste...@gm...> Cc: "Jython Developers" <jyt...@li...> Betreff: Re: [Jython-dev] Jython buffer protocol Jeff Allen On 06/08/2014 02:28, Stefan Richthofer wrote: Jeff, I just quickly scanned the changes and everything looks fine as far as I see. PyByteArray and BaseBytes may need adjustments too (on this occasion remember to add resizeCheck() in irepeat). Thanks for looking that over. I was rather asking whether the use case was served adequately by the addition of PyBUF.AS_ARRAY, hasArray() and getNIOByteBuffer(), since you obviously have a clear idea of it. I took a second/third look but didn't find anything to change in BaseBytes and PyByteArray: when arguments objects have the Buffer API they are accessed through the abstract API (not as byte[]). PyByteArray obviously can support AS_ARRAY, and that seems to be covered in SimpleBuffer. I fixed the irepeat bug in the previous change set: it's a distinct issue, so it gets its own change set. Thanks for spotting. I know there would be careful thinking needed on how to design such intermediate layer and it would be a drastic change of the current API. I just wanted to make you aware of this idea. Maybe one could approach this in Jython 3 or so, or work out a minimal implementation of it, being open for more advanced use in the future. I think it would make sense to have a layer below BaseBytes that contained all those mechanisms that work without assuming a byte[] storage. This would help you implement PyBuffer in an object unable to export a byte[]. That wouldn't change the API and is likely harmless to efficiency. But more radical ideas, I agree, need more careful thought. (The present design has had a lot of thought.) Jeff ------------------------------------------------------------------------------ Infragistics Professional Build stunning WinForms apps today! Reboot your WinForms applications with our WinForms controls. Build a bridge from your legacy apps to the future.http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk[http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk] _______________________________________________ Jython-dev mailing lis...@li...https://lists.sourceforge.net/lists/listinfo/jython-dev |
From: Jeff A. <ja...@fa...> - 2016-05-06 23:51:01
|
I'm following the deprecation route at the moment, but bearing in mind Jim's view that breaking change is acceptable by virtue of low adoption, this may only be a transient arrangement. I don't want to maintain two approaches to storage. I favour adding: ByteBuffer getByteBuffer(); // = getNIOByteBuffer ByteBuffer getByteBuffer(int index); ByteBuffer getByteBuffer(int... indices); which differ only in the position() of the returned buffer. Each returns a new ByteBuffer, so that clients may call the incremental get and put methods without interfering. An alternative is to have only the first, but expose the index calculation helpers so one can set the position easily in complex cases. Writing the test code first was quite helpful in this case. The current getNIOByteBuffer attempts to set the buffer limit according to the actual data extent in the view (which is not the whole underlying byte array when it's a slice). This seems unnecessary, and is only useful in the contiguous case. I figure you should always get the whole thing, then work out how many items to read and write from the navigation, not from ByteBuffer.remaining(). Ok, in CPython 2.7 the reference to the underlying object is present in the code, just missing from the documentation. I think we can accommodate it. Jeff Allen On 28/04/2016 11:06, Stefan Richthofer wrote: >> this is a breaking change to the API > I think this can be achieved without a breaking API-change (detailed comments below). > (However if you prefer a slight break to achieve a cleaner API I won't complain.) > > The possibility that an array-storage access cannot be provided is already contained in the API. > If the flag AS_ARRAY is not set, the current API already doesn't guarantee to offer array-access (via PyBuffer.Pointer). > > Does a type-change of storage field in BaseBuffer count as breaking API change, given that it is > protected? Third-parties that extend BaseBuffer might be affected, which can be avoided by > option 1), i.e. adding ByteBuffer view as a separate field, e.g. "storageBufferView". > We could start with option 1), declare the byte[]-storage field as deprecated and remove it in > 2.7.4 or so. This would provide a smooth transition to variant 2). > > Replacing PyBuffer.Pointer by ByteBuffer would be a breaking change, but could be avoided too. In Java > fashion PyBuffer.Pointer and corresponding API/methods can be kept as @deprecated. (Or just kept - > I am actually +0 about replacing PyBuffer.Pointer with ByteBuffer) > > >> Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, > which may be why I haven't replicated it. > > Taking another look it seems like this feature was actually backported to Python 2. Py_buffer declaration in > object.h of Python 2.7 is titled /* Py3k buffer interface */. However, for sake of compatibility it would > be best to support it in Jython, given that it is (presumably) easy to add. > > > Do I miss some aspect? > > -Stefan > > > Gesendet: Mittwoch, 27. April 2016 um 23:57 Uhr > Von: "Jim Baker" <jim...@py...> > An: "Jeff Allen" <ja...@fa...> > Cc: "Stefan Richthofer" <Ste...@gm...>, "Jython Developers" <jyt...@li...> > Betreff: Re: [Jython-dev] Jython buffer protocol > > On Wed, Apr 27, 2016 at 4:36 PM, Jeff Allen <ja...@fa...> wrote: > I'm giving serious consideration to idea 2, that is, the storage implementation is j.n.ByteBuffer, always, and *may* wrap a byte[] object. I'd need to try this out to ensure there is no fatal flaw. > *Jim:* this is a breaking change to the API. Do we need to be more careful of possible users? I suspect we are only breaking our own work here: how about you? > > We should mention such a breaking change. Necessarily we have been very conservative on various aspects of our Java API - there is certainly usage out there. But that has been seen in 2.5 or earlier API definitions. I don't see a problem here - any users will be sophisticated and can readily adapt. > > We would be saying in this that the Jython PyBuffer is allowed to be less like the CPython one than I've been aiming for. This consistency may be less important than Stefan's use case. The CPython protocol promises efficient access to the storage of an object via a pointer, and we would be saying "only as efficient as a j.n.ByteBuffer" ... although it may turn out there's a backing array. j.n.ByteBuffer does not replace PyBuffer, because it cannot describe strided access or the get-release behaviour. > I think this leads to an API in which what I've tried to do with PyBuffer.Pointer we now do by handing out ByteBuffer slices. So Pointer goes away. In that case getBuf() and getNIOByteBuffer() are probably the same thing. I do not think it is safe to hand out the actual storage: it is almost unavoidable clients would manipulate the internal state (position, limit), surprising each other and the PyBuffer implementation if it relies on them, as I think it should. > Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, which may be why I haven't replicated it. It seems easy to add. (I'd be rewriting all the constructors anyway.) In CPython it's null when there's a buffer but no object. > > Jeff Allen > > On 24/04/2016 15:36, Stefan Richthofer wrote: > Jeff, > > good to hear that you can help with this stuff and also that your answer implies you don't have concerns with the new feature itself. Thinking it through again, I think the following way would be cleanest to add this functionality: > > Add a ByteBuffer-type storage, either exclusively or in addition to byte[] storage. > > > > 1) Version with additional field java.nio.ByteBuffer bufferStorage: > > Case byte[]-backed PyBuffer: > (buffer storage must be view on storage, i.e. backed by it and must always point to first element) > > storage is byte[] > bufferStorage is ByteBuffer.wrap(storage) > > getNIOByteBuffer() can use bufferStorage and needn't call ByteBuffer.wrap every time again. > > > Case direct ByteBuffer (likely not having backing array): > > storage is null or if the JVM happens to be capable of providing direct ByteBuffer with byte[] backend: bufferStorage.array() > > bufferStorage is ByteBuffer.allocateDirect(capacity) > > Methods that used to access elements of storage directly are enriched by a fallback for case storage == null. The fallback would directly operate on bufferStorage. > > > > > 2) Version with exclusive Buffer-storage: > > storage type is java.nioByteBuffer instead of byte[] > > > Case byte[]-backed PyBuffer: > > storage is ByteBuffer.allocate(capacity) (i.e. non-Direct, so buffer will have backing array!) > > getNIOByteBuffer() can use storage and needn't call ByteBuffer.wrap. > > Methods that used to access elements of storage directly now do this on storage.array() rather than on storage itself (should be doable by a simple search/replace refactoring more or less). > > > Case direct ByteBuffer (likely not having backing array): > > bufferStorage is ByteBuffer.allocateDirect(capacity) > > Methods that used to access elements of storage directly are enriched by a fallback for case storage.hasArray() == false. The fallback would directly operate on storage's ByteBuffer methods. > > > I can do the work of writing the fallbacks or help with it up to your discretion. > > > Then another thing: I noticed CPython's PyBuffer-pendant contains a reference to the PyObject that exported it, so you can always find the origin of a given PyBuffer. I don't see how this would be feasible with Jython's current PyBuffer implementation. So from JyNI perspective I can store (as a mapping) the exporter in case it is known for some reason, e.g. because PyBuffer was converted from a native CPython-like variant. > However there could be situations where the buffer comes from Jython and the origin would be unknown. In that case I would (currently) just provide a NULL-value or PyNone for this field and hope to get away with it for the important extensions. Maybe we could attach a PyBuffer's origin in Jython too...? (e.g. as a JyAttribute only if some global flag is set, which JyNI would then set on load). > > Best > > Stefan > > > > Gesendet: Samstag, 23. April 2016 um 20:14 Uhr > Von: "Jeff Allen" <ja...@fa...> > An: "Stefan Richthofer" <Ste...@gm...> > Cc: jim...@py... > Betreff: Re: [Jython-dev] Jython buffer protocol > > Hi Stefan. > > Refreshing my memory about how these classes work, I can see that I took > at face value the CPython view that the purpose of the buffer interface > is to give clients access to the underlying array of bytes, so > abstraction of the storage always gave way to what I thought would be > efficient. (Abstraction of the unit to be something other than byte is > sketched but clarity and a use case eluded me.) > > I always feel I've failed if I have to cast. My instinct is for option a. > > But I think you would not create a "Direct" parallel to BaseBuffer, > since it contains a lot of helper methods independent of the storage > implementation. Rather, factor it into two layers, the first being > either BaseBuffer or AbstractBuffer (depending on what causes least > pain) and the next layer being two base classes, one the revised > BaseBuffer containing: > protected byte[] storage; > and the other containing: > protected ByteBuffer storage; > And in each you migrate case whatever it seems natural should come along > with these declarations. > > I've been meaning to get back to Jython: I could do this groundwork if > that would not be confusing. > > Jeff > > Jeff Allen > > On 22/04/2016 21:50, Stefan Richthofer wrote: > Hello Jeff, > > I'm warming up this old thread, because I am about to start actual work on JyNI's support > for buffer-protocol / the PyBuffer builtin type. > I'd like to point you to my recent pull request https://github.com/jythontools/jython/pull/39[https://github.com/jythontools/jython/pull/39]. > It's a preliminary step for adding support for direct java.nio.ByteBuffers. After establishing this flag > I am going to add some actual support for it. I see basically two ways to go for this > > a) Create a parallel class hierarchy to BaseBuffer et al, backed by direct ByteBuffers. E.g. > call everything with "Direct": DirectBaseBuffer, DirectSimpleBuffer etc. > Then let BufferProtocol implementers check for the flag and use Direct counterpart of the > usually used Buffer-Class accordingly. > > or > > b) Modify existing BaseBuffer such that storage is Object rather than byte[]. Then according to > flags it will be byte[] or ByteBuffer. This variant will involve more explicit type casting than > a), but would involve fewer new classes however. > > What is your opinion about this? > > Best > > Stefan > |
From: Jeff A. <ja...@fa...> - 2016-05-11 06:56:40
|
This is working out reasonably well. It results widespread change, but mostly downwards in complexity. There is less reason to give special cases a fast path when the underlying storage is indirect. Hopefully bulk sequential operations on ByteBuffer implementations are well-optimised. getByteBuffer(int index), which is actually still getNIOByteBuffer(int index), does not seem to do as much for me as the Pointer equivalent. I think requiring a PyBuffer to offer you its index calculation is the way to go. Something like: assert pybuf.getNdim() == 2; assert pybuf.getShape()[0] == x.length; // Extract column c ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL); for (int r=0; r<x.length; r++) x[r] = bb.getFloat( pybuf.index(r,c) ); Jeff Allen On 07/05/2016 00:50, Jeff Allen wrote: > I'm following the deprecation route at the moment, but bearing in mind > Jim's view that breaking change is acceptable by virtue of low adoption, > this may only be a transient arrangement. I don't want to maintain two > approaches to storage. I favour adding: > > ByteBuffer getByteBuffer(); // = getNIOByteBuffer > ByteBuffer getByteBuffer(int index); > ByteBuffer getByteBuffer(int... indices); > > which differ only in the position() of the returned buffer. Each returns > a new ByteBuffer, so that clients may call the incremental get and put > methods without interfering. An alternative is to have only the first, > but expose the index calculation helpers so one can set the position > easily in complex cases. > > Writing the test code first was quite helpful in this case. > > The current getNIOByteBuffer attempts to set the buffer limit according > to the actual data extent in the view (which is not the whole underlying > byte array when it's a slice). This seems unnecessary, and is only > useful in the contiguous case. I figure you should always get the whole > thing, then work out how many items to read and write from the > navigation, not from ByteBuffer.remaining(). > > Ok, in CPython 2.7 the reference to the underlying object is present in > the code, just missing from the documentation. I think we can > accommodate it. > > Jeff Allen > > On 28/04/2016 11:06, Stefan Richthofer wrote: >>> this is a breaking change to the API >> I think this can be achieved without a breaking API-change (detailed comments below). >> (However if you prefer a slight break to achieve a cleaner API I won't complain.) >> >> The possibility that an array-storage access cannot be provided is already contained in the API. >> If the flag AS_ARRAY is not set, the current API already doesn't guarantee to offer array-access (via PyBuffer.Pointer). >> >> Does a type-change of storage field in BaseBuffer count as breaking API change, given that it is >> protected? Third-parties that extend BaseBuffer might be affected, which can be avoided by >> option 1), i.e. adding ByteBuffer view as a separate field, e.g. "storageBufferView". >> We could start with option 1), declare the byte[]-storage field as deprecated and remove it in >> 2.7.4 or so. This would provide a smooth transition to variant 2). >> >> Replacing PyBuffer.Pointer by ByteBuffer would be a breaking change, but could be avoided too. In Java >> fashion PyBuffer.Pointer and corresponding API/methods can be kept as @deprecated. (Or just kept - >> I am actually +0 about replacing PyBuffer.Pointer with ByteBuffer) >> >> >>> Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, >> which may be why I haven't replicated it. >> >> Taking another look it seems like this feature was actually backported to Python 2. Py_buffer declaration in >> object.h of Python 2.7 is titled /* Py3k buffer interface */. However, for sake of compatibility it would >> be best to support it in Jython, given that it is (presumably) easy to add. >> >> >> Do I miss some aspect? >> >> -Stefan >> >> >> Gesendet: Mittwoch, 27. April 2016 um 23:57 Uhr >> Von: "Jim Baker" <jim...@py...> >> An: "Jeff Allen" <ja...@fa...> >> Cc: "Stefan Richthofer" <Ste...@gm...>, "Jython Developers" <jyt...@li...> >> Betreff: Re: [Jython-dev] Jython buffer protocol >> >> On Wed, Apr 27, 2016 at 4:36 PM, Jeff Allen <ja...@fa...> wrote: >> I'm giving serious consideration to idea 2, that is, the storage implementation is j.n.ByteBuffer, always, and *may* wrap a byte[] object. I'd need to try this out to ensure there is no fatal flaw. >> *Jim:* this is a breaking change to the API. Do we need to be more careful of possible users? I suspect we are only breaking our own work here: how about you? >> >> We should mention such a breaking change. Necessarily we have been very conservative on various aspects of our Java API - there is certainly usage out there. But that has been seen in 2.5 or earlier API definitions. I don't see a problem here - any users will be sophisticated and can readily adapt. >> >> We would be saying in this that the Jython PyBuffer is allowed to be less like the CPython one than I've been aiming for. This consistency may be less important than Stefan's use case. The CPython protocol promises efficient access to the storage of an object via a pointer, and we would be saying "only as efficient as a j.n.ByteBuffer" ... although it may turn out there's a backing array. j.n.ByteBuffer does not replace PyBuffer, because it cannot describe strided access or the get-release behaviour. >> I think this leads to an API in which what I've tried to do with PyBuffer.Pointer we now do by handing out ByteBuffer slices. So Pointer goes away. In that case getBuf() and getNIOByteBuffer() are probably the same thing. I do not think it is safe to hand out the actual storage: it is almost unavoidable clients would manipulate the internal state (position, limit), surprising each other and the PyBuffer implementation if it relies on them, as I think it should. >> Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, which may be why I haven't replicated it. It seems easy to add. (I'd be rewriting all the constructors anyway.) In CPython it's null when there's a buffer but no object. >> >> Jeff Allen >> >> On 24/04/2016 15:36, Stefan Richthofer wrote: >> Jeff, >> >> good to hear that you can help with this stuff and also that your answer implies you don't have concerns with the new feature itself. Thinking it through again, I think the following way would be cleanest to add this functionality: >> >> Add a ByteBuffer-type storage, either exclusively or in addition to byte[] storage. >> >> >> >> 1) Version with additional field java.nio.ByteBuffer bufferStorage: >> >> Case byte[]-backed PyBuffer: >> (buffer storage must be view on storage, i.e. backed by it and must always point to first element) >> >> storage is byte[] >> bufferStorage is ByteBuffer.wrap(storage) >> >> getNIOByteBuffer() can use bufferStorage and needn't call ByteBuffer.wrap every time again. >> >> >> Case direct ByteBuffer (likely not having backing array): >> >> storage is null or if the JVM happens to be capable of providing direct ByteBuffer with byte[] backend: bufferStorage.array() >> >> bufferStorage is ByteBuffer.allocateDirect(capacity) >> >> Methods that used to access elements of storage directly are enriched by a fallback for case storage == null. The fallback would directly operate on bufferStorage. >> >> >> >> >> 2) Version with exclusive Buffer-storage: >> >> storage type is java.nioByteBuffer instead of byte[] >> >> >> Case byte[]-backed PyBuffer: >> >> storage is ByteBuffer.allocate(capacity) (i.e. non-Direct, so buffer will have backing array!) >> >> getNIOByteBuffer() can use storage and needn't call ByteBuffer.wrap. >> >> Methods that used to access elements of storage directly now do this on storage.array() rather than on storage itself (should be doable by a simple search/replace refactoring more or less). >> >> >> Case direct ByteBuffer (likely not having backing array): >> >> bufferStorage is ByteBuffer.allocateDirect(capacity) >> >> Methods that used to access elements of storage directly are enriched by a fallback for case storage.hasArray() == false. The fallback would directly operate on storage's ByteBuffer methods. >> >> >> I can do the work of writing the fallbacks or help with it up to your discretion. >> >> >> Then another thing: I noticed CPython's PyBuffer-pendant contains a reference to the PyObject that exported it, so you can always find the origin of a given PyBuffer. I don't see how this would be feasible with Jython's current PyBuffer implementation. So from JyNI perspective I can store (as a mapping) the exporter in case it is known for some reason, e.g. because PyBuffer was converted from a native CPython-like variant. >> However there could be situations where the buffer comes from Jython and the origin would be unknown. In that case I would (currently) just provide a NULL-value or PyNone for this field and hope to get away with it for the important extensions. Maybe we could attach a PyBuffer's origin in Jython too...? (e.g. as a JyAttribute only if some global flag is set, which JyNI would then set on load). >> >> Best >> >> Stefan >> >> >> >> Gesendet: Samstag, 23. April 2016 um 20:14 Uhr >> Von: "Jeff Allen" <ja...@fa...> >> An: "Stefan Richthofer" <Ste...@gm...> >> Cc: jim...@py... >> Betreff: Re: [Jython-dev] Jython buffer protocol >> >> Hi Stefan. >> >> Refreshing my memory about how these classes work, I can see that I took >> at face value the CPython view that the purpose of the buffer interface >> is to give clients access to the underlying array of bytes, so >> abstraction of the storage always gave way to what I thought would be >> efficient. (Abstraction of the unit to be something other than byte is >> sketched but clarity and a use case eluded me.) >> >> I always feel I've failed if I have to cast. My instinct is for option a. >> >> But I think you would not create a "Direct" parallel to BaseBuffer, >> since it contains a lot of helper methods independent of the storage >> implementation. Rather, factor it into two layers, the first being >> either BaseBuffer or AbstractBuffer (depending on what causes least >> pain) and the next layer being two base classes, one the revised >> BaseBuffer containing: >> protected byte[] storage; >> and the other containing: >> protected ByteBuffer storage; >> And in each you migrate case whatever it seems natural should come along >> with these declarations. >> >> I've been meaning to get back to Jython: I could do this groundwork if >> that would not be confusing. >> >> Jeff >> >> Jeff Allen >> >> On 22/04/2016 21:50, Stefan Richthofer wrote: >> Hello Jeff, >> >> I'm warming up this old thread, because I am about to start actual work on JyNI's support >> for buffer-protocol / the PyBuffer builtin type. >> I'd like to point you to my recent pull request https://github.com/jythontools/jython/pull/39[https://github.com/jythontools/jython/pull/39]. >> It's a preliminary step for adding support for direct java.nio.ByteBuffers. After establishing this flag >> I am going to add some actual support for it. I see basically two ways to go for this >> >> a) Create a parallel class hierarchy to BaseBuffer et al, backed by direct ByteBuffers. E.g. >> call everything with "Direct": DirectBaseBuffer, DirectSimpleBuffer etc. >> Then let BufferProtocol implementers check for the flag and use Direct counterpart of the >> usually used Buffer-Class accordingly. >> >> or >> >> b) Modify existing BaseBuffer such that storage is Object rather than byte[]. Then according to >> flags it will be byte[] or ByteBuffer. This variant will involve more explicit type casting than >> a), but would involve fewer new classes however. >> >> What is your opinion about this? >> >> Best >> >> Stefan >> > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Jython-dev mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-dev > |
From: Stefan R. <Ste...@gm...> - 2016-05-11 13:39:00
|
Hey Jeff, > This is working out reasonably well. Sounds like good news! Would you put a draft e.g. on github once it is somehow at a sane state? > ByteBuffer getByteBuffer(int... indices); I wonder what this is supposed to do; afaik ByteBuffer supports no multi-index logic. (Correct me if I'm wrong). > // Extract column c > ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL); > for (int r=0; r<x.length; r++) > x[r] = bb.getFloat( pybuf.index(r,c) ); This looks slow, because method calls are slow (compared to array-access) and it requires at least two calls per index. Maybe JIT applies some magic here, but I would not count on it. However I guess it's presented fairly out of context, so I maybe got the wrong impression. (Which is why I'm looking forward to a complete draft as mentioned above). > I'm following the deprecation route at the moment, but bearing in mind > Jim's view that breaking change is acceptable by virtue of low adoption, I wonder if there is any evidence how low adoption currently is at all. Are there any publicly known projects using this? Best Stefan > Gesendet: Mittwoch, 11. Mai 2016 um 08:55 Uhr > Von: "Jeff Allen" <ja...@fa...> > An: jyt...@li... > Betreff: Re: [Jython-dev] Jython buffer protocol > > This is working out reasonably well. It results widespread change, but > mostly downwards in complexity. There is less reason to give special > cases a fast path when the underlying storage is indirect. Hopefully > bulk sequential operations on ByteBuffer implementations are well-optimised. > > getByteBuffer(int index), which is actually still getNIOByteBuffer(int > index), does not seem to do as much for me as the Pointer equivalent. I > think requiring a PyBuffer to offer you its index calculation is the way > to go. Something like: > > assert pybuf.getNdim() == 2; > assert pybuf.getShape()[0] == x.length; > > // Extract column c > ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL); > for (int r=0; r<x.length; r++) > x[r] = bb.getFloat( pybuf.index(r,c) ); > > > Jeff Allen > > On 07/05/2016 00:50, Jeff Allen wrote: > > I'm following the deprecation route at the moment, but bearing in mind > > Jim's view that breaking change is acceptable by virtue of low adoption, > > this may only be a transient arrangement. I don't want to maintain two > > approaches to storage. I favour adding: > > > > ByteBuffer getByteBuffer(); // = getNIOByteBuffer > > ByteBuffer getByteBuffer(int index); > > ByteBuffer getByteBuffer(int... indices); > > > > which differ only in the position() of the returned buffer. Each returns > > a new ByteBuffer, so that clients may call the incremental get and put > > methods without interfering. An alternative is to have only the first, > > but expose the index calculation helpers so one can set the position > > easily in complex cases. > > > > Writing the test code first was quite helpful in this case. > > > > The current getNIOByteBuffer attempts to set the buffer limit according > > to the actual data extent in the view (which is not the whole underlying > > byte array when it's a slice). This seems unnecessary, and is only > > useful in the contiguous case. I figure you should always get the whole > > thing, then work out how many items to read and write from the > > navigation, not from ByteBuffer.remaining(). > > > > Ok, in CPython 2.7 the reference to the underlying object is present in > > the code, just missing from the documentation. I think we can > > accommodate it. > > > > Jeff Allen > > > > On 28/04/2016 11:06, Stefan Richthofer wrote: > >>> this is a breaking change to the API > >> I think this can be achieved without a breaking API-change (detailed comments below). > >> (However if you prefer a slight break to achieve a cleaner API I won't complain.) > >> > >> The possibility that an array-storage access cannot be provided is already contained in the API. > >> If the flag AS_ARRAY is not set, the current API already doesn't guarantee to offer array-access (via PyBuffer.Pointer). > >> > >> Does a type-change of storage field in BaseBuffer count as breaking API change, given that it is > >> protected? Third-parties that extend BaseBuffer might be affected, which can be avoided by > >> option 1), i.e. adding ByteBuffer view as a separate field, e.g. "storageBufferView". > >> We could start with option 1), declare the byte[]-storage field as deprecated and remove it in > >> 2.7.4 or so. This would provide a smooth transition to variant 2). > >> > >> Replacing PyBuffer.Pointer by ByteBuffer would be a breaking change, but could be avoided too. In Java > >> fashion PyBuffer.Pointer and corresponding API/methods can be kept as @deprecated. (Or just kept - > >> I am actually +0 about replacing PyBuffer.Pointer with ByteBuffer) > >> > >> > >>> Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, > >> which may be why I haven't replicated it. > >> > >> Taking another look it seems like this feature was actually backported to Python 2. Py_buffer declaration in > >> object.h of Python 2.7 is titled /* Py3k buffer interface */. However, for sake of compatibility it would > >> be best to support it in Jython, given that it is (presumably) easy to add. > >> > >> > >> Do I miss some aspect? > >> > >> -Stefan > >> > >> > >> Gesendet: Mittwoch, 27. April 2016 um 23:57 Uhr > >> Von: "Jim Baker" <jim...@py...> > >> An: "Jeff Allen" <ja...@fa...> > >> Cc: "Stefan Richthofer" <Ste...@gm...>, "Jython Developers" <jyt...@li...> > >> Betreff: Re: [Jython-dev] Jython buffer protocol > >> > >> On Wed, Apr 27, 2016 at 4:36 PM, Jeff Allen <ja...@fa...> wrote: > >> I'm giving serious consideration to idea 2, that is, the storage implementation is j.n.ByteBuffer, always, and *may* wrap a byte[] object. I'd need to try this out to ensure there is no fatal flaw. > >> *Jim:* this is a breaking change to the API. Do we need to be more careful of possible users? I suspect we are only breaking our own work here: how about you? > >> > >> We should mention such a breaking change. Necessarily we have been very conservative on various aspects of our Java API - there is certainly usage out there. But that has been seen in 2.5 or earlier API definitions. I don't see a problem here - any users will be sophisticated and can readily adapt. > >> > >> We would be saying in this that the Jython PyBuffer is allowed to be less like the CPython one than I've been aiming for. This consistency may be less important than Stefan's use case. The CPython protocol promises efficient access to the storage of an object via a pointer, and we would be saying "only as efficient as a j.n.ByteBuffer" ... although it may turn out there's a backing array. j.n.ByteBuffer does not replace PyBuffer, because it cannot describe strided access or the get-release behaviour. > >> I think this leads to an API in which what I've tried to do with PyBuffer.Pointer we now do by handing out ByteBuffer slices. So Pointer goes away. In that case getBuf() and getNIOByteBuffer() are probably the same thing. I do not think it is safe to hand out the actual storage: it is almost unavoidable clients would manipulate the internal state (position, limit), surprising each other and the PyBuffer implementation if it relies on them, as I think it should. > >> Concerning the pointer to object member in CPython Py_Buffer, it seems to be a 3.x feature, which may be why I haven't replicated it. It seems easy to add. (I'd be rewriting all the constructors anyway.) In CPython it's null when there's a buffer but no object. > >> > >> Jeff Allen > >> > >> On 24/04/2016 15:36, Stefan Richthofer wrote: > >> Jeff, > >> > >> good to hear that you can help with this stuff and also that your answer implies you don't have concerns with the new feature itself. Thinking it through again, I think the following way would be cleanest to add this functionality: > >> > >> Add a ByteBuffer-type storage, either exclusively or in addition to byte[] storage. > >> > >> > >> > >> 1) Version with additional field java.nio.ByteBuffer bufferStorage: > >> > >> Case byte[]-backed PyBuffer: > >> (buffer storage must be view on storage, i.e. backed by it and must always point to first element) > >> > >> storage is byte[] > >> bufferStorage is ByteBuffer.wrap(storage) > >> > >> getNIOByteBuffer() can use bufferStorage and needn't call ByteBuffer.wrap every time again. > >> > >> > >> Case direct ByteBuffer (likely not having backing array): > >> > >> storage is null or if the JVM happens to be capable of providing direct ByteBuffer with byte[] backend: bufferStorage.array() > >> > >> bufferStorage is ByteBuffer.allocateDirect(capacity) > >> > >> Methods that used to access elements of storage directly are enriched by a fallback for case storage == null. The fallback would directly operate on bufferStorage. > >> > >> > >> > >> > >> 2) Version with exclusive Buffer-storage: > >> > >> storage type is java.nioByteBuffer instead of byte[] > >> > >> > >> Case byte[]-backed PyBuffer: > >> > >> storage is ByteBuffer.allocate(capacity) (i.e. non-Direct, so buffer will have backing array!) > >> > >> getNIOByteBuffer() can use storage and needn't call ByteBuffer.wrap. > >> > >> Methods that used to access elements of storage directly now do this on storage.array() rather than on storage itself (should be doable by a simple search/replace refactoring more or less). > >> > >> > >> Case direct ByteBuffer (likely not having backing array): > >> > >> bufferStorage is ByteBuffer.allocateDirect(capacity) > >> > >> Methods that used to access elements of storage directly are enriched by a fallback for case storage.hasArray() == false. The fallback would directly operate on storage's ByteBuffer methods. > >> > >> > >> I can do the work of writing the fallbacks or help with it up to your discretion. > >> > >> > >> Then another thing: I noticed CPython's PyBuffer-pendant contains a reference to the PyObject that exported it, so you can always find the origin of a given PyBuffer. I don't see how this would be feasible with Jython's current PyBuffer implementation. So from JyNI perspective I can store (as a mapping) the exporter in case it is known for some reason, e.g. because PyBuffer was converted from a native CPython-like variant. > >> However there could be situations where the buffer comes from Jython and the origin would be unknown. In that case I would (currently) just provide a NULL-value or PyNone for this field and hope to get away with it for the important extensions. Maybe we could attach a PyBuffer's origin in Jython too...? (e.g. as a JyAttribute only if some global flag is set, which JyNI would then set on load). > >> > >> Best > >> > >> Stefan > >> > >> > >> > >> Gesendet: Samstag, 23. April 2016 um 20:14 Uhr > >> Von: "Jeff Allen" <ja...@fa...> > >> An: "Stefan Richthofer" <Ste...@gm...> > >> Cc: jim...@py... > >> Betreff: Re: [Jython-dev] Jython buffer protocol > >> > >> Hi Stefan. > >> > >> Refreshing my memory about how these classes work, I can see that I took > >> at face value the CPython view that the purpose of the buffer interface > >> is to give clients access to the underlying array of bytes, so > >> abstraction of the storage always gave way to what I thought would be > >> efficient. (Abstraction of the unit to be something other than byte is > >> sketched but clarity and a use case eluded me.) > >> > >> I always feel I've failed if I have to cast. My instinct is for option a. > >> > >> But I think you would not create a "Direct" parallel to BaseBuffer, > >> since it contains a lot of helper methods independent of the storage > >> implementation. Rather, factor it into two layers, the first being > >> either BaseBuffer or AbstractBuffer (depending on what causes least > >> pain) and the next layer being two base classes, one the revised > >> BaseBuffer containing: > >> protected byte[] storage; > >> and the other containing: > >> protected ByteBuffer storage; > >> And in each you migrate case whatever it seems natural should come along > >> with these declarations. > >> > >> I've been meaning to get back to Jython: I could do this groundwork if > >> that would not be confusing. > >> > >> Jeff > >> > >> Jeff Allen > >> > >> On 22/04/2016 21:50, Stefan Richthofer wrote: > >> Hello Jeff, > >> > >> I'm warming up this old thread, because I am about to start actual work on JyNI's support > >> for buffer-protocol / the PyBuffer builtin type. > >> I'd like to point you to my recent pull request https://github.com/jythontools/jython/pull/39[https://github.com/jythontools/jython/pull/39]. > >> It's a preliminary step for adding support for direct java.nio.ByteBuffers. After establishing this flag > >> I am going to add some actual support for it. I see basically two ways to go for this > >> > >> a) Create a parallel class hierarchy to BaseBuffer et al, backed by direct ByteBuffers. E.g. > >> call everything with "Direct": DirectBaseBuffer, DirectSimpleBuffer etc. > >> Then let BufferProtocol implementers check for the flag and use Direct counterpart of the > >> usually used Buffer-Class accordingly. > >> > >> or > >> > >> b) Modify existing BaseBuffer such that storage is Object rather than byte[]. Then according to > >> flags it will be byte[] or ByteBuffer. This variant will involve more explicit type casting than > >> a), but would involve fewer new classes however. > >> > >> What is your opinion about this? > >> > >> Best > >> > >> Stefan > >> > > > > ------------------------------------------------------------------------------ > > Find and fix application performance issues faster with Applications Manager > > Applications Manager provides deep performance insights into multiple tiers of > > your business applications. It resolves application problems quickly and > > reduces your MTTR. Get your free trial! > > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > > _______________________________________________ > > Jython-dev mailing list > > Jyt...@li... > > https://lists.sourceforge.net/lists/listinfo/jython-dev > > > > > ------------------------------------------------------------------------------ > Mobile security can be enabling, not merely restricting. Employees who > bring their own devices (BYOD) to work are irked by the imposition of MDM > restrictions. Mobile Device Manager Plus allows you to control only the > apps on BYO-devices by containerizing them, leaving personal data untouched! > https://ad.doubleclick.net/ddm/clk/304595813;131938128;j > _______________________________________________ > Jython-dev mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-dev > |
From: Jeff A. <ja...@fa...> - 2016-05-11 18:15:26
|
On 11/05/2016 14:38, Stefan Richthofer wrote: > Sounds like good news! Would you put a draft e.g. on github once it is > somehow at a sane state? How was it Jim dignified our process ... ah yes, commit-then-review. :) I'll share the elements somehow to check my thinking. >> ByteBuffer getByteBuffer(int... indices); > I wonder what this is supposed to do; afaik ByteBuffer supports no > multi-index logic. (Correct me if I'm wrong). No, but PyBuffer does. The above returns a ByteBuffer where the position has been set corresponding to the index polynomial. >> // Extract column c >> ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL); >> for (int r=0; r<x.length; r++) >> x[r] = bb.getFloat( pybuf.index(r,c) ); > This looks slow, because method calls are slow (compared to array-access) > and it requires at least two calls per index. I prefer it to x[r] = pybuf.getByteBuffer(r, c).getFloat() which has object construction too. If you want speed you have to do the index calculation by striding from pybuf.index(0,c), but then you will still call ByteBuffer.getFloat(int) rather than an array access. I think efficiency has ceased to be the main objective. Client code that uses array access (equivalently, Pointer) will break when it encounters your ByteBuffer-based objects, so we should make access via the ByteBuffer convenient. >> I'm following the deprecation route at the moment, but bearing in mind >> Jim's view that breaking change is acceptable by virtue of low adoption, > I wonder if there is any evidence how low adoption currently is at all. > Are there any publicly known projects using this? I abbreviated Jim's argument here. Few and sophisticated enough to adapt was his view. I can't find any public projects using our PyBuffer. To suffer from the change, they would have to be projects that use Pointer, not just PyBuffer. Jeff |
From: Jeff A. <ja...@fa...> - 2016-06-04 16:50:49
|
After extensive refactoring, I'm close now to a satisfactory solution based on parallel hierarchies: one for byte[] and one for ByteBuffer, each with its own JUnit test. Tests pass on both, including when I allocate direct buffers. I had a shot at simply replacing the prior implementation of the buffers with one based on ByteBuffer, wrapping an array where that's the real storage. It nearly worked, but involved breaking too many things at once to keep track of, and it was just easier to build a new one next door. I think having two implementations has helped me get the base material (BaseBuffer and the test support) into better shape. It was less clear what should be in the base material, and what belongs to a particular implementation choice, when there was only one. This is the bit I'm polishing now. I'm also enjoying how JUnit4 parameterisation lets me debug just one implementation type at a time. I should have something to show this week. Jeff Allen On 11/05/2016 19:14, Jeff Allen wrote: > On 11/05/2016 14:38, Stefan Richthofer wrote: >> Sounds like good news! Would you put a draft e.g. on github once it is >> somehow at a sane state? > How was it Jim dignified our process ... ah yes, commit-then-review. :) > I'll share the elements somehow to check my thinking. >>> ByteBuffer getByteBuffer(int... indices); >> I wonder what this is supposed to do; afaik ByteBuffer supports no >> multi-index logic. (Correct me if I'm wrong). > No, but PyBuffer does. The above returns a ByteBuffer where the position > has been set corresponding to the index polynomial. >>> // Extract column c >>> ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL); >>> for (int r=0; r<x.length; r++) >>> x[r] = bb.getFloat( pybuf.index(r,c) ); >> This looks slow, because method calls are slow (compared to array-access) >> and it requires at least two calls per index. > I prefer it to x[r] = pybuf.getByteBuffer(r, c).getFloat() which has > object construction too. If you want speed you have to do the index > calculation by striding from pybuf.index(0,c), but then you will still > call ByteBuffer.getFloat(int) rather than an array access. I think > efficiency has ceased to be the main objective. Client code that uses > array access (equivalently, Pointer) will break when it encounters your > ByteBuffer-based objects, so we should make access via the ByteBuffer > convenient. >>> I'm following the deprecation route at the moment, but bearing in mind >>> Jim's view that breaking change is acceptable by virtue of low adoption, >> I wonder if there is any evidence how low adoption currently is at all. >> Are there any publicly known projects using this? > I abbreviated Jim's argument here. Few and sophisticated enough to adapt > was his view. I can't find any public projects using our PyBuffer. To > suffer from the change, they would have to be projects that use Pointer, > not just PyBuffer. > > Jeff > |
From: Jeff A. <ja...@fa...> - 2016-06-11 09:37:35
|
Stefan: A sane version of the nio buffer work now exists for your delight at: https://bitbucket.org/tournesol/jython-nio I've chosen configuration options at Bitbucket to avoid anyone forking it and to try to avoid unnecessary merges finding their way into the main Jython repo. However, I wouldn't have committed it even locally if it weren't pretty good. If these tricks prevent you getting the code, let me know and I'll open it up a bit more. test_memoryview fails because my new code does not correctly handle overlapped copy to self. A fix will follow. The strided overlapped case is quite tricky, and I think it must have been broken all along. (I wonder if CPython gets it right.) Jeff Jeff Allen On 04/06/2016 17:50, Jeff Allen wrote: > After extensive refactoring, I'm close now to a satisfactory solution > based on parallel hierarchies: one for byte[] and one for ByteBuffer, > each with its own JUnit test. Tests pass on both, including when I > allocate direct buffers. > > I had a shot at simply replacing the prior implementation of the buffers > with one based on ByteBuffer, wrapping an array where that's the real > storage. It nearly worked, but involved breaking too many things at once > to keep track of, and it was just easier to build a new one next door. > > I think having two implementations has helped me get the base material > (BaseBuffer and the test support) into better shape. It was less clear > what should be in the base material, and what belongs to a particular > implementation choice, when there was only one. This is the bit I'm > polishing now. I'm also enjoying how JUnit4 parameterisation lets me > debug just one implementation type at a time. > > I should have something to show this week. > > Jeff Allen > > On 11/05/2016 19:14, Jeff Allen wrote: >> On 11/05/2016 14:38, Stefan Richthofer wrote: >>> Sounds like good news! Would you put a draft e.g. on github once it is >>> somehow at a sane state? >> How was it Jim dignified our process ... ah yes, commit-then-review. :) >> I'll share the elements somehow to check my thinking. >>>> ByteBuffer getByteBuffer(int... indices); >>> I wonder what this is supposed to do; afaik ByteBuffer supports no >>> multi-index logic. (Correct me if I'm wrong). >> No, but PyBuffer does. The above returns a ByteBuffer where the position >> has been set corresponding to the index polynomial. >>>> // Extract column c >>>> ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL); >>>> for (int r=0; r<x.length; r++) >>>> x[r] = bb.getFloat( pybuf.index(r,c) ); >>> This looks slow, because method calls are slow (compared to array-access) >>> and it requires at least two calls per index. >> I prefer it to x[r] = pybuf.getByteBuffer(r, c).getFloat() which has >> object construction too. If you want speed you have to do the index >> calculation by striding from pybuf.index(0,c), but then you will still >> call ByteBuffer.getFloat(int) rather than an array access. I think >> efficiency has ceased to be the main objective. Client code that uses >> array access (equivalently, Pointer) will break when it encounters your >> ByteBuffer-based objects, so we should make access via the ByteBuffer >> convenient. >>>> I'm following the deprecation route at the moment, but bearing in mind >>>> Jim's view that breaking change is acceptable by virtue of low adoption, >>> I wonder if there is any evidence how low adoption currently is at all. >>> Are there any publicly known projects using this? >> I abbreviated Jim's argument here. Few and sophisticated enough to adapt >> was his view. I can't find any public projects using our PyBuffer. To >> suffer from the change, they would have to be projects that use Pointer, >> not just PyBuffer. >> >> Jeff >> > > ------------------------------------------------------------------------------ > What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic > patterns at an interface-level. Reveals which users, apps, and protocols are > consuming the most bandwidth. Provides multi-vendor support for NetFlow, > J-Flow, sFlow and other flows. Make informed decisions using capacity > planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e > _______________________________________________ > Jython-dev mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-dev > |
From: Stefan R. <Ste...@gm...> - 2016-06-13 00:04:28
|
Hey Jeff, thanks a lot for this work. I will take a closer look as soon as I find time. However, so far - quickly scrolling through some source-files - it looks pretty good. Best Stefan > Gesendet: Samstag, 11. Juni 2016 um 11:37 Uhr > Von: "Jeff Allen" <ja...@fa...> > An: "Stefan Richthofer" <Ste...@gm...> > Cc: "Jython Developers" <jyt...@li...> > Betreff: Re: [Jython-dev] Jython buffer protocol > > Stefan: > > A sane version of the nio buffer work now exists for your delight at: > https://bitbucket.org/tournesol/jython-nio > > I've chosen configuration options at Bitbucket to avoid anyone forking > it and to try to avoid unnecessary merges finding their way into the > main Jython repo. However, I wouldn't have committed it even locally if > it weren't pretty good. If these tricks prevent you getting the code, > let me know and I'll open it up a bit more. > > test_memoryview fails because my new code does not correctly handle > overlapped copy to self. A fix will follow. The strided overlapped case > is quite tricky, and I think it must have been broken all along. (I > wonder if CPython gets it right.) > > Jeff > > Jeff Allen > > On 04/06/2016 17:50, Jeff Allen wrote: > > After extensive refactoring, I'm close now to a satisfactory solution > > based on parallel hierarchies: one for byte[] and one for ByteBuffer, > > each with its own JUnit test. Tests pass on both, including when I > > allocate direct buffers. > > > > I had a shot at simply replacing the prior implementation of the buffers > > with one based on ByteBuffer, wrapping an array where that's the real > > storage. It nearly worked, but involved breaking too many things at once > > to keep track of, and it was just easier to build a new one next door. > > > > I think having two implementations has helped me get the base material > > (BaseBuffer and the test support) into better shape. It was less clear > > what should be in the base material, and what belongs to a particular > > implementation choice, when there was only one. This is the bit I'm > > polishing now. I'm also enjoying how JUnit4 parameterisation lets me > > debug just one implementation type at a time. > > > > I should have something to show this week. > > > > Jeff Allen > > > > On 11/05/2016 19:14, Jeff Allen wrote: > >> On 11/05/2016 14:38, Stefan Richthofer wrote: > >>> Sounds like good news! Would you put a draft e.g. on github once it is > >>> somehow at a sane state? > >> How was it Jim dignified our process ... ah yes, commit-then-review. :) > >> I'll share the elements somehow to check my thinking. > >>>> ByteBuffer getByteBuffer(int... indices); > >>> I wonder what this is supposed to do; afaik ByteBuffer supports no > >>> multi-index logic. (Correct me if I'm wrong). > >> No, but PyBuffer does. The above returns a ByteBuffer where the position > >> has been set corresponding to the index polynomial. > >>>> // Extract column c > >>>> ByteBuffer bb = pybuf.getNIOByteBuffer(PyBUF.FULL); > >>>> for (int r=0; r<x.length; r++) > >>>> x[r] = bb.getFloat( pybuf.index(r,c) ); > >>> This looks slow, because method calls are slow (compared to array-access) > >>> and it requires at least two calls per index. > >> I prefer it to x[r] = pybuf.getByteBuffer(r, c).getFloat() which has > >> object construction too. If you want speed you have to do the index > >> calculation by striding from pybuf.index(0,c), but then you will still > >> call ByteBuffer.getFloat(int) rather than an array access. I think > >> efficiency has ceased to be the main objective. Client code that uses > >> array access (equivalently, Pointer) will break when it encounters your > >> ByteBuffer-based objects, so we should make access via the ByteBuffer > >> convenient. > >>>> I'm following the deprecation route at the moment, but bearing in mind > >>>> Jim's view that breaking change is acceptable by virtue of low adoption, > >>> I wonder if there is any evidence how low adoption currently is at all. > >>> Are there any publicly known projects using this? > >> I abbreviated Jim's argument here. Few and sophisticated enough to adapt > >> was his view. I can't find any public projects using our PyBuffer. To > >> suffer from the change, they would have to be projects that use Pointer, > >> not just PyBuffer. > >> > >> Jeff > >> > > > > ------------------------------------------------------------------------------ > > What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic > > patterns at an interface-level. Reveals which users, apps, and protocols are > > consuming the most bandwidth. Provides multi-vendor support for NetFlow, > > J-Flow, sFlow and other flows. Make informed decisions using capacity > > planning reports. https://ad.doubleclick.net/ddm/clk/305295220;132659582;e > > _______________________________________________ > > Jython-dev mailing list > > Jyt...@li... > > https://lists.sourceforge.net/lists/listinfo/jython-dev > > > > |
From: Jeff A. <ja...@fa...> - 2016-07-15 07:42:52
|
Stefan: I recently pushed more changes to my bitbucket fork, including addition of a getObj() to PyBuffer in response to your need to navigate to the exporting object. The existing PyBuffer interface provides copyTo/From byte arrays. With support for non-heap NIO storage it seems natural (and not too hard) to add copyTo/From ByteBuffer. At present these are in the NIO implementation, but not made interface items. Do you think this would be bloat, nice-to-have, or really useful for what you were hoping to do? Jeff Jeff Allen On 13/06/2016 01:04, Stefan Richthofer wrote: > Hey Jeff, > thanks a lot for this work. I will take a closer look as soon as I find time. > However, so far - quickly scrolling through some source-files - it looks pretty good. > > Best > > Stefan > > > >> Gesendet: Samstag, 11. Juni 2016 um 11:37 Uhr >> Von: "Jeff Allen" <ja...@fa...> >> An: "Stefan Richthofer" <Ste...@gm...> >> Cc: "Jython Developers" <jyt...@li...> >> Betreff: Re: [Jython-dev] Jython buffer protocol >> >> Stefan: >> >> A sane version of the nio buffer work now exists for your delight at: >> https://bitbucket.org/tournesol/jython-nio >> <snip> |
From: Stefan R. <Ste...@gm...> - 2016-07-17 14:27:04
|
Hello Jeff, sorry for the delay. I was (and still am) busy with adding NumPy support and it turned out that NumPy is okay with PyMemoryView_FromObject returning null for now (I suppose it has a fallback for that). It actually does call that method which is why I thought buffer protocol (which PyMemoryView_FromObject is based on) would be an urgent need for NumPy support. Of course I still want to add buffer protocol to JyNI, but won't find time to look at this before NumPy support moved on some more. So I did not yet take a detailed look at your work. However my main concern there would be to avoid that any index-iterating (NIO-bridge-)method would perform method calls within a loop, but instead is implemented using bulk-access methods. If this is already the case I would most likely have no further concerns. Thanks for adding getObj(); this is useful in any case. >Do you think this would be bloat, nice-to-have, or really useful for what you were hoping to do? This sounds like it is mainly relevant for Java-integration and not so much for JyNI. Spontaneously I'd give it a "nice to have". To give you some definite clue regarding my time-management: I will resume work on BufferProtocol-front after a) Jython 2.7.1 was released b) JyNI 2.7-alpha.4 was released. Best, Stefan > Gesendet: Freitag, 15. Juli 2016 um 09:42 Uhr > Von: "Jeff Allen" <ja...@fa...> > An: "Stefan Richthofer" <Ste...@gm...> > Cc: "Jython Developers" <jyt...@li...> > Betreff: Re: [Jython-dev] Jython buffer protocol > > Stefan: > > I recently pushed more changes to my bitbucket fork, including addition > of a getObj() to PyBuffer in response to your need to navigate to the > exporting object. > > The existing PyBuffer interface provides copyTo/From byte arrays. With > support for non-heap NIO storage it seems natural (and not too hard) to > add copyTo/From ByteBuffer. At present these are in the NIO > implementation, but not made interface items. Do you think this would be > bloat, nice-to-have, or really useful for what you were hoping to do? > > Jeff > > Jeff Allen > > On 13/06/2016 01:04, Stefan Richthofer wrote: > > Hey Jeff, > > thanks a lot for this work. I will take a closer look as soon as I find time. > > However, so far - quickly scrolling through some source-files - it looks pretty good. > > > > Best > > > > Stefan > > > > > > > >> Gesendet: Samstag, 11. Juni 2016 um 11:37 Uhr > >> Von: "Jeff Allen" <ja...@fa...> > >> An: "Stefan Richthofer" <Ste...@gm...> > >> Cc: "Jython Developers" <jyt...@li...> > >> Betreff: Re: [Jython-dev] Jython buffer protocol > >> > >> Stefan: > >> > >> A sane version of the nio buffer work now exists for your delight at: > >> https://bitbucket.org/tournesol/jython-nio > >> > <snip> > |
From: Jeff A. <ja...@fa...> - 2016-07-19 20:01:01
|
Thanks Stefan: nothing obviously crazy about the concept then. I'll take others' advice (Jim?) on whether this kind of change is too much for 2.7.1, or feels safe. Concerning loops with calls in them, there's always an implementation like that near the base of the hierarchy so that the non-contiguous case is catered for, then the option (which I like to take up) of using a bulk method in the contiguous sub-class. If we suddenly wanted to support CopyTo/From ByteBuffer in the API, implementing it efficiently could follow along. Main thing is you have your buffer protocol interface onto non-heap storage to try when you can. Jeff Jeff Allen On 17/07/2016 15:26, Stefan Richthofer wrote: > Hello Jeff, > > sorry for the delay. I was (and still am) busy with adding NumPy support and it turned out that NumPy is okay with PyMemoryView_FromObject returning null for now (I suppose it has a fallback for that). It actually does call that method which is why I thought buffer protocol (which PyMemoryView_FromObject is based on) would be an urgent need for NumPy support. Of course I still want to add buffer protocol to JyNI, but won't find time to look at this before NumPy support moved on some more. So I did not yet take a detailed look at your work. However my main concern there would be to avoid that any index-iterating (NIO-bridge-)method would perform method calls within a loop, but instead is implemented using bulk-access methods. If this is already the case I would most likely have no further concerns. > Thanks for adding getObj(); this is useful in any case. > >> Do you think this would be bloat, nice-to-have, or really useful for what you were hoping to do? > This sounds like it is mainly relevant for Java-integration and not so much for JyNI. Spontaneously I'd give it a "nice to have". > > To give you some definite clue regarding my time-management: I will resume work on BufferProtocol-front after > a) Jython 2.7.1 was released > b) JyNI 2.7-alpha.4 was released. > > > Best, > > Stefan > > >> Gesendet: Freitag, 15. Juli 2016 um 09:42 Uhr >> Von: "Jeff Allen" <ja...@fa...> >> An: "Stefan Richthofer" <Ste...@gm...> >> Cc: "Jython Developers" <jyt...@li...> >> Betreff: Re: [Jython-dev] Jython buffer protocol >> >> Stefan: >> >> I recently pushed more changes to my bitbucket fork, including addition >> of a getObj() to PyBuffer in response to your need to navigate to the >> exporting object. >> >> The existing PyBuffer interface provides copyTo/From byte arrays. With >> support for non-heap NIO storage it seems natural (and not too hard) to >> add copyTo/From ByteBuffer. At present these are in the NIO >> implementation, but not made interface items. Do you think this would be >> bloat, nice-to-have, or really useful for what you were hoping to do? >> >> Jeff >> >> Jeff Allen >> >> On 13/06/2016 01:04, Stefan Richthofer wrote: >>> Hey Jeff, >>> thanks a lot for this work. I will take a closer look as soon as I find time. >>> However, so far - quickly scrolling through some source-files - it looks pretty good. >>> >>> Best >>> >>> Stefan >>> >>> >>> >>>> Gesendet: Samstag, 11. Juni 2016 um 11:37 Uhr >>>> Von: "Jeff Allen" <ja...@fa...> >>>> An: "Stefan Richthofer" <Ste...@gm...> >>>> Cc: "Jython Developers" <jyt...@li...> >>>> Betreff: Re: [Jython-dev] Jython buffer protocol >>>> >>>> Stefan: >>>> >>>> A sane version of the nio buffer work now exists for your delight at: >>>> https://bitbucket.org/tournesol/jython-nio >>>> >> <snip> >> |
From: Jeff A. <ja...@fa...> - 2016-08-24 08:02:31
|
Jim & all: I feel this has been sitting off to one side long enough. Do we feel safe that I can merge the underlying work? https://bitbucket.org/tournesol/jython-nio I'll write a short paragraph for NEWS as the last change. On a trivial matter of technique, that para goes above the "Jython 2.7.1rc" heading, ready for an rc2 heading above that when we get there, right? Jeff Jeff Allen On 19/07/2016 21:00, Jeff Allen wrote: > Thanks Stefan: nothing obviously crazy about the concept then. > > I'll take others' advice (Jim?) on whether this kind of change is too > much for 2.7.1, or feels safe. > > Concerning loops with calls in them, there's always an implementation > like that near the base of the hierarchy so that the non-contiguous case > is catered for, then the option (which I like to take up) of using a > bulk method in the contiguous sub-class. If we suddenly wanted to > support CopyTo/From ByteBuffer in the API, implementing it efficiently > could follow along. > > Main thing is you have your buffer protocol interface onto non-heap > storage to try when you can. > > Jeff > > Jeff Allen > > On 17/07/2016 15:26, Stefan Richthofer wrote: >> Hello Jeff, >> >> sorry for the delay. I was (and still am) busy with adding NumPy support and it turned out that NumPy is okay with PyMemoryView_FromObject returning null for now (I suppose it has a fallback for that). It actually does call that method which is why I thought buffer protocol (which PyMemoryView_FromObject is based on) would be an urgent need for NumPy support. Of course I still want to add buffer protocol to JyNI, but won't find time to look at this before NumPy support moved on some more. So I did not yet take a detailed look at your work. However my main concern there would be to avoid that any index-iterating (NIO-bridge-)method would perform method calls within a loop, but instead is implemented using bulk-access methods. If this is already the case I would most likely have no further concerns. >> Thanks for adding getObj(); this is useful in any case. >> >>> Do you think this would be bloat, nice-to-have, or really useful for what you were hoping to do? >> This sounds like it is mainly relevant for Java-integration and not so much for JyNI. Spontaneously I'd give it a "nice to have". >> >> To give you some definite clue regarding my time-management: I will resume work on BufferProtocol-front after >> a) Jython 2.7.1 was released >> b) JyNI 2.7-alpha.4 was released. >> >> >> Best, >> >> Stefan >> >> >>> Gesendet: Freitag, 15. Juli 2016 um 09:42 Uhr >>> Von: "Jeff Allen" <ja...@fa...> >>> An: "Stefan Richthofer" <Ste...@gm...> >>> Cc: "Jython Developers" <jyt...@li...> >>> Betreff: Re: [Jython-dev] Jython buffer protocol >>> >>> Stefan: >>> >>> I recently pushed more changes to my bitbucket fork, including addition >>> of a getObj() to PyBuffer in response to your need to navigate to the >>> exporting object. >>> >>> The existing PyBuffer interface provides copyTo/From byte arrays. With >>> support for non-heap NIO storage it seems natural (and not too hard) to >>> add copyTo/From ByteBuffer. At present these are in the NIO >>> implementation, but not made interface items. Do you think this would be >>> bloat, nice-to-have, or really useful for what you were hoping to do? >>> >>> Jeff >>> >>> Jeff Allen >>> >>> On 13/06/2016 01:04, Stefan Richthofer wrote: >>>> Hey Jeff, >>>> thanks a lot for this work. I will take a closer look as soon as I find time. >>>> However, so far - quickly scrolling through some source-files - it looks pretty good. >>>> >>>> Best >>>> >>>> Stefan >>>> >>>> >>>> >>>>> Gesendet: Samstag, 11. Juni 2016 um 11:37 Uhr >>>>> Von: "Jeff Allen" <ja...@fa...> >>>>> An: "Stefan Richthofer" <Ste...@gm...> >>>>> Cc: "Jython Developers" <jyt...@li...> >>>>> Betreff: Re: [Jython-dev] Jython buffer protocol >>>>> >>>>> Stefan: >>>>> >>>>> A sane version of the nio buffer work now exists for your delight at: >>>>> https://bitbucket.org/tournesol/jython-nio >>>>> >>> <snip> >>> > > ------------------------------------------------------------------------------ > What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic > patterns at an interface-level. Reveals which users, apps, and protocols are > consuming the most bandwidth. Provides multi-vendor support for NetFlow, > J-Flow, sFlow and other flows. Make informed decisions using capacity planning > reports.http://sdm.link/zohodev2dev > _______________________________________________ > Jython-dev mailing list > Jyt...@li... > https://lists.sourceforge.net/lists/listinfo/jython-dev > |