From: Jeff A. <ja...@fa...> - 2013-09-28 23:18:34
|
The alert will have noticed that Jython now supports the buffer type. buffer() is reasonably complete (not sure about comparisons) but the thing that lets it down is that other objects do not know how to accept the buffer interface as an argument. That needs correcting, and I'm starting with str (PyString). I have worked through the API of str, and almost everywhere a str/bytes argument is allowed, it will accept a buffer in CPython 2.7.5. It doesn't accept a memoryview in the same places, until Python 3k. In Jython buffer() and memoryview() both implement the same buffer interface, so to accept one is to accept the other (unless I actively prevent it for consistency with CPython, which I think foolish). I would like to check that these things are not bad ideas before I write too much: 1. I will change the signature of about a dozen of the exposed methods to take a PyObject argument where they currently take Java String. (They are package-visible, so I should find any call sites easily.) 2. l will re-use the implementation of affected methods, but have a helper function that derives the String from whatever the PyObject argument is. This helper function is also responsible for raising a TypeError. The CPython 2.7.5 message is "expected a character buffer object". I propose to use the 3.3-ish "expected str, bytearray or buffer compatible object". 3. The CPython 3.3 bytes type accepts memoryview in a few more places than 2.7.5 accepts buffer. I plan to accept either in all the places 3.3 accepts memoryview. Any pitfalls? Jeff -- Jeff Allen |
From: <fwi...@gm...> - 2013-09-29 23:15:38
|
On Sat, Sep 28, 2013 at 4:18 PM, Jeff Allen <ja...@fa...> wrote: > The alert will have noticed that Jython now supports the buffer type. > buffer() is reasonably complete (not sure about comparisons) but the > thing that lets it down is that other objects do not know how to accept > the buffer interface as an argument. That needs correcting, and I'm > starting with str (PyString). > > I have worked through the API of str, and almost everywhere a str/bytes > argument is allowed, it will accept a buffer in CPython 2.7.5. It > doesn't accept a memoryview in the same places, until Python 3k. In > Jython buffer() and memoryview() both implement the same buffer > interface, so to accept one is to accept the other (unless I actively > prevent it for consistency with CPython, which I think foolish). > > I would like to check that these things are not bad ideas before I write > too much: > > 1. I will change the signature of about a dozen of the exposed methods > to take a PyObject argument where they currently take Java String. (They > are package-visible, so I should find any call sites easily.) Is it possible to overload the functions so that we still take java.lang.String? I think it would be too bad if we had to lose the ability to call the methods that take java.lang.String. For example, if we are calling from Java we might have a java.lang.String and it would be inconvenient to be forced to wrap it in a PyObject first. Thoughts? > 2. l will re-use the implementation of affected methods, but have a > helper function that derives the String from whatever the PyObject > argument is. This helper function is also responsible for raising a > TypeError. The CPython 2.7.5 message is "expected a character buffer > object". I propose to use the 3.3-ish "expected str, bytearray or buffer > compatible object". Might it be possible to just keep the methods public? > 3. The CPython 3.3 bytes type accepts memoryview in a few more places > than 2.7.5 accepts buffer. I plan to accept either in all the places 3.3 > accepts memoryview. That sounds good to me. Thanks for looking into this! -Frank |
From: Jim B. <jb...@zy...> - 2013-09-30 03:26:07
|
I have tried out the buffer type in Twisted - now we have new problems to fix! but this has allowed progress - as well as in test_types. That's a fairly minimal test, but we do need to cover this case: if str(a + buffer('def')) != 'asdfdef': self.fail('concatenation of buffers yields wrong content') So this is looking great. More below: On Sun, Sep 29, 2013 at 5:15 PM, fwi...@gm... < fwi...@gm...> wrote: > On Sat, Sep 28, 2013 at 4:18 PM, Jeff Allen <ja...@fa...> wrote: > > The alert will have noticed that Jython now supports the buffer type. > > buffer() is reasonably complete (not sure about comparisons) but the > > thing that lets it down is that other objects do not know how to accept > > the buffer interface as an argument. That needs correcting, and I'm > > starting with str (PyString). > > > > I have worked through the API of str, and almost everywhere a str/bytes > > argument is allowed, it will accept a buffer in CPython 2.7.5. It > > doesn't accept a memoryview in the same places, until Python 3k. In > > Jython buffer() and memoryview() both implement the same buffer > > interface, so to accept one is to accept the other (unless I actively > > prevent it for consistency with CPython, which I think foolish). > No need for such foolish consistency, for sure. > > > > I would like to check that these things are not bad ideas before I write > > too much: > > > > 1. I will change the signature of about a dozen of the exposed methods > > to take a PyObject argument where they currently take Java String. (They > > are package-visible, so I should find any call sites easily.) > Is it possible to overload the functions so that we still take > java.lang.String? I think it would be too bad if we had to lose the > ability to call the methods that take java.lang.String. For example, > if we are calling from Java we might have a java.lang.String and it > would be inconvenient to be forced to wrap it in a PyObject first. > Thoughts? > Agreed with this line of thought. > > > 2. l will re-use the implementation of affected methods, but have a > > helper function that derives the String from whatever the PyObject > > argument is. This helper function is also responsible for raising a > > TypeError. The CPython 2.7.5 message is "expected a character buffer > > object". I propose to use the 3.3-ish "expected str, bytearray or buffer > > compatible object". > Makes sense. We try to keep error strings as closes as possible to CPython as possible, but I don't see this necessary here. > Might it be possible to just keep the methods public? > Not certain what Frank is asking for here. > > > 3. The CPython 3.3 bytes type accepts memoryview in a few more places > > than 2.7.5 accepts buffer. I plan to accept either in all the places 3.3 > > accepts memoryview. > That sounds good to me. > Agreed on that. - Jim |
From: Jeff A. <ja...@fa...> - 2013-10-14 22:01:16
|
That case is covered by the code I have just pushed, so probably worth trying again. In the same bundle of 4 changes, I have made buffer types acceptable as arguments to strip() and split() methods. It was a somewhat arbitrary place to start: I'm just working down PyString in the order it's written. I've done more re-work than strictly necessary, but I wanted to understand (=commentate) the helper methods before calling them in changed ways, and once I did, I found work I could eliminate. I added tests of this acceptability in string_tests.py, and in the process found I wanted memoryview to be a context manager. I have brought in tests from CPython 3.3 in test_memoryview.py to validate that. Let me know if I've reeled in too much Py3k here. It seemed to fit harmlessly. More changes are due to PyString, but I'm quite busy this week. Jeff Jeff Allen On 30/09/2013 04:25, Jim Baker wrote: > I have tried out the buffer type in Twisted - now we have new problems > to fix! but this has allowed progress - as well as in test_types. > That's a fairly minimal test, but we do need to cover this case: > > if str(a + buffer('def')) != 'asdfdef': > self.fail('concatenation of buffers yields wrong content') > > |
From: Jeff A. <ja...@fa...> - 2013-10-01 07:10:22
|
I was perhaps not clear enough in my first question. Below ... On 30/09/2013 00:15, fwi...@gm... wrote: > On Sat, Sep 28, 2013 at 4:18 PM, Jeff Allen <ja...@fa...> wrote: >> ... >> I would like to check that these things are not bad ideas before I write >> too much: >> >> 1. I will change the signature of about a dozen of the exposed methods >> to take a PyObject argument where they currently take Java String. (They >> are package-visible, so I should find any call sites easily.) > Is it possible to overload the functions so that we still take > java.lang.String? I think it would be too bad if we had to lose the > ability to call the methods that take java.lang.String. For example, > if we are calling from Java we might have a java.lang.String and it > would be inconvenient to be forced to wrap it in a PyObject first. > I mean that, for example, we have: public int find(String sub) { return str_find(sub, null, null); } public int find(String sub, PyObject start) { return str_find(sub, start, null); } public int find(String sub, PyObject start, PyObject end) { return str_find(sub, start, end); } @ExposedMethod(defaults = {"null", "null"}, doc = BuiltinDocs.str_find_doc) final int str_find(String sub, PyObject start, PyObject end) { ... } In the last, I will change the signature to: final int str_find(PyObject sub, PyObject start, PyObject end) The method itself then has to take care of the actual type instead of the calling mechanism doing the coercion. It may be a problem if anyone is calling it directly, but as it is package-visible, either that's within our codebase or they knew they were taking a risk. I haven't considered fully the public API, which I think is what you are talking about. Overloading methods would work but we'd have 6 methods where we have 3 already, just for find(), with the addition of: public int find(PyObject sub) public int find(PyObject sub, PyObject start) public int find(PyObject sub, PyObject start, PyObject end) Or maybe: public int find(BufferProtocol sub) ... It just seems a lot of methods, like I was missing The Better Way. Jeff |
From: <fwi...@gm...> - 2013-10-01 17:16:37
|
On Tue, Oct 1, 2013 at 12:10 AM, Jeff Allen <ja...@fa...> wrote: > @ExposedMethod(defaults = {"null", "null"}, doc = > BuiltinDocs.str_find_doc) > final int str_find(String sub, PyObject start, PyObject end) { > ... > } > > In the last, I will change the signature to: > final int str_find(PyObject sub, PyObject start, PyObject end) > > The method itself then has to take care of the actual type instead of the > calling mechanism doing the coercion. It may be a problem if anyone is > calling it directly, but as it is package-visible, either that's within our > codebase or they knew they were taking a risk. Ah then it was my confusion - changing the package private methods in this way is no problem - they are intended to be used by the Python API already and coercion is already happening in general. -Frank |