From: Todd M. <jm...@st...> - 2004-06-30 21:47:43
|
On Wed, 2004-06-30 at 15:57, Tim Hochberg wrote: > I spend some time seeing what I could do in the way of speeding up > wxPoint_LIST_helper by tweaking the numarray code. My first suspect was > _universalIndexing by way of _ndarray_item. However, due to some > new-style machinations, _ndarray_item was never getting called. Instead, > _ndarray_subscript was being called. So, I added a special case to > _ndarray_subscript. This sped things up by 50% or so (I don't recall > exactly). The code for that is at the end of this message; it's not > gauranteed to be 100% correct; it's all experimental. > > After futzing around some more I figured out a way to trick python into > using _ndarray_item. I added "type->tp_as_sequence->sq_item = > _ndarray_item;" to _ndarray new. I'm puzzled why you had to do this. You're using Python-2.3.x, right? There's conditionally compiled code which should be doing this statically. (At least I thought so.) > I then optimized _ndarray_item (code > at end). This halved the execution time of my arbitrary benchmark. This > trick may have horrible, unforseen consequences so use at your own risk. Right now the sq_item hack strikes me as somewhere between completely unnecessary and too scary for me! Maybe if python-dev blessed it. This optimization looks good to me. > Finally I commented out the __del__ method numarraycore. This resulted > in an additional speedup of 64% for a total speed up of 240%. Still not > close to 10x, but a large improvement. However, this is obviously not > viable for real use, but it's enough of a speedup that I'll try to see > if there's anyway to move the shadow stuff back to tp_dealloc. FYI, the issue with tp_dealloc may have to do with which mode Python is compiled in, --with-pydebug, or not. One approach which seems like it ought to work (just thought of this!) is to add an extra reference in C to the NumArray instance __dict__ (from NumArray.__init__ and stashed via a new attribute in the PyArrayObject struct) and then DECREF it as the last part of the tp_dealloc. > In summary: > > Version Time Rel Speedup Abs Speedup > Stock 0.398 ---- ---- > _naarray_item mod 0.192 107% 107% > del __del__ 0.117 64% 240% > > There were a couple of other things I tried that resulted in additional > small speedups, but the tactics I used were too horrible to reproduce > here. The main one of interest is that all of the calls to > NA_updateDataPtr seem to burn some time. However, I don't have any idea > what one could do about that. Francesc Alted had the same comment about NA_updateDataPtr a while ago. I tried to optimize it then but didn't get anywhere. NA_updateDataPtr() should be called at most once per extension function (more is unnecessary but not harmful) but needs to be called at least once as a consequence of the way the buffer protocol doesn't give locked pointers. > That's all for now. > > -tim Well, be picking out your beer. Todd > > > > static PyObject* > _ndarray_subscript(PyArrayObject* self, PyObject* key) > > { > PyObject *result; > #ifdef TAH > if (PyInt_CheckExact(key)) { > long ikey = PyInt_AsLong(key); > long offset; > if (NA_getByteOffset(self, 1, &ikey, &offset) < 0) > return NULL; > if (!NA_updateDataPtr(self)) > return NULL; > return _simpleIndexingCore(self, offset, 1, Py_None); > } > #endif > #if _PYTHON_CALLBACKS > result = PyObject_CallMethod( > (PyObject *) self, "_universalIndexing", "(OO)", key, Py_None); > #else > result = _universalIndexing(self, key, Py_None); > #endif > return result; > } > > > > static PyObject * > _ndarray_item(PyArrayObject *self, int i) > { > #ifdef TAH > long offset; > if (NA_getByteOffset(self, 1, &i, &offset) < 0) > return NULL; > if (!NA_updateDataPtr(self)) > return NULL; > return _simpleIndexingCore(self, offset, 1, Py_None); > #else > PyObject *result; > PyObject *key = PyInt_FromLong(i); > if (!key) return NULL; > result = _universalIndexing(self, key, Py_None); > Py_DECREF(key); > return result; > #endif > } > > > > > > ------------------------------------------------------- > This SF.Net email sponsored by Black Hat Briefings & Training. > Attend Black Hat Briefings & Training, Las Vegas July 24-29 - > digital self defense, top technical experts, no vendor pitches, > unmatched networking opportunities. Visit www.blackhat.com > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- |