From: Mitchell N C. <mch...@ve...> - 2002-03-11 04:11:57
|
This note recently went out to python-list. Fyi, Mitchell ----[]---- Date: Sat, 9 Mar 2002 14:44:02 -0500 To: pyt...@py... Subject: An Optimization Anecdote about PyInline From: Mitchell N Charity <mch...@ve...> Guido's "Python Patterns - An Optimization Anecdote" (http://www.python.org/doc/essays/list2str.html) explores the performance of various ways of converting a list of integers into a string of characters. Such as the straightforward def f1(list): string = "" for item in list: string = string + chr(item) return string and the fastest version import array def f7(list): return array.array('B', list).tostring() # Use 'B' with Python2.2, 'b' with Python1.5 Guido writes [...] I had wanted to try one more approach: write the whole function in C. [...] Given the effort of writing and testing an extension (compared to whipping up those Python one-liners), as well as the dependency on a non-standard Python extension, I decided not to pursue this option... I wanted to revisit this, because with PyInline, the effort has diminished, and a non-standard extension is not required (just PyInline). And, as Guido points out, If you feel the need for speed, [...] - you can't beat a loop written in C. So, here then is a new version, f8(), which uses PyInline. from PyInline import C import __main__ C.Builder(code=""" PyObject* f8(PyObject *list) { PyObject* string = NULL; unsigned char* buffer = NULL; int size, index; #define RAISE(errtype,msg) { PyErr_SetString(errtype,msg); RE_RAISE; } #define RE_RAISE { Py_XDECREF(string); return NULL; } if(!PyList_Check(list)) RAISE(PyExc_TypeError,"a list is required"); size = PyList_Size(list); if(size < 0) RE_RAISE; string = PyString_FromStringAndSize(NULL, size); if(string == NULL) RE_RAISE; buffer = PyString_AsString(string); if(buffer == NULL) RE_RAISE; for(index = 0; index < size; index++) { PyObject *item = NULL; long number; item = PyList_GetItem(list,index); if(item == NULL) RE_RAISE; number = PyInt_AsLong(item); if(PyErr_Occurred() != NULL) RE_RAISE; if(number < 0 || number > 255) RAISE(PyExc_TypeError,"an integer was out of range"); buffer[index] = (unsigned char) number; } return string; } """,targetmodule=__main__).build() The test jig requires a slight change #for f in testfuncs: print f.func_name, f(testdata) for f in testfuncs: print f(testdata) because PyInline currently generates glue which does not understand f.func_name. So, what is the result? First, the downsides. f8() is a C extension, and so only works with C-based Pythons. PyInline invokes an external C compiler, and cruftily caches the result in the filesystem. And, as PyInline is alpha code, there are little misfeatures like not being able to use func_name. Oh yes, and being written in C, f8() is perhaps two orders of magnitude more complex than the Python versions. Though, this last is perhaps a worst case, as f8() is almost all interface code, with maybe just three lines of algorithm. In more typical use, one might see a one order of magnitude cost. And with objects which provide a nice C api, maybe even less. The upside is f8() runs 5 times faster than f7(), the best alternative. And uses half the memory. That's 50 times faster than the naive versions. And again, when python interface code is less dominant, the speedup can be dramatically larger. So, restoring the phrase I clipped out of the quote above, "If you feel the need for speed, go for built-in functions - you can't beat a loop written in C.". And if there isn't one already, you might consider creating one with PyInline. Mitchell Charity (I'm working on a web page http://www.vendian.org/mncharity/dir3/inline/ which explores using PyInline, and Perl's Inline, with objects which provide C api's.) ----[]---- |