From: Travis O. <oli...@ie...> - 2006-07-02 03:20:00
|
I've been playing a bit with ctypes and realized that with a little help, it could be made much easier to interface with NumPy arrays. Thus, I added a ctypes attribute to the NumPy array. If ctypes is installed, this attribute returns a "conversion" object otherwise an AttributeError is raised. The ctypes-conversion object has attributes which return c_types aware objects so that the information can be passed directly to c-code (as an integer, the number of dimensions can already be passed using c-types). The information available and it's corresponding c_type is data - c_void_p shape, strides - c_int * nd or c_long * nd or c_longlong * nd depending on platform -Travis |
From: Albert S. <fu...@gm...> - 2006-07-02 14:24:42
|
Hello all Travis Oliphant wrote: > I've been playing a bit with ctypes and realized that with a little > help, it could be made much easier to interface with NumPy arrays. > Thus, I added a ctypes attribute to the NumPy array. If ctypes is > installed, this attribute returns a "conversion" object otherwise an > AttributeError is raised. > > The ctypes-conversion object has attributes which return c_types aware > objects so that the information can be passed directly to c-code (as an > integer, the number of dimensions can already be passed using c-types). > > The information available and it's corresponding c_type is > > data - c_void_p > shape, strides - c_int * nd or c_long * nd or c_longlong * nd > depending on platform I did a few tests and this seems to work nicely: In [133]: printf = ctypes.cdll.msvcrt.printf In [134]: printf.argtypes = [ctypes.c_char_p, ctypes.c_void_p] In [135]: x = N.array([1,2,3]) In [136]: printf('%p\n', x.ctypes.data) 01CC8AC0 Out[136]: 9 In [137]: hex(x.__array_interface__['data'][0]) Out[137]: '0x1cc8ac0' It would be nice if we could the _as_parameter_ magic to work as well. See this thread: http://aspn.activestate.com/ASPN/Mail/Message/ctypes-users/3122558 If I understood Thomas correctly, in the presence of argtypes an an instance, say x, with _as_parameter_, the following is done to convert the instance to something that the function accepts as its nth argument: func.argtypes[n].from_param(x._as_parameter_) However, if I try passing x directly to printf, I get this: In [147]: printf('%p\n', x) ... ArgumentError: argument 2: exceptions.TypeError: wrong type However, this much works: In [148]: ctypes.c_void_p.from_param(x._as_parameter_) Out[148]: <cparam 'P' (01cc8ac0)> So I don't understand why the conversion isn't happening automatically. Another quirk I noticed is that non-void pointers' from_param can't seem to be used with ints. For example: In [167]: ctypes.POINTER(ctypes.c_double).from_param(x._as_parameter_) ... TypeError: expected LP_c_double instance instead of int But this works: In [168]: ctypes.POINTER(ctypes.c_double).from_address(x._as_parameter_) Out[168]: <ctypes.LP_c_double object at 0x01DCE800> I don't think this is too much of an issue though -- you could wrap all your functions to take c_void_ps. If you happen to pass an int32 NumPy array to a function expecting a double*, you might run into problems though. Maybe there should be a way to get a pointer to the NumPy array data as a POINTER(c_double) if it is known that the array's dtype is float64. Ditto for c_int/int32 and the others. Regards, Albert |
From: Travis O. <oli...@ie...> - 2006-07-03 05:50:59
|
Albert Strasheim wrote: > I did a few tests and this seems to work nicely: > Hey Albert, I read the post you linked to on the ctypes mailing list. I hope I didn't step on any toes with what I did in NumPy. I was just working on a ctypes interface and realized that a lot of the cruft to convert to what ctypes was expecting could and should be handled in a default way. The conversion of the shapes and strides information to the "right-kind" of ctypes integer plus the inclusion of ctypes in Python 2.5 was enough to convince me to put some kind of hook into the array object. I decided to make the ctypes attribute return an object so that the object could grow in the future additional attributes and/or methods to make it easier to interface with ctypes. I looked a bit at the source code and was disappointed to see that the _as_parameter_ approach is pretty limited. While there is talk of supporting a tuple return of _as_parameter_ in the source code comments, there is no evidence in the source itself of supporting it. There is also the changed way of handling additional arguments when argtypes is set on the function which uses the from_param method. Unfortunately, as Thomas responds to your post, the from_param method must be on one of the ctypes to work. You have to add support specifically for one of the c-data types. I think the _as_parameter_ approach returning a tuple that could be interpreted as the right ctype was better because it let other objects play the ctypes game. Basically, what you need is a type-map just like swig uses. But, now that ctypes is in Python, it will be slower to change. That's a bit unfortunate. But, ultimately, it works fine now. I don't know what is really gained by applying an argtypes to a function call anyway --- some kind of "type-checking". Is that supposed to be safer. For NumPy extension modules, type checking is only a small part of the memory-violation danger. In-correct array bounds and/or striding is far more common - not-to mention unaligned memory areas and/or unwriteable ones (like a read-only memory-mapped file). Thus, you're going to have to write a small "error-checking" code in Python anyway that calls out to the C-library with the right arguments. So, basically you write an extension module that calls c-code just as you did before, but now the entire "extension" module can all be in Python because the call to an arbitrary C-library is made using ctypes. For arrays, you will typically need to pass one or more of the data, the dimension information, the stride information, and the number of dimensions. The data-type will be known about because function calls usually handle only a specific data-type. Thus, I started with a ctypes object that produces this needed data in the format that ctypes needs, so it can be very easy to use an array with the ctypes module. Frankly, I'm quite impressed with the ease of accessing C-code available using c-types. It quite rivals f2py in enjoyment using it. One thing I like about c-types over Pyrex, for example is that it lets you separate the C-code from the Python code instead of "mixing it all together" I wouldn't be surprised if c-types doesn't become the dominant way to interface C/C++ and possibly even Fortran code (but it has tougher competition in f2py) once it grows up a little with additional ease-of-use. -Travis |
From: Albert S. <fu...@gm...> - 2006-07-03 07:48:52
|
Hello all, Travis Oliphant wrote: > Hey Albert, I read the post you linked to on the ctypes mailing list. > I hope I didn't step on any toes with what I did in NumPy. I was just Certainly not. This is great stuff! > working on a ctypes interface and realized that a lot of the cruft to > convert to what ctypes was expecting could and should be handled in a > default way. The conversion of the shapes and strides information to > the "right-kind" of ctypes integer plus the inclusion of ctypes in > Python 2.5 was enough to convince me to put some kind of hook into the > array object. I decided to make the ctypes attribute return an object > so that the object could grow in the future additional attributes and/or > methods to make it easier to interface with ctypes. Ah! So arr.ctypes.* is a collection of things that one typically needs to pass to C functions to get them to do their work, i.e. a pointer to data and some description of the data buffer (shape, strides, etc.). Very nice. > <snip> > Basically, what you need is a type-map just like swig uses. But, now > that ctypes is in Python, it will be slower to change. That's a bit > unfortunate. If we find the ctypes in Python 2.5 to be missing some features, maybe Thomas Heller could release "ctypes2" to tide us over until Python 2.6. But I think ctypes as it will appear in Python 2.5 is already excellent. > But, ultimately, it works fine now. I don't know what is really gained > by applying an argtypes to a function call anyway --- some kind of > "type-checking". Is that supposed to be safer. Yes, type-checking mostly. Some interesting things might happen when you're passing structs by value. But hopefully it just works. > For NumPy extension modules, type checking is only a small part of the > memory-violation danger. In-correct array bounds and/or striding is far > more common - not-to mention unaligned memory areas and/or unwriteable > ones (like a read-only memory-mapped file). Agreed. > Thus, you're going to have to write a small "error-checking" code in > Python anyway that calls out to the C-library with the right > arguments. So, basically you write an extension module that calls > c-code just as you did before, but now the entire "extension" module can > all be in Python because the call to an arbitrary C-library is made > using ctypes. Exactly. And once you have your DLL/shared library, there is no need to compile anything again. Another useful benefit on Windows is that you can build your extension in debug mode without having to have a debug build of Python. This is very useful. > <snip> > Frankly, I'm quite impressed with the ease of accessing C-code available > using c-types. It quite rivals f2py in enjoyment using it. Indeed. Viva ctypes! > <snip> Regards, Albert |
From: Travis O. <oli...@ie...> - 2006-07-03 06:14:09
|
Albert Strasheim wrote: > I did a few tests and this seems to work nicely: > > In [133]: printf = ctypes.cdll.msvcrt.printf > > In [134]: printf.argtypes = [ctypes.c_char_p, ctypes.c_void_p] > > In [135]: x = N.array([1,2,3]) > > In [136]: printf('%p\n', x.ctypes.data) > 01CC8AC0 > Out[136]: 9 > > In [137]: hex(x.__array_interface__['data'][0]) > Out[137]: '0x1cc8ac0' > > It would be nice if we could the _as_parameter_ magic to work as well. See > this thread: > > http://aspn.activestate.com/ASPN/Mail/Message/ctypes-users/3122558 > > If I understood Thomas correctly, in the presence of argtypes an an > instance, say x, with _as_parameter_, the following is done to convert the > instance to something that the function accepts as its nth argument: > > func.argtypes[n].from_param(x._as_parameter_) > Unfortunately, from the source code this is not true. It would be an improvement, but the source code shows that the from_param of each type does something special and only works with particular kinds of data-types --- basic Python types or ctypes types. I did not see evidence that the _as_parameter_ method was called within any of the from_param methods of _ctypes.c > However, if I try passing x directly to printf, I get this: > > In [147]: printf('%p\n', x) > ... > ArgumentError: argument 2: exceptions.TypeError: wrong type > > However, this much works: > > In [148]: ctypes.c_void_p.from_param(x._as_parameter_) > Out[148]: <cparam 'P' (01cc8ac0)> > > So I don't understand why the conversion isn't happening automatically. > Despite any advertisement, the code is just not there in ctypes to do it when argtypes are present. Dealing with non-ctypes data is apparently not handled when argtypes are present. Get-rid of the argtypes setting and it will work (because then the _as_parameter_ method is called.... > Another quirk I noticed is that non-void pointers' from_param can't seem to > be used with ints. For example: > Yeah from the code it looks like each from_param method has it's own implementation that expects it's own set of "acceptable" things. There does not seem to be any way for an object to inform it appropriately. > I don't think this is too much of an issue though -- you could wrap all your > functions to take c_void_ps. If you happen to pass an int32 NumPy array to a > function expecting a double*, you might run into problems though. > Yeah, but you were going to run into trouble anyway. I don't really see a lot of "value-added" in the current type-checking c-types provides and would just ignore it at this point. Build a Python function that calls out to the c-function. > Maybe there should be a way to get a pointer to the NumPy array data as a > POINTER(c_double) if it is known that the array's dtype is float64. Ditto > for c_int/int32 and the others. > I could see value in arr.ctypes.data_as() arr.ctypes.strides_as() arr.ctypes.shape_as() methods which allow returning the data as different kinds of c-types things instead of the defaults --- Perhaps we just make data, strides, and shapes methods with an optional argument. -Travis |
From: Albert S. <fu...@gm...> - 2006-07-03 08:11:19
|
Hello all Travis Oliphant wrote: > <snip> > Unfortunately, from the source code this is not true. It would be an > improvement, but the source code shows that the from_param of each type > does something special and only works with particular kinds of > data-types --- basic Python types or ctypes types. I did not see > evidence that the _as_parameter_ method was called within any of the > from_param methods of _ctypes.c To summarise, I think we've come to the conclusion that one should avoid argtypes when mixing NumPy with ctypes? (at least for now) The extensions to .ctypes you propose below should make it possible to use NumPy arrays with argtypes set. "Raw" C functions will probably be wrapped by a Python function 99.9% of the time for error checking, etc. This hides the need to call the .ctypes stuff from the user. > <snip> > > Maybe there should be a way to get a pointer to the NumPy array data as > a > > POINTER(c_double) if it is known that the array's dtype is float64. > Ditto > > for c_int/int32 and the others. > > > > I could see value in > > arr.ctypes.data_as() > arr.ctypes.strides_as() > arr.ctypes.shape_as() > > methods which allow returning the data as different kinds of c-types > things instead of the defaults --- Perhaps we just make data, strides, > and shapes methods with an optional argument. Agreed. If you really like argtypes, arr.ctypes.data_as() would be perfect for doing the necessary work to make sure ctypes accepts the array. arr.ctypes.data_as(c_type) could be implemented as ctypes.cast(x.ctypes.data, ctypes.POINTER(c_type)) c_void_p, c_char_p and c_wchar_p are special cases that aren't going to work here, so maybe it should be ctypes.cast(x.ctypes.data, c_type) in which case one mostly call it as arr.ctypes.data_as(POINTER(c_type)). Regards, Albert |
From: Albert S. <fu...@gm...> - 2006-07-03 12:59:11
|
Hello all Travis Oliphant wrote: > The ctypes-conversion object has attributes which return c_types aware > objects so that the information can be passed directly to c-code (as an > integer, the number of dimensions can already be passed using c-types). > > The information available and it's corresponding c_type is > > data - c_void_p > shape, strides - c_int * nd or c_long * nd or c_longlong * nd > depending on platform Stefan and I did some more experiments and it seems like .ctypes.strides isn't doing the right thing for subarrays. For example: In [52]: x = N.rand(3,4) In [57]: [x.ctypes.strides[i] for i in range(x.ndim)] Out[57]: [32, 8] This looks fine. But for this subarray: In [56]: [x[1:3,1:4].ctypes.strides[i] for i in range(x.ndim)] Out[56]: [32, 8] In this case, I think one wants strides[0] (the row stride) to return 40. .ctypes.data already seems to do the right thing: In [60]: x.ctypes.data Out[60]: c_void_p(31685288) In [61]: x[1:3,1:4].ctypes.data Out[61]: c_void_p(31685328) In [62]: 31685288-31685328 Out[62]: 40 What would be a good way of dealing with discontiguous arrays? It seems like one might want to disable their .ctypes attribute. Regards, Albert |
From: Travis O. <oli...@ie...> - 2006-07-03 19:38:24
|
Albert Strasheim wrote: > Stefan and I did some more experiments and it seems like .ctypes.strides > isn't doing the right thing for subarrays. > > For example: > > In [52]: x = N.rand(3,4) > In [57]: [x.ctypes.strides[i] for i in range(x.ndim)] > Out[57]: [32, 8] > > This looks fine. But for this subarray: > > In [56]: [x[1:3,1:4].ctypes.strides[i] for i in range(x.ndim)] > Out[56]: [32, 8] > > In this case, I think one wants strides[0] (the row stride) to return 40. > Why do you think that? All sliced arrays keep the same strides information as their "parents". This is the essence of a "view". The striding is exactly the same as before (the data hasn't moved anywhere), only the starting point and the bounds have changed. > .ctypes.data already seems to do the right thing: > > In [60]: x.ctypes.data > Out[60]: c_void_p(31685288) > > In [61]: x[1:3,1:4].ctypes.data > Out[61]: c_void_p(31685328) > > In [62]: 31685288-31685328 > Out[62]: 40 > > What would be a good way of dealing with discontiguous arrays? It seems like > one might want to disable their .ctypes attribute. > > No, not at all. Discontiguous arrays are easily handled simply by using the strides information to step through the array in each dimension instead of "assuming" contiguousness. Perhaps there is some confusion about what the strides actually represent. It's quite easy to write C-code that takes stride information as well which will then work with discontiguous arrays. The benefit of this approach is that you avoid copying data when you don't really have to. There should be very little performance penalty in most algorithms as well as the strides calculation is not much more than adding 1 to the current pointer. -Travis |
From: Albert S. <fu...@gm...> - 2006-07-04 07:57:07
|
Hello all > > In this case, I think one wants strides[0] (the row stride) to return > 40. > > > > Why do you think that? > > All sliced arrays keep the same strides information as their > "parents". This is the essence of a "view". The striding is exactly > the same as before (the data hasn't moved anywhere), only the starting > point and the bounds have changed. Sorry, I was suffering some temporary strides brain damage. :-) Regards, Albert |
From: Thomas H. <th...@py...> - 2006-07-04 09:56:54
Attachments:
shape.py
from_param.patch
|
Travis Oliphant schrieb: > I've been playing a bit with ctypes and realized that with a little > help, it could be made much easier to interface with NumPy arrays. > Thus, I added a ctypes attribute to the NumPy array. If ctypes is > installed, this attribute returns a "conversion" object otherwise an > AttributeError is raised. > > The ctypes-conversion object has attributes which return c_types aware > objects so that the information can be passed directly to c-code (as an > integer, the number of dimensions can already be passed using c-types). > > The information available and it's corresponding c_type is > > data - c_void_p > shape, strides - c_int * nd or c_long * nd or c_longlong * nd > depending on platform I've also played a little, and I think one important limitation in ctypes is that items in the argtypes list have to be ctypes types. If that limitation is removed (see the attached trivial patch) one can write a class that implements 'from_param' and accepts ctypes arrays as well as numpy arrays as argument in function calls (Maybe the _as_parameter_ stuff needs cleanup as well). The attached shape.py script implements this class, and has two examples. The 'from_param' method checks the correct shape and itemtype of the arrays that are passed as parameter. Thomas |
From: Thomas H. <th...@py...> - 2006-07-04 10:05:07
|
Thomas Heller schrieb: > I've also played a little, and I think one important limitation in ctypes > is that items in the argtypes list have to be ctypes types. Thi swas misleading: I mean that this limitation should probably be removed, because it prevents a lot of things one could do. Thomas |
From: Albert S. <fu...@gm...> - 2006-07-04 10:11:49
|
Hey Thomas Thomas Heller wrote: > Thomas Heller schrieb: > > I've also played a little, and I think one important limitation in > ctypes > > is that items in the argtypes list have to be ctypes types. > > Thi swas misleading: I mean that this limitation should probably be > removed, because it prevents a lot of things one could do. What's your thinking on getting these changes made to ctypes and on ctypes' future development in general? Presumably you can't change it too much with the Python 2.5 release coming up, but it would be a shame if we had to wait until Python 2.6 to get the changes you suggested (and other goodies, like the array interface). Regards, Albert |
From: Thomas H. <th...@py...> - 2006-07-04 11:50:11
|
Albert Strasheim schrieb: > Hey Thomas > > Thomas Heller wrote: >> Thomas Heller schrieb: >> > I've also played a little, and I think one important limitation in >> ctypes >> > is that items in the argtypes list have to be ctypes types. >> >> Thi swas misleading: I mean that this limitation should probably be >> removed, because it prevents a lot of things one could do. > > What's your thinking on getting these changes made to ctypes and on ctypes' > future development in general? > > Presumably you can't change it too much with the Python 2.5 release coming > up, but it would be a shame if we had to wait until Python 2.6 to get the > changes you suggested (and other goodies, like the array interface). I have asked on python-dev, let's wait for the answer. I hope that at least the limitation that I mentioned can be removed in Python 2.5. The goal of my post was to show that (without this restriction) a lot can already be done in Python, of course it would be better if this could be implemented in C and integrated in ctypes. For the numpy/ctypes inegration I'm not absolutely sure what would be needed most: Is there a need to convert between ctypes and numpy arrays? If numpy arrays can be passed to ctypes foreign functions maybe there is no need at all for the conversion. We could probably even live with helper code like that I posted outside of ctypes... Thomas |
From: Albert S. <fu...@gm...> - 2006-07-04 12:58:12
|
Hello all On Tue, 04 Jul 2006, Thomas Heller wrote: > Albert Strasheim schrieb: > > Hey Thomas > > > > Thomas Heller wrote: > >> Thomas Heller schrieb: > >> > I've also played a little, and I think one important limitation in > >> ctypes > >> > is that items in the argtypes list have to be ctypes types. > >> > >> Thi swas misleading: I mean that this limitation should probably be > >> removed, because it prevents a lot of things one could do. > > > > What's your thinking on getting these changes made to ctypes and on ctypes' > > future development in general? > > > > Presumably you can't change it too much with the Python 2.5 release coming > > up, but it would be a shame if we had to wait until Python 2.6 to get the > > changes you suggested (and other goodies, like the array interface). > > I have asked on python-dev, let's wait for the answer. > I hope that at least the limitation that I mentioned can be removed in Python 2.5. Sounds great. > The goal of my post was to show that (without this restriction) a lot can > already be done in Python, of course it would be better if this could be > implemented in C and integrated in ctypes. > > For the numpy/ctypes inegration I'm not absolutely sure what would be needed most: > > Is there a need to convert between ctypes and numpy arrays? If numpy arrays can > be passed to ctypes foreign functions maybe there is no need at all for the conversion. > We could probably even live with helper code like that I posted outside of ctypes... I think there are basically two ways for a C library to work with regards to memory allocation: 1. let the user allocate the array/struct/whatever and pass a pointer to the library to manipulate 2. let the library allocate the array/struct/whatever, manipulate it and return the pointer to the user I think the first case is pretty much covered. Where in the past you would create the array or struct on the stack or allocate it on the heap with malloc, you now create a ctypes Structure or a NumPy array and pass that to the C function. In the second case, one would want to wrap a NumPy array around the ctype so that you can manipulate the data returned by the library. I don't know if this second scenario is very common -- hopefully not. If not, then having ctypes implement the array interface isn't too critical, since you wouldn't typically need to make a NumPy array from existing data. What do you think? Regards, Albert |