From: David C. <da...@ar...> - 2006-11-07 02:24:32
|
Hi, I am trying to find a nice way to communicate between matlab and python. I am aware of pymat, which does that, but the code is deprecated, and I thing basing the code on ctypes would lead to much more robust code. http://claymore.engineer.gvsu.edu/%7Esteriana/Software/pymat.html I have a really simple prototype which can send and get back data from matlab, but I was wondering if it would be possible to use a scheme similar to ctypes instead of having to convert it by hand. Let me present the way communication with matlab is done: - open a matlab session, which on unix launch a matlab process, and set up pipes between the calling process and matlab process for communication - To send data from the calling process to matlab, you first have to create a mxArray, which is the basic matlab handler of a matlab array, and populating it. Using mxArray is very ackward : you cannot create mxArray from existing data, you have to copy data to them, etc... (it is one of the reason I started looking in python in the first place: interfacing matlab with foreign code is really not a nice experience, and the API is not rich enough to do many interesting things; I am actually amazed how poor this part of matlab is, a product which is now 20 years old). I was wondering if there was a way to extend a numpy array so that I could send directly numpy arrays to matlab C functions expecting a , with ctypes doing the hard work. For example, now, to send data to matlab, I do: session = MatlabEngine() # data is a numpy array, 'dataname' a string with its name inside matlab interpreter session.put(data, 'dataname') The put function has do to a lot for work: - first, getting metadata from data (dimensions, real or complex, etc...) - then creating a mxArray with the same metadata - then populating the mxarray by copying the data from the numpy array. And of course, taking care that all mxarrays are detroyed, otherwise, memory leak... That's why I was thinking about something a bit smarter: creating a mxarray class which implements the numpy interface. I could create a mxarray from a numpy array easily, send a mxarray directly to C function with ctypes, etc... Is this doable ? Would this take care of correct destruction of mxarrays ? I really don't know much about the internals of numpy arrays, so I don't really know how to start, cheers, David |
From: Josh M. <jos...@gm...> - 2006-11-07 04:59:41
|
Hi David, Did you have a look at mlabwrap? It's quite hard to find on the net, which is a shame, since it is a much more up to date version, enhancing pymat with the things that you are trying to do. It allows passing arrays and getting arrays back. http://mlabwrap.sourceforge.net/ However, I do agree with you that it would be good to have a bridge based on ctypes. I can't really help since I long ago moved all of my code from Matlab to Numpy and Matplotlib. Cheers, Josh M David Cournapeau <da...@ar...> wrote: > > Hi, > > I am trying to find a nice way to communicate between matlab and > python. I am aware of pymat, which does that, but the code is > deprecated, and I thing basing the code on ctypes would lead to much > more robust code. > > http://claymore.engineer.gvsu.edu/%7Esteriana/Software/pymat.html > > I have a really simple prototype which can send and get back data from > matlab, but I was wondering if it would be possible to use a scheme > similar to ctypes instead of having to convert it by hand. Let me > present the way communication with matlab is done: > > - open a matlab session, which on unix launch a matlab process, > and > set up pipes between the calling process and matlab process for > communication > - To send data from the calling process to matlab, you first > have to > create a mxArray, which is the basic matlab handler of a matlab array, > and populating it. Using mxArray is very ackward : you cannot create > mxArray from existing data, you have to copy data to them, etc... > (it is > one of the reason I started looking in python in the first place: > interfacing matlab with foreign code is really not a nice experience, > and the API is not rich enough to do many interesting things; I am > actually amazed how poor this part of matlab is, a product which is > now > 20 years old). > > I was wondering if there was a way to extend a numpy array so > that I > could send directly numpy arrays to matlab C functions expecting a , > with ctypes doing the hard work. For example, now, to send data to > matlab, I do: > > session = MatlabEngine() > # data is a numpy array, 'dataname' a string with its name inside > matlab interpreter > session.put(data, 'dataname') > > The put function has do to a lot for work: > - first, getting metadata from data (dimensions, real or complex, > etc...) > - then creating a mxArray with the same metadata > - then populating the mxarray by copying the data from the > numpy array. > > And of course, taking care that all mxarrays are detroyed, otherwise, > memory leak... That's why I was thinking about something a bit > smarter: > creating a mxarray class which implements the numpy interface. I could > create a mxarray from a numpy array easily, send a mxarray directly > to C > function with ctypes, etc... Is this doable ? Would this take care of > correct destruction of mxarrays ? I really don't know much about the > internals of numpy arrays, so I don't really know how to start, > > cheers, > > David > > |
From: David C. <da...@ar...> - 2006-11-08 03:08:11
|
Josh Marshall wrote: > Hi David, > > Did you have a look at mlabwrap? It's quite hard to find on the net, > which is a shame, since it is a much more up to date version, > enhancing pymat with the things that you are trying to do. It allows > passing arrays and getting arrays back. > > http://mlabwrap.sourceforge.net/ > I didn't know that, thanks. Unfortunately, it is not really what I am trying to do: mlabwrap is just a python interface a bit more high level than pymat, with many fancy tricks, but still do copies. What I would like is to avoid completely the copying by using proxy classes around data from numpy so that I can pass "automatically" numpy arrays to matlab C api, and a proxy class around data from matlab so that they look like numpy arrays. I don't care that much about the actual api from python point of view, because I intend to use this mainly to compare matlab vs numpy implementation, not as a way to use matlab inside python regularly. And once the copy problem is solved, adding syntactic sugar using python is easy anyway, I think (it should be easy to do something similar to mlabwrap at that point), cheers, David |
From: Andrew S. <str...@as...> - 2006-11-07 04:56:55
|
David Cournapeau wrote: > - To send data from the calling process to matlab, you first have to > create a mxArray, which is the basic matlab handler of a matlab array, > and populating it. Using mxArray is very ackward : you cannot create > mxArray from existing data, you have to copy data to them, etc... My understanding, never having done it, but from reading the docs, is that you can create a "hybrid array" where you manage the memory. Thus, you can create an mxArray from existing data. However, the docs basically say that this is too hard for most mortals (and they may well be right -- too painful for me, anyway)! |
From: David C. <da...@ar...> - 2006-11-07 05:00:46
|
Andrew Straw wrote: > David Cournapeau wrote: > >> - To send data from the calling process to matlab, you first have to >> create a mxArray, which is the basic matlab handler of a matlab array, >> and populating it. Using mxArray is very ackward : you cannot create >> mxArray from existing data, you have to copy data to them, etc... >> > My understanding, never having done it, but from reading the docs, is > that you can create a "hybrid array" where you manage the memory. Thus, > you can create an mxArray from existing data. However, the docs > basically say that this is too hard for most mortals (and they may well > be right -- too painful for me, anyway)! > Would you mind telling me where you found that information ? Because right now, I am wasting a lot of cycles because of memory copy in both directions, and it is sometimes slow enough so that it is annoying, cheers, David |
From: Andrew S. <str...@as...> - 2006-11-07 05:43:15
|
David Cournapeau wrote: > Andrew Straw wrote: > >> David Cournapeau wrote: >> >> >>> - To send data from the calling process to matlab, you first have to >>> create a mxArray, which is the basic matlab handler of a matlab array, >>> and populating it. Using mxArray is very ackward : you cannot create >>> mxArray from existing data, you have to copy data to them, etc... >>> >>> >> My understanding, never having done it, but from reading the docs, is >> that you can create a "hybrid array" where you manage the memory. Thus, >> you can create an mxArray from existing data. However, the docs >> basically say that this is too hard for most mortals (and they may well >> be right -- too painful for me, anyway)! >> >> > Would you mind telling me where you found that information ? Because > right now, I am wasting a lot of cycles because of memory copy in both > directions, and it is sometimes slow enough so that it is annoying, > I found it reading through the in-program help (the C-API section, whatever it's called) on a Matlab installation at my university. I guess this was Matlab 2006A. A quick Google search turns this up: http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_external/index.html?/access/helpdesk/help/techdoc/matlab_external/f25255.html They give the following example, which seems to create a Matlab array "pArray" with data owned by the C variable "data": mxArray *pArray = mxCreateDoubleMatrix(0, 0, mxREAL); double data[10]; mxSetPr(pArray, data); mxSetM(pArray, 1); mxSetN(pArray, 10); |
From: David C. <da...@ar...> - 2006-11-07 05:53:58
|
Andrew Straw wrote: > David Cournapeau wrote: > >> Andrew Straw wrote: >> >> >>> David Cournapeau wrote: >>> >>> >>> >>>> - To send data from the calling process to matlab, you first have to >>>> create a mxArray, which is the basic matlab handler of a matlab array, >>>> and populating it. Using mxArray is very ackward : you cannot create >>>> mxArray from existing data, you have to copy data to them, etc... >>>> >>>> >>>> >>> My understanding, never having done it, but from reading the docs, is >>> that you can create a "hybrid array" where you manage the memory. Thus, >>> you can create an mxArray from existing data. However, the docs >>> basically say that this is too hard for most mortals (and they may well >>> be right -- too painful for me, anyway)! >>> >>> >>> >> Would you mind telling me where you found that information ? Because >> right now, I am wasting a lot of cycles because of memory copy in both >> directions, and it is sometimes slow enough so that it is annoying, >> >> > I found it reading through the in-program help (the C-API section, > whatever it's called) on a Matlab installation at my university. I guess > this was Matlab 2006A. A quick Google search turns this up: > > http://www.mathworks.com/access/helpdesk/help/techdoc/matlab_external/index.html?/access/helpdesk/help/techdoc/matlab_external/f25255.html > > They give the following example, which seems to create a Matlab array > "pArray" with data owned by the C variable "data": > > mxArray *pArray = mxCreateDoubleMatrix(0, 0, mxREAL); > double data[10]; > > mxSetPr(pArray, data); > mxSetM(pArray, 1); > mxSetN(pArray, 10); > Thank you very much, I think this added documentation is pretty recent; I have never seen it before, and I did a lot a mex programming at some point... This whole mxarray nonsense reminds me why I gave up on matlab :), cheers, David |
From: Matthew B. <mat...@gm...> - 2006-11-07 18:57:26
|
Hi, > Thank you very much, I think this added documentation is pretty recent; > I have never seen it before, and I did a lot a mex programming at some > point... This whole mxarray nonsense reminds me why I gave up on matlab :), I would be very happy to help with this. It would be great if we could get a standard well-maintained library of some sort towards scipy - we (http://neuroimaging.scipy.org/) have a great deal of matlab integration to do. Best, Matthew |
From: David C. <da...@ar...> - 2006-11-09 02:32:26
|
Matthew Brett wrote: > > I would be very happy to help with this. It would be great if we > could get a standard well-maintained library of some sort towards > scipy - we (http://neuroimaging.scipy.org/) have a great deal of > matlab integration to do. > I am a bit busy and late on my PhD schedule; I will try to release something remotely usable by other people within the end of the week, so other people can start hacking on it too, cheers, David |
From: David C. <da...@ar...> - 2006-11-08 14:44:27
|
Andrew Straw wrote: > David Cournapeau wrote: > >> - To send data from the calling process to matlab, you first have to >> create a mxArray, which is the basic matlab handler of a matlab array, >> and populating it. Using mxArray is very ackward : you cannot create >> mxArray from existing data, you have to copy data to them, etc... >> > My understanding, never having done it, but from reading the docs, is > that you can create a "hybrid array" where you manage the memory. Thus, > you can create an mxArray from existing data. However, the docs > basically say that this is too hard for most mortals (and they may well > be right -- too painful for me, anyway)! > Ok, I have looked at it. It is not hard, it is just totally brain damaged: there is no way to destroy a mxArray without destroying the data it is holding, even after a call with mxSetPr. So the data referenced by the pointer given to mxSetPr is always destroyed by mxDestroyArray; I don't see any way to use this to avoid copy... They could at least have given a function which frees the data buffer and one which destroys the other stuff; as it is, it is totally useless, unless you don't mind memory leaks. David |
From: Andrew S. <str...@as...> - 2006-11-08 16:51:52
|
David Cournapeau wrote: > Andrew Straw wrote: > >> David Cournapeau wrote: >> >> >>> - To send data from the calling process to matlab, you first have to >>> create a mxArray, which is the basic matlab handler of a matlab array, >>> and populating it. Using mxArray is very ackward : you cannot create >>> mxArray from existing data, you have to copy data to them, etc... >>> >>> >> My understanding, never having done it, but from reading the docs, is >> that you can create a "hybrid array" where you manage the memory. Thus, >> you can create an mxArray from existing data. However, the docs >> basically say that this is too hard for most mortals (and they may well >> be right -- too painful for me, anyway)! >> >> > Ok, I have looked at it. It is not hard, it is just totally brain > damaged: there is no way to destroy a mxArray without destroying the > data it is holding, even after a call with mxSetPr. So the data > referenced by the pointer given to mxSetPr is always destroyed by > mxDestroyArray; I don't see any way to use this to avoid copy... They > could at least have given a function which frees the data buffer and one > which destroys the other stuff; as it is, it is totally useless, unless > you don't mind memory leaks. > It does sound brain damaged, I agree. But here's a suggestion: can you keep a pool of unused mxArrays rather than calling mxDestroyArray? I guess without the payload, they're just a few bytes and shouldn't take up that much space. |
From: Pauli V. <pau...@ik...> - 2006-11-07 19:00:49
|
Hi all, ti, 2006-11-07 kello 11:23 +0900, David Cournapeau kirjoitti: > I am trying to find a nice way to communicate between matlab and=20 > python. I am aware of pymat, which does that, but the code is=20 > deprecated, and I thing basing the code on ctypes would lead to much=20 > more robust code. >=20 > http://claymore.engineer.gvsu.edu/%7Esteriana/Software/pymat.html >=20 > I have a really simple prototype which can send and get back data from=20 > matlab, but I was wondering if it would be possible to use a scheme=20 > similar to ctypes instead of having to convert it by hand. A while ago I wrote a mex extension to embed the Python interpreter inside Matlab: http://www.iki.fi/pav/pythoncall I guess it's something like an inverse of pymat :) But I guess this is not really what you are looking for, since at present it just does a memory copy when passing arrays between Matlab and Python. Though, shared arrays might be just possible to implement if memory management is done carefully. BR, Pauli Virtanen |
From: David C. <da...@ar...> - 2006-11-08 02:44:29
|
Pauli Virtanen wrote: > Hi all, > > ti, 2006-11-07 kello 11:23 +0900, David Cournapeau kirjoitti: > >> I am trying to find a nice way to communicate between matlab and >> python. I am aware of pymat, which does that, but the code is >> deprecated, and I thing basing the code on ctypes would lead to much >> more robust code. >> >> http://claymore.engineer.gvsu.edu/%7Esteriana/Software/pymat.html >> >> I have a really simple prototype which can send and get back data from >> matlab, but I was wondering if it would be possible to use a scheme >> similar to ctypes instead of having to convert it by hand. >> > > A while ago I wrote a mex extension to embed the Python interpreter > inside Matlab: > > http://www.iki.fi/pav/pythoncall > > I guess it's something like an inverse of pymat :) > > Yes, but at the end, I think they enable similar things. Thanks for the link ! > But I guess this is not really what you are looking for, since at > present it just does a memory copy when passing arrays between Matlab > and Python. Though, shared arrays might be just possible to implement if > memory management is done carefully. > In my case, it is much worse: 1 first, you have numpy data that you have to copy to mxArray, the structure representing arrays in matlab C api. 2 then when you "send" data to the matlab engine, this is done automatically through pipe by the matlab engine API (maybe pipe does not imply copying; I don't know much about pipe from a programming point of view, actually) 3 The arrays you get back from matlab are in matlab mxArray structures: right now, I copy their data to new numpy arrays. At first, I just developed a prototype without thinking too much, and the result was much slower than I thought: sending a numpy with 2e5x10 double takes around 100 ms on my quite powerful machine (around 14 cycles per item for the best case). I suspect it is because I copy memory in a non contiguous manner (matlab arrays have a internal F storage for real arrays, but complex arrays are really two different arrays, which is different than Fortran convention I think, making the copy cost really expensive for complex arrays). To see if I was doing something wrong, I compared with numpy.require(ar, requirements = 'F_CONTIGUOUS'), which is even much slower There is not much I can do about 2, it looks like there is a way to avoid copying for 1, and my question was more specific to 3 (but reusable in 1, maybe, if I am smart enough). Basically: * how to create an object which has the same interface than numpy arrays, but owns the data from a foreign structure, which data are availble when building the object (The idea was to create a class which implements the array interface from python, kind of proxy class, which owns the data from mxArray; owns here is from a memory management point of view). David |