From: Travis O. <oli...@ie...> - 2002-03-06 04:35:18
|
Recently there has been discussion on the list about the awkwardness of matrix syntax when using Numeric Python. Matrix expressions can be awkard to express in Numeric which is a negative mark on an otherwise excellent computing environment. Currently part of the problem can be solved by working with Matrix objects explicitly: a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. However, most operations return arrays which have to be recast to matrices using at best a character with parenthesis: M = Matrix.Matrix M(sin(a)) * M(cos(a)).T The suggestion was made to add ".M" as an attribute of arrays which returns a matrix. Thus, the code above can be written: sin(a).M * cos(a).M.T While some aesthestic simplicity is obtained, the big advantage is in consistency. Somebody else may decide that P = Matrix.Matrix is a better choice. But, if we establish that .M always returns a matrix for arrays < 2d, then we gain consistency. I've made this change and am ready to commit the change to the Numeric tree, unless there are strong objections. I know some people do not like the proliferation of attributes, but in this case the notational convenience it affords to otherwise overly burdened syntax and the consistency it allows Numeric to deal with Matrix equations may be worth it. What do you think? -Travis Oliphant |
From: Pearu P. <pe...@ce...> - 2002-03-06 07:45:10
|
Hi! On Tue, 5 Mar 2002, Travis Oliphant wrote: > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? Would it be possible to use own Matrix classes instead of what is in Matrix.py? I gather that there must be some setter method in Numeric for that: Numeric.set_matrix_factory(MyMatrixClass) with a requirement that MyMatrixClass must be a subclass of Matrix.Matrix. I think it would be a very important feature as users can define their own matrix operations, for example, using their own BLAS routines to speed up operations with matrices (yes, I am thinking of SciPy specific Matrix class). Thanks, Pearu |
From: Konrad H. <hi...@cn...> - 2002-03-06 08:49:53
|
Travis Oliphant <oli...@ie...> writes: > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it At the risk of sounding unconstructively negative, I think this is a misuse of attributes. For someone used to read standard Python code, where attributes are, well, attributes, code using this notation is just weird. Personally, consistent notation is more important than short notation. The Pythonesque solution to this problem, in my opinion, is separate matrix and array objects (which can and should of course share implementation code) plus explicit constructors to convert between the two. I am a bit worried that kludges such as fake attributes set bad precedents for the future. One of the main reasons why I like Python is its clean syntax and its simple object model. This kind of notation messes up both of them. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Perry G. <pe...@st...> - 2002-03-06 18:48:22
|
Travis Oliphant writes: > > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the > Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational > convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > I'd have to agree with Konrad and Paul on this one. While it makes simple matrix expressions clearer, it opens a whole can of worms that were discussed (and never resolved) a couple years ago. Suppose I do this: x = a.M * libfunc(b.M, c.M) where libfunc is a 3rd party module written in Python that was written assuming that operators were elementwise operators. It may silently do a matrix multiplication (depending on the shapes) instead of the intended elementwise multiplication. Yet the usage above looks just as legitimate as x = a.M * b.M In other words, it raises the issues of having incompatible modules, some written with Numeric objects in mind, others with Matrix objects in mind. Conceivably there will be modules useful for both kinds of objects. Do we need to support two kinds? How do we deal with this? This is still a problem if we don't allow the .M attribute but still have widespread usage of a array object with different behavior for operations. Unlike masked arrays, whose basic behavior is unchanged for "good" data, the behavior for identical data is completely different. I wish I had a good answer for this. I don't remember all of the past suggestions, but it looks like one of the following solutions is needed: 1) Campaign for new operators in Python (various proposals to this affect. This is probably best from the Numeric point of view (maybe not from Python in general though). 2) Allow different array classes with different behavior, but come up with conventions and utilities for library developers to produce versions of arrays compatible with the convention assumed for the module (and convert back to the input type for output values). This doesn't prevent all opportunities for confusion and errors however. It also puts a stronger burden on library developers. 3) Do nothing and deal with the resulting mess. Perhaps the two camps have little need for each other's tools and it won't be much of a problem. Do option 2 retroactively if it is a problem. Other suggestions? Perry Perry |
From: Todd A. P. Ph.D. <tp...@ac...> - 2002-03-06 19:20:43
|
I often (perhaps inappropriately) fall into the "silent user" category. However, many of those in this conversation have put significant effort into python development and the least I can do is offer a comment from the standpoint of someone who uses python and Numeric extensively. Perhaps I am stepping into the middle of a conversation here -- I hope I have read all the relevant material. People may "like" Matlab syntax because it requires less typing or because it pleases them aesthetically. I personally feel that explicit function based operators (like transpose()) are very clear and unambiguous. While I understand the desire to have the code and the "math" look similar I think, in general, this is leads to the same kind of difficulty one has with notation in mathematics -- Notation that works well in some fields is extremely cumbersome in others. I don't expect it to look like an equation. I find orderly, predictable behavior that doesn't send me to the source code too often to figure out what is happening very helpful. Treating 1d or 2d arrays as matrices is admittedly *very* useful in some applications but cumbersome in others. This problem is reminiscent of the "clash" between the PIL and Numeric modules or between the C-language row-major matrix storage format and the (in my opinion) better thought out FORTRAN column-major matrix storage format. These differences place limitations on the potential for synergistic profits in the project. It is my personal experience/opinion that "convenience" methods are best added in a specific application that is not intended to be released generally. -Todd * Perry Greenfield (pe...@st...) wrote: > Travis Oliphant writes: > > > > > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > > > I've made this change and am ready to commit the change to the > > Numeric tree, > > unless there are strong objections. I know some people do not like the > > proliferation of attributes, but in this case the notational > > convenience it > > affords to otherwise overly burdened syntax and the consistency it allows > > Numeric to deal with Matrix equations may be worth it. > > > > What do you think? > > > > -Travis Oliphant > > > > > I'd have to agree with Konrad and Paul on this one. While it makes simple > matrix expressions clearer, it opens a whole can of worms that were > discussed (and never resolved) a couple years ago. Suppose I do this: > > x = a.M * libfunc(b.M, c.M) > > where libfunc is a 3rd party module written in Python that was written > assuming that operators were elementwise operators. It may silently > do a matrix multiplication (depending on the shapes) instead of the > intended elementwise multiplication. Yet the usage above looks just as > legitimate as > > x = a.M * b.M > > In other words, it raises the issues of having incompatible modules, some > written with Numeric objects in mind, others with Matrix objects in mind. > Conceivably there will be modules useful for both kinds of objects. Do > we need to support two kinds? How do we deal with this? > > This is still a problem if we don't allow the .M attribute but still have > widespread usage of a array object with different behavior for operations. > Unlike masked arrays, whose basic behavior is unchanged for "good" data, > the behavior for identical data is completely different. > > I wish I had a good answer for this. I don't remember all of the past > suggestions, but it looks like one of the following solutions is needed: > > 1) Campaign for new operators in Python (various proposals to this affect. > This is probably best from the Numeric point of view (maybe not from > Python in general though). > 2) Allow different array classes with different behavior, but come up with > conventions and utilities for library developers to produce versions of > arrays compatible with the convention assumed for the module (and convert > back to the input type for output values). This doesn't prevent all > opportunities for confusion and errors however. It also puts a stronger > burden on library developers. > 3) Do nothing and deal with the resulting mess. Perhaps the two camps have > little need for each other's tools and it won't be much of a problem. > Do option 2 retroactively if it is a problem. > > Other suggestions? > > Perry > > > > Perry > > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: Konrad H. <hi...@cn...> - 2002-03-06 19:24:48
|
"Perry Greenfield" <pe...@st...> writes: > discussed (and never resolved) a couple years ago. Suppose I do this: > > x = a.M * libfunc(b.M, c.M) > > where libfunc is a 3rd party module written in Python that was written > assuming that operators were elementwise operators. It may silently Then you are calling a routine with wrong arguments - that can happen in Python all the time. From my point of view, arrays and matrices are two entirely different things. A function written for matrix objects cannot be expected to work with array objects, and vice versa. Matrix operations should return matrix objects, and array operations should return array objects. What arrays and matrices have in common is not semantics, but implementation. That is something that implementors should profit from, but users shouldn't even need to know about. The discussion about matrices has focused on matrix multiplication as the main difference between the two objects. I suppose this was motivated by comparisons to Matlab and similar environments, which do not have the notion of data types and thus cannot properly distinguish between matrices and arrays. I don't see why should follow this limited approach. A matrix object should not only do matrix multiplication properly, but also provide methods such as diagonalization, application of functions as matrix functions, etc. That would be much more than syntactic sugar, it would be a real implementation of the mathematical concept "matrix". Seen from this point of view, it is not at all clear why an array should have an attribute that is an "equivalent" matrix, as no such equivalence exists in general (only for 2D arrays). > Conceivably there will be modules useful for both kinds of objects. Do I don't think so. The only analogous operations between arrays and matrices are addition, subtraction, negation, and multiplication with a scalar, and those would use the same syntax anyway. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Paul F D. <pa...@pf...> - 2002-03-06 19:48:56
|
I believe the correct solution is a major upgrade to Matrix.py along the lines of what is done in MA; that is, to craft an object that uses Numeric for its implementation but which defines all its own operators in a manner that is semantically sensible for the type of object it is. Such an upgrade could subsequently be improved by using different underlying software for various operations, or even more sophisticated changes such as using a transposed attribute to lazily evaluate transposes in a cleaner way than Numeric does it. Also an upgrade to Numarray would then be virtually painless. If you have never looked at MA, please examine source file Packages/MA/Lib/MA.py before commenting. This file is fairly complex and the required changes to Matrix.py would be considerably simpler; but you can verify that it is fairly straightforward to do. On my project we have done something similar to create a "climate data variable" object. Such a design includes an "exit" function to allow the instance to cheaply view itself as the underlying Numeric array. (In MA, this is "filled", which makes a Numeric array by replacing missing values, but if there are no missing values returns the underlying Numeric array). I'm willing to do this for the community but it would have a side effect; if anyone has been doing "from Matrix import *" they would suddenly get a lot more names imported that would conflict with any imported from Numeric. |
From: Pearu P. <pe...@ce...> - 2002-03-06 20:29:12
|
On Wed, 6 Mar 2002, Paul F Dubois wrote: > If you have never looked at MA, please examine source file > Packages/MA/Lib/MA.py before commenting. This file is fairly complex and You mean those who never used MA should take a day off to read 2000 line code in order to understand the implications of using MA and give a comment? ;-) Indeed, I have never used MA but as a first look it does not look too promising regarding performance: e.g. there seem to be lots of python code involved to apply a simple multiplication of arrays. Could someone more familiar with MA give a comment on performance issues, especially, keeping in mind number crunchers? Pearu |
From: Konrad H. <hi...@cn...> - 2002-03-07 11:22:03
|
> I believe the correct solution is a major upgrade to Matrix.py along the > lines of what is done in MA; that is, to craft an object that uses > Numeric for its implementation but which defines all its own operators > in a manner that is semantically sensible for the type of object it is. That is exactly my idea as well. However, from a quick glance at MA, it seems that this solution could suffer from performance problems when done in Python. What are the real-life experiences with MA in that respect? I suppose the new type inheritance mechanisms in Python 2.2 could help to make this more efficient, but I haven't used them for anything yet. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Pearu P. <pe...@ce...> - 2002-03-06 19:59:23
|
On Wed, 6 Mar 2002, Perry Greenfield wrote: > Other suggestions? Here is one suggestion that is based on the observation that all we need is an easy way to tell that the following operation should be applied as a matrix operation. So, the suggestion is to provide an attribute or a member function that returns an array (note, not a Matrix instance) that has a bit, called it asmatrix, set true but _only_ temporally. The bit is cleaned on every operation. And before applying an operation, the corresponding method (currently there seem to be only four relevant methods: __mul__, __pow__ and their r-versions) checks if either of the operants has asmatrix bit true, then performs the corresponding matrix operation, otherwise the default element-wise operation. And before returning, it cleans up asmatrix bit. For the sake of an example, let .m be Numeric array attribute that when accessed sets asmatrix=1 and returns the array. Examples: a * b - is element-wise multiplication a.m * b, a * b.m - are matrix multiplications, the resulting array, as well a and b, have asmatrix=0 a.m ** -1 - is matrix inverse sin(a) - element-wise sin sin(a.m) - matrix sin To summarize the main ideas: * array has asmatrix bit that most of the time is false. * there is a way to set the asmatrix bit true, either by .m or .M attributes or .m(), .M(), .. methods that return the same array. * __mul__, __pow__, etc. methods check if either operant has asmatrix true, then performs the corresponding matrix operation, otherwise the corresponding element-wise operation. * all operations clean asmatrix bit. So, what do you think? Pearu |
From: Travis O. <oli...@ee...> - 2002-03-06 20:55:38
|
> > Other suggestions? > > Here is one suggestion that is based on the observation that all we need > is an easy way to tell that the following operation should be applied > as a matrix operation. So, the suggestion is to provide an attribute or a > member function that returns an array (note, not a Matrix instance) that > has a bit, called it asmatrix, set true but _only_ temporally. The bit is > cleaned on every operation. And before applying an operation, the > corresponding method (currently there seem to be only four relevant > methods: __mul__, __pow__ and their r-versions) checks if either of > the operants has asmatrix bit true, then performs the corresponding matrix > operation, otherwise the default element-wise operation. And before > returning, it cleans up asmatrix bit. > Frankly, I like this kind of proposal. I disagree with Konrad about the separation between arrays and matrices. From my discussions with other people, it sounds like this is actually a point of disagreement for many in the broader community. To me, matrices are just arrays of rank <=2 which should be interpreted with their specific algebra. > For the sake of an example, let .m be Numeric array attribute that when > accessed sets asmatrix=1 and returns the array. > Examples: > > a * b - is element-wise multiplication > a.m * b, a * b.m - are matrix multiplications, the resulting > array, as well a and b, have asmatrix=0 > a.m ** -1 - is matrix inverse > sin(a) - element-wise sin > sin(a.m) - matrix sin > > To summarize the main ideas: > * array has asmatrix bit that most of the time is false. > * there is a way to set the asmatrix bit true, either by .m or .M > attributes or .m(), .M(), .. methods that return the same array. > * __mul__, __pow__, etc. methods check if either operant has asmatrix > true, then performs the corresponding matrix operation, otherwise > the corresponding element-wise operation. > * all operations clean asmatrix bit. > Again, I wouldn't mind it, but I suspect the more aesthetically critical on the list will dislike it because it blurs the (currently clumsy) distinction between arrays and Matrices that I'm beginning to see people actually like. -Travis |
From: Paul F D. <pa...@pf...> - 2002-03-06 21:51:56
|
Travis wrote: To me, matrices are just arrays of rank <=2 which should be interpreted with their specific algebra. -- If a class is roughly data plus behaviors, a matrix is not simply an array of rank <=2. You can express the concept of a matrix most cleanly as a separate class. Adding an argumentless member function .M to "convert" from one class to the other, and not make the other class explicit, is a bit weird. But if the other class "Matrix" is explicit, you needn't give it a privleged status with respect to Numeric.array by having a member function in Numeric.array that amounts to a Matrix constructor. The only real motivation for that seems to me to be the feeling that M(x) is somehow less clear than x.M. Note that except for a tricky property behavior, you really ought to have to write the latter as x.M(). As I said, I think we can beef up Matrix to make the linear algebra freaks happy, even to making things like transpose(A)*(B) as optimized operations. |
From: Andrew P. L. <bs...@al...> - 2002-03-06 20:39:04
|
Well, I believe that it solves the wrong problem. What I really want are Matrix objects that stay as Matrix objects even through their associated functions. And Array objects that stay as array objects. Why add any characters or casts at all when the objects can stay in their original type? Please correct me if I'm missing something here. -a On Tue, 5 Mar 2002, Travis Oliphant wrote: > > Recently there has been discussion on the list about the awkwardness of > matrix syntax when using Numeric Python. > > Matrix expressions can be awkard to express in Numeric which is a negative > mark on an otherwise excellent computing environment. > > Currently part of the problem can be solved by working with Matrix objects > explicitly: > > a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. > > However, most operations return arrays which have to be recast to matrices > using at best a character with parenthesis: > > M = Matrix.Matrix > > M(sin(a)) * M(cos(a)).T > > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. Somebody else may decide that > > P = Matrix.Matrix is a better choice. But, if we establish that > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: eric <er...@en...> - 2002-03-07 06:04:10
|
Boy did this one get a rise! Nice to hear so many voices. I also feel we need a more compact notation for linear algebra and would like to be able to do it without explicitly casting arrays to Matrix.Matrix objects. This attribute approach will work, but I wonder if trying the "adding an operator to Python" approach one more time would be worth while. At Python10 developer's day, Guido explicitly mentioned the linear algebra operator in a short comment saying something to the affect that, if the numeric community could agree on an appropriate operator, he would strongly consider the addition. He also mentioned the strangness of the 2 PEPs proposed on the topic at a coffee break... I noticed the status of both PEPs is "deferred." http://python.sourceforge.net/peps/pep-0211.html This one proposes the @ operator for outer products. http://python.sourceforge.net/peps/pep-0225.html This one proposes decorating the current binary ops with some symbols to indicate that they have different behavior than the standard binary ops. This is similar to Matlab's use of * for matrix multiplication and .* for element-wise multiplication or to R's use of * for element-wise multiplication and %*% for "object-wise" multiplication. It proposes prepending ~ to operators to change their behavior so that ~* would become matrix multiply. The PEP is a little more general, but this gives the flavor. My hunch is that some form of the second (perhaps drastically reduced) would meet with more success. The suggested ~* or even the %*% operator are both palitable. Such details can be decided later. The question is whether there is sufficient interest to try and push the operator idea through? It would take much longer than choosing something we can do ourselves (like .M), but the operator solution seems more desirable to me. eric ----- Original Message ----- From: "Travis Oliphant" <oli...@ie...> To: <num...@li...> Sent: Tuesday, March 05, 2002 11:44 PM Subject: [Numpy-discussion] adding a .M attribute to the array. > > Recently there has been discussion on the list about the awkwardness of > matrix syntax when using Numeric Python. > > Matrix expressions can be awkard to express in Numeric which is a negative > mark on an otherwise excellent computing environment. > > Currently part of the problem can be solved by working with Matrix objects > explicitly: > > a = Matrix.Matrix("[1 2 3; 4 5 6]") # Notice the strings. > > However, most operations return arrays which have to be recast to matrices > using at best a character with parenthesis: > > M = Matrix.Matrix > > M(sin(a)) * M(cos(a)).T > > The suggestion was made to add ".M" as an attribute of arrays which returns a > matrix. Thus, the code above can be written: > > sin(a).M * cos(a).M.T > > While some aesthestic simplicity is obtained, the big advantage is in > consistency. Somebody else may decide that > > P = Matrix.Matrix is a better choice. But, if we establish that > > .M always returns a matrix for arrays < 2d, then we gain consistency. > > I've made this change and am ready to commit the change to the Numeric tree, > unless there are strong objections. I know some people do not like the > proliferation of attributes, but in this case the notational convenience it > affords to otherwise overly burdened syntax and the consistency it allows > Numeric to deal with Matrix equations may be worth it. > > What do you think? > > -Travis Oliphant > > > > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion > |
From: Huaiyu Z. <hua...@ya...> - 2002-03-18 08:39:56
|
I'm a little late to this discussion, but it gives me a chance to read all the existing comments. I'd like to offer some background to one possible solution. On Thu, 7 Mar 2002, eric wrote: > > http://python.sourceforge.net/peps/pep-0225.html > > This one proposes decorating the current binary ops with some > symbols to indicate that they have different behavior than > the standard binary ops. This is similar to Matlab's use of > * for matrix multiplication and .* for element-wise multiplication > or to R's use of * for element-wise multiplication and %*% for > "object-wise" multiplication. > > It proposes prepending ~ to operators to change their behavior so > that ~* would become matrix multiply. > > The PEP is a little more general, but this gives the flavor. > > My hunch is that some form of the second (perhaps drastically reduced) would > meet with more success. The suggested ~* or even the %*% operator are both > palitable. Such details can be decided later. The question is whether there is > sufficient interest to try and push the operator idea through? It would take > much longer than choosing something we can do ourselves (like .M), but the > operator solution seems more desirable to me. > I'm one of the coauthor of this PEP. I'm very glad to see additional interest in this proposal. It is not just a proposal - there was actually a patch made by Gregory Lielens for the ~op operators for Python 2.0. It redefines ~ so that it can be combined with + - * / ** to form new operators. It is quite ingenious in that all the original bitwise operations on ~ alone are still valid. The new operator can be assigned any semantics with hooks like __tmul__ and __rtmul__. The idea is that a matrix class would define __mul__ so that * is matrix multiplication and define __tmul__ so that ~* is elementwise operation. There is a test implementation on the MatPy homepage (matpy.sourceforge.net). So what was holding it back? Well, last time around when this was discussed, it appears that most of the heavy weights in the Numeric community favored either keeping the status quo, or using ~* symbol for arrays. We hoped to use the MatPy package as a test case to show that it is possible to have two entirely different kinds of objects, where the meaning of * and ~* are switched. However, for various reasons I was not able to act upon it for months, and Python evolved into 2.1 and 2.2. I never had much time to update the patch, and felt the attempt was futile as 1) Python was evolving quite fast, 2) I did not heard much about this issue since then. I often feel guilty about the lapse. Now it might be a good time to revive this proposal, as the idea of having matrices and arrays with independent semantics but possibly related implementation appears to be gaining some additional acceptance. Some ancillary issues that hindered the implementation at that time have also been solved. For example, using .I for inverse, .T for transpose, etc, was costly because of the need to override __getattr__ and __coerce__, making a matrix class less attractive in practice. These can now be implemented efficiently using the new set/get mechanism. I'd like to hear any suggestions on how to proceed. My own favorite would be to have separate array and matrix classes with easy but explicit conversions between them. Without conversions, arrays and matrices would be completely independent semantically. In other words, I'm mostly in favor of Konrad Hinsen's position, with the addition of using ~ operators for elementwise operations for matrix-like classes. The PEP itself also discussed ideas of extending the meaning of ~ to other parts of Python for elementwise operations on aggregate types, but my impressions of people's impressions is that it has a better chance without that part. Huaiyu |
From: <a.s...@gm...> - 2002-03-18 14:55:00
|
[Sorry about the crossposting, but it also seemed relevant to both scipy and numpy...] Huaiyu Zhu <hua...@ya...> writes: [...] > I'd like to hear any suggestions on how to proceed. My own favorite would > be to have separate array and matrix classes with easy but explicit > conversions between them. Without conversions, arrays and matrices would > be completely independent semantically. In other words, I'm mostly in > favor of Konrad Hinsen's position, with the addition of using ~ operators > for elementwise operations for matrix-like classes. The PEP itself also > discussed ideas of extending the meaning of ~ to other parts of Python for > elementwise operations on aggregate types, but my impressions of people's > impressions is that it has a better chance without that part. > Well, from my impression of the previous discussions, the situation (both for numpy and scipy) seems to boil down to me as follows: Either `array` currently is too much of a matrix, or too little: Linear algebra functionality is currently exclusively provided by `array` and libraries that operate on and return `array`s, but the computational and notational efficiency leaves to be desired (compared to e.g. Matlab) in some areas, importantly matrix multiplications (which are up to 40 times slower) and really awkward to write (and much more importantly, decipher afterwards). So I think what one should really do is discuss the advantages and disadvantages of the two possible ways out of this situation, namely providing: 1) a new (efficient) `matrix` class/type (and appropriate libraries that operate on it) [The Matrix class that comes with Numeric is more some syntactic sugar wrapper -- AFAIK it's not use as a return type or argument in any of the functions that only make sense for arrays that are matrices]. 2) the additional functionality that is needed for linear algebra in `array` and the libraries that operate on it. (see [1] below for what I feel is currently missing and could be done either in way 1) or 2)) I think it might be helpful to investigate these "macro"-issues before one gets bogged down in discussions about operators (I admit these are not entirely unrelated -- given that one of the reasons for the creation of a Matrix type would be that '*' is already taken in 'array's and there is no way to add a new operator without modifying the python core -- just for the record and ignoring my own advice, _iff_ there is a chance of getting '~*' into the language, I'd rather have '*' do the same for both matrices and arrays). My impression is that the best path also very much depends on the what the feature aspirations and divisions of labor of numpy/numarray and scipy are going to be. For example, scipy is really aimed at scientific users, which need performance, and are willing to buy it with inconvenience (like the necessity to install other libraries on one's machine, most prominently atlas and blas). The `array` type and the functions in `Numeric`, on the other hand, potentially target a much wider community -- the efficient storage and indexing facilities (rich comparisons, strides, the take, choose etc. functions) make it highly useful for code that is not necessarily numeric, (as an example I'm currently using it for feature selection algorithms, without doing any numerical computations on the arrays). So maybe (a subset of) numpy should make it into the python core (or an as yet `non-existent sumo-distribution`) [BTW, I also wonder whether the python-core array module could be superseded/merged with numpy's `array`? One potential show stopper seems to be that it is e.g. `pop`able]. In such a scenario, where numpy remains relatively general (and might even aim at incorporation into the core), it would be a no-no to bloat it with too much code aimed at improving efficiency (calling blas when possible, sparse storage etc.). On the other hand people who want to do serious numerical work will need this -- and the scipy community already requires atlas etc. and targets a more specialized audience. Under this consideration it might be an attractive solution do incorporate good matrix functionality (and possibly other improvements for hard core number crunchers) in scipy only (or at least limit the efficient _implementation_ of matrices to scipy, providing at only a pure python class or so in numpy). I'm not suggesting, BTW, to necessarily put all of [1] into a single class -- it seems sensible to have a couple of subclasses (for masked, sparse representations etc.) to `matrix` (maybe the parent-class should even be a relatively naïve Numpy implementation, with the good stuff as subclasses in scipy...). In any event, creating a new matrix class/type would also mean that matrix functionality in libraries should use and return this class (existing libraries should presumably largely still operate on arrays for backwards-compatibily (or both -- after a typecheck), and some matrix operations are so useful that it makes sense to provide array versions for them (e.g. dot) -- but on the whole it makes little sense to have a computationally and space efficient matrix type if one has to cast it around all the time). A `matrix` class is more specialized than an `array` and since the operations one will often do on it are consequently more limited, I think it should provide most important functionality as methods (rather than as external functions; see [2] for a list of suggestions). Approach 1) on the other hand would have the advantage that the current interface would stay pretty much the same, and as long as 2D arrays can just be regarded as matrices, there is no absolutely compelling reason not to stuff everything into array (at least the scipy-version thereof). Another important question to ask before deciding what to change how and if, is obviously how many people in the scipy/numpy community do lots of linear algebra (and how many deflectors from matlab etc. one could hope to win if one spiced things up a bit for them...), but I would suppose there must be quite a few (but I'm certainly biased ;). Unfortunately, I've really got to do some work again now, but before I return to number-crunching I'd like to say that I'd be happy to help with the implementation of a matrix class/type in python (I guess a .py-prototype would be helpful to start with, but ultimately a (subclassable) C(++)-type will be called for, at least in scipy). --alex Footnotes: [1] The required improvements for serious linear algebra seem to be: - optional use (atlas) blas routines for real and complex matrix, matrix `dot`s if atlas is available on the build machine (see http://www.scipy.org/Members/aschmolck for a patch -- it produces speedups of more than factor 40 for big matrices; I'd be willing to provide an equivalent patch for the scipy distribution if there is interest) - making sure that no unnecessary copies are created (e.g. when transposing a matrix to use it in `dot` -- AFAIK although the transpose itself only creates a new view, using it for dot results in a copy (but I might be wrong here )) - allowing more space efficient storage forms for special cases (e.g. sparse matrices, upper triangular etc.). IO libraries that can save and load such representations are also needed (methods and static methods might be a good choice to keep things transparent to the user). - providing a convinient and above all legible notation for common matrix operations (better than `dot(tranpose(A),B)` etc. -- possibilities include A * B.T or A ~* B.T or A * B ** T (by overloding __rpow__ as suggested in a previous post)) - (in the case of a new `matrix` class): indexing functionality (e.g. `where`, `choose` etc. should be available without having to cast, e.g. for the common case that I want to set everything under a certain threshold to 0., I don't want to have to cast my sparse matrix to an array etc.) [2] What should a matrix class contain? - a dot operator (certainly eventually, but if there is a good chance to get ~* into python, maybe '*' should remain unimplemented till this can be decided) - most or all of what scipy's linalg module does - possibly IO, (reading as a static method) - indexing (the like of take, choose etc. (some should maybe be functions or static methods)) -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.S...@gm... http://www.dcs.ex.ac.uk/people/aschmolc/ |
From: Konrad H. <hi...@cn...> - 2002-03-18 15:59:27
|
a.s...@gm... (A.Schmolck) writes: > Linear algebra functionality is currently exclusively provided by `array` and > libraries that operate on and return `array`s, but the computational and > notational efficiency leaves to be desired (compared to e.g. Matlab) in some > areas, importantly matrix multiplications (which are up to 40 times slower) > and really awkward to write (and much more importantly, decipher afterwards). Computational and notational efficiency are rather well separated, fortunately. Both the current dot function and an hypothetical matrix multiply operator could be implemented in straightforward C code or using a high-performance library such as Atlas. In fact, this should even be an installation choice in my opinion, as installing Atlas isn't trivial on all machines (e.g. with some gcc versions), and I consider it important for fundamental libraries that they work everywhere easily, even if not optimally. > My impression is that the best path also very much depends on the what the > feature aspirations and divisions of labor of numpy/numarray and scipy are > going to be. For example, scipy is really aimed at scientific users, which > need performance, and are willing to buy it with inconvenience (like the I see the main difference in distribution philosophy. NumPy is an add-on package to Python, which is in turn used by other add-on packages in a modular way. SciPy is rather a monolithic super-distribution for scientific users. Personally I strongly favour the modular package approach, and in fact I haven't installed SciPy on my system for that reason, although I would be interested in some of its components. > algorithms, without doing any numerical computations on the arrays). So maybe > (a subset of) numpy should make it into the python core (or an as yet This has been discussed already, and it might well happen one day, but not with the current NumPy implementation. Numarray looks like a much better candidate, but isn't ready yet. > In such a scenario, where numpy remains relatively general (and > might even aim at incorporation into the core), it would be a no-no > to bloat it with too much code aimed at improving efficiency > (calling blas when possible, sparse storage etc.). On the other hand The same approach as for XML could be used: a slim-line version in the standard distribution that could be replaced by a high-efficiency extended version for those who care. > attractive solution do incorporate good matrix functionality (and > possibly other improvements for hard core number crunchers) in scipy > only (or at least limit the efficient _implementation_ of matrices > to scipy, providing at only a pure python class or so in numpy). I'm I'd love to have efficient matrices without having to install the whole SciPy package! Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Pearu P. <pe...@ce...> - 2002-03-18 20:32:09
|
<blink>Off topic warning</blink> On 18 Mar 2002, Konrad Hinsen wrote: > I see the main difference in distribution philosophy. NumPy is an > add-on package to Python, which is in turn used by other add-on > packages in a modular way. SciPy is rather a monolithic > super-distribution for scientific users. > > Personally I strongly favour the modular package approach, and in fact > I haven't installed SciPy on my system for that reason, although I > would be interested in some of its components. Me too. In what I have contributed to SciPy, I have tried to follow this modularity approach. Modularity is also important property from the development point of view: it minimizes possible interference with other unreleated modules and their bugs. What I am trying to say here is that SciPy can (and should?, +1 from me) provide its components separately, though, currently only few of its components seem to be available in that way without some changes. Pearu |
From: <a.s...@gm...> - 2002-03-18 22:32:14
|
Konrad Hinsen <hi...@cn...> writes: > Computational and notational efficiency are rather well separated, > fortunately. Both the current dot function and an hypothetical matrix Yes, the only thing they have in common is that both are currently unsatisfactory (for matrix operations) in numpy, at least for my needs. Although I've solved my most pressing performance problems by patching Numeric [1], I'm obviously interested in a more official solution (i.e. one that is maintained by others :) [...] [order changed by me] > a.s...@gm... (A.Schmolck) writes: > > My impression is that the best path also very much depends on the what the > > feature aspirations and divisions of labor of numpy/numarray and scipy are ^^^^^^^ Darn, I made a confusing mistake -- this should read _future_. > > going to be. For example, scipy is really aimed at scientific users, which > > need performance, and are willing to buy it with inconvenience (like the > > I see the main difference in distribution philosophy. NumPy is an > add-on package to Python, which is in turn used by other add-on > packages in a modular way. SciPy is rather a monolithic > super-distribution for scientific users. > > Personally I strongly favour the modular package approach, and in fact > I haven't installed SciPy on my system for that reason, although I > would be interested in some of its components. [...] > The same approach as for XML could be used: a slim-line version in the > standard distribution that could be replaced by a high-efficiency > extended version for those who care. [...] I personally agree with all your above points -- if you have a look at our "dotblas"-patch mentioned earlier (see [1]), you will find that it aims to do provide that -- have dot run anywhere without a hassle but run (much) faster if the user is willing to install atlas. My main concern was that the argument should shift away a bit from syntactic and implementation details to what audiences and what needs numpy/numarray and are supposed to address and, in this light, how to best strike the balance between convinience for users and maitainers, speed and bloat, generality and efficiency etc. As an example, adding the dotblas patch [1] to Numeric is, I think more convinient for the users (granting a few assumptions (like that it actually works :) for the sake of the argument) -- it gives users that have atlas better-performance and those who don't won't (or at least shouldn't) notice. It is however inconvinient for the maintainers. Whether one should bother including it in this or some other way depends, among the obvious question of whether there is a better way to achieve what it does for both groups (like creating a dedicated Matrix class), also on what numpy is really supposed to achieve. I'm not entirely clear on that. For example I don't know how many numpy users deeply care about their matrix multiplications for big (1000x1000) matrices being 40 times faster. The monolithic approach is not entirely without its charms (remember python's "batteries included" jinggle)? Apart from convinience factors it also has the not unconsiderable advantage that people use _one_ standard module for a certain thing -- rather than 20 different solutions. This certainly helps to improve code quality. Not least because someone goes through the trouble of deciding what merrit's inclusion in the "Big Thing", possibly urging changes but at least almost certainly taking more time for evalutation than an indivdual programmer who just wants to get a certain job done. It also makes life easier for module writers -- they can rely on certain stuff being around (and don't have to reinvent the wheel, another potential improvement to code quality). As such it makes live easier for maintainers, as does the scipy commandment that you have to install atlas/lapack, full-stop (and if it doesn't run on your machine -- well at least it works fast for some people and that might well be better than working slow for everyone in this context). So, I think what's good really depends on what you're aiming at, that's why I'd like to know what users and developers think about these matters. My points regarding scipy and numpy/numarray were just one attempt at interpreting what these respective libraries try to/should/could attempt to be or become. Now, not being a developer for either of them (I've only submitted a few minor patches to scipy), I'm not in a particular good position to venture such interpretations, but I hoped that it would provoke other and more knowledgeable people to share their opinions and insights on this matter (as indeed you did). > I'd love to have efficient matrices without having to install the > whole SciPy package! Welcome to the linear algebra lobby group ;) yep, that would be nice but my impression was that the scipy folks are currently more concerned about performance issues than the numpy/numarray folks and I could live with either package providing what I want. Ideally , I'd like to see a slim core numarray, without any frills (and more streamlined to behave like standard python containers (e.g. indexing and type/casts behavior)) for the python core, something more enabled and efficient for numerics (including matrices!) as a seperate package (like the XML example you quote). And then maybe a bigger pre-bundled collection of (ideally rather modular) numerical libraries for really hard-core scientific users (maybe in the spirit of xemacs-packages and sumo-tar-balls -- no bloat if you don't need it, plenty of features in an instant if you do). Anyway, is there at least general agreement that there should be some new and wonderful matrix class (plus supporting libraries) somewhere (rather than souping up array)? alex Footnotes: [1] patch for faster dot product in Numeric http://www.scipy.org/Members/aschmolck -- Alexander Schmolck Postgraduate Research Student Department of Computer Science University of Exeter A.S...@gm... http://www.dcs.ex.ac.uk/people/aschmolc/ |
From: Konrad H. <hi...@cn...> - 2002-03-19 11:08:03
|
a.s...@gm... (A.Schmolck) writes: > > > feature aspirations and divisions of labor of numpy/numarray and scipy are > ^^^^^^^ > Darn, I made a confusing mistake -- this should read _future_. Or perhaps __future__ ;-) > I personally agree with all your above points -- if you have a look at our > "dotblas"-patch mentioned earlier (see [1]), you will find that it aims to do And I didn't know even about this... > It is however inconvinient for the maintainers. Whether one should bother > including it in this or some other way depends, among the obvious question of There could be two teams, one maintaining a standard portable implementation, and another one taking care of optimization add-ons. From the user's point of view, what matters most is a single entry-point for finding everything that is available. > The monolithic approach is not entirely without its charms (remember > python's "batteries included" jinggle)? Apart from convinience Sure, but... That's the standard library. Everybody has it, in identical form, and its consistency and portability is taken care off by the Python development team. There can be only *one* standard library that works like this. I see no problem either with providing a larger integrated distribution for specific user communities. But such distribution and packaging strategies should be distinct from development projects. If I can get a certain package only as part of a juge distribution that I can't or don't want to install, then that package is effectively lost for me. Worse, if one package comes with its personalized version of another package (SciPy with NumPy), then I end up having to worry about internal conflicts within my installation. On the other hand, package interdependencies are a big problem in the Open Source community at large, and I have personally been bitten more than once by an incompatible change in NumPy that broke my modules. But I don't see any other solution than better communication between development teams. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: Prabhu R. <pr...@ae...> - 2002-03-20 18:22:01
|
hi, I'm sorry I havent been following the discussion too closely and this post might be completely unrelated. >>>>> "AS" == A Schmolck <a.s...@gm...> writes: AS> Ideally , I'd like to see a slim core numarray, without any AS> frills (and more streamlined to behave like standard python AS> containers (e.g. indexing and type/casts behavior)) for the AS> python core, something more enabled and efficient for numerics AS> (including matrices!) as a seperate package (like the XML AS> example you quote). And then maybe a bigger pre-bundled AS> collection of (ideally rather modular) numerical libraries for AS> really hard-core scientific users (maybe in the spirit of AS> xemacs-packages and sumo-tar-balls -- no bloat if you don't AS> need it, plenty of features in an instant if you do). AS> Anyway, is there at least general agreement that there should AS> be some new and wonderful matrix class (plus supporting AS> libraries) somewhere (rather than souping up array)? Ideally, I'd like something that also has a reasonably easy to use interface from C/C++. The idea is that it should be easy (and natural) for someone to use the same library from C/C++ when performance was desired. This would be really nice and very useful. prabhu |
From: Konrad H. <hi...@cn...> - 2002-03-07 09:11:36
|
"eric" <er...@en...> writes: > Matrix.Matrix objects. This attribute approach will work, but I > wonder if trying the "adding an operator to Python" approach one > more time would be worth while. At Python10 developer's day, Guido If it were only one operator, perhaps, although I might even give up on Python completely if starts to use Perlish notations like ~@!. But if you really want to have a short-hand syntax for the common matrix operations, you'd need multiplication, division (shorthand for multiplying by inverse), power, transpose and hermitian transpose. If you want to go the "operator way", the goal should rather be something like APL, with composite operators. Matrix multiplication would then be a special case of a reduction operator that uses multiplication and addition (in APL this is written as "+.x"). Note that I am *not* suggesting this, my opinion is still that matrices and arrays should be semantically different types. Konrad. -- ------------------------------------------------------------------------------- Konrad Hinsen | E-Mail: hi...@cn... Centre de Biophysique Moleculaire (CNRS) | Tel.: +33-2.38.25.56.24 Rue Charles Sadron | Fax: +33-2.38.63.15.17 45071 Orleans Cedex 2 | Deutsch/Esperanto/English/ France | Nederlands/Francais ------------------------------------------------------------------------------- |
From: John J. L. <jj...@po...> - 2002-03-08 21:41:18
|
On Thu, 7 Mar 2002, Konrad Hinsen wrote: > "eric" <er...@en...> writes: > > > Matrix.Matrix objects. This attribute approach will work, but I > > wonder if trying the "adding an operator to Python" approach one > > more time would be worth while. At Python10 developer's day, Guido [...] > If you want to go the "operator way", the goal should rather be > something like APL, with composite operators. Matrix multiplication [...] How about general operator - function equivalence, as explained here by Alex Martelli? The change is large in one sense, but it is conceptually very simple: http://groups.google.com/groups?q=operator+Martelli+Haskell+group:comp.lang.python&hl=en&selm=8t4dl301a4%40news2.newsguy.com&rnum=1 > 2 div 3 > or > div(2,3) > or > 2 `div 3 > [Haskell-ishly syntax-sugar note: Haskell lets you > use any 2-operand function as an infix operator by > just enclosing its name in ``; in Py3K, I think a > single leading ` would suffice -- far nicer than the > silly current use of ` for the rare need of repr -- > and we might also, with pleasing symmetry, let any > operator be used as a normal function a la > `+(a,b) > i.e., the ` marker could lexically switch functions > to operators and operators to functions, without > needing to 'import operator' and recall what the > operator-name for a given operator IS...!-). The > priority and associativity of these infinitely > many "new operators" could be fixed ones...]. Since GvR seems to have given up the idea of 'Py3K' in favour of gradual changes, perhaps this is a real possibility? Travis' r = a.M * b.M would then be written as M = Numeric.matrixmultiply r = a `M b (Konrad also complains about Perl's nasty syntax. This is frequently complained about, but do you really think the syntax is the problem -- surely it's Perl's horribly complicated semantics that is the real issue? The syntax is just inconvenient, in comparison at least. Sorry, a bit OT...) John |
From: Konrad H. <hi...@cn...> - 2002-03-09 22:37:17
|
"John J. Lee" <jj...@po...> writes: > (Konrad also complains about Perl's nasty syntax. This is frequently > complained about, but do you really think the syntax is the problem -- > surely it's Perl's horribly complicated semantics that is the real issue? > The syntax is just inconvenient, in comparison at least. Sorry, a bit > OT...) It's both, of course. I don't really wish to decide which is worse, especially not because I'd have to read more Perl code to reach such a decision ;-) But syntax is an issue for readability. There are some symbols that are generally used as operators in computer languages, and I think Python uses all of them already. Moreover, the general semantics are quite uniform as well: * stands for multiplication, for example, although the details of what multiplication means can vary. Symbols like @ are not operators everywhere, and where they are there is no uniform meaning attached to them, so they create confusion. As a test, take a Python program and replace all * by @. It does look weird. Konrad. |