You can subscribe to this list here.
2000 |
Jan
(8) |
Feb
(49) |
Mar
(48) |
Apr
(28) |
May
(37) |
Jun
(28) |
Jul
(16) |
Aug
(16) |
Sep
(44) |
Oct
(61) |
Nov
(31) |
Dec
(24) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2001 |
Jan
(56) |
Feb
(54) |
Mar
(41) |
Apr
(71) |
May
(48) |
Jun
(32) |
Jul
(53) |
Aug
(91) |
Sep
(56) |
Oct
(33) |
Nov
(81) |
Dec
(54) |
2002 |
Jan
(72) |
Feb
(37) |
Mar
(126) |
Apr
(62) |
May
(34) |
Jun
(124) |
Jul
(36) |
Aug
(34) |
Sep
(60) |
Oct
(37) |
Nov
(23) |
Dec
(104) |
2003 |
Jan
(110) |
Feb
(73) |
Mar
(42) |
Apr
(8) |
May
(76) |
Jun
(14) |
Jul
(52) |
Aug
(26) |
Sep
(108) |
Oct
(82) |
Nov
(89) |
Dec
(94) |
2004 |
Jan
(117) |
Feb
(86) |
Mar
(75) |
Apr
(55) |
May
(75) |
Jun
(160) |
Jul
(152) |
Aug
(86) |
Sep
(75) |
Oct
(134) |
Nov
(62) |
Dec
(60) |
2005 |
Jan
(187) |
Feb
(318) |
Mar
(296) |
Apr
(205) |
May
(84) |
Jun
(63) |
Jul
(122) |
Aug
(59) |
Sep
(66) |
Oct
(148) |
Nov
(120) |
Dec
(70) |
2006 |
Jan
(460) |
Feb
(683) |
Mar
(589) |
Apr
(559) |
May
(445) |
Jun
(712) |
Jul
(815) |
Aug
(663) |
Sep
(559) |
Oct
(930) |
Nov
(373) |
Dec
|
From: Perry G. <pe...@st...> - 2004-01-21 21:28:10
|
Joe Harrington writes: > > This is a necessarily long post about the path to an open-source > replacement for IDL and Matlab. While I have tried to be fair to > those who have contributed much more than I have, I have also tried to > be direct about what I see as some fairly fundamental problems in the > way we're going about this. I've given it some section titles so you > can navigate, but I hope that you will read the whole thing before > posting a reply. I fear that this will offend some people, but please > know that I value all your efforts, and offense is not my intent. > No offense taken. [...] > THE PROBLEM > > We are not following the open-source development model. Rather, we > pay lip service to it. Open source's development mantra is "release > early, release often". This means release to the public, for use, a > package that has core capability and reasonably-defined interfaces. > Release it in a way that as many people as possible will get it, > install it, use it for real work, and contribute to it. Make the main > focus of the core development team the evaluation and inclusion of > contributions from others. Develop a common vision for the program, > and use that vision to make decisions and keep efforts focused. > Include contributing developers in decision making, but do make > decisions and move on from them. > > Instead, there are no packages for general distribution. The basic > interfaces are unstable, and not even being publicly debated to decide > among them (save for the past 3 days). The core developers seem to > spend most of their time developing, mostly out of view of the > potential user base. I am asked probably twice a week by different > fellow astronomers when an open-source replacement for IDL will be > available. They are mostly unaware that this effort even exists. > However, this indicates that there are at least hundreds of potential > contributors of application code in astronomy alone, as I don't nearly > know everyone. The current efforts look rather more like the GNU > project than Linux. I'm sorry if that hurts, but it is true. > I'd both agree with this and disagree. Agree in the sense that many agree these are desireable traits of an open source project. Disagree in the sense that many don't meet all of these traits, and yet may be useful to some degree. Even Python is not released often, nor is it generally packaged by the core group. You will find packaging by special interest group that may or may not be up to date for various platforms. There is a whole spectrum of other, useful open source projects that don't satisfy these requirments. I don't mean that in a defensive way; it's certainly fair to ask what is going wrong in the Python numeric world, but doing the above alone doesn't necessarily guarentee that you will be sucessful in attracting feedback and contributions; there are other factors as well that influence how a project develops. We have had experience with the packaging issue for PyRAF, and it isn't quite so simple, the package binary approach didn't always make life simpler for the user (arguably, we have found the source distribution approach more trouble-free than our original release). Having ones own version of python packaged as a binary raises issues with LD_LIBRARY_PATH that there are just no good solutions to. > I know that Perry's group at STScI and the fine folks at Enthought > will say they have to work on what they are being paid to work on. > Both groups should consider the long term cost, in dollars, of > spending those development dollars 100% on coding, rather than 50% on > coding and 50% on outreach and intake. Linus himself has written only > a small fraction of the Linux kernel, and almost none of the > applications, yet in much less than 7 years Linux became a viable > operating system, something much bigger than what we are attempting > here. He couldn't have done that himself, for any amount of money. > We all know this. > I'd say we have tried our best to solicit input (and accept contributed code as well). You have to remember that how easily contributions come depends on what the critical mass is for usefulness. For something like numarray or Numeric, that critical mass is quite large. Few are interested in contributing when it can do very little and and older package exists that can do more. By the time it has comparable functionality, it is already quite large. A lot of projects like that start with a small group before more join in. There are others where the critical mass is low and many join in when functionality is still relatively low. > THE PATH > > Here is what I suggest: > > 1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". > If numeric and numarray are in agreement on these issues, then we > can move on, and debate performance and features later. > Well, there are, and continue to be those that can't come to an agreement on even the interface. These issues have been raised many times in the past. Often consensus was hard to achieve. We tended to lean towards backward compatibilty unless the change seemed really necessary. For type coercion and error handling, we thought it was. But I don't think we have tried shield the decision making process from the community. I do think the difficulty in achieving a sense of consensus is a problem. Perhaps we are going about the process in the wrong way; I'd welcome suggestions as to how to improve that. > 2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). > > 3. We should collect or implement a very minimal version of the > featureset, and document it well enough that others like us can do > simple but real tasks to try it out, without reading source code. > That documentation should include lists of things that still need > to be done. > > 4. We should release a stand-alone version of the whole thing in the > formats most likely to be installed by users on the four most > popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this > means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. > Tarballs and CVS checkouts are right out. We have seen that nobody > in the real world installs them. To be most portable and robust, > it would make sense to include the Python interpreter, named such > that it does not stomp on versions of Python in the released > operating systems. Static linking likewise solves a host of > problems and greatly reduces the number of package variants we will > have to maintain. > Static linking also introduces other problems. And we have gone this route in the past so we have some knowledge of what it entails. > 5. We should advertize and advocate the result at conferences and > elsewhere, being sure to label it what it is: a first-cut effort > designed to do a few things well and serve as a platform for > building on. We should also solicit and encourage people either to > work on the included TODO lists or to contribute applications. One > item on the TODO list should be code converters from IDL and Matlab > to Python, and compatibility libraries. > > 6. We should then all continue to participate in the discussions and > development efforts that appeal to us. We should keep in mind that > evaluating and incorporating code that comes in is in the long run > much more efficient than writing the universe ourselves. > > 7. We should cut and package new releases frequently, at least once > every six months. It is better to delay a wanted feature by one > release than to hold a release for a wanted feature. The mountain > is climbed in small steps. > > The open source model is successful because it follows closely > something that has worked for a long time: the scientific method, with > its community contributions, peer review, open discussion, and > progress mainly in small steps. Once basic capability is out there, > we can twiddle with how to improve things behind the scenes. > In general, I can't disagree much with most of these. I'm happy for others to smack us when we are going away from this sort of process. Please do, it would be the only way (and others) would learn how to really do it. But we have released fairly frequently, if not with rpms. We do provide pretty good support as well. We have incorporated most of the code sent to us, and considered and implemented many feature requests or performance issues. But the numarray core is not something one would casually change without spending some time understanding how it works; I suspect that is the biggest inhibitor to changes to the core. We are happy to work with others on it if they have the time to do so. If anyone feels we have discouraged people contributing, please let me know (privately if you wish). > IS SCIPY THE WAY? > > The recipe above sounds a lot like SciPy. SciPy began as a way to > integrate the necessary add-ons to numeric for real work. It was > supposed to test, document, and distribute everything together. I am > aware that there are people who use it, but the numbers are small and > they seem to be tightly connected to Enthought for support and > application development. Enthought's focus seems to be on servicing > its paying customers rather than on moving SciPy development along, > and I fear they are building an installed customer base on interfaces > that were not intended to be stable. > I don't feel this is fair to Enthought. It is not my impression that they have made any money off of the scipy distribution directly (Chaco is a different issue). As far as I can tell, the only benefit they've generally gotten from it is from the visibility of sponsoring it, and perhaps from their own use few of the tools they have included as part of it. I doubt that their own clients have driven its development in any significant way. I'd guess they have sunk far more money into scipy than gotten out of it. I don't want others to get the impression that it is the other way around. In fact, on a number of occasions I have heard users complain about the documentation and the standard response is "please help us improve it" with very little in response. They have gone the extra mile in soliciting contributions and help maintaining it. Perhaps it is part of my open source blind spot, but I have trouble seeing what else they could be doing to encourage others to contribute to scipy (besides paying them; which they have done as well!). The only thing I can think of is that because they are doing it, others feel that they don't. Perhaps there is a similar issue with numarray. I don't know. > So, I will raise the question: is SciPy the way? Rather than forking > the plotting and numerical efforts from what SciPy is doing, should we > not be creating a new effort to do what SciPy has so far not > delivered? These are not rhetorical or leading questions. I don't > know enough about the motivations, intentions, and resources of the > folks at Enthought (and elsewhere) to know the answer. I do think > that such a fork will occur unless SciPy's approach changes > substantially. The way to decide is for us all to discuss the > question openly on these lists, and for those willing to participate > and contribute effort to declare so openly. I think all that is > needed, either to help SciPy or replace it, is some leadership in the > direction outlined above. I would be interested in hearing, perhaps > from the folks at Enthought, alternative points of view. Why are > there no packages for popular OSs for SciPy 0.2? Why are releases so > infrequent? If the folks running the show at scipy.org disagree with > many others on these lists, then perhaps those others would like to > roll their own. Or, perhaps stable/testing/unstable releases of the > whole package are in order. > I think the answer is simple. Supporting distributions of the software they have pulled into scipy is a hell of a lot of work; work that nobody is paying them for. It gives me the shivers to think of our taking on all they have for scipy. > HOW TO CONTRIBUTE? > > Judging by the number of PhDs in sigs, there are a lot of researchers > on this list. I'm one, and I know that our time for doing core > development or providing the aforementioned leadership is very > limited, if not zero. Later we will be in a much better position to > contribute application software. However, there is a way we can > contribute to the core effort even if we are not paid, and that is to > put budget items in grant and project proposals to support the work of > others. Those others could be either our own employees or > subcontractors at places like Enthought or STScI. A handful of > contributors would be all we'd need to support someone to produce OS > packages and tutorial documentation (the stuff core developers find > boring) for two releases a year. > By all means, if there is a groundswell of support for development, please let us know. Perry Greenfield |
From: Konrad H. <hi...@cn...> - 2004-01-21 21:26:13
|
On 21.01.2004, at 19:44, Joe Harrington wrote: > This is a necessarily long post about the path to an open-source > replacement for IDL and Matlab. While I have tried to be fair to You raise many good points here. Some comments: > those who have contributed much more than I have, I have also tried to > be direct about what I see as some fairly fundamental problems in the > way we're going about this. I've given it some section titles so you I'd say the fundamental problem is that "we" don't exist as a coherent group. There are a few developer groups (e.g. at STSC and Enthought) who write code primarily for their own need and then make it available. The rest of us are what one could call "power users": very interested in the code, knowledgeable about its use, but not contributing to its development other than through testing and feedback. > THE PROBLEM > > We are not following the open-source development model. Rather, we True. But is it perhaps because that model is not so well adapted to our situation? If you look at Linux (the OpenSource reference), it started out very differently. It was a fun project, done by hobby programmers who shared an idea of fun (kernel hacking). Linux was not goal-oriented in the beginnings. No deadlines, no usability criteria, but lots of technical challenges. Our situation is very different. We are scientists and engineers who want code to get our projects done. We have clear goals, and very limited means, plus we are mostly somone's employees and thus not free to do as we would like. On the other hand, our project doesn't provide the challenges that attract the kind of people who made Linux big. You don't get into the news by working on NumPy, you don't work against Microsoft, etc. Computational science and engineering just isn't the same as kernel hacking. I develop two scientific Python libraries myself, more specialized and thus with a smaller market share, but the situation is otherwise similar. And I work much like the Numarray people do: I write the code that I need, and I invest minimal effort in distribution and marketing. To get the same code developped in the Linux fashion, there would have to be many more developers. But they just don't exist. I know of three people worldwide whose competence in both Python/C and in the application domain is good enough that they could work on the code base. This is not enough to build a networked development community. The potential NumPy community is certainly much bigger, but I am not sure it is big enough. Working on NumPy/Numarray requires the combination of not-so-frequent competences, plus availability. I am not saying it can't be done, but it sure isn't obvious that it can be. > Release it in a way that as many people as possible will get it, > install it, use it for real work, and contribute to it. Make the main > focus of the core development team the evaluation and inclusion of > contributions from others. Develop a common vision for the program, This requires yet different competences, and thus different people. It takes people who are good at reading others' code and communicating with them about it. Some people are good programmers, some are good scientists, some are good communicators. How many are all of that - *and* available? > I know that Perry's group at STScI and the fine folks at Enthought > will say they have to work on what they are being paid to work on. > Both groups should consider the long term cost, in dollars, of > spending those development dollars 100% on coding, rather than 50% on > coding and 50% on outreach and intake. Linus himself has written only You are probably right. But does your employer think long-term? Mine doesn't. > applications, yet in much less than 7 years Linux became a viable > operating system, something much bigger than what we are attempting Exactly. We could be too small to follow the Linux way. > 1. We should identify the remaining open interface questions. Not, > "why is numeric faster than numarray", but "what should the syntax > of creating an array be, and of doing different basic operations". Yes, a very good point. Focus on the goal, not on the legacy code. However, a technical detail that should not be forgotten here: NumPy and Numarray have a C API as well, which is critical for many add-ons and applications. A C API is more closely tied to the implementation than a Python API. It might thus be difficult to settle on an API and then work on efficient implementations. > 2. We should identify what we need out of the core plotting > capability. Again, not "chaco vs. pyxis", but the list of > requirements (as an astronomer, I very much like Perry's list). 100% agreement. For plotting, defining the interface should be easier (no C stuff). Konrad. |
From: Todd M. <jm...@st...> - 2004-01-21 21:20:50
|
> > Why are numarrays so slow to create? > > There are several portable ways to create numarrays (array(), arange(), zeros(), ones()) and I'm not really sure which one to address, so I poked around some. I discovered that numarray-0.8 has a problem with array() which causes very poor performance (~30x slower than Numeric) for arrays created from a sequence. The problem is with a private Python function, _all_arrays(), that scans the sequence to see if it consists only of arrays; _all_arrays() works badly for the ordinary case of a sequence of numbers. This is fixed now in CVS. Beyond this flaw in array(), it's a mixed bag, with numarray tending to do well with large arrays and certain use cases, and Numeric doing well with small arrays and other use cases. Todd > I'll leave it to Todd to give the details of that. > > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion -- Todd Miller Space Telescope Science Institute 3700 San Martin Drive Baltimore MD, 21030 (410) 338 - 4576 |
From: Magnus L. H. <ma...@he...> - 2004-01-21 20:52:57
|
Perry Greenfield <pe...@st...>: > > Jon Peirce writes: > > > > I agree with the sentiment that chaco is a very heavy and confusing > > package for the average scientist (but maybe great for the full-time > > programmer) but I'm really concerned about the idea that we need > > *another* solution started from scratch. There are already so many > > including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, > > pgplot, pyx (new)... > > > We had looked all of these and each had fallen short in some major > way (though I thought piddle had much promise and perhaps could be > built on; however it was intended as a back end only.) Wohoo! Piddle lives ;) I think I'd be interested in resuming some of my earlier work on Piddle if it is ever used for something useful -- such as a proper plotting tool. (I was actually just thinking about wrapping PyX in the Piddle interface to make TeX typesetting available in Piddle.) [snip about mathplotlib] Hm. Maybe a Piddle back-end could be written for it (which would instantly give it lots of extra back-ends)...? Two birds with one stone and all that... - M -- Magnus Lie Hetland "The mind is not a vessel to be filled, http://hetland.org but a fire to be lighted." [Plutarch] |
From: Perry G. <pe...@st...> - 2004-01-21 20:06:25
|
Jon Peirce writes: > > I agree with the sentiment that chaco is a very heavy and confusing > package for the average scientist (but maybe great for the full-time > programmer) but I'm really concerned about the idea that we need > *another* solution started from scratch. There are already so many > including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, > pgplot, pyx (new)... > We had looked all of these and each had fallen short in some major way (though I thought piddle had much promise and perhaps could be built on; however it was intended as a back end only.) > In particular MatPlotLib looks promising - check out its examples: > http://matplotlib.sourceforge.net/screenshots.html > *Many* plotting types already , simple syntax, a few different backends. > And already has something of a following. > This we had not seen. A superficial look indicates that it is worth investigating further as a basis for a plotting package. I didn't see any major problem with it that contradicted our requirements, but obviously we will have to look at it in more depth to see if that is the case. It doesn't have to be perfect of course. And it is much more expensive tto start from scratch (though we weren't doing that entirely since a number of components from the chaco effort would have been reused). But this is worth seriously considering. Perry Greenfield > So is it really not possible for STScI to push its resources into aiding > the development of something that's already begun? Would be great if we > could develop a single package really well rather than everyone making > their own. > |
From: Jon P. <jw...@ps...> - 2004-01-21 19:09:42
|
> > >We have started on this over the past month, and hope to have some >simple >functionality available within a month (though when we make it public >may >take a bit longer). It will be open source and we hope significantly >simpler >than chaco. It will not focus on speed (well, we want fairly fast >display times >for plots of a reasonable number of points, but we don't need video >refresh >rates). If your interest in plotting matches ours, then this may be for >you. >We will welcome contributions and comments once we get it off the >ground. >(We are calling it pyxis by the way). > I agree with the sentiment that chaco is a very heavy and confusing package for the average scientist (but maybe great for the full-time programmer) but I'm really concerned about the idea that we need *another* solution started from scratch. There are already so many including scipy.gplt, scipy.plt, dislin, biggles, pychart, piddle, pgplot, pyx (new)... In particular MatPlotLib looks promising - check out its examples: http://matplotlib.sourceforge.net/screenshots.html *Many* plotting types already , simple syntax, a few different backends. And already has something of a following. So is it really not possible for STScI to push its resources into aiding the development of something that's already begun? Would be great if we could develop a single package really well rather than everyone making their own. -- Jon Peirce Nottingham University +44 (0)115 8467176 (tel) +44 (0)115 9515324 (fax) http://www.psychology.nottingham.ac.uk/staff/jwp/ |
From: Joe H. <jh...@oo...> - 2004-01-21 18:44:49
|
This is a necessarily long post about the path to an open-source replacement for IDL and Matlab. While I have tried to be fair to those who have contributed much more than I have, I have also tried to be direct about what I see as some fairly fundamental problems in the way we're going about this. I've given it some section titles so you can navigate, but I hope that you will read the whole thing before posting a reply. I fear that this will offend some people, but please know that I value all your efforts, and offense is not my intent. THE PAST VS. NOW While there is significant and dedicated effort going into numeric/numarray/scipy, it's becoming clear that we are not progressing quickly toward a replacement for IDL and Matlab. I have great respect for all those contributing to the code base, but I think the present discussion indicates some deep problems. If we don't identify those problems (easy) and solve them (harder, but not impossible), we will continue not to have the solution so many people want. To be convinced that we are doing something wrong at a fundamental level, consider that Python was the clear choice for a replacement in 1996, when Paul Barrett and I ran a BoF at ADASS VI on interactive data analysis environments. That was over 7 years ago. When people asked at that conference, "what does Python need to replace IDL or Matlab", the answer was clearly "stable interfaces to basic numerics and plotting; then we can build it from there following the open-source model". Work on both these problems was already well underway then. Now, both the numerical and plotting development efforts have branched. There is still no stable base upon which to build. There aren't even packages for popular OSs that people can install and play with. The problem is not that we don't know how to do numerics or graphics; if anything, we know these things too well. In 1996, if anyone had told us that in 2004 there would be no ready-to-go replacement system because of a factor of 4 in small array creation overhead (on computers that ran 100x as fast as those then available) or the lack of interactive editing of plots at video speeds, the response would not have been pretty. How would you have felt? THE PROBLEM We are not following the open-source development model. Rather, we pay lip service to it. Open source's development mantra is "release early, release often". This means release to the public, for use, a package that has core capability and reasonably-defined interfaces. Release it in a way that as many people as possible will get it, install it, use it for real work, and contribute to it. Make the main focus of the core development team the evaluation and inclusion of contributions from others. Develop a common vision for the program, and use that vision to make decisions and keep efforts focused. Include contributing developers in decision making, but do make decisions and move on from them. Instead, there are no packages for general distribution. The basic interfaces are unstable, and not even being publicly debated to decide among them (save for the past 3 days). The core developers seem to spend most of their time developing, mostly out of view of the potential user base. I am asked probably twice a week by different fellow astronomers when an open-source replacement for IDL will be available. They are mostly unaware that this effort even exists. However, this indicates that there are at least hundreds of potential contributors of application code in astronomy alone, as I don't nearly know everyone. The current efforts look rather more like the GNU project than Linux. I'm sorry if that hurts, but it is true. I know that Perry's group at STScI and the fine folks at Enthought will say they have to work on what they are being paid to work on. Both groups should consider the long term cost, in dollars, of spending those development dollars 100% on coding, rather than 50% on coding and 50% on outreach and intake. Linus himself has written only a small fraction of the Linux kernel, and almost none of the applications, yet in much less than 7 years Linux became a viable operating system, something much bigger than what we are attempting here. He couldn't have done that himself, for any amount of money. We all know this. THE PATH Here is what I suggest: 1. We should identify the remaining open interface questions. Not, "why is numeric faster than numarray", but "what should the syntax of creating an array be, and of doing different basic operations". If numeric and numarray are in agreement on these issues, then we can move on, and debate performance and features later. 2. We should identify what we need out of the core plotting capability. Again, not "chaco vs. pyxis", but the list of requirements (as an astronomer, I very much like Perry's list). 3. We should collect or implement a very minimal version of the featureset, and document it well enough that others like us can do simple but real tasks to try it out, without reading source code. That documentation should include lists of things that still need to be done. 4. We should release a stand-alone version of the whole thing in the formats most likely to be installed by users on the four most popular OSs: Linux, Windows, Mac, and Solaris. For Linux, this means .rpm and .deb files for Fedora Core 1 and Debian 3.0r2. Tarballs and CVS checkouts are right out. We have seen that nobody in the real world installs them. To be most portable and robust, it would make sense to include the Python interpreter, named such that it does not stomp on versions of Python in the released operating systems. Static linking likewise solves a host of problems and greatly reduces the number of package variants we will have to maintain. 5. We should advertize and advocate the result at conferences and elsewhere, being sure to label it what it is: a first-cut effort designed to do a few things well and serve as a platform for building on. We should also solicit and encourage people either to work on the included TODO lists or to contribute applications. One item on the TODO list should be code converters from IDL and Matlab to Python, and compatibility libraries. 6. We should then all continue to participate in the discussions and development efforts that appeal to us. We should keep in mind that evaluating and incorporating code that comes in is in the long run much more efficient than writing the universe ourselves. 7. We should cut and package new releases frequently, at least once every six months. It is better to delay a wanted feature by one release than to hold a release for a wanted feature. The mountain is climbed in small steps. The open source model is successful because it follows closely something that has worked for a long time: the scientific method, with its community contributions, peer review, open discussion, and progress mainly in small steps. Once basic capability is out there, we can twiddle with how to improve things behind the scenes. IS SCIPY THE WAY? The recipe above sounds a lot like SciPy. SciPy began as a way to integrate the necessary add-ons to numeric for real work. It was supposed to test, document, and distribute everything together. I am aware that there are people who use it, but the numbers are small and they seem to be tightly connected to Enthought for support and application development. Enthought's focus seems to be on servicing its paying customers rather than on moving SciPy development along, and I fear they are building an installed customer base on interfaces that were not intended to be stable. So, I will raise the question: is SciPy the way? Rather than forking the plotting and numerical efforts from what SciPy is doing, should we not be creating a new effort to do what SciPy has so far not delivered? These are not rhetorical or leading questions. I don't know enough about the motivations, intentions, and resources of the folks at Enthought (and elsewhere) to know the answer. I do think that such a fork will occur unless SciPy's approach changes substantially. The way to decide is for us all to discuss the question openly on these lists, and for those willing to participate and contribute effort to declare so openly. I think all that is needed, either to help SciPy or replace it, is some leadership in the direction outlined above. I would be interested in hearing, perhaps from the folks at Enthought, alternative points of view. Why are there no packages for popular OSs for SciPy 0.2? Why are releases so infrequent? If the folks running the show at scipy.org disagree with many others on these lists, then perhaps those others would like to roll their own. Or, perhaps stable/testing/unstable releases of the whole package are in order. HOW TO CONTRIBUTE? Judging by the number of PhDs in sigs, there are a lot of researchers on this list. I'm one, and I know that our time for doing core development or providing the aforementioned leadership is very limited, if not zero. Later we will be in a much better position to contribute application software. However, there is a way we can contribute to the core effort even if we are not paid, and that is to put budget items in grant and project proposals to support the work of others. Those others could be either our own employees or subcontractors at places like Enthought or STScI. A handful of contributors would be all we'd need to support someone to produce OS packages and tutorial documentation (the stuff core developers find boring) for two releases a year. --jh-- |
From: Kuzminski, S. R <SKu...@fa...> - 2004-01-21 13:05:38
|
I'm working on a commercial product that produces publication quality plots from data contained in Numeric arrays. I also concluded that Chaco was a bit more involved than I needed. My question is what requirements are not met by the other available plotting packages such as.. http://matplotlib.sourceforge.net/=20 These don't have every bell and whistle ( esp. when it comes to the interactive 'properties' dialog ) but as you point out there is a dark side to those features.=20 There are a number of quite capable plotting packages for Python, diversity is good up to a point, but this space ( Plotting packages ) seems ripe for a shakeout. =20 Stefan -----Original Message----- From: num...@li... [mailto:num...@li...] On Behalf Of Perry Greenfield Sent: Tuesday, January 20, 2004 7:02 PM To: Chris Barker Cc: num...@li... Subject: Re: [Numpy-discussion] Status of Numeric (and plotting in particular) On Tuesday, January 20, 2004, at 06:08 PM, Chris Barker wrote: >> Perry Greenfield writes: > >> > It has been our intention to port scipy to use numarray soon. This >> > work has been delayed somewhat since our current focus is on >> > plotting. > > That is good news. What plotting package are you working on? Last I=20 > heard Chaco had turned into Enthought's (and STSci) in-house Windows=20 > only package. (Not because they want it that way, but because they=20 > don't have funding to make it work on other platforms, and support the > broader community). > > I don't see anything new on the SciPy page after August '03. > > Frankly, weak plotting is a bigger deal to me than array performance. > Yes, I agree completely (and why we are giving plotting higher priority=20 than scipy integration). I really was hoping to raise this issue later, but I might as well=20 address it since the Numeric/numarray issue has raised it indirectly. Chaco had been the focus of our plotting efforts for more than a year.=20 The effort started with our funding Enthought to start the effort. We had a number=20 of requirements for a plotting package that weren't met by any existing package, and it=20 didn't appear that any would be easily modified to our needs. The requirements we had=20 (off the top of my head) included: 1) easy portability to graphics devices *and* different windowing=20 systems. 2) it had to run on all major platforms including Solaris, Linux, Macs,=20 and Windows. 3) the graphics had to be embedable within gui widgets. 4) it had to allow cursor interactions, at least to the point of being=20 able to read cursor positions from python. 5) it had to be open source and preferably not gpl (though the latter=20 was probably not a show stopper for us) 6) It also had to be customizable to the point of being able to produce=20 very high quality hardcopy plots suitable for publication. 7) object oriented plotting framework capable of sensible composition. 8) command line interface akin to that available in matlab or IDL to=20 make producing quick interactive plots very, very easy. Developing something that satisfies these is not at all trivial. In the process Enthought has expended much energy developing chaco,=20 kiva and traits (and lately they are working on yet more extensions); easily much more=20 of the effort has come from sources other than STScI. Kiva is the back end that=20 presents a uniform api for different graphics devices. Traits handles many of the user=20 interface issues for plot parameters, and handling the relationships of these parameters=20 between plot components. Chaco is the higher level plotting software that=20 provides the traditional plotting capabilities for 2-d data. Much has been invested in chaco. It is with some regret that we (STScI) have concluded that chaco is not=20 suitable for our needs and that we need to take a different approach (or at least=20 give it a try). I'll take some space to explain why. The short answer is that in the end we think it was too ambitious. We=20 still aim to achieve the goals I listed above. The problem we think is that chaco=20 was also tasked to try to achieve extra goals with regard interactive=20 capabilities that were in the end, not really important to STScI and it's community, but=20 were important to Enthought (and presumably its clients, and the scipy=20 community). More specifically, a lot of thought and work went into making many=20 aspects of the plots could be interactively modified. That is, by clicking on various=20 aspects of plots, one could bring up editors for the attributes of that plot=20 element, such as color, line style, font, size, etc. Many other interactive=20 aspects have been enhanced as well. Much recent work by Enthought is going into=20 extending the capabilities even further by adding gui kinds of features (e.g.,=20 widgets of all sorts). Unfortunately these capabilities have come at a price, namely=20 complexity. We have found it difficult to track the ongoing changes to chaco to become=20 proficient enough to contribute significantly by adding capabilities we have=20 needed. Perhaps that argues that we aren't competent to do so. To a certain=20 degree, that is probably is true. There is no doubt that Enthought has some=20 very talented software engineers working on chaco and related products. On the other=20 hand, our goal is to have this software be accessible by scientists in=20 general, and particularly astronomers. Chaco is complex enough that we think that is=20 a serious problem. Customizing it's behavior requires a very large=20 investment of time understanding how it works, far beyond what most astronomers are=20 willing to tackle (at least that's my impression). Much of this complexity (and many of its ongoing changes) is to support=20 the interactive capabilities, and to make it responsive enough that plots=20 can update themselves quickly enough not to lead to annoying lags. But=20 frankly, we just want something to render plots on the screen and on hardcopy.=20 Outside of being able to obtain cursor coordinates, we find many of the=20 interactive capabilities as secondary in importance. When most astronomers want to=20 tune a plot (either for publication quality, or for batch processing), they=20 usually want to be able to reproduce the adjustments for new data, for which=20 the interactive attribute editing capability is of little use. Generally they would=20 like to script the the more customized plots so that they can be easily=20 modified and reused. So it seems that it is too difficult to accomplish all these aims=20 within one package. We would like to develop a different plotting package (using=20 many of ideas from chaco, and some code) based on kiva and the traits package. We have started on this over the past month, and hope to have some=20 simple functionality available within a month (though when we make it public=20 may take a bit longer). It will be open source and we hope significantly=20 simpler than chaco. It will not focus on speed (well, we want fairly fast=20 display times for plots of a reasonable number of points, but we don't need video=20 refresh rates). If your interest in plotting matches ours, then this may be for=20 you. We will welcome contributions and comments once we get it off the=20 ground. (We are calling it pyxis by the way). Enthought is continuing to work on chaco and at some point that will be=20 mature, and will be capable of some sophisticated things. That may be more=20 appropriate for some than what we are working on. Perry Greenfield ------------------------------------------------------- The SF.Net email is sponsored by EclipseCon 2004 Premiere Conference on Open Tools Development and Integration See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. http://www.eclipsecon.org/osdn _______________________________________________ Numpy-discussion mailing list Num...@li... https://lists.sourceforge.net/lists/listinfo/numpy-discussion |
From: Perry G. <pe...@st...> - 2004-01-21 03:03:21
|
On Tuesday, January 20, 2004, at 06:08 PM, Chris Barker wrote: >> Perry Greenfield writes: > >> > It has been our intention to port scipy to use numarray soon. This >> > work has been delayed somewhat since our current focus is on >> > plotting. > > That is good news. What plotting package are you working on? Last I > heard Chaco had turned into Enthought's (and STSci) in-house Windows > only package. (Not because they want it that way, but because they > don't have funding to make it work on other platforms, and support the > broader community). > > I don't see anything new on the SciPy page after August '03. > > Frankly, weak plotting is a bigger deal to me than array performance. > Yes, I agree completely (and why we are giving plotting higher priority than scipy integration). I really was hoping to raise this issue later, but I might as well address it since the Numeric/numarray issue has raised it indirectly. Chaco had been the focus of our plotting efforts for more than a year. The effort started with our funding Enthought to start the effort. We had a number of requirements for a plotting package that weren't met by any existing package, and it didn't appear that any would be easily modified to our needs. The requirements we had (off the top of my head) included: 1) easy portability to graphics devices *and* different windowing systems. 2) it had to run on all major platforms including Solaris, Linux, Macs, and Windows. 3) the graphics had to be embedable within gui widgets. 4) it had to allow cursor interactions, at least to the point of being able to read cursor positions from python. 5) it had to be open source and preferably not gpl (though the latter was probably not a show stopper for us) 6) It also had to be customizable to the point of being able to produce very high quality hardcopy plots suitable for publication. 7) object oriented plotting framework capable of sensible composition. 8) command line interface akin to that available in matlab or IDL to make producing quick interactive plots very, very easy. Developing something that satisfies these is not at all trivial. In the process Enthought has expended much energy developing chaco, kiva and traits (and lately they are working on yet more extensions); easily much more of the effort has come from sources other than STScI. Kiva is the back end that presents a uniform api for different graphics devices. Traits handles many of the user interface issues for plot parameters, and handling the relationships of these parameters between plot components. Chaco is the higher level plotting software that provides the traditional plotting capabilities for 2-d data. Much has been invested in chaco. It is with some regret that we (STScI) have concluded that chaco is not suitable for our needs and that we need to take a different approach (or at least give it a try). I'll take some space to explain why. The short answer is that in the end we think it was too ambitious. We still aim to achieve the goals I listed above. The problem we think is that chaco was also tasked to try to achieve extra goals with regard interactive capabilities that were in the end, not really important to STScI and it's community, but were important to Enthought (and presumably its clients, and the scipy community). More specifically, a lot of thought and work went into making many aspects of the plots could be interactively modified. That is, by clicking on various aspects of plots, one could bring up editors for the attributes of that plot element, such as color, line style, font, size, etc. Many other interactive aspects have been enhanced as well. Much recent work by Enthought is going into extending the capabilities even further by adding gui kinds of features (e.g., widgets of all sorts). Unfortunately these capabilities have come at a price, namely complexity. We have found it difficult to track the ongoing changes to chaco to become proficient enough to contribute significantly by adding capabilities we have needed. Perhaps that argues that we aren't competent to do so. To a certain degree, that is probably is true. There is no doubt that Enthought has some very talented software engineers working on chaco and related products. On the other hand, our goal is to have this software be accessible by scientists in general, and particularly astronomers. Chaco is complex enough that we think that is a serious problem. Customizing it's behavior requires a very large investment of time understanding how it works, far beyond what most astronomers are willing to tackle (at least that's my impression). Much of this complexity (and many of its ongoing changes) is to support the interactive capabilities, and to make it responsive enough that plots can update themselves quickly enough not to lead to annoying lags. But frankly, we just want something to render plots on the screen and on hardcopy. Outside of being able to obtain cursor coordinates, we find many of the interactive capabilities as secondary in importance. When most astronomers want to tune a plot (either for publication quality, or for batch processing), they usually want to be able to reproduce the adjustments for new data, for which the interactive attribute editing capability is of little use. Generally they would like to script the the more customized plots so that they can be easily modified and reused. So it seems that it is too difficult to accomplish all these aims within one package. We would like to develop a different plotting package (using many of ideas from chaco, and some code) based on kiva and the traits package. We have started on this over the past month, and hope to have some simple functionality available within a month (though when we make it public may take a bit longer). It will be open source and we hope significantly simpler than chaco. It will not focus on speed (well, we want fairly fast display times for plots of a reasonable number of points, but we don't need video refresh rates). If your interest in plotting matches ours, then this may be for you. We will welcome contributions and comments once we get it off the ground. (We are calling it pyxis by the way). Enthought is continuing to work on chaco and at some point that will be mature, and will be capable of some sophisticated things. That may be more appropriate for some than what we are working on. Perry Greenfield |
From: Andrew P. L. Jr. <bs...@al...> - 2004-01-21 02:52:17
|
On Mon, 19 Jan 2004, Travis Oliphant wrote: > ... Ultimately I think it will be a wise thing to have two > implementations of arrays: one that is fast and lightweight optimized > for many relatively small arrays, and another that is optimized for > large-scale arrays. I am *extremely* interested in the use case of the small arrays in SciPy. Which algorithms and modules are dominated by the small array speed? -a |
From: Perry G. <pe...@st...> - 2004-01-21 01:45:22
|
On Tuesday, January 20, 2004, at 05:54 PM, Ray Schumacher wrote: > With a cross-over at ~2000 elements, can we safely say that working > with video, FITS cubes or other similar imagery would be fastest with > numarray for summing or dividing 2D arrays? (~920K elements) > As long as you treat the array as a whole, I'd say that usually numarray would be better suited. That doesn't mean you won't find some instances where it is slower for certain operations. (When you do, let us know). Perry Greenfield |
From: Perry G. <pe...@st...> - 2004-01-21 01:42:52
|
On Tuesday, January 20, 2004, at 05:53 PM, Marcel Oliver wrote: > That this discussion is happening NOW really surprises me. I have > been following this list for a couple of years now, with the intention > of eventually using numerical Python as the main teaching toolbox for > numerical analysis, and possibly for the migration small research > codes as well. > > The possibility of doing numerics in Phython has always intrigued me. > Right now I am primarily using Matlab. It's very powerful, but not > free and the language is horrible; Octave is trying to play catch up > but has mostly lost steam. So a good scientific Phython environment > (of any sort) would be a really cool thing to have. > > However, two things have always held me back (apart from coding small > examples on a few occasions): > > 1. Numerical Phython has been in a limbo for too long (I had even > assumed a few times that both Numeric and Numarray were dead for > all practical purposes). If there are two incompatible version for > I don't know why you assumed that. Both have regularly been updated more than once in the past two years. > years and no clear indication where the whole thing is going, I am > very hesitant to invest any time into writing substantial code, or > recommend it for class room use. > That's your right of course. You have to remember that neither we (STScI) nor Enthought (who has funded virtually all the scipy work) are getting paid to do the work we are doing for the general community. In our case, we do much of it for our own purposes, and it would certainly be to our advantage if numarray were adopted by the general community so we invest resources in it. If you don't feel it is ready for your purposes, don't use numarray (or Numeric). We have only so many resources and while we wish we could do everything immediately, we can't. We are committed to making Python a good scientific environment, but we don't promise that it has everything that everyone would need now (and it certainly doesn't). > 2. Plotting is a major issue. There are a couple of semi-functional > packages, but neither a comprehensive solution nor a clear > direction for the plotting architecture. > I agree completely. A later (tonight) message will discuss the current situation at more length. Perry Greenfield |
From: Perry G. <pe...@st...> - 2004-01-21 01:31:48
|
On Tuesday, January 20, 2004, at 05:18 PM, Colin J. Williams wrote: > Travis Oliphant wrote: > >> >> Numarray is making great progress and is quite usable for many >> purposes. An idea that was championed by some is that the Numeric >> code base would stay static and be replaced entirely by Numarray. > > It was my impression that this idea had been generally accepted. It > was not just one of the proposals under discussion. > I don't think there was ever any formal vote. I think Paul Dubois had accepted the idea, others had a more "wait and see" attitude. Realistically, I think one can safely say that as one might expect, those that already were using Numeric probably were happy with its capabilities and that given normal motivations, there would be significant inertia on the part of well established users (those with a lot of code already) to switch over. But since it wasn't quite as usable for our needs, we decided that we needed a new version. We had to develop it to support our needs and would have done it regardless. We hoped that it would be suitable for all uses, and we've tried to involve all in the process as much as possible. As you might expect, we've devoted most of our attention to meeting our needs, but we have also expended significant energy trying to meet the needs of the more general community (and we will continue to try to do so within our resources). I don't know if it is reasonable to expect that a certain outcome has been blessed by all, nor did most of the existing Numeric users ask us to do this. But many did recognize (as Paul Dubois alluded to) that there was a need to recode the array stuff. Maybe someone could have done a better job of it, but no one else has yet (it is a fair amount of work after all). We do intend to support all the important packages that Numeric does, it make take some time to get there. I suppose our goal is to eventually attract all new users. We can't, nor should we expect that existing Numeric users will switch at our desire or whim. > I wonder how many others out there had assumed that, in spite of > current speed problems, numarray was the way for the future, and had > based their development endeavours on numarray. I did. > > To this relative outsider, there seem to have been three groups > involved in efforts to provide Python with numerical array > capabilities, those connected with Numeric, SciPy and numarray. SciPy > would appear to be the most recent addition to the list. > Actually, I think it would be more accurate to say that SciPy is an attempt to collect a large base of numeric code and integrate it into an array package (currently Numeric) rather than to develop a new array package. It was started before we started numarray and thus was centered around Numeric. They have found occasions to to modify and extend Numeric behavior. In that sense, it long has been somewhat incompatible with Numeric. (Travis can correct me if I got that wrong.) > Is there any way that some agrement between these groups can be > achieved to restore the hope for a common development path? > I would certainly like to, and in any case, we want to adapt scipy to be compatible with numarray. Perry Greenfield |
From: Perry G. <pe...@st...> - 2004-01-21 01:04:50
|
> David M. Cooke writes: > > 10000 times per size. I'm re-running it like you suggested, but the > difference is small (the new version is up on the above page). For > numarray for addition, it's now > 3.8771e-5 + 4.9832e-9 * N > Well, OK we'll have to look into that. That's different by a factor of 3 or so than what I expected. I'll see if I can find what that is due to. Perry |
From: Ray S. <ra...@bl...> - 2004-01-21 00:02:38
|
Hi Marcel, At 12:07 AM 1/21/2004 +0100, you wrote: ><snip> >Are you saying you have found that you have reinvented the wheel? >That's exactly what I suspect happening a lot... I'm sure a lot of people have written little plot utilities because of the size of Chaco and similar packages, or difficulty integrating with their favorite GUI or module. Ray |
From: Chris B. <Chr...@no...> - 2004-01-20 23:09:31
|
> Perry Greenfield writes: > > It has been our intention to port scipy to use numarray soon. This > > work has been delayed somewhat since our current focus is on > > plotting. That is good news. What plotting package are you working on? Last I heard Chaco had turned into Enthought's (and STSci) in-house Windows only package. (Not because they want it that way, but because they don't have funding to make it work on other platforms, and support the broader community). I don't see anything new on the SciPy page after August '03. Frankly, weak plotting is a bigger deal to me than array performance. -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception Chr...@no... |
From: David M. C. <co...@ph...> - 2004-01-20 23:04:11
|
On Tuesday 20 January 2004 17:54, Ray Schumacher wrote: > With a cross-over at ~2000 elements, can we safely say that working with > video, FITS cubes or other similar imagery would be fastest with numarray > for summing or dividing 2D arrays? (~920K elements) My benchmark was for 1-D arrays, but checking 2-D shows the crossover is in the same region. I'd say for these types of applications you really want to use numarray. -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
From: Ray S. <ra...@bl...> - 2004-01-20 22:55:55
|
With a cross-over at ~2000 elements, can we safely say that working with video, FITS cubes or other similar imagery would be fastest with numarray for summing or dividing 2D arrays? (~920K elements) Ray http://rjs.org/astro |
From: Marcel O. <m.o...@iu...> - 2004-01-20 22:51:38
|
Perry Greenfield writes: > Peter J. Verveer writes: > > > I was under the impression that Numarray was intended to be a > > replacement for Numeric, also as a building block for larger > > packages such as SciPy. Was Numarray not intended to be an > > "improved Numeric" in the first place? I chose to develop for > > Numarray rather than Numeric because of its improvements, under > > the assumption that eventually my code would also become > > available to the users of such packages as SciPy. (I wrote the > > nd_image extension that is now distributed with Numarray. I also > > contributed some improvements to RandomArray extension that are > > not in the Numeric version.) > > > It has been our intention to port scipy to use numarray soon. This > work has been delayed somewhat since our current focus is on > plotting. We do still intend to see that scipy works with numarray. That this discussion is happening NOW really surprises me. I have been following this list for a couple of years now, with the intention of eventually using numerical Python as the main teaching toolbox for numerical analysis, and possibly for the migration small research codes as well. The possibility of doing numerics in Phython has always intrigued me. Right now I am primarily using Matlab. It's very powerful, but not free and the language is horrible; Octave is trying to play catch up but has mostly lost steam. So a good scientific Phython environment (of any sort) would be a really cool thing to have. However, two things have always held me back (apart from coding small examples on a few occasions): 1. Numerical Phython has been in a limbo for too long (I had even assumed a few times that both Numeric and Numarray were dead for all practical purposes). If there are two incompatible version for years and no clear indication where the whole thing is going, I am very hesitant to invest any time into writing substantial code, or recommend it for class room use. 2. Plotting is a major issue. There are a couple of semi-functional packages, but neither a comprehensive solution nor a clear direction for the plotting architecture. Short, I see a lot of potential, unused mainly because the numerical Python community seems to lack clear direction and leadership. This is a real showstopper for someone who is primarily interested in building on top. I am still hopeful that something will come of all this - any progress will be very much appreciated. Best regards, Marcel --------------------------------------------------------------------- Marcel Oliver Phone: +49-421-200-3212 School of Engineering and Science Fax: +49-421-200-3103 International University Bremen m.o...@iu... Campus Ring 1 ol...@me... 28759 Bremen, Germany http://math.iu-bremen.de/oliver --------------------------------------------------------------------- |
From: Colin J. W. <cj...@sy...> - 2004-01-20 22:19:14
|
Travis Oliphant wrote: > > Numarray is making great progress and is quite usable for many > purposes. An idea that was championed by some is that the Numeric > code base would stay static and be replaced entirely by Numarray. It was my impression that this idea had been generally accepted. It was not just one of the proposals under discussion. I wonder how many others out there had assumed that, in spite of current speed problems, numarray was the way for the future, and had based their development endeavours on numarray. I did. To this relative outsider, there seem to have been three groups involved in efforts to provide Python with numerical array capabilities, those connected with Numeric, SciPy and numarray. SciPy would appear to be the most recent addition to the list. Is there any way that some agrement between these groups can be achieved to restore the hope for a common development path? This message from Travis Oliphant seems to envisage two paths. Is this the better way to go? > > However, Numeric is currently used in a large installed base. In > particular SciPy uses Numeric as its core array. While no doubt > numarray arrays will be supported in the future, the speed of the less > bulky Numeric arrays and the typical case that we encounter in SciPy > of many, small arrays will make it difficult for people to abandon > Numeric entirely with it's comparatively light-weight arrays. > > In the development of SciPy we have encountered issues in Numeric that > we feel need to be fixed. As this has become an important path to > success of several projects (both commercial and open) it is > absolutely necessary that this issues be addressed. > > > The purpose of this email is to assess the attitude of the community > regarding how these changes to Numeric should be accomplished. > These are the two options we can see: > * freeze old Numeric 23.x and make all changes to Numeric 24.x still > keeping Numeric separate from SciPy > * freeze old Numeric 23.x and subsume Numeric into SciPy essentially > creating a new SciPy arrayobject that is fast and lightweight. > Anybody wanting this new array object would get it by installing > scipy_base. Numeric would never change in the future but the array in > scipy_base would. > > It is not an option to wait for numarray to get fast enough as these > issues need to be addressed now. Ultimately I think it will be a wise > thing to have two implementations of arrays: one that is fast and > lightweight optimized for many relatively small arrays, and another > that is optimized for large-scale arrays. Eventually, the use of > these two underlying implementations should be automatic and invisible > to the user. Is this "automatic and invisible" practicable, excepts for trivial examples? > > A few of the particular changes we need to make to the Numeric > arrayobject are: > > 1) change the coercion model to reflect Numarray's choice and > eliminate the savespace crutch. > 2) Add indexing capability to Numeric arrays (similar to Numarray's) > 3) Improve the interaction between Numeric arrays and scalars. > 4) Optimization: > > Again, these changes are going to be made to some form of the Numeric > arrays. What I am really interested in knowing is the attitude of the > community towards keeping Numeric around. If most of the community > wants to see Numeric go away then we will be forced to bring the > Numeric array under the SciPy code-base and own it there. > > Your feedback is welcome and appreciated. > Sincerely, > > Travis Oliphant and other SciPy developers > > > > ------------------------------------------------------- > The SF.Net email is sponsored by EclipseCon 2004 > Premiere Conference on Open Tools Development and Integration > See the breadth of Eclipse activity. February 3-5 in Anaheim, CA. > http://www.eclipsecon.org/osdn > _______________________________________________ > Numpy-discussion mailing list > Num...@li... > https://lists.sourceforge.net/lists/listinfo/numpy-discussion I hope that some cooperative approach can be devised. Colin W. |
From: David M. C. <co...@ph...> - 2004-01-20 21:04:29
|
On Tue, Jan 20, 2004 at 03:38:34PM -0500, Perry Greenfield wrote: > David M. Cooke writes: > > > Just what I was doing :-) > > > > Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing > > the two. > > > > Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an > > array of size N (of Float), the time to do a+a is > > > > Numeric: 3.7940e-6 + 2.2556e-8 * N seconds > > numarray: 3.7062e-5 + 5.8497e-9 * N > > > > For sin(a), > > Numeric: 1.7824e-6 + 1.1341e-7 * N > > numarray: 2.8994e-5 + 9.8985e-8 * N ... > How many times do you do the operation for each size? Because of > caching, the first result may be much slower than the rest. > If you didn't could you try computing it by discarding the first > numarray time (or start timing after doing the first iteration)? 10000 times per size. I'm re-running it like you suggested, but the difference is small (the new version is up on the above page). For numarray for addition, it's now 3.8771e-5 + 4.9832e-9 * N -- |>|\/|< /--------------------------------------------------------------------------\ |David M. Cooke http://arbutus.physics.mcmaster.ca/dmc/ |co...@ph... |
From: Perry G. <pe...@st...> - 2004-01-20 20:41:40
|
Peter J. Verveer writes: > I was under the impression that Numarray was intended to be a > replacement for > Numeric, also as a building block for larger packages such as SciPy. Was > Numarray not intended to be an "improved Numeric" in the first > place? I chose > to develop for Numarray rather than Numeric because of its > improvements, under > the assumption that eventually my code would also become available to the > users of such packages as SciPy. (I wrote the nd_image extension > that is now > distributed with Numarray. I also contributed some improvements > to RandomArray > extension that are not in the Numeric version.) > It has been our intention to port scipy to use numarray soon. This work has been delayed somewhat since our current focus is on plotting. We do still intend to see that scipy works with numarray. Perry |
From: Perry G. <pe...@st...> - 2004-01-20 20:38:38
|
David M. Cooke writes: > Just what I was doing :-) > > Check out http://arbutus.mcmaster.ca/dmc/numpy/ for a graph comparing > the two. > > Basically, I get on my machine (a 1.3 GHz Athlon running Linux), for an > array of size N (of Float), the time to do a+a is > > Numeric: 3.7940e-6 + 2.2556e-8 * N seconds > numarray: 3.7062e-5 + 5.8497e-9 * N > > For sin(a), > Numeric: 1.7824e-6 + 1.1341e-7 * N > numarray: 2.8994e-5 + 9.8985e-8 * N > > So the slowness of numarray vs. Numeric for small arrays is because of > an overhead of 3.7e-5 s for numarray, as opposed to 3.8e-6 s for > Numeric. Otherwise, numarray is 4 times faster for large arrays > for addition (and multiplication, which I've also checked). > > The crossover is at arrays of about 2000 elements. > > If this overhead could be reduced by a factor of 3 or 4, I'd be much > happier with using numarray for small arrays. But for now, it's not > good enough. > How many times do you do the operation for each size? Because of caching, the first result may be much slower than the rest. If you didn't could you try computing it by discarding the first numarray time (or start timing after doing the first iteration)? Thanks, Perry |
From: <ve...@em...> - 2004-01-20 20:37:07
|
Just my 2 cents on the issue of replacing Numeric by Numarray: I was under the impression that Numarray was intended to be a replacement for Numeric, also as a building block for larger packages such as SciPy. Was Numarray not intended to be an "improved Numeric" in the first place? I chose to develop for Numarray rather than Numeric because of its improvements, under the assumption that eventually my code would also become available to the users of such packages as SciPy. (I wrote the nd_image extension that is now distributed with Numarray. I also contributed some improvements to RandomArray extension that are not in the Numeric version.) I believe that it would be a bad situation if the numerical python community would be split among two different array packages. (I think Paul Dubois expressed a similar sentiment on comp.lang.python). Supporting code for two incompatible packages would be a pain (I am personally not willing to do that). Not being able to use modules designed for one package in the other would be disappointing for many people, I think... If I understood well, the only issue with Numarray seems to be that the speed for handling small arrays is too low. So would it not be more efficient to focus on that problem rather than throwing away all the excellent work that has been done already on Numarray? Best regards, Peter -- Dr. Peter J. Verveer Cell Biology and Cell Biophysics Programme European Molecular Biology Laboratory Meyerhofstrasse 1 D-69117 Heidelberg Germany |
From: Perry G. <pe...@st...> - 2004-01-20 20:32:47
|
> -----Original Message----- > From: num...@li... > [mailto:num...@li...]On Behalf Of Edward > C. Jones > Sent: Tuesday, January 20, 2004 2:42 PM > To: num...@li... > Subject: [Numpy-discussion] How fast are small arrays currently? > > > Has anyone recently benchmarked the speed of numarray vs. Numeric? > We presented some benchmarks at scipy 2003. It depends on many factors and what functions or operations are being performed so it is hard to generalize (one reason I ask for specific cases that need improvement). But to take ufuncs as examples: the speed for 1 element arrays (about as small as they get) has: v0.4 v0.5 Int32 + Int32 65 3.7 Int32 + Int32 discontiguous 104 7.3 Int32 + Float64 95 4.9 add.reduce(Int32) NxN swapaxes 111 3.6 add.reduce(Int32, -1) NxN 98 3.2 What is shown is the (time for numarray operation)/(time for Numeric), for v0.4 and v0.5. Note that with v0.5, these are typically 3 to 4 times slower for small arrays with a couple cases some what worse (a factor of 4.9 and 7.3). Speeds for v0.4 are substantially slower (orders of magnitude). Note that the speedup is obtained through caching certain information. The first time you perform a certain operation (say an Int32/Int16 add), it will be slow. When repeated it will be closer to that shown benchmark. If you are only going to do one operation on a small array, speed presumably doesn't matter much. It is only when you plan to iterate over many small arrays would it usually be an issue. Other functions may be much worse (or better). If people let us know which things are too slow we can put that on our to do list. Is a factor of 3 or 4 times slower a killer? What about a factor of 2? > Why are numarrays so slow to create? > I'll leave it to Todd to give the details of that. |