From: Joey W. <dou...@gm...> - 2009-12-16 22:45:44
|
Does anyone know the status of development for Matplotlib persistent figure saving? I would like to be able to save the figures from matplotlib in an editable form, without flattening down to an image file. The closest thing to this right now is the SVG output, but a native mpl format would be better. I need to be able to save the figure, so that later it can be loaded, edited, and re-saved. I know that this topic has been somewhat discussed in the past, but I believe it is desperately needed, so I thought I would bring it back up. Let me say why I think this feature is so essential. Anyone who is in research or academia knows that figures often need to be edited when a publication comes back from peer review. It's already happened to me many times, and I've learned that I absolutely have to save my figures for later editing to save myself a lot of time. Some people have argued that a script that generates the plots/figures should be saved, and that if you need to edit the figure, just re-run the script. The problem with this argument is that scientific plots often take hours, days, or even weeks of computation to generate. For example, generating a bit-error-rate curve in communications takes days. Therefore, always re-running from a script is just not practical. Now, I understand that resources are limited, so I would be willing to raise some money to get this feature added to Matplotlib. It's desperately needed by myself and many others in the community. I would really like to completely replace Matlab with Python,Scipy, and Matplotlib. MPL is an excellent tool, and it could be even more useful/professional with the addition of a figure save feature. Any thoughts? -Joey Wilson span.ece.utah.edu/joey-wilson |
From: Anne A. <per...@gm...> - 2009-12-17 17:42:25
|
2009/12/16 Joey Wilson <dou...@gm...>: > Does anyone know the status of development for Matplotlib persistent figure > saving? I would like to be able to save the figures from matplotlib in an > editable form, without flattening down to an image file. The closest thing > to this right now is the SVG output, but a native mpl format would be > better. I need to be able to save the figure, so that later it can be > loaded, edited, and re-saved. I know that this topic has been somewhat > discussed in the past, but I believe it is desperately needed, so I thought > I would bring it back up. > Let me say why I think this feature is so essential. Anyone who is in > research or academia knows that figures often need to be edited when a > publication comes back from peer review. It's already happened to me many > times, and I've learned that I absolutely have to save my figures for later > editing to save myself a lot of time. Some people have argued that a script > that generates the plots/figures should be saved, and that if you need to > edit the figure, just re-run the script. The problem with this argument is > that scientific plots often take hours, days, or even weeks of computation > to generate. For example, generating a bit-error-rate curve in > communications takes days. Therefore, always re-running from a script is > just not practical. > Now, I understand that resources are limited, so I would be willing to raise > some money to get this feature added to Matplotlib. It's desperately needed > by myself and many others in the community. I would really like to > completely replace Matlab with Python,Scipy, and Matplotlib. MPL is an > excellent tool, and it could be even more useful/professional with the > addition of a figure save feature. > Any thoughts? Leaving entirely aside any question of persistence, do you find matplotlib plots to be modifiable in the ways you want? I find for anything beyond minor changes of axes, I end up rerunning my plotting command anyway - for example, I suppose it's possible to change a line on an existing plot from red to black, but I just rerun the plotting command. What about adding/removing error bars? changing the number of bins, range, or starting position of your histogram? plotting the square root instead of the logarithm of the image values? removing bogus data points (or adding back in points you I previously removed)? It seems to me that all of these things require me to keep the original data around. Since that's the case, I usually generate my plots in one of two ways: either I just write a script that runs the calculation and generates the plot, or I write one script to generate the data and save it to disk, and another to plot the data from disk. This is sometimes mildly annoying when a script is just a bit slow, but not enough to warrant saving the data to disk. In those cases if I must I can run the script under ipython and modify the plot, then save out the modifications to a script. Now, if you want a very-low-effort way to save your data to disk, I agree that would be valuable to have, but there are, in ascending order of complexity and power, the native numpy data format, pyfits, and pytables/pyhdf. Anne |
From: Ryan M. <rm...@gm...> - 2009-12-17 17:44:41
|
On Wed, Dec 16, 2009 at 4:45 PM, Joey Wilson <dou...@gm...> wrote: > Let me say why I think this feature is so essential. Anyone who is in > research or academia knows that figures often need to be edited when a > publication comes back from peer review. It's already happened to me many > times, and I've learned that I absolutely have to save my figures for later > editing to save myself a lot of time. Some people have argued that a script > that generates the plots/figures should be saved, and that if you need to > edit the figure, just re-run the script. The problem with this argument is > that scientific plots often take hours, days, or even weeks of computation > to generate. For example, generating a bit-error-rate curve in > communications takes days. Therefore, always re-running from a script is > just not practical. Ignoring the issue of having saved matplotlib figures, I'd argue you should separate the parts of the code that do computation from those that do plotting into separate scripts. Is there anything keeping you from saving all of the results from the computation into (for instance) a NetCDF file? Then the plotting script can just read in the file and do the plotting. This is exactly how my workflow is set up. I'd be happy to address any concerns you see with doing things this way. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma |
From: Christopher B. <Chr...@no...> - 2009-12-18 05:03:59
|
Joey Wilson wrote: > I would like to be able to save the figures from > matplotlib in an editable form, without flattening down to an image > file. > Now, I understand that resources are limited, so I would be willing to > raise some money to get this feature added to Matplotlib. I think to do this right, you'd need to completely re-design MPL to be based on a more declarative structure: i.e. you'd define what the objects were in a figure, and MPL would generate the figure from the declaration -- much like how an drawing is generated from SVG. Maybe it's not as big a re-factor I think it is, but it seems that MPL is built to be used from a scripting interface instead: a series of commands that builds the figure. Honestly, for your purposes, I don't know that there is much difference. I suppose what you are looking for a is a way to get a script that you could edit and re-run, but have it generated from a figure automatically, the figure itself could have been generated by a different script (intermeshed with computational code), or an interactive session, or.... That would be pretty cool, but I think a bit of re-factoring of your process would make it pretty easy to edit and re-run your scripts anyway. > I > would really like to completely replace Matlab with Python,Scipy, and > Matplotlib. How does Matlab handle this? In my Matlab days, I wrote scripts that generated my figures, and when I needed to change them, I edited the scripts and re-ran them -- exactly the workflow we're suggesting for Python/MPL. But that was 10 years ago... -Chris -- Christopher Barker, Ph.D. Oceanographer NOAA/OR&R/HAZMAT (206) 526-6959 voice 7600 Sand Point Way NE (206) 526-6329 fax Seattle, WA 98115 (206) 526-6317 main reception |
From: Jouni K. S. <jk...@ik...> - 2009-12-19 19:08:01
|
Christopher Barker <Chr...@no...> writes: > Joey Wilson wrote: >> I would like to be able to save the figures from >> matplotlib in an editable form, without flattening down to an image >> file. > > I think to do this right, you'd need to completely re-design MPL to be > based on a more declarative structure: [...] but it seems that MPL is > built to be used from a scripting interface instead: a series of > commands that builds the figure. I don't think that is the real problem. How matplotlib works is that the plotting commands build up a data structure consisting of various objects pointing to each other, and when you call show or savefig, the data structure is traversed and the appropriate backend functions are called. In principle it should not be very difficult to serialize this data structure, but since extension objects are involved, some work might be needed to get it right. What I think is the really difficult part is keeping the serialized format somehow usable across different versions of matplotlib. When anything changes in the various classes, the developers would need to decide how the change is reflected in the on-disk format, how files corresponding to the old class should be read in using the new class, etc. Perhaps something like Google's protocol buffers could be used to make this easier, but it would still be an burden on all subsequent development. -- Jouni K. Seppänen http://www.iki.fi/jks |
From: Eric F. <ef...@ha...> - 2009-12-19 20:03:53
|
Jouni K. Seppänen wrote: > Christopher Barker <Chr...@no...> writes: > >> Joey Wilson wrote: >>> I would like to be able to save the figures from >>> matplotlib in an editable form, without flattening down to an image >>> file. >> I think to do this right, you'd need to completely re-design MPL to be >> based on a more declarative structure: [...] but it seems that MPL is >> built to be used from a scripting interface instead: a series of >> commands that builds the figure. > > I don't think that is the real problem. How matplotlib works is that the > plotting commands build up a data structure consisting of various > objects pointing to each other, and when you call show or savefig, the > data structure is traversed and the appropriate backend functions are > called. In principle it should not be very difficult to serialize this > data structure, but since extension objects are involved, some work > might be needed to get it right. > > What I think is the really difficult part is keeping the serialized > format somehow usable across different versions of matplotlib. When > anything changes in the various classes, the developers would need to > decide how the change is reflected in the on-disk format, how files > corresponding to the old class should be read in using the new class, > etc. Perhaps something like Google's protocol buffers could be used to > make this easier, but it would still be an burden on all subsequent > development. > Exactly. I *strongly* oppose any move in this direction. It would be enabling bad workflow strategy on the part of users, providing no benefit that cannot be achieved better with a good workflow strategy, and adding complexity. We have enough of that already. We need to think about how to clean up mpl and make it easier to maintain and improve, not clutter it with ever more complexity. Eric |
From: Ryan M. <rm...@gm...> - 2009-12-20 02:46:49
|
On Sat, Dec 19, 2009 at 2:03 PM, Eric Firing <ef...@ha...> wrote: > Exactly. I *strongly* oppose any move in this direction. It would be > enabling bad workflow strategy on the part of users, providing no > benefit that cannot be achieved better with a good workflow strategy, > and adding complexity. We have enough of that already. We need to > think about how to clean up mpl and make it easier to maintain and > improve, not clutter it with ever more complexity. +1 That pretty much sums up how I feel. Ryan -- Ryan May Graduate Research Assistant School of Meteorology University of Oklahoma |