From: Peter G. <pgr...@ge...> - 2004-02-11 19:17:09
|
Hello: We will be dealing with large (> 100,000 but in some instances as big as 500,000 points) data sets. They are to be plotted, and I would like to use matplotlib. I did a few preliminary tests, and it seems like plotting that many pairs is a little too much for the system to handle. Currently we are using (as a backend to some other software) gnuplot for doing this plotting. It seems to be "lighting-fast" but I suspect (may be wrong!) that it reduces this data before the plotting takes place, and only selects every nth point. I have to go through the code that calls it to be certain. I would imagine that it is not necessary to get evrey point in 100,000 to produce a page-size plot, but I'm not sure if simply grabbing every nth point and reducing the data like that is the best way about this. So my question is to anyone else out there who is also dealing with these large (and very large) data sets? What do you do? Any library routines that you use before plotting to massage that data? Are there any ways (ie flags to set) to optimize this in matplotlib? Any other software you use? I should note that I use the GD backend and pipe the output to stdout for a cgi scrpit to pick up. Thanks. -- Peter Groszkowski Gemini Observatory Tel: +1 808 974-2509 670 N. A'ohoku Place Fax: +1 808 935-9235 Hilo, Hawai'i 96720, USA |