|
From: Michael D. <md...@st...> - 2013-10-14 17:20:41
|
Sorry to repeat myself, but please reduce this to a short, self contained example, that is absolutely minimal to demonstrate the problem. http://sscce.org/ should help better explain what I'm after. I don't want to find the needle in the haystack here -- there is code in your example that doesn't even run, for example. That said, are you really after creating a legend entry for each of the dots? (See below). That just isn't going to work, and I'm not surprised it eats up excessive amounts of memory. I think you want (and can) reduce this to a single scatter call. _series = [_ax1.scatter(_x, _y, color=_c, s=objsize, label=_l, hatch='.') for _x, _y, _c, _l in izip(mydata_x, mydata_y, colors, legends)] # returns PathCollection object Mike On 10/12/2013 12:57 PM, Martin MOKREJŠ wrote: > Hi, > so here is some quick but working example. I added there are 2-3 functions (unused) > as a bonus, you can easily call them from the main function using same API > (except the piechart). I hope this shows what I lack in matplotlib - a general API > so that I could easily switch form scatter plot to piechart or barchart without altering > much the function arguments. Messing with return objects line2D, PathCollection, Rectangle > is awkward and I would like to stay away from matplotlib's internals. ;) Some can be sliced, > so not, you will see in the code. > > This eatmem.py will take easily all your memory. Drawing 300000 dots is not feasible > with 16GB of RAM. While the example is for sure inefficient in many places generating the data > in python does not eat RAM. That happens afterwards. > > I would really like to hear whether matplotlib could be adjusted instead. ;) I already mentioned > in this thread that it is awkward to pre-create colors before passing all data to a drawing > function. I think we could all save a lot if matplotlib could dynamically fetch colors > on the fly from user-created generator, same for legends descriptions. I think my example > code shows the inefficient approach here. Would I have more time I would randomize a bit > more the sublist of each series so that the numbers in legends would be more variable > but that is a cosmetic issue. > Probably due to my ignorance you will see that figures with legends have different font > sizes, axes are rescaled and the figure. Of course I wanted to have the drawing same via both > approaches but failed badly. The files/figures with legends should be just accompanied by the > legend "table" underneath but the drawing itself should be same. Maybe an issue with DPI settings > but not only. > > I placed some comments in the code, please don't take them in person. ;) Of course > I am glad for the existing work and am happy to contribute my crap. I am fine if you rewamp > this ugly code into matplotlib testsuite, provide similar function (the API mentioned above) > so that I could use your code directly. That would be great. I just tried to show multiple > issues at once, notably that is why I included those unused functions. You will for sure find > a way to use them. > > Regarding the "unnecessary" del() calls etc., I think I have to use keep some, Ben, because > the function is not always left soon enough. I could drop some, you are right, but for some > I don't think so. Matplotlib cannot recycle the memory until me (upstream) deletes the reference > so ... go and test this lousy code. Now you have a testcase. ;) Same with the gc.collect() calls. > Actually, the main loop with 10 iteration is there just to show why I always want to clear > a figure when entering a function and while leaving it as well. It happened too many times that > I drawed over an old figure, and this was posted also few times on this list by others. That is > a weird behavior in my opinion. We, users, are just forced to use too low-level functions. > > So, have fun eating your memory! :)) > Martin -- _ |\/|o _|_ _. _ | | \.__ __|__|_|_ _ _ ._ _ | ||(_| |(_|(/_| |_/|(_)(/_|_ |_|_)(_)(_)| | | http://www.droettboom.com |