From: Stef T. <st...@um...> - 2009-07-22 21:18:57
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 03:03:52PM -0400, Stef Telford wrote: > >> surely, please find the output attached at the bottom. >> > > Thank you! > > No, Honestly, Thank you :) >> Ordered by: cumulative time >> >> ncalls tottime percall cumtime percall filename:lineno(function) >> 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) >> 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) >> > > Hmm. Does the program try to draw the entire table in one huge slurp? > > Actually, no. This page does 40k of bookings (a single object type) but the query isn't "Sequential" (it's not grabbing 0-40,000 is what I mean). I can do a page which has lots of different object types if this would help ? >> 40000 0.375 0.000 39.282 0.001 main.py:872(get) >> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> > > Half of the time the program spends drawing rows one by one. This > probably could be optimized by using fetchmany or fetchall. > > Noted. Let me try this later tonight when I have some spare cycles :) >> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> > > Third of the time - Decimal.__new__. There is something strange here (in > the profiler output itself, I mean) - those Decimal calls are probably > from DecimalCol.to_python, but the profiler didn't add those calls: > > Urm. hrm. that's weird. I would have thought that they would have been added if they were called. I have read in more than a few places, that Decimal instantiation is slow when compared to float or gmpy :\ >> 1200000 1.856 0.000 2.769 0.000 col.py:1289(to_python) >> 3362951 2.729 0.000 2.729 0.000 main.py:1673(instanceName) >> 640000 0.999 0.000 1.471 0.000 col.py:657(to_python) >> > > My guess is that those Decimal.__new__ calls are inside DB API driver, > and DecimalCol.to_python gets Decimal and returns it unchanged. This means > that the lines > > >> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> > > should be read as follows: fetchone draws a row and converts values to > Decimal, so 29.2 s are really a part of 39.2, and fetchone only waited for > DB for 10 seconds. > > 10seconds to fetch from the database is not bad (in my view). The 29s for decimal is definitely 'killer' >> 40000 0.214 0.000 23.323 0.001 main.py:912(_init) >> 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) >> > > _SO_selectInit is a rather simple function - it populates the row with > values just fetched from the DB; the values are converted using calls to > to_python. Those to_python calls are fast (if we ca believe the profiler) - > see above. So where does _SO_selectInit spend its time?! > > I wish I knew. The object in question -does- have about 40 or 50 columns on it (don't ask.. lots of feature creep). I wonder if perhaps the number of columns plays into the _init time ? I take it that the class overview is cached, so that it only has to be parsed 'once'.. but what about if it's a class through a FK or M2M ? Is SQLObject able to find the pre-parsed class (if that makes sense) Sorry about all this, but I really am sorta hitting my head here. I mean, I -could- change Decimals to gmpy or something inside SO, but floats with their (0.1 + 0.1 + 0.1 - 0.3) fiasco is pretty much a non-starter for me I think. Hrm. As I said, probably not being much help here :) Regards Stef |