Thread: Re: [SQLObject] SQLObject Speed ?
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: Oleg B. <ph...@ph...> - 2009-07-22 18:03:55
|
On Wed, Jul 22, 2009 at 01:27:41PM -0400, Stef Telford wrote: >>> Hello Oleg, Everyone, >>> Sorry for the delay in replying.. "real life" sort of got in the >>> way. Deepest apologies for that. >> >> No problem. I understand the issue and feel your pain, sure. >> > *chuckles* this is kind of the problem with OS projects. We use them > almost 100% here at work but try to spend any -actual- time on improving > them and.. well.. yes. Well, I have a lot of job, too; and another problem is that my mother-in-law is very ill so my wife and me have to sit beside her days and nights. > actually.. I -did- remark them out. As I said, all the evals do (I > -believe-) is to add the lambda into the python dictionary with a > specific name for getattr to find Not for getattr - for property... but that's a minor technical detail. > I have read in numerous places > that python's eval is hysterically slow [skip] > yes. evals appear to be > a 'bad' thing here :\ Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they work once for every column in the table, but that's all. After the class has been created and populated - whatever you do with rows (class instances) - those evals are not executed. It could be that the functions the evals produced are slow - you've commented out all those evals and functions. That again require experimentation. And what is worse - those functions are hard to "fix" without much architectural changes. >> Perhaps those evals >> could be replaced with functools.partial >> > functools partial ? isn't this another name for decorators ? No, not at all - it is pure functional approach to a completely different problem. >>> (ps. I -do- have a large PNG of the system during a single 'page >>> impression'. It's about 7mb .. I can send it offlist if your >>> interested ?) >> >> A PNG of the system? What's that? I don't understand... >> > It's a PNG of the profiling of a single 'page'. I profiled it and then > passed it through gprof2dot ( > http://code.google.com/p/jrfonseca/wiki/Gprof2Dot ) let me know if you > want it.. it -is- a VERY large PNG though :) Ah, I see - gprof. I have never used it so I doubt I can understand the results. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2009-07-22 18:46:24
|
On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: >>> yes. evals appear to be a 'bad' thing here :\ >> >> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they >> work once for every column in the table, but that's all. After the class >> has been created and populated - whatever you do with rows (class instances) >> - those evals are not executed. >> > Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the > instantiations get .. well .. 'slower' over time. Curiouser and curiouser. IWBN to find where the slowness is *near the end* of the loop - i.e. when instantiation becomes really slow. > I should stress that > the amount of instantiations can reach into thousands easily for me. I > know this maybe a niche problem and for 10-30 objects the speed is good. Still I'd resolve the issue to at least some extent. > I have attached the python -m cProfile output, and limited the run to > 'only' (?!) 40,000 objects. Hopefully this may make some more sense to > you. Gprof is.. definitely.. urm.. quirky :) > Ordered by: standard name > > ncalls tottime percall cumtime percall filename:lineno(function) > 40000 0.063 0.000 0.141 0.000 <string>:1(<lambda>) Can you sort by ncalls and cumtime columns, cut the first 100-200 lines and post two tables here? > 1 0.000 0.000 85.863 85.863 <string>:1(<module>) This is AFAIU the entire script? :) Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-22 19:04:09
Attachments:
cumulative.out
ncalls.out
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: > >>>> yes. evals appear to be a 'bad' thing here :\ >>>> >>> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they >>> work once for every column in the table, but that's all. After the class >>> has been created and populated - whatever you do with rows (class instances) >>> - those evals are not executed. >>> >>> >> Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the >> instantiations get .. well .. 'slower' over time. >> > > Curiouser and curiouser. IWBN to find where the slowness is *near the > end* of the loop - i.e. when instantiation becomes really slow. > > It could be purely a 'feeling' .. I don't have any numbers to back it up and I am not entirely sure -how- to benchmark that. This machine does have 8gb of ram in it, and a fairly beefy quad core. I have never seen any process get near memory exhaustion, which I could believe the calls to 'malloc' could slow down but.. yes. >> Ordered by: standard name >> >> ncalls tottime percall cumtime percall filename:lineno(function) >> 40000 0.063 0.000 0.141 0.000 <string>:1(<lambda>) >> > > Can you sort by ncalls and cumtime columns, cut the first 100-200 lines > and post two tables here? > > surely, please find the output attached at the bottom. It maybe also worth changing the Decimal call to using gmpy or .. something that's faster than Decimal perhaps ? I did notice a fair few cycles are spent in Decimal.__new__ .. I don't think that's the -entire- story, but it should help I would think. >> 1 0.000 0.000 85.863 85.863 <string>:1(<module>) >> > > This is AFAIU the entire script? :) > > Yes indeed.. that's the way I read the cProfile output as well :D Regards Stef |
From: Oleg B. <ph...@ph...> - 2009-07-22 21:49:27
|
On Wed, Jul 22, 2009 at 03:03:52PM -0400, Stef Telford wrote: > surely, please find the output attached at the bottom. Thank you! > Ordered by: cumulative time > > ncalls tottime percall cumtime percall filename:lineno(function) > 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) > 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) Hmm. Does the program try to draw the entire table in one huge slurp? > 40000 0.375 0.000 39.282 0.001 main.py:872(get) > 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} Half of the time the program spends drawing rows one by one. This probably could be optimized by using fetchmany or fetchall. > 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) Third of the time - Decimal.__new__. There is something strange here (in the profiler output itself, I mean) - those Decimal calls are probably from DecimalCol.to_python, but the profiler didn't add those calls: > 1200000 1.856 0.000 2.769 0.000 col.py:1289(to_python) > 3362951 2.729 0.000 2.729 0.000 main.py:1673(instanceName) > 640000 0.999 0.000 1.471 0.000 col.py:657(to_python) My guess is that those Decimal.__new__ calls are inside DB API driver, and DecimalCol.to_python gets Decimal and returns it unchanged. This means that the lines > 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} > 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) should be read as follows: fetchone draws a row and converts values to Decimal, so 29.2 s are really a part of 39.2, and fetchone only waited for DB for 10 seconds. > 40000 0.214 0.000 23.323 0.001 main.py:912(_init) > 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) _SO_selectInit is a rather simple function - it populates the row with values just fetched from the DB; the values are converted using calls to to_python. Those to_python calls are fast (if we ca believe the profiler) - see above. So where does _SO_selectInit spend its time?! Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-22 21:18:57
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 03:03:52PM -0400, Stef Telford wrote: > >> surely, please find the output attached at the bottom. >> > > Thank you! > > No, Honestly, Thank you :) >> Ordered by: cumulative time >> >> ncalls tottime percall cumtime percall filename:lineno(function) >> 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) >> 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) >> > > Hmm. Does the program try to draw the entire table in one huge slurp? > > Actually, no. This page does 40k of bookings (a single object type) but the query isn't "Sequential" (it's not grabbing 0-40,000 is what I mean). I can do a page which has lots of different object types if this would help ? >> 40000 0.375 0.000 39.282 0.001 main.py:872(get) >> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> > > Half of the time the program spends drawing rows one by one. This > probably could be optimized by using fetchmany or fetchall. > > Noted. Let me try this later tonight when I have some spare cycles :) >> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> > > Third of the time - Decimal.__new__. There is something strange here (in > the profiler output itself, I mean) - those Decimal calls are probably > from DecimalCol.to_python, but the profiler didn't add those calls: > > Urm. hrm. that's weird. I would have thought that they would have been added if they were called. I have read in more than a few places, that Decimal instantiation is slow when compared to float or gmpy :\ >> 1200000 1.856 0.000 2.769 0.000 col.py:1289(to_python) >> 3362951 2.729 0.000 2.729 0.000 main.py:1673(instanceName) >> 640000 0.999 0.000 1.471 0.000 col.py:657(to_python) >> > > My guess is that those Decimal.__new__ calls are inside DB API driver, > and DecimalCol.to_python gets Decimal and returns it unchanged. This means > that the lines > > >> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> > > should be read as follows: fetchone draws a row and converts values to > Decimal, so 29.2 s are really a part of 39.2, and fetchone only waited for > DB for 10 seconds. > > 10seconds to fetch from the database is not bad (in my view). The 29s for decimal is definitely 'killer' >> 40000 0.214 0.000 23.323 0.001 main.py:912(_init) >> 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) >> > > _SO_selectInit is a rather simple function - it populates the row with > values just fetched from the DB; the values are converted using calls to > to_python. Those to_python calls are fast (if we ca believe the profiler) - > see above. So where does _SO_selectInit spend its time?! > > I wish I knew. The object in question -does- have about 40 or 50 columns on it (don't ask.. lots of feature creep). I wonder if perhaps the number of columns plays into the _init time ? I take it that the class overview is cached, so that it only has to be parsed 'once'.. but what about if it's a class through a FK or M2M ? Is SQLObject able to find the pre-parsed class (if that makes sense) Sorry about all this, but I really am sorta hitting my head here. I mean, I -could- change Decimals to gmpy or something inside SO, but floats with their (0.1 + 0.1 + 0.1 - 0.3) fiasco is pretty much a non-starter for me I think. Hrm. As I said, probably not being much help here :) Regards Stef |
From: Stef T. <st...@um...> - 2009-07-30 02:46:40
|
Hello Oleg, Everyone, So.. please find attached the results from the same lowlevel program as the last run. I changed the psycopg2 to use the gmpy.mpq types as opposed to the Decimal type. I also (of course) had to hack col.py to use gmpy.mpq's instead of Decimal.. After everything is said and done.. Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 23.905 23.905 23.905 23.905 {method 'fetchall' of 'psycopg2._psycopg.cursor' objects} 1 16.336 16.336 16.336 16.336 {method 'execute' of 'psycopg2._psycopg.cursor' objects} 1 11.495 11.495 52.079 52.079 t.lowlevel:3(<module>) 1 0.317 0.317 52.396 52.396 {execfile} 1 0.031 0.031 0.031 0.031 pgconnection.py:108(makeConnection) 371/62 0.015 0.000 0.050 0.001 sre_parse.py:385(_parse) 1 0.014 0.014 40.287 40.287 dbconnection.py:257(_runWithConnection) 4824 0.012 0.000 0.015 0.000 sre_parse.py:188(__next) 740/61 0.011 0.000 0.033 0.001 sre_compile.py:38(_compile) 261 0.010 0.000 0.011 0.000 {eval} 339 0.010 0.000 0.015 0.000 sre_compile.py:213(_optimize_charset) 1054/380 0.008 0.000 0.009 0.000 sre_parse.py:146(getwidth) 16310/15895 0.007 0.000 0.007 0.000 {len} 668 0.006 0.000 0.009 0.000 {built-in method sub} 12099 0.005 0.000 0.005 0.000 {method 'append' of 'list' objects} 4074 0.005 0.000 0.018 0.000 sre_parse.py:207(get) 1 0.005 0.005 0.033 0.033 converters.py:1(<module>) 43 0.004 0.000 0.004 0.000 sre_compile.py:264(_mk_bitmap) 130 0.004 0.000 0.018 0.000 pkg_resources.py:1645(find_on_path) Which would -definitely- seem to suggest Decimal is gawd awful slow. The entire program finished in roughly 1/8th of the time. True, memory consumption went from around 880mb for 220k objects upto 1.4gb for the same 220k objects. However, look at that speed gain. It's insane. So.. urm.. yeah. I honestly don't know what to say.. anyone else interested in an 8 fold speed improvement ? ;) Regards Stef Oleg Broytmann wrote: > On Tue, Jul 28, 2009 at 09:37:56PM -0400, Stef Telford wrote: > >> Here is the output from the lowlevel connection select as you >> suggested above (mostly) ordered by internal time. Note that it selects >> > > Thank you! > > >> all 220k bookings, as opposed to last time when it 'only' selected 40k. >> It seems that decimal.__new__ is killer .. I could be reading this wrong >> (of course) but.. the tottime would seem to back that up I think. >> > > It's the point that I wanted to prove, really - to time fetchall and > Decimal without SQLObject. > > >> 133167599 function calls (133165268 primitive calls) in 392.615 >> CPU seconds >> >> Ordered by: internal time >> > > >> ncalls tottime percall cumtime percall filename:lineno(function) >> > > >> 1 16.715 16.715 392.589 392.589 t.lowlevel:3(<module>) >> > > This is the entire script - 392 seconds. > > >> 1 53.209 53.209 295.902 295.902 {method 'fetchall' of >> 'psycopg2._psycopg.cursor' objects} >> > > 295 seconds in the most interesting part of the program. > > >> 9883851 90.187 0.000 242.693 0.000 decimal.py:515(__new__) >> > > 243 seconds were spent in Decimal.__new__. I.e., Decimal.__new__ is > called from DB API driver; so we can trust the profiler when it shown us > last time .to_python() calls were fast - they were really fast 'cause they > didn't have a need to call Decimal. > > >> 9883845 35.014 0.000 42.012 0.000 decimal.py:830(__str__) >> 9883845 21.096 0.000 63.108 0.000 decimal.py:825(__repr__) >> > > This is printing. > > >> 9884218 25.102 0.000 25.102 0.000 {built-in method match} >> 39536056 20.103 0.000 20.103 0.000 {built-in method group} >> 1 16.522 16.522 16.522 16.522 {method 'execute' of >> 'psycopg2._psycopg.cursor' objects} >> 9883846 6.461 0.000 6.461 0.000 {method 'lstrip' of 'str' >> objects} >> 9885976 4.971 0.000 4.971 0.000 {isinstance} >> > > And some internal stuff. > > Well, fetchall + print spans the entire program, so we can see > sqlbuilder is fast. With this (and with more trust in the profiler) we can > return to analyze SQLObject timing. > > Oleg. > |
From: Oleg B. <ph...@ph...> - 2009-07-30 06:42:26
|
On Wed, Jul 29, 2009 at 10:46:19PM -0400, Stef Telford wrote: > Which would -definitely- seem to suggest Decimal is gawd awful slow. > The entire program finished in roughly 1/8th of the time. True, memory > consumption went from around 880mb for 220k objects upto 1.4gb for the > same 220k objects. However, look at that speed gain. It's insane. Trading speed for memory. Pretty crowded market. ;) Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Simon C. <hod...@gm...> - 2009-07-30 15:06:28
|
On Thu, Jul 30, 2009 at 3:09 PM, Stef Telford<st...@um...> wrote: > Let me re-ask that question; is there any point in submitting > patches to use gmpy to you if to get things working requires a patched > psycopg2 (which will probably never go 'mainstream') ? I think your best bet is going to be to attempt to get a fast C implementation of Decimal accepted into Python itself. There has already been some work on this which is outlined in the Python bug tracker [1] and given that there was talk of accepting the patches for 3.1 there is some hope that you could get them into 2.7 / 3.2. [1] http://bugs.python.org/issue2486 Schiavo Simon |
From: Oleg B. <ph...@ph...> - 2009-07-29 07:44:03
|
On Tue, Jul 28, 2009 at 09:37:56PM -0400, Stef Telford wrote: > Here is the output from the lowlevel connection select as you > suggested above (mostly) ordered by internal time. Note that it selects Thank you! > all 220k bookings, as opposed to last time when it 'only' selected 40k. > It seems that decimal.__new__ is killer .. I could be reading this wrong > (of course) but.. the tottime would seem to back that up I think. It's the point that I wanted to prove, really - to time fetchall and Decimal without SQLObject. > 133167599 function calls (133165268 primitive calls) in 392.615 > CPU seconds > > Ordered by: internal time > ncalls tottime percall cumtime percall filename:lineno(function) > 1 16.715 16.715 392.589 392.589 t.lowlevel:3(<module>) This is the entire script - 392 seconds. > 1 53.209 53.209 295.902 295.902 {method 'fetchall' of > 'psycopg2._psycopg.cursor' objects} 295 seconds in the most interesting part of the program. > 9883851 90.187 0.000 242.693 0.000 decimal.py:515(__new__) 243 seconds were spent in Decimal.__new__. I.e., Decimal.__new__ is called from DB API driver; so we can trust the profiler when it shown us last time .to_python() calls were fast - they were really fast 'cause they didn't have a need to call Decimal. > 9883845 35.014 0.000 42.012 0.000 decimal.py:830(__str__) > 9883845 21.096 0.000 63.108 0.000 decimal.py:825(__repr__) This is printing. > 9884218 25.102 0.000 25.102 0.000 {built-in method match} > 39536056 20.103 0.000 20.103 0.000 {built-in method group} > 1 16.522 16.522 16.522 16.522 {method 'execute' of > 'psycopg2._psycopg.cursor' objects} > 9883846 6.461 0.000 6.461 0.000 {method 'lstrip' of 'str' > objects} > 9885976 4.971 0.000 4.971 0.000 {isinstance} And some internal stuff. Well, fetchall + print spans the entire program, so we can see sqlbuilder is fast. With this (and with more trust in the profiler) we can return to analyze SQLObject timing. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: David T. <no...@op...> - 2009-07-22 20:43:35
|
On Wed, 2009-07-22 at 15:03 -0400, Stef Telford wrote: > Oleg Broytmann wrote: > > On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: > > > >>>> yes. evals appear to be a 'bad' thing here :\ > >>>> > >>> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they > >>> work once for every column in the table, but that's all. After the class > >>> has been created and populated - whatever you do with rows (class instances) > >>> - those evals are not executed. > >>> > >>> > >> Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the > >> instantiations get .. well .. 'slower' over time. > >> > > > > Curiouser and curiouser. IWBN to find where the slowness is *near the > > end* of the loop - i.e. when instantiation becomes really slow. > > > > > It could be purely a 'feeling' .. I don't have any numbers to back it up > and I am not entirely sure -how- to benchmark that. This machine does > have 8gb of ram in it, and a fairly beefy quad core. I have never seen > any process get near memory exhaustion, which I could believe the calls > to 'malloc' could slow down but.. yes. Python's GC takes time roughly proportional to the number of live objects. The GC is called every 700 allocations (7000, 70000 for older generations), so allocation also takes time proportional to number of live objects. |
From: David T. <no...@op...> - 2009-07-22 21:04:12
|
On Wed, 2009-07-22 at 16:16 -0400, David Turner wrote: > On Wed, 2009-07-22 at 15:03 -0400, Stef Telford wrote: > > Oleg Broytmann wrote: > > > On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: > > > > > >>>> yes. evals appear to be a 'bad' thing here :\ > > >>>> > > >>> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they > > >>> work once for every column in the table, but that's all. After the class > > >>> has been created and populated - whatever you do with rows (class instances) > > >>> - those evals are not executed. > > >>> > > >>> > > >> Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the > > >> instantiations get .. well .. 'slower' over time. > > >> > > > > > > Curiouser and curiouser. IWBN to find where the slowness is *near the > > > end* of the loop - i.e. when instantiation becomes really slow. > > > > > > > > It could be purely a 'feeling' .. I don't have any numbers to back it up > > and I am not entirely sure -how- to benchmark that. This machine does > > have 8gb of ram in it, and a fairly beefy quad core. I have never seen > > any process get near memory exhaustion, which I could believe the calls > > to 'malloc' could slow down but.. yes. > > Python's GC takes time roughly proportional to the number of live > objects. The GC is called every 700 allocations (7000, 70000 for older > generations), so allocation also takes time proportional to number of > live objects. http://www.gossamer-threads.com/lists/python/dev/658396?page=last <-- see this thread, for instance. |
From: Stef T. <st...@um...> - 2009-07-22 21:10:43
|
David Turner wrote: > On Wed, 2009-07-22 at 16:16 -0400, David Turner wrote: > >> On Wed, 2009-07-22 at 15:03 -0400, Stef Telford wrote: >> >>> Oleg Broytmann wrote: >>> >>>> On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: >>>> >>>> >>>>>>> yes. evals appear to be a 'bad' thing here :\ >>>>>>> >>>>>>> >>>>>> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they >>>>>> work once for every column in the table, but that's all. After the class >>>>>> has been created and populated - whatever you do with rows (class instances) >>>>>> - those evals are not executed. >>>>>> >>>>>> >>>>>> >>>>> Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the >>>>> instantiations get .. well .. 'slower' over time. >>>>> >>>>> >>>> Curiouser and curiouser. IWBN to find where the slowness is *near the >>>> end* of the loop - i.e. when instantiation becomes really slow. >>>> >>>> >>>> >>> It could be purely a 'feeling' .. I don't have any numbers to back it up >>> and I am not entirely sure -how- to benchmark that. This machine does >>> have 8gb of ram in it, and a fairly beefy quad core. I have never seen >>> any process get near memory exhaustion, which I could believe the calls >>> to 'malloc' could slow down but.. yes. >>> >> Python's GC takes time roughly proportional to the number of live >> objects. The GC is called every 700 allocations (7000, 70000 for older >> generations), so allocation also takes time proportional to number of >> live objects. >> > > http://www.gossamer-threads.com/lists/python/dev/658396?page=last <-- > see this thread, for instance. > > Hrm.. that would definitely explain and contribute to the slow down after a lot of objects on the page (>100k). I haven't seen anyway around the GC in python though, other than the folks at unladen-swallow trying to re-implement with a 'better' GC. Hrm, I wonder if IronPython could run webware+sqlobject .. hrm. Regards Stef |
From: Oleg B. <ph...@ph...> - 2009-07-22 21:56:14
|
On Wed, Jul 22, 2009 at 05:18:46PM -0400, Stef Telford wrote: >>> Ordered by: cumulative time >>> >>> ncalls tottime percall cumtime percall filename:lineno(function) >>> 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) >>> 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) >> >> Hmm. Does the program try to draw the entire table in one huge slurp? >> > Actually, no. This page does 40k of bookings (a single object type) but > the query isn't "Sequential" (it's not grabbing 0-40,000 is what I > mean). Hmm. One call to SelectResults.__iter__ that spans the entire program's lifetime. "One huge SELECT" is the only way I can interpret this. > I can do a page which has lots of different object types if this > would help ? I doubt that would make difference. >>> 40000 0.375 0.000 39.282 0.001 main.py:872(get) >>> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> >> Half of the time the program spends drawing rows one by one. This >> probably could be optimized by using fetchmany or fetchall. >> > Noted. Let me try this later tonight when I have some spare cycles :) It's not that easy. Changing fetchone to fetchmany required an additional loop over the results in Iteration.next - similar to what I did in InheritableIteration.next. >> My guess is that those Decimal.__new__ calls are inside DB API driver, >> and DecimalCol.to_python gets Decimal and returns it unchanged. This means >> that the lines >> >>> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >>> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> >> should be read as follows: fetchone draws a row and converts values to >> Decimal, so 29.2 s are really a part of 39.2, and fetchone only waited for >> DB for 10 seconds. >> > 10seconds to fetch from the database is not bad (in my view). The 29s > for decimal is definitely 'killer' So we have 29 seconds in Decimal.__new__, 10 seconds in fetchone, and 23 seconds in _SO_selectInit. 63 seconds of 85... >>> 40000 0.214 0.000 23.323 0.001 main.py:912(_init) >>> 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) >> >> _SO_selectInit is a rather simple function - it populates the row with >> values just fetched from the DB; the values are converted using calls to >> to_python. Those to_python calls are fast (if we ca believe the profiler) - >> see above. So where does _SO_selectInit spend its time?! Well, it is certainly not evals. I also doubt it has something with garbage collection. Not in this function. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-23 01:39:24
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 05:18:46PM -0400, Stef Telford wrote: > >>>> Ordered by: cumulative time >>>> >>>> ncalls tottime percall cumtime percall filename:lineno(function) >>>> 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) >>>> 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) >>>> >>> Hmm. Does the program try to draw the entire table in one huge slurp? >>> >>> >> Actually, no. This page does 40k of bookings (a single object type) but >> the query isn't "Sequential" (it's not grabbing 0-40,000 is what I >> mean). >> > > Hmm. One call to SelectResults.__iter__ that spans the entire program's > lifetime. "One huge SELECT" is the only way I can interpret this. > > Hello Oleg, Sorry, I should have been more precise in my statement. The bookings -are- returned from one select/query via cursor but the bookings are not contiguous (eg, booking 300, booking 310, booking 423, etc). This said, I tried the select in psql and it worked in 19msec and then I tried from psycopg2 cursor and it took 3 seconds.. so.. I have to ask, why is it taking 23sec ? > So we have 29 seconds in Decimal.__new__, 10 seconds in fetchone, and 23 > seconds in _SO_selectInit. 63 seconds of 85... > > yup, and the next two are the notify method (which is similiar in way to publish/subscribers that I have added into sqlobject) and framestack (which is used in the notify method to stop circular references). I could rip those out but.. :\ >>>> 40000 0.214 0.000 23.323 0.001 main.py:912(_init) >>>> 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) >>>> >>> _SO_selectInit is a rather simple function - it populates the row with >>> values just fetched from the DB; the values are converted using calls to >>> to_python. Those to_python calls are fast (if we ca believe the profiler) - >>> see above. So where does _SO_selectInit spend its time?! >>> > > Well, it is certainly not evals. I also doubt it has something with > garbage collection. Not in this function. > > hrm. could it be tied to the Decimal init's ? I mean to say, if we are seeing that those take a while to create, and each object has 10-14 Decimals inside it.. hrm Regards Stef |
From: Stef T. <st...@um...> - 2009-07-30 13:09:50
|
Oleg Broytmann wrote: > On Wed, Jul 29, 2009 at 10:46:19PM -0400, Stef Telford wrote: > >> Which would -definitely- seem to suggest Decimal is gawd awful slow. >> The entire program finished in roughly 1/8th of the time. True, memory >> consumption went from around 880mb for 220k objects upto 1.4gb for the >> same 220k objects. However, look at that speed gain. It's insane. >> > > Trading speed for memory. Pretty crowded market. ;) > > Oleg. > Morning Oleg, Yeah.. I would think so. My hack currently consists of ripping out the Decimal stuff mostly from col.py as well as the two line change inside psycopg2 .. and that's where I think we may have a problem. The Psycopg2 guys are probably -not- interested in changing over from Decimal to gmpy, even for speed. Which makes me wonder.. should I maintain these patches for myself ? Let me re-ask that question; is there any point in submitting patches to use gmpy to you if to get things working requires a patched psycopg2 (which will probably never go 'mainstream') ? The difference -is- like night and day though. The process that used to take 26 hours to run now takes around 4 hours. ~Slight~ difference in speed ;) Regards Stef (ps. I am also pretty certain that more speed can be ecked out of the system, although I doubt with such gains ;) |
From: Oleg B. <ph...@ph...> - 2009-07-30 15:06:58
|
On Thu, Jul 30, 2009 at 09:09:25AM -0400, Stef Telford wrote: > My hack currently consists of ripping out > the Decimal stuff mostly from col.py as well as the two line change > inside psycopg2 .. and that's where I think we may have a problem. The > Psycopg2 guys are probably -not- interested in changing over from > Decimal to gmpy, even for speed. Which makes me wonder.. should I > maintain these patches for myself ? > > Let me re-ask that question; is there any point in submitting patches > to use gmpy to you if to get things working requires a patched psycopg2 Can the patching be done on the fly (monkey-patching)? Even better - is there an adapter that can be unregister and reregister to gmpy? This way you can have a configuration flag for SQLObject that replaces Decimal with gmpy; SQLObject with this configuration flag reregisters the adaptor for you. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-30 15:44:50
|
Oleg Broytmann wrote: > On Thu, Jul 30, 2009 at 09:09:25AM -0400, Stef Telford wrote: > >> My hack currently consists of ripping out >> the Decimal stuff mostly from col.py as well as the two line change >> inside psycopg2 .. and that's where I think we may have a problem. The >> Psycopg2 guys are probably -not- interested in changing over from >> Decimal to gmpy, even for speed. Which makes me wonder.. should I >> maintain these patches for myself ? >> >> Let me re-ask that question; is there any point in submitting patches >> to use gmpy to you if to get things working requires a patched psycopg2 >> > > Can the patching be done on the fly (monkey-patching)? Even better - is > there an adapter that can be unregister and reregister to gmpy? This way > you can have a configuration flag for SQLObject that replaces Decimal with > gmpy; SQLObject with this configuration flag reregisters the adaptor for > you. > > Oleg. > Hello again Oleg, So, the specific changes in psycopg2 come down to; cmpadm@test:~/psycopg2-2.0.11$ diff ./psycopg/psycopgmodule.c ../psycopg2-2.0.11-new/psycopg/psycopgmodule.c 595c595 < decimal = PyImport_ImportModule("decimal"); --- > decimal = PyImport_ImportModule("gmpy"); 597c597 < decimalType = PyObject_GetAttrString(decimal, "Decimal"); --- > decimalType = PyObject_GetAttrString(decimal, "mpq"); Kind of a 'no-brainer' there but, since psycopg is a compiled extension, I don't see how we can monkey patch the C lib :( I -can- fling across my kludges off list for col.py and converters.py if you want ? Lastly, I have also hacked my version of SQLObject to disable the local cache, so that SO can use memcache (by subclassing the sqlobject and then making custom get/setattr methods). It would be nice if there was a way to toggle the cache on/off at the SO level (if there is, I haven't found it sadly). Regards Stef |
From: Oleg B. <ph...@ph...> - 2009-07-30 17:43:52
|
On Thu, Jul 30, 2009 at 11:44:31AM -0400, Stef Telford wrote: > < decimal = PyImport_ImportModule("decimal"); > --- > > decimal = PyImport_ImportModule("gmpy"); > > Kind of a 'no-brainer' there but, since psycopg is a compiled > extension, I don't see how we can monkey patch the C lib :( Well, I think it's too low-level. I am sure psycopg is better than that. Python is dynamic, and I'm sure it's possible to find a dynamic way around builtin module names. Let's look around. import psycopg2._psycopg; print dir(psycopg2._psycopg)... aha, types and adapters. There is a DECIMAL type converter, and there is a psycopg2._psycopg.string_types dict where the converter is registered under the key 1700. What 1700? grep 1700 * pgtypes.h:#define NUMERICOID 1700 typecast_builtins.c:static long int typecast_NUMBER_types[] = {20, 23, 21, 701, 700, 1700, 0}; typecast_builtins.c:static long int typecast_DECIMAL_types[] = {1700, 0}; Seems 1700 is a builtin index for DECIMAL type converter. I don't know how exactly type converters are used; I am sure you can find this out by reading psycopgmodule.c or by asking a psycopg mailing list. Probably you need to replace DECIMAL or 1700 type converter, or both. Something like psycopg2._psycopg.DECIMAL = my_decimal_converter or psycopg2._psycopg.string_types[1700] = my_decimal_converter Exact details will require a lot of homework, sure. > Lastly, I have also hacked my version of SQLObject to disable the > local cache, so that SO can use memcache (by subclassing the sqlobject > and then making custom get/setattr methods). It would be nice if there > was a way to toggle the cache on/off at the SO level (if there is, I > haven't found it sadly). There is no way currently. Why not to add it? Extend your hack so it'd be possible to configure caching using a flag in sqlmeta; let's call it useCache = True/False (defaults to True). Publish your patch. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2009-07-31 10:42:03
|
On Thu, Jul 30, 2009 at 09:10:31PM -0400, Stef Telford wrote: > disable the writelocks. Why?! Locks protect critical data in multithreading programs, they shouldn't be disabled. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2010-02-01 20:52:22
|
On Mon, Feb 01, 2010 at 01:40:51PM -0500, Stef Telford wrote: > *) subclass the SQLObject into another class (which I called > CompassObject, since our reservation system is called Compass ;) and > overloaded _SO_setValue to do a delete inside memcache on change and To delete the instance from the cache to make the cache coherent? > overloaded 'get' such that it looks up memcache. I have to del/create I am going to do it in steps. The first step - to make SQLObject instances pickleable. MemcachedCache would be the next. But I am going to develop a separate cache class, not hack into .get machinery. > _SO_writeLock, sqlmeta and _connection on each object when it's returned. _connection? Do you mean per-instance connection? In general, the connection is stored in the SQLObject table class and it doesn't need to be pickled or restored. But how do you restore per-instance connection? > Also, in a totally random thought, you will probably hit the 1mb > memcache limit when storing objects. What's the limit? Oleg. -- Oleg Broytman http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2010-02-05 14:10:59
|
On Mon, Feb 01, 2010 at 11:43:09PM +0300, Oleg Broytman wrote: > I am going to do it in steps. The first step - to make SQLObject > instances pickleable. Committed to the trunk in the revision 4099. Oleg. -- Oleg Broytman http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2010-02-01 19:07:44
|
Morning Oleg, Everyone, I found that to get memcache working, I had do two things; *) get rid of the cache writelocks inside main.py (eg; remark out any val._SO_writeLock.acquire()) mostly due to explosions whilst pickling *) subclass the SQLObject into another class (which I called CompassObject, since our reservation system is called Compass ;) and overloaded _SO_setValue to do a delete inside memcache on change and overloaded 'get' such that it looks up memcache. I have to del/create _SO_writeLock, sqlmeta and _connection on each object when it's returned. Note that if someone was calling a series of 'setValue' instead of one large 'set', this would cause a lot of cache deletes and then misses when it tries to set. Thankfully, we have already audited the codebase for that (it was about a 3-4% "win" for us to change it over). Also, in a totally random thought, you will probably hit the 1mb memcache limit when storing objects. Since we are storing objects without blobs, it works. The only real workaround if people start to complain about memcache 'not storing their objects' is to either go Tokyo Tyrant or Redis. If you go Redis, you also end up with more speed and m->s replication for free ;). I am more than willing to help and/or bounce ideas off if you want. The solution I have may not be 'perfect' but it's definitely performant enough for our needs :D Regards Stef On 02/01/2010 12:05 PM, Oleg Broytman wrote: > Hello. I have returned to investigate the issue. Are you still interested? > > On Fri, Jul 31, 2009 at 01:10:34PM -0400, Stef Telford wrote: > >> Okay.. so.. assuming I can get a writeLock again after del'ing it.. >> we should be good there. If you can let me know exactly -what- in the >> sqlmeta blows up, that may also help matters. The connection surely >> explodes (And rightly so I think). >> > The problem with sqlmeta is that it's a class generated on the fly. It > doesn't even have a module - and pickle protocol prefers stable classes, > with proper modules and names. > But I think it's easy to deal with it anyway. We don't need to pickle > sqlmeta *class* - the class is in the corresponding SQLObject class. We > probably even don't need to pickle sqlmeta instance - we can just recreate > it on unpickling 'cause its only instance data is a reference to SQLObject > instance and we know the instance - at the time of unpickling it's 'self' > so we can reconstruct a sqlmeta instance using the known class and known > instance data. Just for the (rare) case a user would have put some data > into the instance I can save sqlmeta.__dict__ and restore it on unpickling. > I am going to add the corresponding __getstate__ and __setstate__ to > SQLObject and test them a bit. > > Oleg. > |
From: Oleg B. <ph...@ph...> - 2010-02-01 17:06:01
|
Hello. I have returned to investigate the issue. Are you still interested? On Fri, Jul 31, 2009 at 01:10:34PM -0400, Stef Telford wrote: > Okay.. so.. assuming I can get a writeLock again after del'ing it.. > we should be good there. If you can let me know exactly -what- in the > sqlmeta blows up, that may also help matters. The connection surely > explodes (And rightly so I think). The problem with sqlmeta is that it's a class generated on the fly. It doesn't even have a module - and pickle protocol prefers stable classes, with proper modules and names. But I think it's easy to deal with it anyway. We don't need to pickle sqlmeta *class* - the class is in the corresponding SQLObject class. We probably even don't need to pickle sqlmeta instance - we can just recreate it on unpickling 'cause its only instance data is a reference to SQLObject instance and we know the instance - at the time of unpickling it's 'self' so we can reconstruct a sqlmeta instance using the known class and known instance data. Just for the (rare) case a user would have put some data into the instance I can save sqlmeta.__dict__ and restore it on unpickling. I am going to add the corresponding __getstate__ and __setstate__ to SQLObject and test them a bit. Oleg. -- Oleg Broytman http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2009-07-28 14:56:33
|
On Wed, Jul 22, 2009 at 09:32:18PM -0400, Stef Telford wrote: > The bookings > -are- returned from one select/query via cursor but the bookings are not > contiguous (eg, booking 300, booking 310, booking 423, etc). This said, > I tried the select in psql and it worked in 19msec and then I tried from > psycopg2 cursor and it took 3 seconds.. so.. I have to ask, why is it > taking 23sec ? Can I ask you do an experiment with a different program? What if you use sqlbuilder.Select() - a lower-level interface? How long it'd take to draw all these rows? >>>> _SO_selectInit is a rather simple function - it populates the row with >>>> values just fetched from the DB; the values are converted using calls to >>>> to_python. Those to_python calls are fast (if we ca believe the profiler) - >>>> see above. So where does _SO_selectInit spend its time?! >> > hrm. could it be tied to the Decimal init's ? I mean to say, if we are > seeing that those take a while to create, and each object has 10-14 > Decimals inside it.. hrm But those __new__/__init__ must be called from .to_python(), and the profiler has shown those .to_python() took very little time. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-28 20:04:44
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 09:32:18PM -0400, Stef Telford wrote: > >> The bookings >> -are- returned from one select/query via cursor but the bookings are not >> contiguous (eg, booking 300, booking 310, booking 423, etc). This said, >> I tried the select in psql and it worked in 19msec and then I tried from >> psycopg2 cursor and it took 3 seconds.. so.. I have to ask, why is it >> taking 23sec ? >> > > Can I ask you do an experiment with a different program? What if you use > sqlbuilder.Select() - a lower-level interface? How long it'd take to draw > all these rows? > > Hello again Oleg, Sorry, I have almost -0- experience with that low a level .. do you have any nice canned example I could tailor to suit ? >>>>> _SO_selectInit is a rather simple function - it populates the row with >>>>> values just fetched from the DB; the values are converted using calls to >>>>> to_python. Those to_python calls are fast (if we ca believe the profiler) - >>>>> see above. So where does _SO_selectInit spend its time?! >>>>> >> hrm. could it be tied to the Decimal init's ? I mean to say, if we are >> seeing that those take a while to create, and each object has 10-14 >> Decimals inside it.. hrm >> > > But those __new__/__init__ must be called from .to_python(), and the > profiler has shown those .to_python() took very little time. > > It's definitely strange, however, I don't know that the profiling could really be 'wrong'. I am as stumped as you are sadly :( Hopefully, if you can fling me an example of how to do the 'lower level' select, then we can get more profiling from that perhaps ? Regards Stef > Oleg. > |