sqlobject-discuss Mailing List for SQLObject (Page 51)
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
You can subscribe to this list here.
2003 |
Jan
|
Feb
(2) |
Mar
(43) |
Apr
(204) |
May
(208) |
Jun
(102) |
Jul
(113) |
Aug
(63) |
Sep
(88) |
Oct
(85) |
Nov
(95) |
Dec
(62) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
2004 |
Jan
(38) |
Feb
(93) |
Mar
(125) |
Apr
(89) |
May
(66) |
Jun
(65) |
Jul
(53) |
Aug
(65) |
Sep
(79) |
Oct
(60) |
Nov
(171) |
Dec
(176) |
2005 |
Jan
(264) |
Feb
(260) |
Mar
(145) |
Apr
(153) |
May
(192) |
Jun
(166) |
Jul
(265) |
Aug
(340) |
Sep
(300) |
Oct
(469) |
Nov
(316) |
Dec
(235) |
2006 |
Jan
(236) |
Feb
(156) |
Mar
(229) |
Apr
(221) |
May
(257) |
Jun
(161) |
Jul
(97) |
Aug
(169) |
Sep
(159) |
Oct
(400) |
Nov
(136) |
Dec
(134) |
2007 |
Jan
(152) |
Feb
(101) |
Mar
(115) |
Apr
(120) |
May
(129) |
Jun
(82) |
Jul
(118) |
Aug
(82) |
Sep
(30) |
Oct
(101) |
Nov
(137) |
Dec
(53) |
2008 |
Jan
(83) |
Feb
(139) |
Mar
(55) |
Apr
(69) |
May
(82) |
Jun
(31) |
Jul
(66) |
Aug
(30) |
Sep
(21) |
Oct
(37) |
Nov
(41) |
Dec
(65) |
2009 |
Jan
(69) |
Feb
(46) |
Mar
(22) |
Apr
(20) |
May
(39) |
Jun
(30) |
Jul
(36) |
Aug
(58) |
Sep
(38) |
Oct
(20) |
Nov
(10) |
Dec
(11) |
2010 |
Jan
(24) |
Feb
(63) |
Mar
(22) |
Apr
(72) |
May
(8) |
Jun
(13) |
Jul
(35) |
Aug
(23) |
Sep
(12) |
Oct
(26) |
Nov
(11) |
Dec
(30) |
2011 |
Jan
(15) |
Feb
(44) |
Mar
(36) |
Apr
(26) |
May
(27) |
Jun
(10) |
Jul
(28) |
Aug
(12) |
Sep
|
Oct
|
Nov
(17) |
Dec
(16) |
2012 |
Jan
(12) |
Feb
(31) |
Mar
(23) |
Apr
(14) |
May
(10) |
Jun
(26) |
Jul
|
Aug
(2) |
Sep
(2) |
Oct
(1) |
Nov
|
Dec
(6) |
2013 |
Jan
(4) |
Feb
(5) |
Mar
|
Apr
(4) |
May
(13) |
Jun
(7) |
Jul
(5) |
Aug
(15) |
Sep
(25) |
Oct
(18) |
Nov
(7) |
Dec
(3) |
2014 |
Jan
(1) |
Feb
(5) |
Mar
|
Apr
(3) |
May
(3) |
Jun
(2) |
Jul
(4) |
Aug
(5) |
Sep
|
Oct
(11) |
Nov
|
Dec
(62) |
2015 |
Jan
(8) |
Feb
(3) |
Mar
(15) |
Apr
|
May
|
Jun
(6) |
Jul
|
Aug
(6) |
Sep
|
Oct
|
Nov
|
Dec
(19) |
2016 |
Jan
(2) |
Feb
|
Mar
(2) |
Apr
(4) |
May
(3) |
Jun
(7) |
Jul
(14) |
Aug
(13) |
Sep
(6) |
Oct
(2) |
Nov
(3) |
Dec
|
2017 |
Jan
(6) |
Feb
(14) |
Mar
(2) |
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(4) |
Nov
(3) |
Dec
|
2018 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
(1) |
Mar
|
Apr
(44) |
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2020 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
(1) |
2021 |
Jan
|
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2022 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
|
Dec
(1) |
2023 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
|
Oct
(1) |
Nov
(2) |
Dec
|
2024 |
Jan
|
Feb
|
Mar
|
Apr
(4) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(1) |
2025 |
Jan
|
Feb
(1) |
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Stef T. <st...@um...> - 2009-07-30 15:44:50
|
Oleg Broytmann wrote: > On Thu, Jul 30, 2009 at 09:09:25AM -0400, Stef Telford wrote: > >> My hack currently consists of ripping out >> the Decimal stuff mostly from col.py as well as the two line change >> inside psycopg2 .. and that's where I think we may have a problem. The >> Psycopg2 guys are probably -not- interested in changing over from >> Decimal to gmpy, even for speed. Which makes me wonder.. should I >> maintain these patches for myself ? >> >> Let me re-ask that question; is there any point in submitting patches >> to use gmpy to you if to get things working requires a patched psycopg2 >> > > Can the patching be done on the fly (monkey-patching)? Even better - is > there an adapter that can be unregister and reregister to gmpy? This way > you can have a configuration flag for SQLObject that replaces Decimal with > gmpy; SQLObject with this configuration flag reregisters the adaptor for > you. > > Oleg. > Hello again Oleg, So, the specific changes in psycopg2 come down to; cmpadm@test:~/psycopg2-2.0.11$ diff ./psycopg/psycopgmodule.c ../psycopg2-2.0.11-new/psycopg/psycopgmodule.c 595c595 < decimal = PyImport_ImportModule("decimal"); --- > decimal = PyImport_ImportModule("gmpy"); 597c597 < decimalType = PyObject_GetAttrString(decimal, "Decimal"); --- > decimalType = PyObject_GetAttrString(decimal, "mpq"); Kind of a 'no-brainer' there but, since psycopg is a compiled extension, I don't see how we can monkey patch the C lib :( I -can- fling across my kludges off list for col.py and converters.py if you want ? Lastly, I have also hacked my version of SQLObject to disable the local cache, so that SO can use memcache (by subclassing the sqlobject and then making custom get/setattr methods). It would be nice if there was a way to toggle the cache on/off at the SO level (if there is, I haven't found it sadly). Regards Stef |
From: Stef T. <st...@um...> - 2009-07-30 15:41:17
|
Simon Cross wrote: > On Thu, Jul 30, 2009 at 3:09 PM, Stef Telford<st...@um...> wrote: > >> Let me re-ask that question; is there any point in submitting >> patches to use gmpy to you if to get things working requires a patched >> psycopg2 (which will probably never go 'mainstream') ? >> > > I think your best bet is going to be to attempt to get a fast C > implementation of Decimal accepted into Python itself. There has > already been some work on this which is outlined in the Python bug > tracker [1] and given that there was talk of accepting the patches for > 3.1 there is some hope that you could get them into 2.7 / 3.2. > > [1] http://bugs.python.org/issue2486 > > Schiavo > Simon > > ------------------------------------------------------------------------------ > Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day > trial. Simplify your report design, integration and deployment - and focus on > what you do best, core application coding. Discover what's new with > Crystal Reports now. http://p.sf.net/sfu/bobj-july > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss > Hello Simon, Well.. sadly.. how realistic is this ? No offense meant to anyone, but, gmpy has been around since (at least) python 2.4 (perhaps earlier ? not sure 100%) and it still hasn't been accepted in. I do agree that fixing Decimal at a fundamental level would be MUCH nicer, but, the BDFL (Guido) seems to not be moving on it. That bug was opened on 2008-03 .. over a year ago. I think that the quickest (no pun intended) is to fix it in the adaptor if possible. Jst my 2c :) Regards Stef |
From: Oleg B. <ph...@ph...> - 2009-07-30 15:06:58
|
On Thu, Jul 30, 2009 at 09:09:25AM -0400, Stef Telford wrote: > My hack currently consists of ripping out > the Decimal stuff mostly from col.py as well as the two line change > inside psycopg2 .. and that's where I think we may have a problem. The > Psycopg2 guys are probably -not- interested in changing over from > Decimal to gmpy, even for speed. Which makes me wonder.. should I > maintain these patches for myself ? > > Let me re-ask that question; is there any point in submitting patches > to use gmpy to you if to get things working requires a patched psycopg2 Can the patching be done on the fly (monkey-patching)? Even better - is there an adapter that can be unregister and reregister to gmpy? This way you can have a configuration flag for SQLObject that replaces Decimal with gmpy; SQLObject with this configuration flag reregisters the adaptor for you. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Simon C. <hod...@gm...> - 2009-07-30 15:06:28
|
On Thu, Jul 30, 2009 at 3:09 PM, Stef Telford<st...@um...> wrote: > Let me re-ask that question; is there any point in submitting > patches to use gmpy to you if to get things working requires a patched > psycopg2 (which will probably never go 'mainstream') ? I think your best bet is going to be to attempt to get a fast C implementation of Decimal accepted into Python itself. There has already been some work on this which is outlined in the Python bug tracker [1] and given that there was talk of accepting the patches for 3.1 there is some hope that you could get them into 2.7 / 3.2. [1] http://bugs.python.org/issue2486 Schiavo Simon |
From: Stef T. <st...@um...> - 2009-07-30 13:09:50
|
Oleg Broytmann wrote: > On Wed, Jul 29, 2009 at 10:46:19PM -0400, Stef Telford wrote: > >> Which would -definitely- seem to suggest Decimal is gawd awful slow. >> The entire program finished in roughly 1/8th of the time. True, memory >> consumption went from around 880mb for 220k objects upto 1.4gb for the >> same 220k objects. However, look at that speed gain. It's insane. >> > > Trading speed for memory. Pretty crowded market. ;) > > Oleg. > Morning Oleg, Yeah.. I would think so. My hack currently consists of ripping out the Decimal stuff mostly from col.py as well as the two line change inside psycopg2 .. and that's where I think we may have a problem. The Psycopg2 guys are probably -not- interested in changing over from Decimal to gmpy, even for speed. Which makes me wonder.. should I maintain these patches for myself ? Let me re-ask that question; is there any point in submitting patches to use gmpy to you if to get things working requires a patched psycopg2 (which will probably never go 'mainstream') ? The difference -is- like night and day though. The process that used to take 26 hours to run now takes around 4 hours. ~Slight~ difference in speed ;) Regards Stef (ps. I am also pretty certain that more speed can be ecked out of the system, although I doubt with such gains ;) |
From: Oleg B. <ph...@ph...> - 2009-07-30 06:42:26
|
On Wed, Jul 29, 2009 at 10:46:19PM -0400, Stef Telford wrote: > Which would -definitely- seem to suggest Decimal is gawd awful slow. > The entire program finished in roughly 1/8th of the time. True, memory > consumption went from around 880mb for 220k objects upto 1.4gb for the > same 220k objects. However, look at that speed gain. It's insane. Trading speed for memory. Pretty crowded market. ;) Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-30 02:46:40
|
Hello Oleg, Everyone, So.. please find attached the results from the same lowlevel program as the last run. I changed the psycopg2 to use the gmpy.mpq types as opposed to the Decimal type. I also (of course) had to hack col.py to use gmpy.mpq's instead of Decimal.. After everything is said and done.. Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 1 23.905 23.905 23.905 23.905 {method 'fetchall' of 'psycopg2._psycopg.cursor' objects} 1 16.336 16.336 16.336 16.336 {method 'execute' of 'psycopg2._psycopg.cursor' objects} 1 11.495 11.495 52.079 52.079 t.lowlevel:3(<module>) 1 0.317 0.317 52.396 52.396 {execfile} 1 0.031 0.031 0.031 0.031 pgconnection.py:108(makeConnection) 371/62 0.015 0.000 0.050 0.001 sre_parse.py:385(_parse) 1 0.014 0.014 40.287 40.287 dbconnection.py:257(_runWithConnection) 4824 0.012 0.000 0.015 0.000 sre_parse.py:188(__next) 740/61 0.011 0.000 0.033 0.001 sre_compile.py:38(_compile) 261 0.010 0.000 0.011 0.000 {eval} 339 0.010 0.000 0.015 0.000 sre_compile.py:213(_optimize_charset) 1054/380 0.008 0.000 0.009 0.000 sre_parse.py:146(getwidth) 16310/15895 0.007 0.000 0.007 0.000 {len} 668 0.006 0.000 0.009 0.000 {built-in method sub} 12099 0.005 0.000 0.005 0.000 {method 'append' of 'list' objects} 4074 0.005 0.000 0.018 0.000 sre_parse.py:207(get) 1 0.005 0.005 0.033 0.033 converters.py:1(<module>) 43 0.004 0.000 0.004 0.000 sre_compile.py:264(_mk_bitmap) 130 0.004 0.000 0.018 0.000 pkg_resources.py:1645(find_on_path) Which would -definitely- seem to suggest Decimal is gawd awful slow. The entire program finished in roughly 1/8th of the time. True, memory consumption went from around 880mb for 220k objects upto 1.4gb for the same 220k objects. However, look at that speed gain. It's insane. So.. urm.. yeah. I honestly don't know what to say.. anyone else interested in an 8 fold speed improvement ? ;) Regards Stef Oleg Broytmann wrote: > On Tue, Jul 28, 2009 at 09:37:56PM -0400, Stef Telford wrote: > >> Here is the output from the lowlevel connection select as you >> suggested above (mostly) ordered by internal time. Note that it selects >> > > Thank you! > > >> all 220k bookings, as opposed to last time when it 'only' selected 40k. >> It seems that decimal.__new__ is killer .. I could be reading this wrong >> (of course) but.. the tottime would seem to back that up I think. >> > > It's the point that I wanted to prove, really - to time fetchall and > Decimal without SQLObject. > > >> 133167599 function calls (133165268 primitive calls) in 392.615 >> CPU seconds >> >> Ordered by: internal time >> > > >> ncalls tottime percall cumtime percall filename:lineno(function) >> > > >> 1 16.715 16.715 392.589 392.589 t.lowlevel:3(<module>) >> > > This is the entire script - 392 seconds. > > >> 1 53.209 53.209 295.902 295.902 {method 'fetchall' of >> 'psycopg2._psycopg.cursor' objects} >> > > 295 seconds in the most interesting part of the program. > > >> 9883851 90.187 0.000 242.693 0.000 decimal.py:515(__new__) >> > > 243 seconds were spent in Decimal.__new__. I.e., Decimal.__new__ is > called from DB API driver; so we can trust the profiler when it shown us > last time .to_python() calls were fast - they were really fast 'cause they > didn't have a need to call Decimal. > > >> 9883845 35.014 0.000 42.012 0.000 decimal.py:830(__str__) >> 9883845 21.096 0.000 63.108 0.000 decimal.py:825(__repr__) >> > > This is printing. > > >> 9884218 25.102 0.000 25.102 0.000 {built-in method match} >> 39536056 20.103 0.000 20.103 0.000 {built-in method group} >> 1 16.522 16.522 16.522 16.522 {method 'execute' of >> 'psycopg2._psycopg.cursor' objects} >> 9883846 6.461 0.000 6.461 0.000 {method 'lstrip' of 'str' >> objects} >> 9885976 4.971 0.000 4.971 0.000 {isinstance} >> > > And some internal stuff. > > Well, fetchall + print spans the entire program, so we can see > sqlbuilder is fast. With this (and with more trust in the profiler) we can > return to analyze SQLObject timing. > > Oleg. > |
From: Oleg B. <ph...@ph...> - 2009-07-29 07:44:03
|
On Tue, Jul 28, 2009 at 09:37:56PM -0400, Stef Telford wrote: > Here is the output from the lowlevel connection select as you > suggested above (mostly) ordered by internal time. Note that it selects Thank you! > all 220k bookings, as opposed to last time when it 'only' selected 40k. > It seems that decimal.__new__ is killer .. I could be reading this wrong > (of course) but.. the tottime would seem to back that up I think. It's the point that I wanted to prove, really - to time fetchall and Decimal without SQLObject. > 133167599 function calls (133165268 primitive calls) in 392.615 > CPU seconds > > Ordered by: internal time > ncalls tottime percall cumtime percall filename:lineno(function) > 1 16.715 16.715 392.589 392.589 t.lowlevel:3(<module>) This is the entire script - 392 seconds. > 1 53.209 53.209 295.902 295.902 {method 'fetchall' of > 'psycopg2._psycopg.cursor' objects} 295 seconds in the most interesting part of the program. > 9883851 90.187 0.000 242.693 0.000 decimal.py:515(__new__) 243 seconds were spent in Decimal.__new__. I.e., Decimal.__new__ is called from DB API driver; so we can trust the profiler when it shown us last time .to_python() calls were fast - they were really fast 'cause they didn't have a need to call Decimal. > 9883845 35.014 0.000 42.012 0.000 decimal.py:830(__str__) > 9883845 21.096 0.000 63.108 0.000 decimal.py:825(__repr__) This is printing. > 9884218 25.102 0.000 25.102 0.000 {built-in method match} > 39536056 20.103 0.000 20.103 0.000 {built-in method group} > 1 16.522 16.522 16.522 16.522 {method 'execute' of > 'psycopg2._psycopg.cursor' objects} > 9883846 6.461 0.000 6.461 0.000 {method 'lstrip' of 'str' > objects} > 9885976 4.971 0.000 4.971 0.000 {isinstance} And some internal stuff. Well, fetchall + print spans the entire program, so we can see sqlbuilder is fast. With this (and with more trust in the profiler) we can return to analyze SQLObject timing. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2009-07-29 07:21:59
|
On Wed, Jul 29, 2009 at 04:57:12PM +1200, Aaron Robinson wrote: > First off, the point you raised about setting many values at once; yes, I > was aware of this, but a lot of the things we seem to want to do are more > like we have a dict with strings for keys and associated values, and we want > to write them into the db in one go, where the strings are the column names > - is there's some way to do a myObject.set(), and pass in a dictionary? .set(**dict) > is there any way to do transactions with SQLObject's? transaction = connection.transaction() and use the transaction object everywhere instead of connection: sqlobject.sqlhub = transaction or class Table(SQLObject): _connection = transaction or Table.select(..., connection=transaction, ...) and so on. You don't need all at once - just choose a way you want it. Don't forget to do transaction.commit() at the end. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Aaron R. <aar...@mo...> - 2009-07-29 04:57:25
|
Hi Oleg, That's terrific - thanks very much for your response. First off, the point you raised about setting many values at once; yes, I was aware of this, but a lot of the things we seem to want to do are more like we have a dict with strings for keys and associated values, and we want to write them into the db in one go, where the strings are the column names - is there's some way to do a myObject.set(), and pass in a dictionary? (or some python semantic that yields the same result?.. I'm not sure). Okay, I completely understand the distinction between SQLObject and SQLBuilder now, and it makes perfect sense - thanks! Just by the way, is there any way to do transactions with SQLObject's? Thanks again, -Aaron. On Fri, Jul 24, 2009 at 7:59 PM, Oleg Broytmann <ph...@ph...> wrote: > On Fri, Jul 24, 2009 at 12:31:02PM +1200, Aaron Robinson wrote: > > if I set the value of a blob column by setting a > > varialbe on my object, it gets written to the db just fine, but when I do > it > > inside a transaction using sqlbuilder I have the error. Please see the > code > > below. > > > > Test Code: > > > > from sqlobject import * > > from array import * > > > > sqlhub.processConnection = > connectionForURI('postgres://user:pass@localhost > > /mydb') > > > > class MyObject(SQLObject): > > myInt = IntCol(default=42) > > myBlob = BLOBCol(default=None) > > > > MyObject.createTable() > > > > #This works: > > myObject = MyObject() > > myObject.myInt = 43 > > myObject.myBlob = array('H',range(256)).tostring() > > I hope you know you can set many values at once: > myObject.set(myInt=43, myBlob=array('H',range(256)).tostring()) > (That's offtopic, just a reminder.) > > > #This doesn't work: > > record = {'my_int':44, 'my_blob':array('H',range(256)).tostring()} > > conn = sqlhub.getConnection() > > tx = conn.transaction() > > tx.query(conn.sqlrepr(sqlbuilder.Insert(MyObject, [record]))) > > tx.commit() > > > > #psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0x80 > > #HINT: This error can also happen if the byte sequence does not match > the > > encoding expected by the server, which is controlled by > "client_encoding". > > First, this has nothing with transactions. You can do conn.query() here. > Second, SQLBuilder... SQLObject is a high-level object that does a lot of > things internally, but SQLBuilder is a low-level interface that knows > nothing about classes, columns, column types, conversion; the first > parameters should be string (table name), the values must be converted to > the DB format; for a BLOB it means you have to use psycopg.Binary wrapper. > This code works for my in Postgres and SQLite: > > record = {'my_int':44, > 'my_blob':conn.createBinary(array('H',range(256)).tostring())} > conn.query(conn.sqlrepr(sqlbuilder.Insert(MyObject.sqlmeta.table, > [record]))) > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > |
From: Stef T. <st...@um...> - 2009-07-29 01:44:45
|
Oleg Broytmann wrote: > On Tue, Jul 28, 2009 at 04:04:28PM -0400, Stef Telford wrote: > >> Oleg Broytmann wrote: >> >>> Can I ask you do an experiment with a different program? What if you use >>> sqlbuilder.Select() - a lower-level interface? How long it'd take to draw >>> all these rows? >>> >> Sorry, I have almost -0- experience with that low a level .. do you >> have any nice canned example I could tailor to suit ? >> > > Very easy: sqlbuilder.Select(list_of_columns), convert the expression to > a string (SQL query), execute the query and get back the results: > > connection = connectionForURI('...') > > class Test(SQLObject): > _connection = connection > a = IntCol() > b = IntCol() > > Test.createTable() > > Test(a=1, b=2) > Test(a=2, b=1) > > for row in connection.queryAll(connection.sqlrepr( > sqlbuilder.Select([Test.q.a, Test.q.b]))): > print row > > In case you want to pass the list of columns as a list of strings - pass > the list of tables for the FROM clause: > > Select(['a', 'b'], staticTables=['test']) > > Oleg. > Hello Oleg, Here is the output from the lowlevel connection select as you suggested above (mostly) ordered by internal time. Note that it selects all 220k bookings, as opposed to last time when it 'only' selected 40k. It seems that decimal.__new__ is killer .. I could be reading this wrong (of course) but.. the tottime would seem to back that up I think. 133167599 function calls (133165268 primitive calls) in 392.615 CPU seconds Ordered by: internal time ncalls tottime percall cumtime percall filename:lineno(function) 9883936 93.137 0.000 93.137 0.000 {built-in method __new__ of type object at 0x814fa00} 9883851 90.187 0.000 242.693 0.000 decimal.py:515(__new__) 1 53.209 53.209 295.902 295.902 {method 'fetchall' of 'psycopg2._psycopg.cursor' objects} 9883845 35.014 0.000 42.012 0.000 decimal.py:830(__str__) 9884218 25.102 0.000 25.102 0.000 {built-in method match} 9883845 21.096 0.000 63.108 0.000 decimal.py:825(__repr__) 39536056 20.103 0.000 20.103 0.000 {built-in method group} 1 16.715 16.715 392.589 392.589 t.lowlevel:3(<module>) 1 16.522 16.522 16.522 16.522 {method 'execute' of 'psycopg2._psycopg.cursor' objects} 24350308/24349893 9.742 0.000 9.742 0.000 {len} 9883846 6.461 0.000 6.461 0.000 {method 'lstrip' of 'str' objects} 9885976 4.971 0.000 4.971 0.000 {isinstance} 1 0.031 0.031 0.031 0.031 pgconnection.py:108(makeConnection) 1 0.026 0.026 392.615 392.615 {execfile} 371/62 0.015 0.000 0.048 0.001 sre_parse.py:385(_parse) 1 0.014 0.014 312.470 312.470 dbconnection.py:257(_runWithConnection) 4824 0.012 0.000 0.015 0.000 sre_parse.py:188(__next) 740/61 0.011 0.000 0.033 0.001 sre_compile.py:38(_compile) 261 0.010 0.000 0.011 0.000 {eval} 339 0.010 0.000 0.015 0.000 sre_compile.py:213(_optimize_charset) 1054/380 0.008 0.000 0.009 0.000 sre_parse.py:146(getwidth) 664 0.006 0.000 0.009 0.000 {built-in method sub} 12091 0.005 0.000 0.005 0.000 {method 'append' of 'list' objects} 4074 0.005 0.000 0.018 0.000 sre_parse.py:207(get) 43 0.004 0.000 0.004 0.000 sre_compile.py:264(_mk_bitmap) 128 0.004 0.000 0.017 0.000 pkg_resources.py:1645(find_on_path) 1 0.004 0.004 0.005 0.005 socket.py:43(<module>) 1 0.004 0.004 0.031 0.031 converters.py:1(<module>) 1 0.004 0.004 0.007 0.007 mimetypes.py:199(readfp) 6273 0.003 0.000 0.003 0.000 {method 'startswith' of 'str' objects} Regards Stef |
From: Oleg B. <ph...@ph...> - 2009-07-28 20:37:45
|
On Tue, Jul 28, 2009 at 04:04:28PM -0400, Stef Telford wrote: > Oleg Broytmann wrote: >> Can I ask you do an experiment with a different program? What if you use >> sqlbuilder.Select() - a lower-level interface? How long it'd take to draw >> all these rows? > > Sorry, I have almost -0- experience with that low a level .. do you > have any nice canned example I could tailor to suit ? Very easy: sqlbuilder.Select(list_of_columns), convert the expression to a string (SQL query), execute the query and get back the results: connection = connectionForURI('...') class Test(SQLObject): _connection = connection a = IntCol() b = IntCol() Test.createTable() Test(a=1, b=2) Test(a=2, b=1) for row in connection.queryAll(connection.sqlrepr( sqlbuilder.Select([Test.q.a, Test.q.b]))): print row In case you want to pass the list of columns as a list of strings - pass the list of tables for the FROM clause: Select(['a', 'b'], staticTables=['test']) Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-28 20:04:44
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 09:32:18PM -0400, Stef Telford wrote: > >> The bookings >> -are- returned from one select/query via cursor but the bookings are not >> contiguous (eg, booking 300, booking 310, booking 423, etc). This said, >> I tried the select in psql and it worked in 19msec and then I tried from >> psycopg2 cursor and it took 3 seconds.. so.. I have to ask, why is it >> taking 23sec ? >> > > Can I ask you do an experiment with a different program? What if you use > sqlbuilder.Select() - a lower-level interface? How long it'd take to draw > all these rows? > > Hello again Oleg, Sorry, I have almost -0- experience with that low a level .. do you have any nice canned example I could tailor to suit ? >>>>> _SO_selectInit is a rather simple function - it populates the row with >>>>> values just fetched from the DB; the values are converted using calls to >>>>> to_python. Those to_python calls are fast (if we ca believe the profiler) - >>>>> see above. So where does _SO_selectInit spend its time?! >>>>> >> hrm. could it be tied to the Decimal init's ? I mean to say, if we are >> seeing that those take a while to create, and each object has 10-14 >> Decimals inside it.. hrm >> > > But those __new__/__init__ must be called from .to_python(), and the > profiler has shown those .to_python() took very little time. > > It's definitely strange, however, I don't know that the profiling could really be 'wrong'. I am as stumped as you are sadly :( Hopefully, if you can fling me an example of how to do the 'lower level' select, then we can get more profiling from that perhaps ? Regards Stef > Oleg. > |
From: Oleg B. <ph...@ph...> - 2009-07-28 14:56:33
|
On Wed, Jul 22, 2009 at 09:32:18PM -0400, Stef Telford wrote: > The bookings > -are- returned from one select/query via cursor but the bookings are not > contiguous (eg, booking 300, booking 310, booking 423, etc). This said, > I tried the select in psql and it worked in 19msec and then I tried from > psycopg2 cursor and it took 3 seconds.. so.. I have to ask, why is it > taking 23sec ? Can I ask you do an experiment with a different program? What if you use sqlbuilder.Select() - a lower-level interface? How long it'd take to draw all these rows? >>>> _SO_selectInit is a rather simple function - it populates the row with >>>> values just fetched from the DB; the values are converted using calls to >>>> to_python. Those to_python calls are fast (if we ca believe the profiler) - >>>> see above. So where does _SO_selectInit spend its time?! >> > hrm. could it be tied to the Decimal init's ? I mean to say, if we are > seeing that those take a while to create, and each object has 10-14 > Decimals inside it.. hrm But those __new__/__init__ must be called from .to_python(), and the profiler has shown those .to_python() took very little time. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2009-07-24 07:59:53
|
On Fri, Jul 24, 2009 at 12:31:02PM +1200, Aaron Robinson wrote: > if I set the value of a blob column by setting a > varialbe on my object, it gets written to the db just fine, but when I do it > inside a transaction using sqlbuilder I have the error. Please see the code > below. > > Test Code: > > from sqlobject import * > from array import * > > sqlhub.processConnection = connectionForURI('postgres://user:pass@localhost > /mydb') > > class MyObject(SQLObject): > myInt = IntCol(default=42) > myBlob = BLOBCol(default=None) > > MyObject.createTable() > > #This works: > myObject = MyObject() > myObject.myInt = 43 > myObject.myBlob = array('H',range(256)).tostring() I hope you know you can set many values at once: myObject.set(myInt=43, myBlob=array('H',range(256)).tostring()) (That's offtopic, just a reminder.) > #This doesn't work: > record = {'my_int':44, 'my_blob':array('H',range(256)).tostring()} > conn = sqlhub.getConnection() > tx = conn.transaction() > tx.query(conn.sqlrepr(sqlbuilder.Insert(MyObject, [record]))) > tx.commit() > > #psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0x80 > #HINT: This error can also happen if the byte sequence does not match the > encoding expected by the server, which is controlled by "client_encoding". First, this has nothing with transactions. You can do conn.query() here. Second, SQLBuilder... SQLObject is a high-level object that does a lot of things internally, but SQLBuilder is a low-level interface that knows nothing about classes, columns, column types, conversion; the first parameters should be string (table name), the values must be converted to the DB format; for a BLOB it means you have to use psycopg.Binary wrapper. This code works for my in Postgres and SQLite: record = {'my_int':44, 'my_blob':conn.createBinary(array('H',range(256)).tostring())} conn.query(conn.sqlrepr(sqlbuilder.Insert(MyObject.sqlmeta.table, [record]))) Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Aaron R. <aar...@mo...> - 2009-07-24 00:31:15
|
Hi Oleg, Thanks for your quick response. In writing a short test program I think I may have come across the problem, although again I'm not familiar enough with SQLObject to know where to from here... It seems that if I set the value of a blob column by setting a varialbe on my object, it gets written to the db just fine, but when I do it inside a transaction using sqlbuilder I have the error. Please see the code below. Also, I'm at a complete loss as to what the server/client encodings are or even what version of sqlobject I'm using (it could be a couple of years old)... Obviously (based on the test program) I'm not setting the client encoding to anything other than the default... I've done a standard install of Postgres 8.3 and just went with the defaults there too... and I've had a look through the sqlobject code and can't find a version number anywhere. Could you give me a hint about where to look if this is important? Test Code: from sqlobject import * from array import * sqlhub.processConnection = connectionForURI('postgres://user:pass@localhost /mydb') class MyObject(SQLObject): myInt = IntCol(default=42) myBlob = BLOBCol(default=None) MyObject.createTable() #This works: myObject = MyObject() myObject.myInt = 43 myObject.myBlob = array('H',range(256)).tostring() #This doesn't work: record = {'my_int':44, 'my_blob':array('H',range(256)).tostring()} conn = sqlhub.getConnection() tx = conn.transaction() tx.query(conn.sqlrepr(sqlbuilder.Insert(MyObject, [record]))) tx.commit() #Traceback (most recent call last): # File "E:\SSL\Biometix\Code\Performix\trunk\demoBinaryAddError.py", line 28, in ? # tx.query(conn.sqlrepr(sqlbuilder.Insert(MyObject, [record]))) # File "E:\SSL\Biometix\Code\Performix\trunk\sqlobject\dbconnection.py", line 846, in query # return self._dbConnection._query(self._connection, s) # File "E:\SSL\Biometix\Code\Performix\trunk\sqlobject\dbconnection.py", line 339, in _query # self._executeRetry(conn, conn.cursor(), s) # File "E:\SSL\Biometix\Code\Performix\trunk\sqlobject\dbconnection.py", line 334, in _executeRetry # return cursor.execute(query) #psycopg2.DataError: invalid byte sequence for encoding "UTF8": 0x80 #HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". On Thu, Jul 23, 2009 at 9:45 PM, Oleg Broytmann <ph...@ph...> wrote: > On Thu, Jul 23, 2009 at 04:54:11PM +1200, Aaron Robinson wrote: > > I'm wanting to write some binary data to the database - specifically the > > data is produced by something like: > > array.array('H',[1,2,3]).tostring() > > and the database is Postgres 8.3 with a sqlobject.BLOBCol(default=None) > > column. > > > > Currently when I try this I receive this error: > > DataError: invalid byte sequence for encoding "UTF8": 0x83 > > HINT: This error can also happen if the byte sequence does not match > > the encoding expected by the server, which is controlled by > > "client_encoding". > > Ouch. Can you write a short test program that demonstrates the problem? > What version of SQLObject do you use? what are the encodings of client and > server? > > > I ran into a vaguely similar post, and the suggestion was to > > useUnicodeCol(), which I have tried and received the same result. > > UnicodeCol is for text, not binary data. > > > Basically the main focus here is on keeping the storage size in the DB as > > low as possible, as we want to store many groups of numbers (all below > > 65536), and there is no need (or desire) to store them in individual > > columns. > > There is also PickleCol - a subtype of BLOBCol that can store any > pickleable data - the column (un)pickles data as necessary. > A problem with the column (and your approach with array.tostring) is > that you cannot search or sort results by values in these columns. > > Oleg. > -- > Oleg Broytmann http://phd.pp.ru/ ph...@ph... > Programmers don't die, they just GOSUB without RETURN. > > > ------------------------------------------------------------------------------ > _______________________________________________ > sqlobject-discuss mailing list > sql...@li... > https://lists.sourceforge.net/lists/listinfo/sqlobject-discuss > |
From: Oleg B. <ph...@ph...> - 2009-07-23 09:45:15
|
On Thu, Jul 23, 2009 at 04:54:11PM +1200, Aaron Robinson wrote: > I'm wanting to write some binary data to the database - specifically the > data is produced by something like: > array.array('H',[1,2,3]).tostring() > and the database is Postgres 8.3 with a sqlobject.BLOBCol(default=None) > column. > > Currently when I try this I receive this error: > DataError: invalid byte sequence for encoding "UTF8": 0x83 > HINT: This error can also happen if the byte sequence does not match > the encoding expected by the server, which is controlled by > "client_encoding". Ouch. Can you write a short test program that demonstrates the problem? What version of SQLObject do you use? what are the encodings of client and server? > I ran into a vaguely similar post, and the suggestion was to > useUnicodeCol(), which I have tried and received the same result. UnicodeCol is for text, not binary data. > Basically the main focus here is on keeping the storage size in the DB as > low as possible, as we want to store many groups of numbers (all below > 65536), and there is no need (or desire) to store them in individual > columns. There is also PickleCol - a subtype of BLOBCol that can store any pickleable data - the column (un)pickles data as necessary. A problem with the column (and your approach with array.tostring) is that you cannot search or sort results by values in these columns. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Aaron R. <aar...@mo...> - 2009-07-23 05:52:45
|
Hi All, I'm new to SQLObject, but have picked through the documentation as much as possible and done numerous searches and am still at a loss... I'm wanting to write some binary data to the database - specifically the data is produced by something like: array.array('H',[1,2,3]).tostring() and the database is Postgres 8.3 with a sqlobject.BLOBCol(default=None) column. Currently when I try this I receive this error: DataError: invalid byte sequence for encoding "UTF8": 0x83 HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding". I ran into a vaguely similar post, and the suggestion was to useUnicodeCol(), which I have tried and received the same result. Basically the main focus here is on keeping the storage size in the DB as low as possible, as we want to store many groups of numbers (all below 65536), and there is no need (or desire) to store them in individual columns. Thanks in advance, -Aaron. |
From: Stef T. <st...@um...> - 2009-07-23 01:39:24
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 05:18:46PM -0400, Stef Telford wrote: > >>>> Ordered by: cumulative time >>>> >>>> ncalls tottime percall cumtime percall filename:lineno(function) >>>> 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) >>>> 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) >>>> >>> Hmm. Does the program try to draw the entire table in one huge slurp? >>> >>> >> Actually, no. This page does 40k of bookings (a single object type) but >> the query isn't "Sequential" (it's not grabbing 0-40,000 is what I >> mean). >> > > Hmm. One call to SelectResults.__iter__ that spans the entire program's > lifetime. "One huge SELECT" is the only way I can interpret this. > > Hello Oleg, Sorry, I should have been more precise in my statement. The bookings -are- returned from one select/query via cursor but the bookings are not contiguous (eg, booking 300, booking 310, booking 423, etc). This said, I tried the select in psql and it worked in 19msec and then I tried from psycopg2 cursor and it took 3 seconds.. so.. I have to ask, why is it taking 23sec ? > So we have 29 seconds in Decimal.__new__, 10 seconds in fetchone, and 23 > seconds in _SO_selectInit. 63 seconds of 85... > > yup, and the next two are the notify method (which is similiar in way to publish/subscribers that I have added into sqlobject) and framestack (which is used in the notify method to stop circular references). I could rip those out but.. :\ >>>> 40000 0.214 0.000 23.323 0.001 main.py:912(_init) >>>> 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) >>>> >>> _SO_selectInit is a rather simple function - it populates the row with >>> values just fetched from the DB; the values are converted using calls to >>> to_python. Those to_python calls are fast (if we ca believe the profiler) - >>> see above. So where does _SO_selectInit spend its time?! >>> > > Well, it is certainly not evals. I also doubt it has something with > garbage collection. Not in this function. > > hrm. could it be tied to the Decimal init's ? I mean to say, if we are seeing that those take a while to create, and each object has 10-14 Decimals inside it.. hrm Regards Stef |
From: Oleg B. <ph...@ph...> - 2009-07-22 21:56:14
|
On Wed, Jul 22, 2009 at 05:18:46PM -0400, Stef Telford wrote: >>> Ordered by: cumulative time >>> >>> ncalls tottime percall cumtime percall filename:lineno(function) >>> 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) >>> 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) >> >> Hmm. Does the program try to draw the entire table in one huge slurp? >> > Actually, no. This page does 40k of bookings (a single object type) but > the query isn't "Sequential" (it's not grabbing 0-40,000 is what I > mean). Hmm. One call to SelectResults.__iter__ that spans the entire program's lifetime. "One huge SELECT" is the only way I can interpret this. > I can do a page which has lots of different object types if this > would help ? I doubt that would make difference. >>> 40000 0.375 0.000 39.282 0.001 main.py:872(get) >>> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> >> Half of the time the program spends drawing rows one by one. This >> probably could be optimized by using fetchmany or fetchall. >> > Noted. Let me try this later tonight when I have some spare cycles :) It's not that easy. Changing fetchone to fetchmany required an additional loop over the results in Iteration.next - similar to what I did in InheritableIteration.next. >> My guess is that those Decimal.__new__ calls are inside DB API driver, >> and DecimalCol.to_python gets Decimal and returns it unchanged. This means >> that the lines >> >>> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >>> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> >> should be read as follows: fetchone draws a row and converts values to >> Decimal, so 29.2 s are really a part of 39.2, and fetchone only waited for >> DB for 10 seconds. >> > 10seconds to fetch from the database is not bad (in my view). The 29s > for decimal is definitely 'killer' So we have 29 seconds in Decimal.__new__, 10 seconds in fetchone, and 23 seconds in _SO_selectInit. 63 seconds of 85... >>> 40000 0.214 0.000 23.323 0.001 main.py:912(_init) >>> 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) >> >> _SO_selectInit is a rather simple function - it populates the row with >> values just fetched from the DB; the values are converted using calls to >> to_python. Those to_python calls are fast (if we ca believe the profiler) - >> see above. So where does _SO_selectInit spend its time?! Well, it is certainly not evals. I also doubt it has something with garbage collection. Not in this function. Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Oleg B. <ph...@ph...> - 2009-07-22 21:49:27
|
On Wed, Jul 22, 2009 at 03:03:52PM -0400, Stef Telford wrote: > surely, please find the output attached at the bottom. Thank you! > Ordered by: cumulative time > > ncalls tottime percall cumtime percall filename:lineno(function) > 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) > 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) Hmm. Does the program try to draw the entire table in one huge slurp? > 40000 0.375 0.000 39.282 0.001 main.py:872(get) > 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} Half of the time the program spends drawing rows one by one. This probably could be optimized by using fetchmany or fetchall. > 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) Third of the time - Decimal.__new__. There is something strange here (in the profiler output itself, I mean) - those Decimal calls are probably from DecimalCol.to_python, but the profiler didn't add those calls: > 1200000 1.856 0.000 2.769 0.000 col.py:1289(to_python) > 3362951 2.729 0.000 2.729 0.000 main.py:1673(instanceName) > 640000 0.999 0.000 1.471 0.000 col.py:657(to_python) My guess is that those Decimal.__new__ calls are inside DB API driver, and DecimalCol.to_python gets Decimal and returns it unchanged. This means that the lines > 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} > 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) should be read as follows: fetchone draws a row and converts values to Decimal, so 29.2 s are really a part of 39.2, and fetchone only waited for DB for 10 seconds. > 40000 0.214 0.000 23.323 0.001 main.py:912(_init) > 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) _SO_selectInit is a rather simple function - it populates the row with values just fetched from the DB; the values are converted using calls to to_python. Those to_python calls are fast (if we ca believe the profiler) - see above. So where does _SO_selectInit spend its time?! Oleg. -- Oleg Broytmann http://phd.pp.ru/ ph...@ph... Programmers don't die, they just GOSUB without RETURN. |
From: Stef T. <st...@um...> - 2009-07-22 21:18:57
|
Oleg Broytmann wrote: > On Wed, Jul 22, 2009 at 03:03:52PM -0400, Stef Telford wrote: > >> surely, please find the output attached at the bottom. >> > > Thank you! > > No, Honestly, Thank you :) >> Ordered by: cumulative time >> >> ncalls tottime percall cumtime percall filename:lineno(function) >> 1 0.127 0.127 83.346 83.346 sresults.py:175(__iter__) >> 40001 0.308 0.000 79.876 0.002 dbconnection.py:649(next) >> > > Hmm. Does the program try to draw the entire table in one huge slurp? > > Actually, no. This page does 40k of bookings (a single object type) but the query isn't "Sequential" (it's not grabbing 0-40,000 is what I mean). I can do a page which has lots of different object types if this would help ? >> 40000 0.375 0.000 39.282 0.001 main.py:872(get) >> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> > > Half of the time the program spends drawing rows one by one. This > probably could be optimized by using fetchmany or fetchall. > > Noted. Let me try this later tonight when I have some spare cycles :) >> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> > > Third of the time - Decimal.__new__. There is something strange here (in > the profiler output itself, I mean) - those Decimal calls are probably > from DecimalCol.to_python, but the profiler didn't add those calls: > > Urm. hrm. that's weird. I would have thought that they would have been added if they were called. I have read in more than a few places, that Decimal instantiation is slow when compared to float or gmpy :\ >> 1200000 1.856 0.000 2.769 0.000 col.py:1289(to_python) >> 3362951 2.729 0.000 2.729 0.000 main.py:1673(instanceName) >> 640000 0.999 0.000 1.471 0.000 col.py:657(to_python) >> > > My guess is that those Decimal.__new__ calls are inside DB API driver, > and DecimalCol.to_python gets Decimal and returns it unchanged. This means > that the lines > > >> 40002 10.018 0.000 39.252 0.001 {method 'fetchone' of 'psycopg2._psycopg.cursor' objects} >> 1840006 16.887 0.000 29.234 0.000 decimal.py:515(__new__) >> > > should be read as follows: fetchone draws a row and converts values to > Decimal, so 29.2 s are really a part of 39.2, and fetchone only waited for > DB for 10 seconds. > > 10seconds to fetch from the database is not bad (in my view). The 29s for decimal is definitely 'killer' >> 40000 0.214 0.000 23.323 0.001 main.py:912(_init) >> 40000 10.475 0.000 23.069 0.001 main.py:1140(_SO_selectInit) >> > > _SO_selectInit is a rather simple function - it populates the row with > values just fetched from the DB; the values are converted using calls to > to_python. Those to_python calls are fast (if we ca believe the profiler) - > see above. So where does _SO_selectInit spend its time?! > > I wish I knew. The object in question -does- have about 40 or 50 columns on it (don't ask.. lots of feature creep). I wonder if perhaps the number of columns plays into the _init time ? I take it that the class overview is cached, so that it only has to be parsed 'once'.. but what about if it's a class through a FK or M2M ? Is SQLObject able to find the pre-parsed class (if that makes sense) Sorry about all this, but I really am sorta hitting my head here. I mean, I -could- change Decimals to gmpy or something inside SO, but floats with their (0.1 + 0.1 + 0.1 - 0.3) fiasco is pretty much a non-starter for me I think. Hrm. As I said, probably not being much help here :) Regards Stef |
From: Stef T. <st...@um...> - 2009-07-22 21:10:43
|
David Turner wrote: > On Wed, 2009-07-22 at 16:16 -0400, David Turner wrote: > >> On Wed, 2009-07-22 at 15:03 -0400, Stef Telford wrote: >> >>> Oleg Broytmann wrote: >>> >>>> On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: >>>> >>>> >>>>>>> yes. evals appear to be a 'bad' thing here :\ >>>>>>> >>>>>>> >>>>>> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they >>>>>> work once for every column in the table, but that's all. After the class >>>>>> has been created and populated - whatever you do with rows (class instances) >>>>>> - those evals are not executed. >>>>>> >>>>>> >>>>>> >>>>> Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the >>>>> instantiations get .. well .. 'slower' over time. >>>>> >>>>> >>>> Curiouser and curiouser. IWBN to find where the slowness is *near the >>>> end* of the loop - i.e. when instantiation becomes really slow. >>>> >>>> >>>> >>> It could be purely a 'feeling' .. I don't have any numbers to back it up >>> and I am not entirely sure -how- to benchmark that. This machine does >>> have 8gb of ram in it, and a fairly beefy quad core. I have never seen >>> any process get near memory exhaustion, which I could believe the calls >>> to 'malloc' could slow down but.. yes. >>> >> Python's GC takes time roughly proportional to the number of live >> objects. The GC is called every 700 allocations (7000, 70000 for older >> generations), so allocation also takes time proportional to number of >> live objects. >> > > http://www.gossamer-threads.com/lists/python/dev/658396?page=last <-- > see this thread, for instance. > > Hrm.. that would definitely explain and contribute to the slow down after a lot of objects on the page (>100k). I haven't seen anyway around the GC in python though, other than the folks at unladen-swallow trying to re-implement with a 'better' GC. Hrm, I wonder if IronPython could run webware+sqlobject .. hrm. Regards Stef |
From: David T. <no...@op...> - 2009-07-22 21:04:12
|
On Wed, 2009-07-22 at 16:16 -0400, David Turner wrote: > On Wed, 2009-07-22 at 15:03 -0400, Stef Telford wrote: > > Oleg Broytmann wrote: > > > On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: > > > > > >>>> yes. evals appear to be a 'bad' thing here :\ > > >>>> > > >>> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they > > >>> work once for every column in the table, but that's all. After the class > > >>> has been created and populated - whatever you do with rows (class instances) > > >>> - those evals are not executed. > > >>> > > >>> > > >> Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the > > >> instantiations get .. well .. 'slower' over time. > > >> > > > > > > Curiouser and curiouser. IWBN to find where the slowness is *near the > > > end* of the loop - i.e. when instantiation becomes really slow. > > > > > > > > It could be purely a 'feeling' .. I don't have any numbers to back it up > > and I am not entirely sure -how- to benchmark that. This machine does > > have 8gb of ram in it, and a fairly beefy quad core. I have never seen > > any process get near memory exhaustion, which I could believe the calls > > to 'malloc' could slow down but.. yes. > > Python's GC takes time roughly proportional to the number of live > objects. The GC is called every 700 allocations (7000, 70000 for older > generations), so allocation also takes time proportional to number of > live objects. http://www.gossamer-threads.com/lists/python/dev/658396?page=last <-- see this thread, for instance. |
From: David T. <no...@op...> - 2009-07-22 20:43:35
|
On Wed, 2009-07-22 at 15:03 -0400, Stef Telford wrote: > Oleg Broytmann wrote: > > On Wed, Jul 22, 2009 at 02:26:57PM -0400, Stef Telford wrote: > > > >>>> yes. evals appear to be a 'bad' thing here :\ > >>>> > >>> Well, those evals are in sqlmeta.addColumn and .addJoin methods, so they > >>> work once for every column in the table, but that's all. After the class > >>> has been created and populated - whatever you do with rows (class instances) > >>> - those evals are not executed. > >>> > >>> > >> Ah. hrm. *rubs chin* perhaps it's not the evals then. It seems that the > >> instantiations get .. well .. 'slower' over time. > >> > > > > Curiouser and curiouser. IWBN to find where the slowness is *near the > > end* of the loop - i.e. when instantiation becomes really slow. > > > > > It could be purely a 'feeling' .. I don't have any numbers to back it up > and I am not entirely sure -how- to benchmark that. This machine does > have 8gb of ram in it, and a fairly beefy quad core. I have never seen > any process get near memory exhaustion, which I could believe the calls > to 'malloc' could slow down but.. yes. Python's GC takes time roughly proportional to the number of live objects. The GC is called every 700 allocations (7000, 70000 for older generations), so allocation also takes time proportional to number of live objects. |