[SQL-CVS] [ sqlobject-Support Requests-3366899 ] The speed of SQLObject is slower on the time
SQLObject is a Python ORM.
Brought to you by:
ianbicking,
phd
From: SourceForge.net <no...@so...> - 2011-07-20 13:33:18
|
Support Requests item #3366899, was opened at 2011-07-14 11:40 Message generated for change (Comment added) made by phd You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=540673&aid=3366899&group_id=74338 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: None >Status: Closed Priority: 5 Private: No Submitted By: Rosa Robles (marobles) Assigned to: Oleg Broytman (phd) Summary: The speed of SQLObject is slower on the time Initial Comment: I had a test of a SQLObject and I had found a strange behaviour of SQLObject. On the time, the respond time is lesser. Here, I put a example of test code : from sqlobject.sqlbuilder import * connection = connectionForURI('postgresql://test:test@localhost/test_sql_object') sqlhub.processConnection = connection class Person(SQLObject): firstName = StringCol() middleInitial = StringCol(length=1, default=None) lastName = StringCol() iteration = 0 while 1: start = datetime.now() person_list = [] for key in range(1000): person = Person.selectBy(firstName='name%s%s' % (iteration,key))[0] person_list.append(person) for person in person_list: person.expire() del person del person_list sqlhub.processConnection.cache.clear() end = datetime.now() print "TIME ",iteration,':' , end - start iteration = iteration + 1 In the result I can check that in each iteration the time is bigger. The first iteration has time of 1.37s, then the time increases and the 19 iteration has time of 5.83 s! TIME 0 : 0:00:01.376392 TIME 1 : 0:00:01.663250 TIME 2 : 0:00:01.978639 TIME 3 : 0:00:02.319939 TIME 4 : 0:00:02.618213 TIME 5 : 0:00:02.723846 TIME 6 : 0:00:02.584007 TIME 7 : 0:00:02.868877 TIME 8 : 0:00:03.051937 TIME 9 : 0:00:03.389757 TIME 10 : 0:00:03.503036 TIME 11 : 0:00:03.563154 TIME 12 : 0:00:04.191539 TIME 13 : 0:00:04.274464 TIME 14 : 0:00:04.813206 TIME 15 : 0:00:04.677002 TIME 16 : 0:00:04.995708 TIME 17 : 0:00:05.126811 TIME 18 : 0:00:05.571279 TIME 19 : 0:00:05.832840 Why is happend this? Thanks in advance for your attention. Cheers Rosa ---------------------------------------------------------------------- >Comment By: Oleg Broytman (phd) Date: 2011-07-20 17:33 Message: MySQL is different, AFAIU, because it has only client-side cursors instead of server-side; i.e. MySQLdb fetches all results even if SQLObject asks to fetch one row. ---------------------------------------------------------------------- Comment By: Rosa Robles (marobles) Date: 2011-07-20 10:32 Message: I didn't know profiler tool, thanks very much, Oleg!! I'm agree with you source of problem, in postgres it spends almost all time in method 'execute' of 'psycopg2._psycopg.cursor' objects however in mysql it spends almost all time in method 'query' of '_mysql.connection' objects. The problem is in the use of database, a lots of operations to database spends a lots time. Thanks very much, your help has been very useful to me. ---------------------------------------------------------------------- Comment By: Oleg Broytman (phd) Date: 2011-07-19 21:09 Message: Well, I ran it myself: python -m cProfile -s cumulative test_sql_object_by_object.py. The results is 5089432 function calls (4987267 primitive calls) in 82.498 CPU seconds Ordered by: cumulative time ncalls tottime percall cumtime percall filename:lineno(function) 1 0.001 0.001 82.498 82.498 {execfile} 1 0.001 0.001 82.497 82.497 test_sql_object_by_object.py:3(<module>) 1 0.324 0.324 82.115 82.115 test_sql_object_by_object.py:50(select_examples) 20000 0.295 0.000 77.778 0.004 sresults.py:126(__getitem__) 20000 0.442 0.000 75.925 0.004 sresults.py:175(__iter__) 20000 0.107 0.000 69.318 0.003 sresults.py:181(lazyIter) 20000 0.150 0.000 69.076 0.003 dbconnection.py:470(iterSelect) 20000 0.202 0.000 68.665 0.003 dbconnection.py:705(__init__) 20001 0.085 0.000 60.550 0.003 sqliteconnection.py:194(_executeRetry) 20001 60.465 0.003 60.465 0.003 {method 'execute' of 'pysqlite2.dbapi2.Cursor' objects} 20000 0.156 0.000 7.846 0.000 dbconnection.py:485(queryForSelect) 40000 0.332 0.000 5.945 0.000 dbconnection.py:719(next) 40000 0.170 0.000 5.601 0.000 dbconnection.py:670(sqlrepr) 140000/40000 0.584 0.000 5.430 0.000 converters.py:198(sqlrepr) 20000 1.305 0.000 4.802 0.000 sqlbuilder.py:613(__sqlrepr__) 20000 0.399 0.000 4.536 0.000 main.py:878(get) 20000 0.205 0.000 3.503 0.000 main.py:1373(selectBy) 20000 0.182 0.000 2.920 0.000 main.py:915(_init) 20000 0.655 0.000 2.735 0.000 sresults.py:42(queryForSelect) 140000 0.679 0.000 2.704 0.000 sqlbuilder.py:215(tablesUsedSet) 40000 0.931 0.000 2.620 0.000 sresults.py:9(__init__) 20000 0.440 0.000 2.373 0.000 main.py:1154(_SO_selectInit) 220568 0.660 0.000 1.967 0.000 {getattr} 80000 0.776 0.000 1.950 0.000 col.py:503(to_python) 140000 1.078 0.000 1.793 0.000 sqlbuilder.py:194(tablesUsedSet) 20000 0.465 0.000 1.699 0.000 dbconnection.py:641(_SO_columnClause) 20000 0.183 0.000 1.492 0.000 sresults.py:89(clone) 80000 0.446 0.000 1.431 0.000 sqlbuilder.py:391(__getattr__) 100024 0.597 0.000 1.173 0.000 dbconnection.py:896(__get__) 564108 1.073 0.000 1.073 0.000 {isinstance} The entire program took 80 seconds, of which 60 seconds were spend in method 'execute' of 'pysqlite2.dbapi2.Cursor' (I used sqlite). Not an SQLObject fault. You can run a profiler yourself sorting by different column. I'd like to hear your interpretation. ---------------------------------------------------------------------- Comment By: Oleg Broytman (phd) Date: 2011-07-19 19:21 Message: To stop cache culling I added connection.cache = CacheSet(cullFrequency=10**6) to your test program. It didn't help. So it seems cache culling is not the culprit. There is a need to do a deeper investigation. Can you run a python profiler over your test(s)? ---------------------------------------------------------------------- Comment By: Rosa Robles (marobles) Date: 2011-07-19 12:09 Message: if we review the results, after read a lots of records selectBy takes more times, in the first iteration is almost zero but in 20 iteration is bigger. If you do the example, and you takes the times without to call to expireAll() or clear(), you can check that the time c for iteration in range(20): start = datetime.now() for key in range(1000): person = Person.selectBy(firstName='name%s%s' % (iteration,key))[0] end = datetime.now() print "ITERATION ",iteration, "TIME", end-start Person.sqlmeta.expireAll() sqlhub.getConnection().expireAll() connection.cache.clear() ---------------------------------------------------------------------- Comment By: Oleg Broytman (phd) Date: 2011-07-18 21:50 Message: time_end-time_start are consistently almost zero, so selectBy doesn't take time, right? ---------------------------------------------------------------------- Comment By: Rosa Robles (marobles) Date: 2011-07-18 10:36 Message: The program takes time only from the selectby operation for key in range(1000): time_start = datetime.now() person = Person.selectBy(firstName='name%s%s' %(iteration,key))[0] time_end = datetime.now() times.append(time_end-time_start) I attach the example to you can read better If we review the results, the selectBy method takes times random in iteration, but also each iteration's time grows up. I think the problem is in the execution of cull process when delete expiredCache values, what do you think? In my real program do not use select in mass but it can acumulate a lots operations of select in the time, then we have problems of memory and we must restart the application. May be this situation is the source of our problem with the memory. ---------------------------------------------------------------------- Comment By: Oleg Broytman (phd) Date: 2011-07-15 21:14 Message: Min/max/avg times are almost zero, so it seems the only place the program spends time is calling .expireAll() and cache.clear(), right? ---------------------------------------------------------------------- Comment By: Rosa Robles (marobles) Date: 2011-07-15 10:52 Message: Thanks Oleg. I'm using SQLObject 1.1.1, Python2.6, Postgres 8.4 Ok, I change the code to takes more times. for iteration in range(20): start = datetime.now() acum = timedelta(microseconds=0) times = [] for key in range(1000): time_start = datetime.now() person = Person.selectBy(firstName='name%s%s' % (iteration,key))[0] time_end = datetime.now() times.append(time_end-time_start) acum = acum + (time_end-time_start) end = datetime.now() print "ITERATION ",iteration,"time",end-start print " First Key ->", times[0] print " Last Key ->",times[999] print " Key ->",times.index(max(times)), " Max Time ->", max(times) print " Key ->",times.index(min(times)), " Min Time ->", min(times) print " avg ->",acum/999 Person.sqlmeta.expireAll() for k in connection.cache.caches.keys(): connection.cache.caches[k].expireAll() sqlhub.getConnection().expireAll() connection.cache.clear() gc.collect() And the result is: ITERATION 0 time 0:00:01.302290 First Key -> 0:00:00.001555 Last Key -> 0:00:00.001445 Key -> 11 Max Time -> 0:00:00.003216 Key -> 36 Min Time -> 0:00:00.001115 avg -> 0:00:00.001287 ITERATION 1 time 0:00:01.621580 First Key -> 0:00:00.001879 Last Key -> 0:00:00.001734 Key -> 924 Max Time -> 0:00:00.009572 Key -> 22 Min Time -> 0:00:00.001076 avg -> 0:00:00.001606 ITERATION 2 time 0:00:01.851719 First Key -> 0:00:00.002328 Last Key -> 0:00:00.002064 Key -> 970 Max Time -> 0:00:00.002544 Key -> 22 Min Time -> 0:00:00.001306 avg -> 0:00:00.001836 ITERATION 3 time 0:00:02.040802 First Key -> 0:00:00.002435 Last Key -> 0:00:00.002376 Key -> 557 Max Time -> 0:00:00.009674 Key -> 25 Min Time -> 0:00:00.001538 avg -> 0:00:00.002026 ITERATION 4 time 0:00:02.427911 First Key -> 0:00:00.002906 Last Key -> 0:00:00.002065 Key -> 774 Max Time -> 0:00:00.003263 Key -> 7 Min Time -> 0:00:00.001763 avg -> 0:00:00.002413 ITERATION 5 time 0:00:02.768819 First Key -> 0:00:00.003051 Last Key -> 0:00:00.002319 Key -> 920 Max Time -> 0:00:00.007694 Key -> 11 Min Time -> 0:00:00.002007 avg -> 0:00:00.002753 ITERATION 6 time 0:00:02.537283 First Key -> 0:00:00.002676 Last Key -> 0:00:00.002549 Key -> 2 Max Time -> 0:00:00.003512 Key -> 132 Min Time -> 0:00:00.002318 avg -> 0:00:00.002523 ITERATION 7 time 0:00:02.723405 First Key -> 0:00:00.005272 Last Key -> 0:00:00.002788 Key -> 79 Max Time -> 0:00:00.007543 Key -> 12 Min Time -> 0:00:00.002457 avg -> 0:00:00.002709 ITERATION 8 time 0:00:02.947676 First Key -> 0:00:00.003161 Last Key -> 0:00:00.003028 Key -> 1 Max Time -> 0:00:00.006068 Key -> 31 Min Time -> 0:00:00.002683 avg -> 0:00:00.002933 ITERATION 9 time 0:00:03.781268 First Key -> 0:00:00.004271 Last Key -> 0:00:00.003416 Key -> 930 Max Time -> 0:00:00.011791 Key -> 5 Min Time -> 0:00:00.003029 avg -> 0:00:00.003765 ITERATION 10 time 0:00:03.842353 First Key -> 0:00:00.006544 Last Key -> 0:00:00.003394 Key -> 0 Max Time -> 0:00:00.006544 Key -> 4 Min Time -> 0:00:00.003171 avg -> 0:00:00.003826 ITERATION 11 time 0:00:03.361368 First Key -> 0:00:00.001509 Last Key -> 0:00:00.003626 Key -> 602 Max Time -> 0:00:00.009093 Key -> 9 Min Time -> 0:00:00.001089 avg -> 0:00:00.003346 ITERATION 12 time 0:00:03.581802 First Key -> 0:00:00.001511 Last Key -> 0:00:00.003912 Key -> 474 Max Time -> 0:00:00.009027 Key -> 1 Min Time -> 0:00:00.001117 avg -> 0:00:00.003566 ITERATION 13 time 0:00:03.797965 First Key -> 0:00:00.001833 Last Key -> 0:00:00.004118 Key -> 100 Max Time -> 0:00:00.005115 Key -> 9 Min Time -> 0:00:00.001126 avg -> 0:00:00.003782 ITERATION 14 time 0:00:04.147961 First Key -> 0:00:00.001851 Last Key -> 0:00:00.004362 Key -> 782 Max Time -> 0:00:00.013857 Key -> 16 Min Time -> 0:00:00.001186 avg -> 0:00:00.004132 ITERATION 15 time 0:00:04.251830 First Key -> 0:00:00.001538 Last Key -> 0:00:00.004611 Key -> 518 Max Time -> 0:00:00.009511 Key -> 9 Min Time -> 0:00:00.001114 avg -> 0:00:00.004236 ITERATION 16 time 0:00:04.487687 First Key -> 0:00:00.001541 Last Key -> 0:00:00.004994 Key -> 858 Max Time -> 0:00:00.009368 Key -> 9 Min Time -> 0:00:00.001120 avg -> 0:00:00.004472 ITERATION 17 time 0:00:04.703588 First Key -> 0:00:00.003440 Last Key -> 0:00:00.005104 Key -> 529 Max Time -> 0:00:00.008179 Key -> 8 Min Time -> 0:00:00.001105 avg -> 0:00:00.004688 ITERATION 18 time 0:00:05.300920 First Key -> 0:00:00.001851 Last Key -> 0:00:00.006987 Key -> 854 Max Time -> 0:00:00.007299 Key -> 9 Min Time -> 0:00:00.001181 avg -> 0:00:00.005285 ITERATION 19 time 0:00:06.168681 First Key -> 0:00:00.001880 Last Key -> 0:00:00.005577 Key -> 911 Max Time -> 0:00:00.009503 Key -> 9 Min Time -> 0:00:00.001119 avg -> 0:00:00.006152 ---------------------------------------------------------------------- Comment By: Oleg Broytman (phd) Date: 2011-07-14 20:56 Message: No idea what is going on. Let's investigate a little further. What version of SQLObject are you using? Can you add more prints inside the loop to see what part of the program takes time? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=540673&aid=3366899&group_id=74338 |