Support Requests item #3366899, was opened at 2011-07-14 11:40
Message generated for change (Comment added) made by phd
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=540673&aid=3366899&group_id=74338
Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: None
>Status: Closed
Priority: 5
Private: No
Submitted By: Rosa Robles (marobles)
Assigned to: Oleg Broytman (phd)
Summary: The speed of SQLObject is slower on the time
Initial Comment:
I had a test of a SQLObject and I had found a strange behaviour of SQLObject. On the time, the respond time is lesser.
Here, I put a example of test code :
from sqlobject.sqlbuilder import *
connection = connectionForURI('postgresql://test:test@localhost/test_sql_object')
sqlhub.processConnection = connection
class Person(SQLObject):
firstName = StringCol()
middleInitial = StringCol(length=1, default=None)
lastName = StringCol()
iteration = 0
while 1:
start = datetime.now()
person_list = []
for key in range(1000):
person = Person.selectBy(firstName='name%s%s' % (iteration,key))[0]
person_list.append(person)
for person in person_list:
person.expire()
del person
del person_list
sqlhub.processConnection.cache.clear()
end = datetime.now()
print "TIME ",iteration,':' , end - start
iteration = iteration + 1
In the result I can check that in each iteration the time is bigger. The first iteration has time of 1.37s, then the time increases and the 19 iteration has time of 5.83 s!
TIME 0 : 0:00:01.376392
TIME 1 : 0:00:01.663250
TIME 2 : 0:00:01.978639
TIME 3 : 0:00:02.319939
TIME 4 : 0:00:02.618213
TIME 5 : 0:00:02.723846
TIME 6 : 0:00:02.584007
TIME 7 : 0:00:02.868877
TIME 8 : 0:00:03.051937
TIME 9 : 0:00:03.389757
TIME 10 : 0:00:03.503036
TIME 11 : 0:00:03.563154
TIME 12 : 0:00:04.191539
TIME 13 : 0:00:04.274464
TIME 14 : 0:00:04.813206
TIME 15 : 0:00:04.677002
TIME 16 : 0:00:04.995708
TIME 17 : 0:00:05.126811
TIME 18 : 0:00:05.571279
TIME 19 : 0:00:05.832840
Why is happend this?
Thanks in advance for your attention.
Cheers Rosa
----------------------------------------------------------------------
>Comment By: Oleg Broytman (phd)
Date: 2011-07-20 17:33
Message:
MySQL is different, AFAIU, because it has only client-side cursors instead
of server-side; i.e. MySQLdb fetches all results even if SQLObject asks to
fetch one row.
----------------------------------------------------------------------
Comment By: Rosa Robles (marobles)
Date: 2011-07-20 10:32
Message:
I didn't know profiler tool, thanks very much, Oleg!!
I'm agree with you source of problem, in postgres it spends almost all
time in
method 'execute' of 'psycopg2._psycopg.cursor' objects
however in mysql it spends almost all time in method 'query' of
'_mysql.connection' objects.
The problem is in the use of database, a lots of operations to database
spends a lots time.
Thanks very much, your help has been very useful to me.
----------------------------------------------------------------------
Comment By: Oleg Broytman (phd)
Date: 2011-07-19 21:09
Message:
Well, I ran it myself: python -m cProfile -s cumulative
test_sql_object_by_object.py. The results is
5089432 function calls (4987267 primitive calls) in 82.498 CPU
seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.001 0.001 82.498 82.498 {execfile}
1 0.001 0.001 82.497 82.497
test_sql_object_by_object.py:3(<module>)
1 0.324 0.324 82.115 82.115
test_sql_object_by_object.py:50(select_examples)
20000 0.295 0.000 77.778 0.004
sresults.py:126(__getitem__)
20000 0.442 0.000 75.925 0.004 sresults.py:175(__iter__)
20000 0.107 0.000 69.318 0.003 sresults.py:181(lazyIter)
20000 0.150 0.000 69.076 0.003
dbconnection.py:470(iterSelect)
20000 0.202 0.000 68.665 0.003
dbconnection.py:705(__init__)
20001 0.085 0.000 60.550 0.003
sqliteconnection.py:194(_executeRetry)
20001 60.465 0.003 60.465 0.003 {method 'execute' of
'pysqlite2.dbapi2.Cursor' objects}
20000 0.156 0.000 7.846 0.000
dbconnection.py:485(queryForSelect)
40000 0.332 0.000 5.945 0.000 dbconnection.py:719(next)
40000 0.170 0.000 5.601 0.000
dbconnection.py:670(sqlrepr)
140000/40000 0.584 0.000 5.430 0.000
converters.py:198(sqlrepr)
20000 1.305 0.000 4.802 0.000
sqlbuilder.py:613(__sqlrepr__)
20000 0.399 0.000 4.536 0.000 main.py:878(get)
20000 0.205 0.000 3.503 0.000 main.py:1373(selectBy)
20000 0.182 0.000 2.920 0.000 main.py:915(_init)
20000 0.655 0.000 2.735 0.000
sresults.py:42(queryForSelect)
140000 0.679 0.000 2.704 0.000
sqlbuilder.py:215(tablesUsedSet)
40000 0.931 0.000 2.620 0.000 sresults.py:9(__init__)
20000 0.440 0.000 2.373 0.000
main.py:1154(_SO_selectInit)
220568 0.660 0.000 1.967 0.000 {getattr}
80000 0.776 0.000 1.950 0.000 col.py:503(to_python)
140000 1.078 0.000 1.793 0.000
sqlbuilder.py:194(tablesUsedSet)
20000 0.465 0.000 1.699 0.000
dbconnection.py:641(_SO_columnClause)
20000 0.183 0.000 1.492 0.000 sresults.py:89(clone)
80000 0.446 0.000 1.431 0.000
sqlbuilder.py:391(__getattr__)
100024 0.597 0.000 1.173 0.000
dbconnection.py:896(__get__)
564108 1.073 0.000 1.073 0.000 {isinstance}
The entire program took 80 seconds, of which 60 seconds were spend in
method 'execute' of 'pysqlite2.dbapi2.Cursor' (I used sqlite). Not an
SQLObject fault. You can run a profiler yourself sorting by different
column. I'd like to hear your interpretation.
----------------------------------------------------------------------
Comment By: Oleg Broytman (phd)
Date: 2011-07-19 19:21
Message:
To stop cache culling I added
connection.cache = CacheSet(cullFrequency=10**6)
to your test program. It didn't help. So it seems cache culling is not the
culprit. There is a need to do a deeper investigation. Can you run a python
profiler over your test(s)?
----------------------------------------------------------------------
Comment By: Rosa Robles (marobles)
Date: 2011-07-19 12:09
Message:
if we review the results, after read a lots of records selectBy takes more
times, in the first iteration is almost zero but in 20 iteration is bigger.
If you do the example, and you takes the times without to call to
expireAll() or clear(), you can check that the time c
for iteration in range(20):
start = datetime.now()
for key in range(1000):
person = Person.selectBy(firstName='name%s%s' %
(iteration,key))[0]
end = datetime.now()
print "ITERATION ",iteration, "TIME", end-start
Person.sqlmeta.expireAll()
sqlhub.getConnection().expireAll()
connection.cache.clear()
----------------------------------------------------------------------
Comment By: Oleg Broytman (phd)
Date: 2011-07-18 21:50
Message:
time_end-time_start are consistently almost zero, so selectBy doesn't take
time, right?
----------------------------------------------------------------------
Comment By: Rosa Robles (marobles)
Date: 2011-07-18 10:36
Message:
The program takes time only from the selectby operation
for key in range(1000):
time_start = datetime.now()
person = Person.selectBy(firstName='name%s%s' %(iteration,key))[0]
time_end = datetime.now()
times.append(time_end-time_start)
I attach the example to you can read better
If we review the results, the selectBy method takes times random in
iteration, but also each iteration's time grows up.
I think the problem is in the execution of cull process when delete
expiredCache values, what do you think?
In my real program do not use select in mass but it can acumulate a lots
operations of select in the time,
then we have problems of memory and we must restart the application.
May be this situation is the source of our problem with the memory.
----------------------------------------------------------------------
Comment By: Oleg Broytman (phd)
Date: 2011-07-15 21:14
Message:
Min/max/avg times are almost zero, so it seems the only place the program
spends time is calling .expireAll() and cache.clear(), right?
----------------------------------------------------------------------
Comment By: Rosa Robles (marobles)
Date: 2011-07-15 10:52
Message:
Thanks Oleg.
I'm using SQLObject 1.1.1, Python2.6, Postgres 8.4
Ok, I change the code to takes more times.
for iteration in range(20):
start = datetime.now()
acum = timedelta(microseconds=0)
times = []
for key in range(1000):
time_start = datetime.now()
person = Person.selectBy(firstName='name%s%s' %
(iteration,key))[0]
time_end = datetime.now()
times.append(time_end-time_start)
acum = acum + (time_end-time_start)
end = datetime.now()
print "ITERATION ",iteration,"time",end-start
print " First Key ->", times[0]
print " Last Key ->",times[999]
print " Key ->",times.index(max(times)), " Max Time
->", max(times)
print " Key ->",times.index(min(times)), " Min Time
->", min(times)
print " avg ->",acum/999
Person.sqlmeta.expireAll()
for k in connection.cache.caches.keys():
connection.cache.caches[k].expireAll()
sqlhub.getConnection().expireAll()
connection.cache.clear()
gc.collect()
And the result is:
ITERATION 0 time 0:00:01.302290
First Key -> 0:00:00.001555
Last Key -> 0:00:00.001445
Key -> 11 Max Time -> 0:00:00.003216
Key -> 36 Min Time -> 0:00:00.001115
avg -> 0:00:00.001287
ITERATION 1 time 0:00:01.621580
First Key -> 0:00:00.001879
Last Key -> 0:00:00.001734
Key -> 924 Max Time -> 0:00:00.009572
Key -> 22 Min Time -> 0:00:00.001076
avg -> 0:00:00.001606
ITERATION 2 time 0:00:01.851719
First Key -> 0:00:00.002328
Last Key -> 0:00:00.002064
Key -> 970 Max Time -> 0:00:00.002544
Key -> 22 Min Time -> 0:00:00.001306
avg -> 0:00:00.001836
ITERATION 3 time 0:00:02.040802
First Key -> 0:00:00.002435
Last Key -> 0:00:00.002376
Key -> 557 Max Time -> 0:00:00.009674
Key -> 25 Min Time -> 0:00:00.001538
avg -> 0:00:00.002026
ITERATION 4 time 0:00:02.427911
First Key -> 0:00:00.002906
Last Key -> 0:00:00.002065
Key -> 774 Max Time -> 0:00:00.003263
Key -> 7 Min Time -> 0:00:00.001763
avg -> 0:00:00.002413
ITERATION 5 time 0:00:02.768819
First Key -> 0:00:00.003051
Last Key -> 0:00:00.002319
Key -> 920 Max Time -> 0:00:00.007694
Key -> 11 Min Time -> 0:00:00.002007
avg -> 0:00:00.002753
ITERATION 6 time 0:00:02.537283
First Key -> 0:00:00.002676
Last Key -> 0:00:00.002549
Key -> 2 Max Time -> 0:00:00.003512
Key -> 132 Min Time -> 0:00:00.002318
avg -> 0:00:00.002523
ITERATION 7 time 0:00:02.723405
First Key -> 0:00:00.005272
Last Key -> 0:00:00.002788
Key -> 79 Max Time -> 0:00:00.007543
Key -> 12 Min Time -> 0:00:00.002457
avg -> 0:00:00.002709
ITERATION 8 time 0:00:02.947676
First Key -> 0:00:00.003161
Last Key -> 0:00:00.003028
Key -> 1 Max Time -> 0:00:00.006068
Key -> 31 Min Time -> 0:00:00.002683
avg -> 0:00:00.002933
ITERATION 9 time 0:00:03.781268
First Key -> 0:00:00.004271
Last Key -> 0:00:00.003416
Key -> 930 Max Time -> 0:00:00.011791
Key -> 5 Min Time -> 0:00:00.003029
avg -> 0:00:00.003765
ITERATION 10 time 0:00:03.842353
First Key -> 0:00:00.006544
Last Key -> 0:00:00.003394
Key -> 0 Max Time -> 0:00:00.006544
Key -> 4 Min Time -> 0:00:00.003171
avg -> 0:00:00.003826
ITERATION 11 time 0:00:03.361368
First Key -> 0:00:00.001509
Last Key -> 0:00:00.003626
Key -> 602 Max Time -> 0:00:00.009093
Key -> 9 Min Time -> 0:00:00.001089
avg -> 0:00:00.003346
ITERATION 12 time 0:00:03.581802
First Key -> 0:00:00.001511
Last Key -> 0:00:00.003912
Key -> 474 Max Time -> 0:00:00.009027
Key -> 1 Min Time -> 0:00:00.001117
avg -> 0:00:00.003566
ITERATION 13 time 0:00:03.797965
First Key -> 0:00:00.001833
Last Key -> 0:00:00.004118
Key -> 100 Max Time -> 0:00:00.005115
Key -> 9 Min Time -> 0:00:00.001126
avg -> 0:00:00.003782
ITERATION 14 time 0:00:04.147961
First Key -> 0:00:00.001851
Last Key -> 0:00:00.004362
Key -> 782 Max Time -> 0:00:00.013857
Key -> 16 Min Time -> 0:00:00.001186
avg -> 0:00:00.004132
ITERATION 15 time 0:00:04.251830
First Key -> 0:00:00.001538
Last Key -> 0:00:00.004611
Key -> 518 Max Time -> 0:00:00.009511
Key -> 9 Min Time -> 0:00:00.001114
avg -> 0:00:00.004236
ITERATION 16 time 0:00:04.487687
First Key -> 0:00:00.001541
Last Key -> 0:00:00.004994
Key -> 858 Max Time -> 0:00:00.009368
Key -> 9 Min Time -> 0:00:00.001120
avg -> 0:00:00.004472
ITERATION 17 time 0:00:04.703588
First Key -> 0:00:00.003440
Last Key -> 0:00:00.005104
Key -> 529 Max Time -> 0:00:00.008179
Key -> 8 Min Time -> 0:00:00.001105
avg -> 0:00:00.004688
ITERATION 18 time 0:00:05.300920
First Key -> 0:00:00.001851
Last Key -> 0:00:00.006987
Key -> 854 Max Time -> 0:00:00.007299
Key -> 9 Min Time -> 0:00:00.001181
avg -> 0:00:00.005285
ITERATION 19 time 0:00:06.168681
First Key -> 0:00:00.001880
Last Key -> 0:00:00.005577
Key -> 911 Max Time -> 0:00:00.009503
Key -> 9 Min Time -> 0:00:00.001119
avg -> 0:00:00.006152
----------------------------------------------------------------------
Comment By: Oleg Broytman (phd)
Date: 2011-07-14 20:56
Message:
No idea what is going on. Let's investigate a little further. What version
of SQLObject are you using? Can you add more prints inside the loop to see
what part of the program takes time?
----------------------------------------------------------------------
You can respond by visiting:
https://sourceforge.net/tracker/?func=detail&atid=540673&aid=3366899&group_id=74338
|