#6 The speed of SQLObject is slower on the time

closed
Oleg Broytman
None
5
2011-07-20
2011-07-14
Rosa Robles
No

I had a test of a SQLObject and I had found a strange behaviour of SQLObject. On the time, the respond time is lesser.

Here, I put a example of test code :

from sqlobject.sqlbuilder import *
connection = connectionForURI('postgresql://test:test@localhost/test_sql_object')
sqlhub.processConnection = connection

class Person(SQLObject):
firstName = StringCol()
middleInitial = StringCol(length=1, default=None)
lastName = StringCol()

iteration = 0
while 1:
start = datetime.now()
person_list = []
for key in range(1000):
person = Person.selectBy(firstName='name%s%s' % (iteration,key))[0]
person_list.append(person)

for person in person_list:
person.expire()
del person
del person_list
sqlhub.processConnection.cache.clear()

end = datetime.now()

print "TIME ",iteration,':' , end - start
iteration = iteration + 1

In the result I can check that in each iteration the time is bigger. The first iteration has time of 1.37s, then the time increases and the 19 iteration has time of 5.83 s!

TIME 0 : 0:00:01.376392
TIME 1 : 0:00:01.663250
TIME 2 : 0:00:01.978639
TIME 3 : 0:00:02.319939
TIME 4 : 0:00:02.618213
TIME 5 : 0:00:02.723846
TIME 6 : 0:00:02.584007
TIME 7 : 0:00:02.868877
TIME 8 : 0:00:03.051937
TIME 9 : 0:00:03.389757
TIME 10 : 0:00:03.503036
TIME 11 : 0:00:03.563154
TIME 12 : 0:00:04.191539
TIME 13 : 0:00:04.274464
TIME 14 : 0:00:04.813206
TIME 15 : 0:00:04.677002
TIME 16 : 0:00:04.995708
TIME 17 : 0:00:05.126811
TIME 18 : 0:00:05.571279
TIME 19 : 0:00:05.832840

Why is happend this?
Thanks in advance for your attention.
Cheers Rosa

Discussion

1 2 > >> (Page 1 of 2)
  • Oleg Broytman
    Oleg Broytman
    2011-07-14

    No idea what is going on. Let's investigate a little further. What version of SQLObject are you using? Can you add more prints inside the loop to see what part of the program takes time?

     
  • Oleg Broytman
    Oleg Broytman
    2011-07-14

    • assigned_to: nobody --> phd
     
  • Rosa Robles
    Rosa Robles
    2011-07-15

    Thanks Oleg.
    I'm using SQLObject 1.1.1, Python2.6, Postgres 8.4

    Ok, I change the code to takes more times.

    for iteration in range(20):
    start = datetime.now()
    acum = timedelta(microseconds=0)
    times = []

    for key in range(1000):
    time_start = datetime.now()
    person = Person.selectBy(firstName='name%s%s' % (iteration,key))[0]
    time_end = datetime.now()
    times.append(time_end-time_start)
    acum = acum + (time_end-time_start)

    end = datetime.now()

    print "ITERATION ",iteration,"time",end-start
    print " First Key ->", times[0]
    print " Last Key ->",times[999]
    print " Key ->",times.index(max(times)), " Max Time ->", max(times)
    print " Key ->",times.index(min(times)), " Min Time ->", min(times)
    print " avg ->",acum/999

    Person.sqlmeta.expireAll()
    for k in connection.cache.caches.keys():
    connection.cache.caches[k].expireAll()
    sqlhub.getConnection().expireAll()
    connection.cache.clear()
    gc.collect()

    And the result is:

    ITERATION 0 time 0:00:01.302290
    First Key -> 0:00:00.001555
    Last Key -> 0:00:00.001445
    Key -> 11 Max Time -> 0:00:00.003216
    Key -> 36 Min Time -> 0:00:00.001115
    avg -> 0:00:00.001287
    ITERATION 1 time 0:00:01.621580
    First Key -> 0:00:00.001879
    Last Key -> 0:00:00.001734
    Key -> 924 Max Time -> 0:00:00.009572
    Key -> 22 Min Time -> 0:00:00.001076
    avg -> 0:00:00.001606
    ITERATION 2 time 0:00:01.851719
    First Key -> 0:00:00.002328
    Last Key -> 0:00:00.002064
    Key -> 970 Max Time -> 0:00:00.002544
    Key -> 22 Min Time -> 0:00:00.001306
    avg -> 0:00:00.001836
    ITERATION 3 time 0:00:02.040802
    First Key -> 0:00:00.002435
    Last Key -> 0:00:00.002376
    Key -> 557 Max Time -> 0:00:00.009674
    Key -> 25 Min Time -> 0:00:00.001538
    avg -> 0:00:00.002026
    ITERATION 4 time 0:00:02.427911
    First Key -> 0:00:00.002906
    Last Key -> 0:00:00.002065
    Key -> 774 Max Time -> 0:00:00.003263
    Key -> 7 Min Time -> 0:00:00.001763
    avg -> 0:00:00.002413
    ITERATION 5 time 0:00:02.768819
    First Key -> 0:00:00.003051
    Last Key -> 0:00:00.002319
    Key -> 920 Max Time -> 0:00:00.007694
    Key -> 11 Min Time -> 0:00:00.002007
    avg -> 0:00:00.002753
    ITERATION 6 time 0:00:02.537283
    First Key -> 0:00:00.002676
    Last Key -> 0:00:00.002549
    Key -> 2 Max Time -> 0:00:00.003512
    Key -> 132 Min Time -> 0:00:00.002318
    avg -> 0:00:00.002523
    ITERATION 7 time 0:00:02.723405
    First Key -> 0:00:00.005272
    Last Key -> 0:00:00.002788
    Key -> 79 Max Time -> 0:00:00.007543
    Key -> 12 Min Time -> 0:00:00.002457
    avg -> 0:00:00.002709
    ITERATION 8 time 0:00:02.947676
    First Key -> 0:00:00.003161
    Last Key -> 0:00:00.003028
    Key -> 1 Max Time -> 0:00:00.006068
    Key -> 31 Min Time -> 0:00:00.002683
    avg -> 0:00:00.002933
    ITERATION 9 time 0:00:03.781268
    First Key -> 0:00:00.004271
    Last Key -> 0:00:00.003416
    Key -> 930 Max Time -> 0:00:00.011791
    Key -> 5 Min Time -> 0:00:00.003029
    avg -> 0:00:00.003765
    ITERATION 10 time 0:00:03.842353
    First Key -> 0:00:00.006544
    Last Key -> 0:00:00.003394
    Key -> 0 Max Time -> 0:00:00.006544
    Key -> 4 Min Time -> 0:00:00.003171
    avg -> 0:00:00.003826
    ITERATION 11 time 0:00:03.361368
    First Key -> 0:00:00.001509
    Last Key -> 0:00:00.003626
    Key -> 602 Max Time -> 0:00:00.009093
    Key -> 9 Min Time -> 0:00:00.001089
    avg -> 0:00:00.003346
    ITERATION 12 time 0:00:03.581802
    First Key -> 0:00:00.001511
    Last Key -> 0:00:00.003912
    Key -> 474 Max Time -> 0:00:00.009027
    Key -> 1 Min Time -> 0:00:00.001117
    avg -> 0:00:00.003566
    ITERATION 13 time 0:00:03.797965
    First Key -> 0:00:00.001833
    Last Key -> 0:00:00.004118
    Key -> 100 Max Time -> 0:00:00.005115
    Key -> 9 Min Time -> 0:00:00.001126
    avg -> 0:00:00.003782
    ITERATION 14 time 0:00:04.147961
    First Key -> 0:00:00.001851
    Last Key -> 0:00:00.004362
    Key -> 782 Max Time -> 0:00:00.013857
    Key -> 16 Min Time -> 0:00:00.001186
    avg -> 0:00:00.004132
    ITERATION 15 time 0:00:04.251830
    First Key -> 0:00:00.001538
    Last Key -> 0:00:00.004611
    Key -> 518 Max Time -> 0:00:00.009511
    Key -> 9 Min Time -> 0:00:00.001114
    avg -> 0:00:00.004236
    ITERATION 16 time 0:00:04.487687
    First Key -> 0:00:00.001541
    Last Key -> 0:00:00.004994
    Key -> 858 Max Time -> 0:00:00.009368
    Key -> 9 Min Time -> 0:00:00.001120
    avg -> 0:00:00.004472
    ITERATION 17 time 0:00:04.703588
    First Key -> 0:00:00.003440
    Last Key -> 0:00:00.005104
    Key -> 529 Max Time -> 0:00:00.008179
    Key -> 8 Min Time -> 0:00:00.001105
    avg -> 0:00:00.004688
    ITERATION 18 time 0:00:05.300920
    First Key -> 0:00:00.001851
    Last Key -> 0:00:00.006987
    Key -> 854 Max Time -> 0:00:00.007299
    Key -> 9 Min Time -> 0:00:00.001181
    avg -> 0:00:00.005285
    ITERATION 19 time 0:00:06.168681
    First Key -> 0:00:00.001880
    Last Key -> 0:00:00.005577
    Key -> 911 Max Time -> 0:00:00.009503
    Key -> 9 Min Time -> 0:00:00.001119
    avg -> 0:00:00.006152

     
  • Oleg Broytman
    Oleg Broytman
    2011-07-15

    Min/max/avg times are almost zero, so it seems the only place the program spends time is calling .expireAll() and cache.clear(), right?

     
  • Rosa Robles
    Rosa Robles
    2011-07-18

    The program takes time only from the selectby operation
    for key in range(1000):
    time_start = datetime.now()
    person = Person.selectBy(firstName='name%s%s' %(iteration,key))[0]
    time_end = datetime.now()
    times.append(time_end-time_start)

    I attach the example to you can read better

    If we review the results, the selectBy method takes times random in iteration, but also each iteration's time grows up.
    I think the problem is in the execution of cull process when delete expiredCache values, what do you think?

    In my real program do not use select in mass but it can acumulate a lots operations of select in the time,
    then we have problems of memory and we must restart the application.
    May be this situation is the source of our problem with the memory.

     
  • Oleg Broytman
    Oleg Broytman
    2011-07-18

    time_end-time_start are consistently almost zero, so selectBy doesn't take time, right?

     
  • Rosa Robles
    Rosa Robles
    2011-07-19

    if we review the results, after read a lots of records selectBy takes more times, in the first iteration is almost zero but in 20 iteration is bigger.

    If you do the example, and you takes the times without to call to expireAll() or clear(), you can check that the time c

    for iteration in range(20):
    start = datetime.now()
    for key in range(1000):
    person = Person.selectBy(firstName='name%s%s' % (iteration,key))[0]
    end = datetime.now()

    print "ITERATION ",iteration, "TIME", end-start

    Person.sqlmeta.expireAll()
    sqlhub.getConnection().expireAll()
    connection.cache.clear()

     
  • Oleg Broytman
    Oleg Broytman
    2011-07-19

    To stop cache culling I added

    connection.cache = CacheSet(cullFrequency=10**6)

    to your test program. It didn't help. So it seems cache culling is not the culprit. There is a need to do a deeper investigation. Can you run a python profiler over your test(s)?

     
  • Oleg Broytman
    Oleg Broytman
    2011-07-19

    Well, I ran it myself: python -m cProfile -s cumulative test_sql_object_by_object.py. The results is

    5089432 function calls (4987267 primitive calls) in 82.498 CPU seconds

    Ordered by: cumulative time

    ncalls tottime percall cumtime percall filename:lineno(function)
    1 0.001 0.001 82.498 82.498 {execfile}
    1 0.001 0.001 82.497 82.497 test_sql_object_by_object.py:3(<module>)
    1 0.324 0.324 82.115 82.115 test_sql_object_by_object.py:50(select_examples)
    20000 0.295 0.000 77.778 0.004 sresults.py:126(__getitem__)
    20000 0.442 0.000 75.925 0.004 sresults.py:175(__iter__)
    20000 0.107 0.000 69.318 0.003 sresults.py:181(lazyIter)
    20000 0.150 0.000 69.076 0.003 dbconnection.py:470(iterSelect)
    20000 0.202 0.000 68.665 0.003 dbconnection.py:705(__init__)
    20001 0.085 0.000 60.550 0.003 sqliteconnection.py:194(_executeRetry)
    20001 60.465 0.003 60.465 0.003 {method 'execute' of 'pysqlite2.dbapi2.Cursor' objects}
    20000 0.156 0.000 7.846 0.000 dbconnection.py:485(queryForSelect)
    40000 0.332 0.000 5.945 0.000 dbconnection.py:719(next)
    40000 0.170 0.000 5.601 0.000 dbconnection.py:670(sqlrepr)
    140000/40000 0.584 0.000 5.430 0.000 converters.py:198(sqlrepr)
    20000 1.305 0.000 4.802 0.000 sqlbuilder.py:613(__sqlrepr__)
    20000 0.399 0.000 4.536 0.000 main.py:878(get)
    20000 0.205 0.000 3.503 0.000 main.py:1373(selectBy)
    20000 0.182 0.000 2.920 0.000 main.py:915(_init)
    20000 0.655 0.000 2.735 0.000 sresults.py:42(queryForSelect)
    140000 0.679 0.000 2.704 0.000 sqlbuilder.py:215(tablesUsedSet)
    40000 0.931 0.000 2.620 0.000 sresults.py:9(__init__)
    20000 0.440 0.000 2.373 0.000 main.py:1154(_SO_selectInit)
    220568 0.660 0.000 1.967 0.000 {getattr}
    80000 0.776 0.000 1.950 0.000 col.py:503(to_python)
    140000 1.078 0.000 1.793 0.000 sqlbuilder.py:194(tablesUsedSet)
    20000 0.465 0.000 1.699 0.000 dbconnection.py:641(_SO_columnClause)
    20000 0.183 0.000 1.492 0.000 sresults.py:89(clone)
    80000 0.446 0.000 1.431 0.000 sqlbuilder.py:391(__getattr__)
    100024 0.597 0.000 1.173 0.000 dbconnection.py:896(__get__)
    564108 1.073 0.000 1.073 0.000 {isinstance}

    The entire program took 80 seconds, of which 60 seconds were spend in method 'execute' of 'pysqlite2.dbapi2.Cursor' (I used sqlite). Not an SQLObject fault. You can run a profiler yourself sorting by different column. I'd like to hear your interpretation.

     
1 2 > >> (Page 1 of 2)