Re: [Modeling-users] Fetching raw rows

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Mario Ruggier <ma...@ru...> wrote:
> Given the same db, what is the performance difference between
> the following fetches (for logically equivalent queries)?
> - 1st time "classic" fetch (in empty EC)
> - 2nd time "classic" fetch (objects in resultset already known to EC)
> - 1st time raw fetch (in empty EC)
> - 2nd time raw fetch (objects in resultset already known to EC)
> - 1st time dbapi2.0 execute query (direct via python adaptor)
> - 2nd time dbapi2.0 execute query (direct via python adaptor)

Here are the figures. Test database: 5000 objects with 3 attributes: a
PK, a FK (to-one relation to an other object of a different type), and a
text field.

[std] 1st fetch : 7.20251297951
[std] 2nd fetch : 1.03094005585

[raw row] 1st fetch (ec empty):                0.618131995201
[raw row] 2nd fetch (objects already loaded):  0.61448597908
[raw row] 3rd fetch (objects already loaded):  2.24008309841

[psycopg] 1st fetch: 0.038547039032
[psycopg] 2nd fetch: 0.0789960622787

Comments:=20

  1. no surprise, fetching real objects really takes much more time than
     simple fetch w/ psycopg, and raw psycopg fetch is the fastest.

     Maybe it's time for me to study the fetching process in details, to
     see where it can be enhanced. This could be done after 0.9-pre-10,
     i.e. after finishing the documentation for PyModels first.

  2. You probably already noticed that fetching raw rows is
     significantly slower when the objects are loaded. The reason is
     that objects already loaded are checked for modification, because
     as I explained it in my previous post we have to return the
     modifications, not the fetched data, if an object has been
     modified.

     I'm currently studying this, I have an implementation that does not
     consume more time when no objects are modified:

[raw row] 1st fetch (ec empty):                0.595005989075
[raw row] 2nd fetch :                          0.585139036179
[raw row] 3rd fetch (objects already loaded):  0.607128024101

     However, still, we cannot avoid the additional payload when some
     objects are modified, and the more modified objects we have, the
     slower will be the fetch (first figure, 2.24, would be the upper
     limit here, when all objects are modified).

     Second, I do not want to commit this quicker implementation now,
     because a problem remains: if the database has been changed in the
     mean time, you can get raw rows whose value are not the same then
     _unmodified_ objects in the EditingContext. I'm not sure if this is
     a significant problem, or put it differently, if we have to pay for
     extra cpu-time for ensuring that this does not happen. But I feel a
     little touchy in making a exception in the general rule.

> It would be interesting to keep an eye on these values, for a particular
> setup, thus when changes to the system are made, unexpected performance
> side effects may still be observed. Maybe such a script can be added to
> the tests?

Sorry, I did not read you right: the idea of observing these figures to
detect the impacts of changes on performance is indeed a very good
idea. That will be done, sure.

        Cheers,

-- S=E9bastien.