Re: [Modeling-users] Lazy Initialization Part 2

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

        Hi Yannick,

Replying quite quickly here again.

1. Do these times refer to the 30s spent on the initial algorithm? Does
   this mean half of the time is already saved by prefetching? (I'm not
   suggesting this is sufficient, I'm just curious there)

2. I'm surprised the benefit of MDL_PERMANENT_DB_CONNECTION is so
   low. That's a very minor point however.

3. How can it be that qualifierWithQualifierFormat() is called *so many
   times*?? I do not understand that either. However:

> A lot of time is spent parsing the qualifier and building the SQL
> query.  Could spark be an unexpected bottle neck ?  The "IN" query
> does not seems to scale really well for large value lists.

  _Too much_ time spent in parsing a qualifier, indeed.  And yes, for
  highly stressed application heavily dependent on qualifiers strings,
  spark is a known bottleneck --it's all in python hence it's
  slowness. A chapter on performance tuning is still to be written, and
  that's definitely meat for it (that's one of the reasons why I'm
  asking for the figures).

  I guess it's time for you to forget about qualifiers as strings and to
  learn how to built your own Qualifier instances by hand. It's not that
  complicated, and you will then avoid all the time lost in parsing the
  qualifier. Of course, only change the qualifiers that are highly
  stressed by the fetch of to-many.

    Example:=20

       ec.fetch('Writer', 'books.id in %s'%pks)

    can be rewritten using:

       from Modeling import Qualifier
       q=3DQualifier.KeyValueQualifier('books.id',
                                     Qualifier.QualifierOperatorIn,
                                     pks)
       ec.fetch('Writer', q)

  Please post if you need help in building your own qualifiers (this
  also needs to be documented).

> Is there any hope or should I forget about the nice abstraction provided =
by=20
> Modeling and craft my own SQL ?

  *I* really think that there is a solution --at least one that works on
  the paper, and that I need to test; believe me, I won't say this if I
  hadn't the strong feeling that it can be improved. But as I already
  said, I won't have time until tuesday (maybe on monday, but I can't
  promise for now) to actually test it, so we'll only be sure how good
  it is then, you'll understand that I can't obviously be absolutely
  positive before. It will involve two massive queries and no additional
  fetch afterwards, so we can expect a big improvement.

  I hope you can live with these results until then, and we'll see how
  good this solution is then.

    Thanks a lot for the figures and the profile log. I'd be interested
    in the same after you change your qualifiers as suggested above, so
    we can compare.

        Thanks for your patience, regards,

-- S=E9bastien.