Re: [Modeling-users] Lazy Initialization Part 2

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

        Hi,

  Here are some hints; sorry, but I can't spend a lot of time on this
  for now.  BTW I will be quite busy until tuesday, probably mostly
  offline.

> I have a particular record set that hold 700 or so rows.  I use
> dynamic i18n from another table that we will call i18nTable for
> simplicity.  There is only a few fields in i18nTable : title and
> description.=20=20
>=20
> The relation look like this :=20
[snipped: toMany rel. 'i18ns']
> So here I am trying to fetch my whole record set.  The initial fetch
> is really fast but when I try to getI18ns() on each fetched object it
> result in a SQL SELECT.  This mean that do 701 SELECTs where I
> intended to do only one...  My whole fetch take 30 seconds to complete
> but the initial fetch (the master records with lots of columns) take
> barely a second.  It would be really nice if I could group the fetch so
> that it resulted in a single SQL select (ok, maybe 2 : one for master
> records, one for all the i18ns).

  I understand this, but this is not that simple. See below.

> I understand why you really want lazy initialization.  It's would be
> even slower to fetch all the database when you fetch an object with
> many relations.  But having it optional, something like a
> recursiveFecth() would be really nice.

  I will think about it; but as you already noted it, this can result in
  unwanted cascaded fetchs ultimately loading the whole db.

> So my question is : Is it possible to do so ?

Before adressing your particular pb., let's see of this can be done with
a to-one relationship. In fact, a simple combination of globalIDs and
the operator 'in' makes the trick:

[in Modeling/tests/testPackages]
    >>> from Modeling.EditingContext import EditingContext
    >>> import AuthorBooks
    >>> ec=3DEditingContext()
    >>> objs=3Dec.fetch('Book') # fetch all books
    >>> gids=3D[o.globalID() for o in objs]
    >>> pks=3D[gid.keyValues()['id'] for gid in gids]
    >>> objs[0].getAuthor().isFault()
    1
    >>> ec.fetch('Writer', 'books.id in %s'%pks) # all fetches at once
    >>> objs[0].getAuthor().isFault() # no round-trip to the db
    0

Now consider the to-many relationship:

>>> ec=3DEditingContext()
>>> objs=3Dec.fetch('Writer')
>>> gids=3D[o.globalID() for o in objs]
>>> pks=3D[gid.keyValues()['id'] for gid in gids]
>>> objs[0].getBooks().isFault()
>>> 1
>>> ec.fetch('Book', 'author.id in %s'%pks) # all fetches at once
>>> objs[0].getAuthor().isFault() # still a fault!
>>> 1
>>> len(objs[0]) # trigger a select

What is the pb here? The framework has no way to know that all necessary
objects for the given relation have been already loaded, so it needs to
fetch, at least to get the ids. So in this case there's no way to avoid
the additional round-trip to the db. However, these additional fetches
avoid the initialization phase for these objects (it's already done), so
you can expect these fetches to be significantly faster.

  As a conclusion, I'll say that:

    - either fetch all the i18ns objects, then pre-fetch the inverse
      to-one pointing to your record objects,

    - or stay as-is, and live w/ the additional fetches (where the
      related i18ns are initialized by a single fetch, but where you'll
      still get a fetch for each faulted array).

  If you could try both, I'd be very interested in knowing how long each
  one takes with your data. Thanks in advance!

  Oh, and BTW: in case you've not already done it, you'd probably want
  to set MDL_PERMANENT_DB_CONNECTION to 1, so that you do not pay the
  additional overhead of closing and reopening the db connection for
  each fetch.

> BTW The trick with FixedPoint works really well, tanks for the hints !

  Glad to hear this, this will be then documented for the next release.

-- S=E9bastien.