Re: [Modeling-users] API cleanup proposal: EditingContext

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

        Hi all,

Some more comments about the API change proposal; I concentrate here on
EditingContext, and will post another message for CustomObject and
related APIs.

Note: I also received two private answers.

* Again: methods will be aliased, not replaced,

* EC.insert() and EC.delete() are okay to everyone.

* autoInsertion()/setAutoInsertion(): the former is a getter, the
  latter, a setter

Mario> OK. What about commit() for saveChanges()?

* about commit(): that was initially in my proposal. I removed
  it. Remember I'm working on integrating the ZODB transaction and
  observation mechanisms into the modeling? Well, then, commit() clashes
  with py2.1 zodb's transaction protocol (and has a very different
  semantic than saveChanges(): it is equivalent to objectWillChange()).
  Hence I'd prefer to let it apart from the proposal for the moment
  being.

* EC.fetch():

  The initial proposal is accepted by everyone so we'll keep it as a basis.

Mario> Another issue is that even with all the keyword options offered, one
Mario> still has to "dip into" small pieces of raw sql very quickly, as sho=
wn
Mario> by your examples.

  Here I strongly disagree with you. It /seems/ that you dip into small
  pieces of raw sql, but it's not the case at all. Qualifiers and
  FetchSpecification properties offers a generic command language;
  consider this:

    ec.fetch('Book', 'author.pygmalion.lastName ilike "r*"')

  -> this is far from the generated sql:

        SELECT t0.id, t0.title, t0.FK_WRITER_ID, t0.PRICE=20
        FROM BOOK t0=20
          INNER JOIN  ( WRITER t1=20
                        INNER JOIN WRITER t2=20
                        ON t1.FK_WRITER_ID=3Dt2.ID )
          ON t0.FK_WRITER_ID=3Dt1.ID
        WHERE UPPER(t2.LAST_NAME) LIKE UPPER('r%');

  In the framework we generally make the strong assumption that raw sql
  is taken out of the python code. Even if some of the keywords (such as
  'like'/'ilike' for qualifiers, or 'asc'/'desc' for ordering) are
  actually the same than the corresponding sql keywords, they are indeed
  decorrelated.  See again the example above: the 'ilike' keyword is not
  used, rather we compare UPPER()s (this might change in the future, now
  that 'ilike' is a sql keyword accepted by a lot of db-servers ;)

Mario> This may be an acceptable compromise, but then client code should be
Mario> allowed to pass any SQL it wants...

  That's an other problem, to which I completely agree: we should have a
  mean to execute complete raw sql statements, and get the (raw) result
  back. Moreover, we should be able to transform the returned raw rows
  into real objects if necessary (and if possible).

> Another detail is that the object loads all (direct) properties, every
> time -- there may be no workaround for this, given the way the
> framework is built.
(also related to your 'resultset' proposal)

  Last, we should also be able to tell the framework to return the raw
  rows instead of fully initialized objects (which can later be reverted
  to real objects): there's a real need for that; sometimes you want to
  present a (very) long list of objects in a summary page/widget, but
  you do not need the full objects, not even every attributes, but a
  subset to present the user. Then the users selects one or more of this
  rows and that's where you'll transform the raw rows to real objects.

  Note that the framework architecture will have no problem to support
  this. Some of the needed APIs are already present but not implemented
  (such as DatabaseContext.faultForRawRow()).

  I thought this was on the todo list but it's not --I'll add that,
  since I've been thinking about these points for quite a long time.

Impact on the API:

  when this is implemented I suggest we add the following parameters to
  fetch():

    rawRows -- (default: false) return raw rows instead of objects

    sql -- execute the sql statement (and ignore all other parameters
           except entityName and rawRows --both optional in that case)

  BTW I also suggest that the unsupported features in the fetch API are
  removed until they are implemented (such as limit/page/offset/etc).

> I feel that some things (even if no one has requested them yet ;) are
> missing...
> for example, should one ever need to take advantage of sql "group by" and
> "having".
> Access to these sql clauses may be added later, without breaking this API,
> which is OK. Also, such manips may be done in the middle code, but
> that would be very inefficient.
[...]
>=20
> The real problem with this is that to request the result of an sql functi=
on,
> of which count() is an example, additional api functionality is needed.
> But what happens if I want other functions, such as SUM or MAX or AVERAGE
> over a few columns? Each of these functions may take complex parameters.
> Again, the functionality may be replicated in the middle code, but this w=
ould
> not only be a waste of development time, but also be very inefficient.
>=20
> I propose either a generalization of "select" (which may be too complicat=
ed
> in its implications), or an addition of a func keyword option, e.g.
>=20
> func=3D'count'
> func=3D('funcName', param1, param2, ...)
> func=3D( ('funcName', param1, param2, ...), ('funcname2') )
>=20
> func=3D('sum','age')
>=20
> The question is then how is this information returned?
> I suggest as a tuple...
[snipped]

  That's a very interesting idea, but this will take too much efforts
  for it to be shortly developped. Let me explain that: if we do this
  just as it sounds, then we will have *pieces* of raw sql in the middle
  of generic command patterns --I don't like that. Do not misunderstand
  me: it should be possible to abstract this in a certain way, but I
  really wonder if it's worth the effort. And there's more problems,
  consider this:

    - either you simply want to sum()/avg()/max()/etc. on a table and
      its attributes, and then I guess it's probably sufficient to offer
      the possibility to fetch(sql=3D'select max(age) from author'), as
      stated above;

    - or you want to use this along with the automatic generation of
      complex queries (e.g. with multiple joins): okay, but then you
      must be able to say which attributes you want, and it's definitely
      not sufficient to tell which table it belongs to: relations can be
      reflexive (such as the 'pygmalion' relationship in the author/book
      model), and in such cases you have two possibilities for
      Author.age: table alias t1 or t2 (referring to the sql statements
      above). This also means that this expression needs to be bound to
      the automatic generation of sql statements.

  I can't think of a straightforward way to do this by now. Again, I'm
  not saying this is impossible: I'm just playing with the interesting
  idea and explaining the difficulties I can foresee, wondering whether
  such an advanced functionality would be worth the effort. That's an
  open question, and for the moment being I suggest we do not take this
  into account _as far as the API change proposal is concerned_. This
  could be discussed in a separate thread, and it would help a lot if we
  had some real-life examples showing where this could be very handy.
  --> Same for 'group by' and 'having' statements by the way, since I'm not
  really familiar with them either.

Mario> I would also add "by indicating clearly the small subset of methods
Mario> intended for use by client code, and the stability level".

  Right. I'm still looking for a way to include this in the docstrings
  so that the generation of the API can eat it (FYI we use epydoc).

-- S=E9bastien.