Re: [Modeling-users] (LONG) Other OR-bridges?

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hello,

...

> - model design, model validation, generation of databases and schemas,
>   generation of python classes are available in the ZModeler.

Yes, and they are all very nice things to have...

> - Updating a database schema after changes made on a model is not
>   available --that feature would be in the tool, not in the
>   core.  Reverse-engineering an existing database is not available,=20
> it's
>   been on the todo list for a certain time.

Absolutely.

> In a previous mail you were speaking of the prominence of the zmodeler
> and suggesting a different organization of the documentation. This
> should be done, for sure. I've begun writing a tutorial, hence the pb=20=

> of
> making the code/sql generation available apart from zope was =
identified
> and I should be able to release the scripts soon.

Ah, very good to have the script. Also the tutorial.
If you like I can review the tutorial, as well the the other docs...

> - Documentation for the xml format: in a way, there is some, even it's
>   not that obvious... In fact, section 2.2 in the User's Guide=20
> describes
>   each element that can be found in a xmlmodel (sample xml models are=20=

> in
>   testPackages.AuthorBooks and StoreEmployees).

Yes, it is there, but it reads more like a brief overview of main=20
elements and
attributes, leaving one wondering what the complete picture is.
Having the XML spec available will take off the pressure on the general=20=

explanation,
allowing it to highlight only what is typically most pertinent and not=20=

get lost in detail.
The XML spec could be its XSD (although i'd prefer another more=20
expressive
(human) definition syntax... in any case, it should only be up to half=20=

a page long.
Also, I do not think that validation of the XML (per se) would be=20
particularly useful,
as it would not imply much about how valid the represented model is=20
anyway.
...

> I can't see precisely how to design a csv that maps nicely to the xml,
> but I'm open to suggestions! Same, anyone feeling like coding some=20
> tools
> will be welcome ;) and in that case, I'll start a dev-branch and share
> the unreleased-coz'-unfinished code I already have.

CSV may or may not be appropriate, given that you do have some nesting
in your format. However, after I familiarize myself better with the=20
schema, I
could offer something more concrete. On this issue, another possibly
interesting way to handle this "description language" is using a=20
construct or
mini language in python itself...

...

>> This could be an optimization later -- instead of sorting on the =
store
>> it might be faster to do a db query to get only the sort info (get
>> only the id?)  and then iterate on that sort order using the data in
>> the store (or getting it when missing).
>
> That's the idea. The only thing that needs to be done is to generate=20=

> SQL
> code from a SortOrdering object.

OK.

>> ... all requests
>> for view-only data are serviced by a common read-only object store,
>> while all modify-data requests are serviced by a (short-lived) user
>> store that could be independent from read-only store, or could be a
>> nested store to it.
>
> A nested store to a read-only object store? That's a very, very good
> idea, i didn't think about that --must be still too impregnated with=20=

> the
> way the EOF handles that. This would make it easy to update
> mostly-read-objects while keeping the memory footprint low. I just
> dropped a note in the TODO list about that, thanks!

Ah, yes. Your TODO list is very comprehensive -- you will be busy with
this for the next few years \-? Hmmn, i'd say focus on the docs and web=20=

site,
and announcements, to attract a few more users -- who will certainly=20
help
with the TODO list (hopefully not only making it longer :-)

>> Again, cost of fetching is (much) less important here. In addition,
>> normally data is modified in small units, typically 1 object at a
>> time. But when browsing, it is common to request fetches of 1000s of
>> objects...
>
> That's exactly why there should be the possibility to fetch and get =
raw
> rows, not objects.

There isn't? But can always do a raw select...

>>>     However, even in the still-uncommited changes, the framework
>>>   lacks an important feature which is likely to be the next thing
>>>   I'll work on: the ability to broadcast the changes committed on
>>>   one object store to the others (in fact, the notifications are
>>>   already made, but they are not properly handled yet).
>>
>
>> Yes, this would be necessary. Not sure how the broadcast model will
>> work (may not always know who the end clients are -- data on a client
>> web page will never know about these changes even if you
>> broadcast). But, possibly a feasible simple strategy could be that
>> each store will keep a copy of the original object as it received
>> it.
>
> There is a central and shared object which stores the snapshots when
> rows are fetched ; you'll find that in class Database. FYI the
> responsability of broacasting messages is assigned to the
> NotificationCenter distributed in the NotificationFramework apart from
> the Modeling.

OK. Have not yet dived into the code, but will do.

>> When committing a modify it compares the original object(s) being
>> modified to those in the db, if still identical, locks the objects =
and
>> commits the changes, otherwise returns an error "object has changed
>> since". This behaviour could be application specific, i.e. in some
>> cases you want to go ahead and modify anyway, and in others you want
>> to do something else. Thus, this could be a store option, with =
default
>> being the error.
>
> What you describe here seems to be optimistic locking: when updated,=20=

> the
> corresponding database row is compared against the snapshots we got at
> the time the object wes fetched, and if they do not match, this an=20
> error
> (recoverable, in case you want to update anyway, take whatever action
> depending on the differences and the object and object's type, etc.).
>
> What I was talking about is something different. Suppose S1 and S2 are
> two different object stores, resp. holding objects O1 and o1 that
> correspond to the very same row in the database (in other words, they
> are clones living each in its world). Now suppose that o1 is updated=20=

> and
> saved back to the database: you may want to make sure the O1 gets the
> changes as well (of course, you may also request that changes that=20
> would
> have been made on it should be re-applied after they are broadcasted)
>
>> In combination with this, other stores may still be notified of
>> changes, and it is up to them (or their configuration) whether to
>> reload the modified objects, or to ignore the broadcasts.
>
> (ok, i should read more carefully what follows before answering ;)

OK, so we are meaning the same thing here.

>> Otherwise, on each fetch the framework could check if any changes =
have
>> taken place (may be expensive) for the concerned objects, and reload
>> them automatically. This has the advantage of covering those cases
>> when changes to data in the db are done by some way beyond the
>> framework, and thus the framework can never be sure to know about it.
>
> Yes, this would be expensive, and this is already available to a=20
> certain
> extent. You can ``re-fault''/invalidate your objects so that they are
> automatically re-fetched next time they are accessed. I understand =
this
> is not *exactly* the way you think of it, but that's kind of an
> implementation and optimization detail, except if you expect objects =
to
> be refreshed while not overriding the changes that were already made
> --this is not possible yet.

This may be enough, then.
Do not know if this is already there, but i guess it would be also=20
useful
to be able to simply request to force a fresh fetch -- if any of the=20
objects
are already in the store they are reloaded.

>>> Last and although this is not directly connected to that topic, I'd
>>> like to make an other point more precise: the framework does not
>>> implement yet any locking, either optimistic or pessimistic, on DB's
>>> rows. This means that if other processes modify a row in the DB, an
>>> ObjectStore will not notice the changes and will possibly override
>>> them if the corresponding object has been modified in the meantime.
>>
>> Yes, but locking objects should be very short lived -- only the time
>> to commit any changes, as mentioned above. This also makes it much
>> less important, and only potentially a problem when two processes try
>> to modify the same object at the same time.
>
> Postgresql already locks rows during updates (I cant remember what=20
> MySQL
> does) but anyway I do not plan to support this kind of short-living
> locks --to my point of view they should be managed by the database
> itself.

Agreed.

> What I meant was ``locking policy''=20
> (http://minnow.cc.gatech.edu/squeak/2634).
> Pessimistic locking can result in long-standing locks, from the moment
> an object is *about* to be updated to the moment it is made persistent
> --this might be an application requirement that an object cannot be
> edited by more than one person at a time.

In general I am wary of this kind of behaviour as (a) it probably=20
increases the
possibility of clashes and "deadlocks", as mentioned in the linked=20
article
and (b) it is heavier on the server and (c) more difficult to program=20
for, and introduces
an additional set of possible problems, e.g. what if, after the=20
application, after
acquiring a lock on some objects, runs into problems and never gives it=20=

up?
Bad programming? Maybe, but the program should not have to worry about=20=

this.

Also, on requiring an object to be modified by only one person at a=20
time -- this
can always be handled with the mechanism described above, namely that=20
before
committing a change, the "original" object is compared to the db=20
object, and if
different the error is raised (and may be ignored). Besides, when would=20=

such
a case require this? Aside, even systems like CVS do not restrict that=20=

a co item
be modified by only one person.

>   Well, I hope the overview of what the framework does /not/ support=20=

> yet
>   does not completely hide what it already does!

Hey, my interest in Modeling is based on what it does, and how it does=20=

it...
But, as i mentioned above, it would help it if all that it actually=20
does already
is given more prominence...

>         Regards,
>
> -- S=E9bastien.

Regards, mario