modeling-users Mailing List for Object-Relational Bridge for python (Page 21)
Status: Abandoned
Brought to you by:
sbigaret
You can subscribe to this list here.
2002 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2003 |
Jan
(19) |
Feb
(55) |
Mar
(54) |
Apr
(48) |
May
(41) |
Jun
(40) |
Jul
(156) |
Aug
(56) |
Sep
(90) |
Oct
(14) |
Nov
(41) |
Dec
(32) |
2004 |
Jan
(6) |
Feb
(57) |
Mar
(38) |
Apr
(23) |
May
(3) |
Jun
(40) |
Jul
(39) |
Aug
(82) |
Sep
(31) |
Oct
(14) |
Nov
|
Dec
(9) |
2005 |
Jan
|
Feb
(4) |
Mar
(13) |
Apr
|
May
(5) |
Jun
(2) |
Jul
|
Aug
(1) |
Sep
|
Oct
|
Nov
|
Dec
(1) |
2006 |
Jan
(1) |
Feb
(1) |
Mar
(9) |
Apr
(1) |
May
|
Jun
(1) |
Jul
(5) |
Aug
|
Sep
(5) |
Oct
(1) |
Nov
|
Dec
|
2007 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(2) |
Nov
(1) |
Dec
|
2009 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(4) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Mario R. <ma...@ru...> - 2003-09-27 09:43:12
|
On Jeudi, sep 25, 2003, at 20:43 Europe/Zurich, Sebastien Bigaret wrote: > > Hi Mario, > > Mario Ruggier <ma...@ru...> wrote: >> Hi, little issue with usedForLocking, which is mentioned throughout >> the >> the PyModels Attributes section, each page of which points back to >> 2.3.3 >> for further explanation, but there is never a word about what this is >> for >> (as far as I can see). >> >> What was this thing for? > > Funny that you're asking this now that we discuss optimistic locking ;) Ha, I had not yet caught up with reading all the messages... ;) > This flag indicates which attributes optimistic locking will check when > it is about to save changes: if value for attribute 'lastname' has > changed and 'usedForLocking' is set for tha attr., then under the > optimistic locking policy saveChanges() will raise; if the flag is > unset, the attribute will be silently overriden (for example, you > probably won't mark an object's timestamp as used for locking). OK, thanks. But this paragraph should also be copy'n'pasted into in section "2.3.3 Attribute" of the userguide... mario |
From: Sebastien B. <sbi...@us...> - 2003-09-26 22:40:48
|
"Ernesto Revilla" <er...@si...> wrote: > I agree with Fede that this should be one of the priorities, cause it has= so many positive effects: > we can let modeling run in a 'heavy client', although this is not an appr= oach that I like, because I want the business logic on the server, but.... > we can do load balancing with more servers, we can use different editing = contexts for each session, as already done with Zope, and detect easily any= update collisions. Each application should handle collisions on it's own w= ay. Absolutely, and I'll definitely post a plan for it this we.=20 > I don't agree with Heinz on the fetch problem, because: > * the probability we hack on the same data over and over again is very hi= gh. This is a matter of natural work. > * it may be normal to fetch the objects in several steps, although it is = more probable to get the objects through object traversal. > * as S=E9bastien said, you can use ec.dispose() > * we definetly need some refresh mechanism. I don't know if the strategy = to only reread data if write collisions are detected is better than doing a= refresh on some fetch. Just a comment: I think ZEO (Zope Enterprise Object= s) use notifications to invalid objects, yes, the server sends messages to = the client to invalidate data. This does not consume much bandwith as there= are not many objects that change. (But this may not be the same context in= RDBMS.) I think we now all agree that both points of view should be possible. Now for the inter-process communication/notification, that's yet another feature that should be studied on its own --if you have more details about ZEO, I'm interested. -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-09-26 22:34:13
|
Federico Heinz <fh...@vi...> wrote: [...] > > Okay, I'll try hard to make this happen this week-end, then. BTW, I know > > there should be documentation for the framework's architecture. > > Hopefully this will be done one day, but the todo list is sooo long... >=20 > Sounds familiar... :-) We'll have to communicate stuff internally, so > we're likely to produce *some* form of representation of the > architecture. If we do, we'll definitely share it. That would be great. Do not hesitate to share it early, this is something we can work on together if you wish. > > You can think about these notifications as a mean to provide a specific > > and application-specific behaviour for minimizing failures under > > optimistic locking strategies, at least when in a single > > address-space. >=20 > It sounds to me that what you'd be getting is more like early > announcement that a conflict is coming, but the conflict-resolution > logic will still have to be there. Right, the conflict-resolution still needs to be there, but that's a annoucement that a "conflict" has already occurred: an object is now out-of-sync because another EC saved its changes (inside a single address-space, we rely on the same db-cache). On the other hand, we'll also need an other conflict resolution (possibly the same) when an other process has modified the data --that's when optimistic locking fails. > > Moreover, these notifications are really needed if for > > any reason you ''choose'' the no-locking policy (which is the only > > supported policy by now, and the reason why the User's Guide details the > > problem when using one EC per session). >=20 > This is, of course, true. But if we're talking priorities, I think > optimistic locking would be first, because you'll pretty much need its > infrastructure to resolve the conflict you've just been informed of. My ideas are not really clear by now, so I'll wait a little before stating anything here --I suspect both are a little more than just complementary, but can't find a clear explanation now. > > Does all this make sense wrt your own claims & requirements? >=20 > It does. And I'm not making claims or requiring things... I'm just > exploring possibilities. >=20 > By the way, I *like* this whole exchange! Makes me feel good about > working with you. Kewl ;) And I'm sure good things are to happen --and esp. your experience with vertical mapping can be quite a driving force for implementing it. My turn to have some rest now. Regards, -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-09-26 22:07:26
|
Hi, Federico Heinz <fh...@vi...> wrote: > On Thu, 2003-09-25 at 19:18, Sebastien Bigaret wrote: > > Not at all! For postgresql e.g., we use dedicated sequences, one for > > each inheritance tree; >=20 > Trying to recall why we didn't go for this approach, I vagely remember > that we discarded it in search of database independence... (the > exploration of PostgreSQL's inheritance is just meant as a temporary > hack, DBMS independence is important to us). But we might reconsider > this. If you want DBMS independence, I must be even more explicit about it then. The framework needs to be in control of the mechanism generating the primary keys --it needs to know the PK in advance so that relationships (FK) and other changes can be changed in a single roundtrip to the database. For postgres and oracle we use sequences; with mysql and sqlite we use a dedicated table, with one column 'id' and one row containing the value for the next available pk. All are named after all the root entities in the model: PK_SEQ_<rootEntityName>. When a database is migrated, this sequence/table can be rebuild with max(pk)+1 for all tables of the root entity and its subentities. (I should probably add a little note about this in the User's Guide, after all the python code is commented, but not the db schema) > > Well, since the siblings get different ids, resolving FKs "simply" means > > deep-fetching the inheritance tree to find the related row. >=20 > I checked the code... and it looks suspiciously simple. You're hiding > something from me! :-) I guess there's some cool magic hidden in there, > my guess is that objectsForSourceGlobalID has some clever tricks up its > sleeve. It's late now, I'll look into it tomorrow. Right now, my > sleep-deprived brain can't imagine any other ways of doing it than > trying all tables in sequence until you find one that has the record > you're looking for, or keeping a table that tracks what table got > assigned which key, all of which doesn't sound more efficient than > joins... >=20 > But things may look clearer tomorrow. Sleep is a marvelous thing, when > you can get some :-) Seems like you were not dreaming yet ;) Yes, that's exactly the way this is done by now: if you have three entities, you'll get three fetch. For example, for SalesClerk and Executive inheriting from Employee, when a fault is cleared 3 fetches are done: SELECT t0.id, t0.last_name,t0.first_name,t0.fk_store_id FROM EMPLOYEE t0 WHERE (t0.id =3D 1); SELECT t0.id,t0.store_area,t0.last_name,t0.first_name,t0.fk_store_id FROM SALES_CLERK t0 WHERE (t0.id =3D 1); SELECT t0.id,t0.office_location,t0.last_name,t0.first_name,t0.fk_store_id FROM EXECUTIVE t0 WHERE (t0.id =3D 1); We've discussed other details with Ernesto at: https://sf.net/mailarchive/forum.php?thread_id=3D2563498&forum_id=3D10674 > > Will this allow a second look at horizontal mapping ? ;) >=20 > You seem to be advocating it very much... Any particular reason? We've > been doing the vertical mapping thing with databases that are > representative of our target, running on slow machines, and the DB never > was the bottleneck. I'm not really advocating it, it's just that I never played with it because I considered a bit slower than horizontal: with 'nb' classes and subclasses, - a fetch needs 'nb' fetches+join --horizontal: nb fetches, single table: 1 fetch, - an update can need 2 query for vertical, vs 1 for horizontal and single. And that's basically why I never tried it --but my opinion has no real weight here, remember I'm really not good at sql: that was one of the motivation for writing the code :) You're perfectly right when you say that the db will not be the bottleneck, the time taken by the framework to build and manage the object graph will be greater anyhow. BTW if you have precise SQL examples on how you optimize the fetching and updating of objects in a vertical mapping that could come in handy! -- S=E9bastien. |
From: Federico H. <fh...@vi...> - 2003-09-26 15:34:12
|
On Thu, 2003-09-25 at 17:54, Ernesto Revilla wrote: > For serial, the trick is that the two tables use the same sequence > generator, so the problem does not appear. I stand corrected. As a matter of fact, it seems PostgreSQL implements inheritance in a way that seems completely backwards (and different from what I thought was happening). Basically, it seems to be doing horizontal mapping, with selects on the superclass being mapped to a series of selects on every (grand)*child. One would think that inheritance could be implemented in a much smarter way from the inside of the DBMS... > All this may not be that important if only modeling is used to > manipulate data as it may take care of it. That's right. Thanks for the clarification! Fede |
From: Ernesto R. <er...@si...> - 2003-09-26 10:28:33
|
I agree with Fede that this should be one of the priorities, cause it = has so many positive effects: we can let modeling run in a 'heavy client', although this is not an = approach that I like, because I want the business logic on the server, = but.... we can do load balancing with more servers, we can use different editing = contexts for each session, as already done with Zope, and detect easily = any update collisions. Each application should handle collisions on it's = own way. I don't agree with Heinz on the fetch problem, because: * the probability we hack on the same data over and over again is very = high. This is a matter of natural work. * it may be normal to fetch the objects in several steps, although it is = more probable to get the objects through object traversal. * as S=E9bastien said, you can use ec.dispose() * we definetly need some refresh mechanism. I don't know if the strategy = to only reread data if write collisions are detected is better than = doing a refresh on some fetch. Just a comment: I think ZEO (Zope = Enterprise Objects) use notifications to invalid objects, yes, the = server sends messages to the client to invalidate data. This does not = consume much bandwith as there are not many objects that change. (But = this may not be the same context in RDBMS.) Erny |
From: Federico H. <fh...@vi...> - 2003-09-26 00:27:27
|
On Thu, 2003-09-25 at 19:18, Sebastien Bigaret wrote: > Not at all! For postgresql e.g., we use dedicated sequences, one for > each inheritance tree; Trying to recall why we didn't go for this approach, I vagely remember that we discarded it in search of database independence... (the exploration of PostgreSQL's inheritance is just meant as a temporary hack, DBMS independence is important to us). But we might reconsider this. > Well, since the siblings get different ids, resolving FKs "simply" means > deep-fetching the inheritance tree to find the related row. I checked the code... and it looks suspiciously simple. You're hiding something from me! :-) I guess there's some cool magic hidden in there, my guess is that objectsForSourceGlobalID has some clever tricks up its sleeve. It's late now, I'll look into it tomorrow. Right now, my sleep-deprived brain can't imagine any other ways of doing it than trying all tables in sequence until you find one that has the record you're looking for, or keeping a table that tracks what table got assigned which key, all of which doesn't sound more efficient than joins... But things may look clearer tomorrow. Sleep is a marvelous thing, when you can get some :-) > Will this allow a second look at horizontal mapping ? ;) You seem to be advocating it very much... Any particular reason? We've been doing the vertical mapping thing with databases that are representative of our target, running on slow machines, and the DB never was the bottleneck. Fede |
From: Federico H. <fh...@vi...> - 2003-09-25 23:59:35
|
On Thu, 2003-09-25 at 19:08, Sebastien Bigaret wrote: > If you want this, you can simply ec.dispose() after you've done with > the changes, and the objects will be invalidated and the corresponding > cached rows will be removed (I assume here that you only have one EC > at a time). This way, you'll get the exact behaviour you're asking > for. OK, sold! :-) > Okay, I'll try hard to make this happen this week-end, then. BTW, I know > there should be documentation for the framework's architecture. > Hopefully this will be done one day, but the todo list is sooo long... Sounds familiar... :-) We'll have to communicate stuff internally, so we're likely to produce *some* form of representation of the architecture. If we do, we'll definitely share it. > You can think about these notifications as a mean to provide a specific > and application-specific behaviour for minimizing failures under > optimistic locking strategies, at least when in a single > address-space. It sounds to me that what you'd be getting is more like early announcement that a conflict is coming, but the conflict-resolution logic will still have to be there. > Moreover, these notifications are really needed if for > any reason you ''choose'' the no-locking policy (which is the only > supported policy by now, and the reason why the User's Guide details the > problem when using one EC per session). This is, of course, true. But if we're talking priorities, I think optimistic locking would be first, because you'll pretty much need its infrastructure to resolve the conflict you've just been informed of. > Does all this make sense wrt your own claims & requirements? It does. And I'm not making claims or requiring things... I'm just exploring possibilities. By the way, I *like* this whole exchange! Makes me feel good about working with you. Fede |
From: Sebastien B. <sbi...@us...> - 2003-09-25 22:19:10
|
Federico Heinz <fh...@vi...> wrote: > On Thu, 2003-09-25 at 16:36, Sebastien Bigaret wrote: > > The XML-models or the PyModels are Entity-Relationship models, they do > > represent how classes map to databases. They are more database-centered > > than UML-centered, that's why, in such situations, the E-R diagram will > > differ from the class diagram. >=20 > Right. What I meant is that the XML constructs a flat view of the tables > making up the object. Indeed. > > BTW: there is a reason why models requires properties to be copied: a > > string attribute in a parent can have a width of 40, while a child can > > ask a width of 50 for the same field. >=20 > I got that from the docs... not sure about how good this may be for the > app. Of course, when each class maps to a different table this is > probably needed, particularly if it's a legacy database... That was the exact reason why this was made possible! [my silly comment snipped] > I sent an example earlier. I understand that what happens is that all > Vertebrate fields live in the Vertebrates table, and the Mammals table > contains just the non-inherited fields. However, a select on Mammals > will also return inherited fields as if they were there. > This is my understanding of what is going on... I haven't looked at the > innards of PostgreSQL to find out. Dunno either how postgresql achieves this. BTW and just in case, this may be the place to recall that the framework automatically and transparently handles fetching objects on an inheritance tree, whatever the underlying db schema. (this is the 'isDeep' parameter of fetch(), see http://modeling.sf.net/UserGuide/ec-fetch-inheritance.html) > > About unique IDs: the framework takes care of assigning a unique ID (= per > > inheritance tree) to any object, regardless of where they are > > stored. So, if you create two objects v1(Vertebrate) and m1(Mammal), > > with Mammal inheriting from Vertebrate, it is guaranteed that > > v1.ID!=3Dm1.ID, always --even in horizontal mapping. >=20 > So we don't use the automatic unique id generator in the DBMS... Doesn't > this break horribly in a multiple-address-space scenario? Not at all! For postgresql e.g., we use dedicated sequences, one for each inheritance tree; when generating the db-schema, you'll see this: [...] CREATE SEQUENCE PK_SEQ_BOOK START 1 CREATE SEQUENCE PK_SEQ_WRITER START 1 [...] All sequences names PK_SEQ_... are used by the pk generation mechanism. MySQL does not support sequences so we use a different mechanism (a dedicated table containing only one row, plus specific SQL statements to safely increment the next PK value). In any case, it is a strong requirement for the framework's pk- generation process to be safe in multi-threaded (same adress space) or in multi-processes (different address spaces) environments. [...] > > Be also > > aware that, while pg accepts multiple inheritance, the framework does > > not support more than one parent for a given entity. >=20 > Multiple inheritance is an abomination :-) I'm glad you said it! > > BTW, if you can disclose, I'll be really interested in hearing the > > reasons why you want vertical mapping --esp. against the additional > > overhead each fetch (where tables should be JOINed), and each > > insert/update (where two INSERT/UPDATES are needed). >=20 > Well, there were two important problems without vertical mapping: > * unique ids for sibling classes is a bitch of a problem when you > have multiple address spaces As stated above, the framework does take care of this. > * we often have relations that are "to instances of any subclass > of foo", which are pretty easy to implement in vertical mapping, > but were near to impossible in the framework we're trying to > break from. > As a matter of fact, I look forward to finding out how you solved the > second problem in Modeling... Well, since the siblings get different ids, resolving FKs "simply" means deep-fetching the inheritance tree to find the related row. So, when an object has a relationship pointing to an entity, it can actually refer to this entity or any of its sub-entities --with nothing more than a classic foreign key. And of course, this is why the framework takes great care of assigning a different id to the siblings. If you're interested in looking at the code, the methods responsible for this are: - for to-one relationships: AccessFaultHandler.completeInitializationOfObject() [in module FaultHandler] =20=20=20=20 - for to-many relationships: DatabaseContext.objectsForSourceGlobalID() [in module DatabaseContext] triggered by: AccessArrayFaultHandler.completeInitializationOfObject() [in module FaultHandler] And last, willRead() in DatabaseObject (superclass for CustomObject) is the one responsible for asking the fault handler to complete the initialization of the object. Will this allow a second look at horizontal mapping ? ;) -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-09-25 22:08:56
|
Federico Heinz <fh...@vi...> wrote: > The point I'm trying to make is that applications usually let the user > perform transactions on a certain part of the database, and there's > little point to keeping the data from previous transactions around. Let me rephrase what your idea is, so that we're sure we're speaking of the same thing: you consider that fetching is the preliminary phase, then you modify objects (without fetching explicitely anymore), than you save changes and finally discards the object(s) you fetched. If you want this, you can simply ec.dispose() after you've done with the changes, and the objects will be invalidated and the corresponding cached rows will be removed (I assume here that you only have one EC at a time). This way, you'll get the exact behaviour you're asking for. > For example, imagine a simple app that allows you to delete, insert or > modify the Author's info. The user of such an app will repeatedly look > up an Author, change an attribute or two, commit the change, and start > over again. When the user looks up the second Author, it doesn't make a > lot of sense to keep the first Author's object around, does it? Yes, it > is possible that the application will let the user navigate the second > Author's books, and if the first Author has co-authored a book with the > second, it might also let the user navigate back to the first Author... > but I think it's affordable (and in some cases even desirable) that this > navigation results in the first Author being fetched again from the > database, instead of just fishing a likely stale copy from the cache. Agreed, this is possible --but you must also understand that other people might think differently, esp. wrt performance issues. Here we just have a couple of objects, so this does not make much differences, but if you need to work on a bigger set of objects that's another story. Speaking of perf., here are some figures I've just produced on my installation: - fetching 5000 simple objects (one attributes, a to-one and a to-many relationship, plus an ID) -> 4.77s - fetching the same 5000 objects while they are already fetched in the EC: 798 ms That's a gain factor of ~6x --just because in the 2nd case, objects were not re-created and re-populated with their values. > On Thu, 2003-09-25 at 15:40, Sebastien Bigaret wrote: > > In fact, there are two levels of caching. > > 1. Within EC:=20 > > - you fetch an object obj1, and modify it, > > - then you submit another query, which returns obj1 as well: there, > > you don't want to override the modification you've made but not > > saved. >=20 > Hmmm... Well, this is the thing. I mean, what are you doing submitting > another query while you still have uncommitted changes? I understand > this is exactly the right behavior if you are traversing a series of > relationships that brings you back to the original table/entity, but my > understanding is that when the program says ec.fetch(...), it is > actually stating that it's done with the results of the last fetch, and > wants to start anew. So if you try to do a fetch while changes are > pending, the EC should either commit the changes (don't like it) or > raise an exception (much better) Okay, let's go for a little illustration! I want to be able to fetch while some changes are uncommitted, because I may be in the middle of a modification process, and now, for example, I need to fetch some other objects to add them to my initial object's relationships. And I want that both the initial changes, and the one I make after the subsequent fetch(es), are all done atomically wrt the db. For example, i may want to fetch an author, get its books, remove all books from the author's list, delete the author, fetch an other author based on any user specification (no traversal here, but real fetch), assign the former author's books to the latter's list of books, then, and only then, save the changes. > > 2. The database's rows cache, held by Database, to which the framework > > refer for various tasks, such as: building fetched objects, computing > > the changes that needs to be forwarded to the database, etc. >=20 > I'm arguing that for most applications this cache ought to be flushed > with each user-level fetch. I understand that, for single-user > applications, longer-term caching can be a significance performance > boost (although it may, as noted in the documentation, lead to memory > footprint bloat if the application is not restarted regularly). In any > other environment, I feel that the risk of the cache becoming stale > seriously outweights performance concerns. I think we both could argue endlessly ;) but I do understand your point. All that I say is that both mechanisms should be supported. And they are, actually. [...] > > 2. -> otherwise, the database cache is searched for the row, and if > > found, that one will be used instead. > > (2.) can be annoying, and that's the situations where this is annoying > > that 'refresh' will address (and in addition to the default > > mechanism, it will allow you to do whatever you want through a > > specific delegate if the object actually changed, just like with > > optimistic locking) >=20 > I agree that optimistic locking could make the long-term caching of rows > workable. It will also, however, make conflicts more likely, because > rows that have been longer in the database cache have a larger > probability of becoming stale. >=20 > > In fact, clearing the cache cannot be the default, just because you'd > > probably won't rely in the framework to modify the data in your back. > > Suppose, for example, that when fetching, an previously fetched object > > has been deleted in the meantime (by an other applications): what you > > the framework do? Should it take the responsability to delete the > > object in the EC that fetched the data? >=20 > If the database doesn't keep a long-term cache, the call to fetch() will > not return the deleted row. If the deletion takes place after the > fetch(), of course, we'll have to resort to the whole optimistic locking > thing. Come to think of it, I think most of my argument rests upon the > idea that it's desirable to minimize the likelyhood of optimistic > locking conflict occurrence, which sound intuitively right to me, but I > don't have any hard data to back it up. That sounds a reasonable goal, but some others will argue that this is not their priority number one. In other words, that's a application-design decision, and I do not want to make this decision within the framework but, again, I think we'd better offer the developper the choice by giving him the tools. Agreed however, some of these tools are still missing. > > Not now, but I can make a plan for it, say, this week-end if you wish. >=20 > Well, that would be great! I'm trying (and still failing :-) ) to figure > out which module does what in the framework, so an expert opinion on > what would need to be done to get optimistic locking and vertical > mapping working would be a wonderful thing. Okay, I'll try hard to make this happen this week-end, then. BTW, I know there should be documentation for the framework's architecture. Hopefully this will be done one day, but the todo list is sooo long... > > > I must admit I'm kinda skeptic about the notification idea... > > Agreed, just because the modifications could have been made by any > > bash/perl/... script who won't post any modifications ;) Back on the > > notifications, at least they could solve the case where the framework > > runs in a single address space (this is the case in Zope, for example, > > or in any threaded application) and an EC save changes that you'd like > > to see appear in other ECs. >=20 > I'm saying that I don't see how notification could solve conflicts even > in a single address space. If both ec1 and ec2 have pending > modifications for obj1, and ec1 commits the change and notifies ec2... > what will ec2 do with its changes? You can, for example: - either ignore the changes, and rely on optimistic locking (this will probably be the default), - decide to examine the objects, apply the saved changes, then re-apply the uncommitted changes. Example: you modify a person's phone number, at this point you get a notification saying that the tel. number and the middle name has changed: you apply those changes, then re-apply the uncommitted changes for the phone number) [This can maybe be done automatically, although there are some subtle points when it comes to relationships --this needs to be investigated] - ask the user, - ... add your aplication requirements here ;) You can think about these notifications as a mean to provide a specific and application-specific behaviour for minimizing failures under optimistic locking strategies, at least when in a single address-space. Moreover, these notifications are really needed if for any reason you ''choose'' the no-locking policy (which is the only supported policy by now, and the reason why the User's Guide details the problem when using one EC per session). Does all this make sense wrt your own claims & requirements? -- S=E9bastien. |
From: Federico H. <fh...@vi...> - 2003-09-25 21:22:29
|
On Thu, 2003-09-25 at 16:36, Sebastien Bigaret wrote: > The XML-models or the PyModels are Entity-Relationship models, they do > represent how classes map to databases. They are more database-centered > than UML-centered, that's why, in such situations, the E-R diagram will > differ from the class diagram. Right. What I meant is that the XML constructs a flat view of the tables making up the object. > BTW: there is a reason why models requires properties to be copied: a > string attribute in a parent can have a width of 40, while a child can > ask a width of 50 for the same field. I got that from the docs... not sure about how good this may be for the app. Of course, when each class maps to a different table this is probably needed, particularly if it's a legacy database... > You really mean "*non-inherited* fields in Vertebrate are stored in > their own table", don't you? (my turn to ask to make sure that we're > talking about the same thing ;) Ermmm... in my example, Vertebrate is the superclass, so all of its fields are non-inherited. I sent an example earlier. I understand that what happens is that all Vertebrate fields live in the Vertebrates table, and the Mammals table contains just the non-inherited fields. However, a select on Mammals will also return inherited fields as if they were there. This is my understanding of what is going on... I haven't looked at the innards of PostgreSQL to find out. > About unique IDs: the framework takes care of assigning a unique ID (pe= r > inheritance tree) to any object, regardless of where they are > stored. So, if you create two objects v1(Vertebrate) and m1(Mammal), > with Mammal inheriting from Vertebrate, it is guaranteed that > v1.ID!=3Dm1.ID, always --even in horizontal mapping. So we don't use the automatic unique id generator in the DBMS... Doesn't this break horribly in a multiple-address-space scenario? > you won't be > able to fetch data from a specific entity alone, because the framework > will not issue the necessary 'select ... from only...' statement, nor > it will fetch the 'tableoid' system field --so the 'isDeep' flag of > fetch() will have no effect. This could be a problem... or not! Indeed, this is something we'll have to look into. > Be also > aware that, while pg accepts multiple inheritance, the framework does > not support more than one parent for a given entity. Multiple inheritance is an abomination :-) > BTW, if you can disclose, I'll be really interested in hearing the > reasons why you want vertical mapping --esp. against the additional > overhead each fetch (where tables should be JOINed), and each > insert/update (where two INSERT/UPDATES are needed). Well, there were two important problems without vertical mapping: * unique ids for sibling classes is a bitch of a problem when you have multiple address spaces * we often have relations that are "to instances of any subclass of foo", which are pretty easy to implement in vertical mapping, but were near to impossible in the framework we're trying to break from. As a matter of fact, I look forward to finding out how you solved the second problem in Modeling... Fede |
From: Federico H. <fh...@vi...> - 2003-09-25 21:22:16
|
On Thu, 2003-09-25 at 15:50, Ernesto Revilla wrote: > Take into account that in PostgreSQL inherited tables have separate > primary key sets I understand this is not true, and this transcript seems to disprove it (unless I failed to understand your meaning of "separate primary key sets"): test=3D# create table foo (id serial primary key, foo text); NOTICE: CREATE TABLE will create implicit sequence 'foo_id_seq' for SERIAL= column 'foo.id' NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'foo_pkey' f= or table 'foo' CREATE test=3D# create table foobar (bar text) inherits (foo); CREATE test=3D# insert into foo (foo) values ('hello'); INSERT 981730 1 test=3D# insert into foobar (foo, bar) values ('1,2', '3'); INSERT 981731 1 test=3D# select * from foo; id | foo =20 ----+------- 1 | hello 2 | 1,2 (2 rows) test=3D# select * from foobar; id | foo | bar=20 ----+-----+----- 2 | 1,2 | 3 (1 row) > and that, AFAIK, referential integrity rules in base tables are not > checked. It's actually a bit worse that that: they are checked... but they don't take inheritance into account. This means that if you declare a relationship to foo, this won't let the foreign key point to a foobar record: test=3D# create table bar (id serial primary key, bar int references foo); NOTICE: CREATE TABLE will create implicit sequence 'bar_id_seq' for SERIAL= column 'bar.id' NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index 'bar_pkey' f= or table 'bar' NOTICE: CREATE TABLE will create implicit trigger(s) for FOREIGN KEY check= (s) CREATE test=3D# insert into bar (bar) values (1); INSERT 981752 1 test=3D# insert into bar (bar) values (2); ERROR: <unnamed> referential integrity violation - key referenced from bar= not found in foo Which is ugly, and just may make the whole thing unusable for vertical mapping, but I'm still looking. Fede |
From: Federico H. <fh...@vi...> - 2003-09-25 21:22:08
|
I think I have not explained myself good enough, and you're answering just to what I said, not what I meant :-) The point I'm trying to make is that applications usually let the user perform transactions on a certain part of the database, and there's little point to keeping the data from previous transactions around. For example, imagine a simple app that allows you to delete, insert or modify the Author's info. The user of such an app will repeatedly look up an Author, change an attribute or two, commit the change, and start over again. When the user looks up the second Author, it doesn't make a lot of sense to keep the first Author's object around, does it? Yes, it is possible that the application will let the user navigate the second Author's books, and if the first Author has co-authored a book with the second, it might also let the user navigate back to the first Author... but I think it's affordable (and in some cases even desirable) that this navigation results in the first Author being fetched again from the database, instead of just fishing a likely stale copy from the cache. On Thu, 2003-09-25 at 15:40, Sebastien Bigaret wrote: > In fact, there are two levels of caching. > 1. Within EC:=20 > - you fetch an object obj1, and modify it, > - then you submit another query, which returns obj1 as well: there, > you don't want to override the modification you've made but not > saved. Hmmm... Well, this is the thing. I mean, what are you doing submitting another query while you still have uncommitted changes? I understand this is exactly the right behavior if you are traversing a series of relationships that brings you back to the original table/entity, but my understanding is that when the program says ec.fetch(...), it is actually stating that it's done with the results of the last fetch, and wants to start anew. So if you try to do a fetch while changes are pending, the EC should either commit the changes (don't like it) or raise an exception (much better) > 2. The database's rows cache, held by Database, to which the framework > refer for various tasks, such as: building fetched objects, computing > the changes that needs to be forwarded to the database, etc. I'm arguing that for most applications this cache ought to be flushed with each user-level fetch. I understand that, for single-user applications, longer-term caching can be a significance performance boost (although it may, as noted in the documentation, lead to memory footprint bloat if the application is not restarted regularly). In any other environment, I feel that the risk of the cache becoming stale seriously outweights performance concerns. > When referring to fetch(), both mechanisms can be triggered: > 1. -> if the object already exists in the EC, possibly modified, you'll > get that one. I've argued above that fetch() should fail if there are pending changes. > 2. -> otherwise, the database cache is searched for the row, and if > found, that one will be used instead. > (2.) can be annoying, and that's the situations where this is annoying > that 'refresh' will address (and in addition to the default > mechanism, it will allow you to do whatever you want through a > specific delegate if the object actually changed, just like with > optimistic locking) I agree that optimistic locking could make the long-term caching of rows workable. It will also, however, make conflicts more likely, because rows that have been longer in the database cache have a larger probability of becoming stale. > In fact, clearing the cache cannot be the default, just because you'd > probably won't rely in the framework to modify the data in your back. > Suppose, for example, that when fetching, an previously fetched object > has been deleted in the meantime (by an other applications): what you > the framework do? Should it take the responsability to delete the > object in the EC that fetched the data? If the database doesn't keep a long-term cache, the call to fetch() will not return the deleted row. If the deletion takes place after the fetch(), of course, we'll have to resort to the whole optimistic locking thing. Come to think of it, I think most of my argument rests upon the idea that it's desirable to minimize the likelyhood of optimistic locking conflict occurrence, which sound intuitively right to me, but I don't have any hard data to back it up. =20 > Not now, but I can make a plan for it, say, this week-end if you wish. Well, that would be great! I'm trying (and still failing :-) ) to figure out which module does what in the framework, so an expert opinion on what would need to be done to get optimistic locking and vertical mapping working would be a wonderful thing. > > I must admit I'm kinda skeptic about the notification idea... > Agreed, just because the modifications could have been made by any > bash/perl/... script who won't post any modifications ;) Back on the > notifications, at least they could solve the case where the framework > runs in a single address space (this is the case in Zope, for example, > or in any threaded application) and an EC save changes that you'd like > to see appear in other ECs. I'm saying that I don't see how notification could solve conflicts even in a single address space. If both ec1 and ec2 have pending modifications for obj1, and ec1 commits the change and notifies ec2... what will ec2 do with its changes? Fede |
From: Sebastien B. <sbi...@us...> - 2003-09-25 19:45:25
|
Hi Ernesto, Sorry for the delay in the answer. "Ernesto Revilla" <er...@si...> wrote: > I've installed modeling 0.9-pre15, with sqlite (2.8.6) >=20 > when I create the schema with mdl_generate_DB_schema.py -c then I get out= put > with no semicolons (;) separating the SQL commands, and this is not > understood by sqlite. >=20 > I modified line 99 of the script (mdl_generate_DB_schema.py) to include a > semicolon: > str+=3D'%s;\n'%sqlExpr.statement() >=20 > now I can do a: > mdl_generate_DB_schema.py -c -C mymodel.py >schema.sql > sqlite mymodel.sqlite < schema.py >=20 > Directly creating the schema (not specifying the -c option) worked > correctly. That's a reasonable request, I'll probably change the behaviour of the script in the coming release when '-c' is specified --and add an option for disabling it. Could you add a feature request to the project's RFE? -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-09-25 19:37:20
|
Federico Heinz <fh...@vi...> wrote: > On Thu, 2003-09-25 at 10:37, Sebastien Bigaret wrote: > > I can't remember if it's been discussed here, however this is the way > > I think this could be done: > > - models: add support for flattened attributes and > > relationships. Flattened properties are exactly this: take the value > > for a given field from another entity (hence potentially from another > > table). Flattened properties will be declared with a simple path, > > such as: toEmployee.toStore.name > > (BTW, backward-compatibility should, and will, be ensured) >=20 > OK, so what we're talking about here is that the XML is a description of > the *classes*, and how they map to the database, not a description of > the database itself (i.e. "entities" declared in the XML are "class" > entities, not "table" entities). No problem with that, I'm just trying > to make sure we're talking about the same things.=20 The XML-models or the PyModels are Entity-Relationship models, they do represent how classes map to databases. They are more database-centered than UML-centered, that's why, in such situations, the E-R diagram will differ from the class diagram. >In this way of > handling things, the XML will never reflect inheritance, and all classes > that are part of an inheritance hierarchy will have to specify the > flattening of each attribute. The XML will not directly reflect the class diagram, at least not in the same way than UML models, however: - the inheritance tree is modeled by the 'parent' field in Entity, - each subclass will have their own entity and table, and these sub-entities will have flattened attributes and relationships pointing to the superclass'. Here you're right, the inherited attributes and relationships should be flattened (in the xml) from the parent to the children --but this is also the case for horizontal mapping (where each class gets its own table): although in this case there is no flattened properties, the real attributes and relationships are also copied. > This makes writing the XML by hand a major > pain in the neck, though it probably doesn't matter (at least to us, > we're planning on generating this XML from our own, richer XML dstabase > schema description). XML-models have no default value, in fact they are a comprehensive description of the models, so in general designing them by hand is a pain --that's why the ZModeler was written, and also why PyModels are a great win :) Note that in PyModels, properties do not need to be copied from parent to children, this is done automatically. BTW: there is a reason why models requires properties to be copied: a string attribute in a parent can have a width of 40, while a child can ask a width of 50 for the same field. Since model introspection is something the framework does a lot, we do not want to compute inherited attributes if they are not overriden: that's why they are copied. And since xml-models are comprehensive descriptions of models, you also find copy of inherited attributes in entities. PyModels takes care of this when loaded, although they also allow you to override the definitions of inherited attributes, if needed. The best of the two worlds ;) > > - make the framework correctly handle those flattened properties. As > > far as I can see, this would only impact the low-level bricks of the > > framework (esp. SQL generation for fetching and saving). >=20 > I agree, this should be pretty straightforward, and could be implemented > without breaking anything... people who don't use it wouldn't even > notice it's there.=20 Sure. > > Could you possibly elaborate on that? I never played with postgresql's > > inheritance support, and it's probably something that could be added in > > the User's Guide until vertical mapping is supported. If you find some > > time to illustrate the mechanisms, I'll be interested in learning how > > they behave wrt. the framework. >=20 > Well, I haven't done much work with it personally either, but one of my > pals here has. The story is that you can specify inheritance in a > PostgreSQL database scheme. When you tell PostgreSQL that table Mammal > inherits from Vertebrate, a "SELECT * FROM Mammal" will include all > fields defined in Vertebrate as well. The interesting part is that the > Vertebrate fields are actually stored in their own table...=20 You really mean "*non-inherited* fields in Vertebrate are stored in their own table", don't you? (my turn to ask to make sure that we're talking about the same thing ;) > think of it > as vertical mapping implemented in the DBMS. So unique indentifiers are > unique across all Vertebrates, regardless of whether they are Mammals or > not. > I have not tried this yet, but it looks to me that this could be used to > put our application (which uses vertical mapping) in such a state that > it can be used with the current framework. About unique IDs: the framework takes care of assigning a unique ID (per inheritance tree) to any object, regardless of where they are stored. So, if you create two objects v1(Vertebrate) and m1(Mammal), with Mammal inheriting from Vertebrate, it is guaranteed that v1.ID!=3Dm1.ID, always --even in horizontal mapping. After having quickly looked at the postgresql doc. for inheritance, I can see the following immediate pb. with such an approach: you won't be able to fetch data from a specific entity alone, because the framework will not issue the necessary 'select ... from only...' statement, nor it will fetch the 'tableoid' system field --so the 'isDeep' flag of fetch() will have no effect. This could be a problem... or not! Be also aware that, while pg accepts multiple inheritance, the framework does not support more than one parent for a given entity. I'm not aware of the issues that Ernesto has raised in his recent post, but for sure this needs to be checked as well. Maybe related, at http://www.postgresql.org/docs/7.3/interactive/ we read: << A limitation of the inheritance feature is that indexes (including unique constraints) and foreign key constraints only apply to single tables, not to their inheritance children. Thus, in the above example, specifying that another table's column REFERENCES cities(name) would allow the other table to contain city names but not capital names. This deficiency will probably be fixed in some future release. >> BTW, if you can disclose, I'll be really interested in hearing the reasons why you want vertical mapping --esp. against the additional overhead each fetch (where tables should be JOINed), and each insert/update (where two INSERT/UPDATES are needed). Regards, -- S=E9bastien. |
From: Ernesto R. <er...@si...> - 2003-09-25 18:54:30
|
----- Original Message -----=20 From: "Sebastien Bigaret" <sbi...@us...> To: "Federico Heinz" <fh...@vi...> Cc: "Modeling Mailing List" <mod...@li...> Sent: Thursday, September 25, 2003 3:37 PM Subject: [Modeling-users] Re: Implementing inheritance through vertical = mapping ..... >> [1] depending on priorities. We have done similar work in other >> contexts, but for now we could choose to use PostgreSQL's inheritance >> mechanism to implement the database, until we can spare the time to >> create the vertical mapping layer. >Could you possibly elaborate on that? I never played with postgresql's >inheritance support, and it's probably something that could be added in >the User's Guide until vertical mapping is supported. If you find some >time to illustrate the mechanisms, I'll be interested in learning how >they behave wrt. the framework. Take into account that in PostgreSQL inherited tables have separate = primary key sets and that, AFAIK, referential integrity rules in base = tables are not checked. So the only inheritance mechanism implemented is = that of queries. All other questions are still open and although = possible solutions have been discussed, nothing has been done for years. = One of the most important problems seems that an index covers only one = table, so all DML queries (INSERT, DELETE, UPDATE) have to check the = indexes for the complete inheritance tree, what, I think, is not = implemented. Erny |
From: Sebastien B. <sbi...@us...> - 2003-09-25 18:44:38
|
Hi Mario, Mario Ruggier <ma...@ru...> wrote: > Hi, little issue with usedForLocking, which is mentioned throughout the > the PyModels Attributes section, each page of which points back to 2.3.3 > for further explanation, but there is never a word about what this is for > (as far as I can see). >=20 > What was this thing for? Funny that you're asking this now that we discuss optimistic locking ;) This flag indicates which attributes optimistic locking will check when it is about to save changes: if value for attribute 'lastname' has changed and 'usedForLocking' is set for tha attr., then under the optimistic locking policy saveChanges() will raise; if the flag is unset, the attribute will be silently overriden (for example, you probably won't mark an object's timestamp as used for locking). Cheers, -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-09-25 18:40:49
|
Federico Heinz <fh...@vi...> wrote: > On Thu, 2003-09-25 at 10:18, Sebastien Bigaret wrote: > > To be precise, an object's row is cached as long as one instance for > > this row is registered in one EC. >=20 > OK... So the question WRT cache life seems to be "when does a row get > deregistered from an EC?". It would seem reasonable to think that every > time the EC gets a user-level fetch request (as opposed as a fetch > request due to accessing a fault object), it clears its cache, since the > application now is obviously interested in another set of objects > instead of the ones already in memory. In fact, there are two levels of caching. 1. Within EC:=20 - you fetch an object obj1, and modify it, =20=20 - then you submit another query, which returns obj1 as well: there, you don't want to override the modification you've made but not saved. 2. The database's rows cache, held by Database, to which the framework refer for various tasks, such as: building fetched objects, computing the changes that needs to be forwarded to the database, etc. > This could create problems if the > application kept references to objects between fetches, but I'd argue > that doing this is a Bad Thing to boot. Should the need arise for this > kind of functionality (kind of hard to imagine, but life is weird), a > method cumulativeFetch() could be added to the EditingContext, which > fetches without clearing the cache first. When referring to fetch(), both mechanisms can be triggered: 1. -> if the object already exists in the EC, possibly modified, you'll get that one. 2. -> otherwise, the database cache is searched for the row, and if found, that one will be used instead. (1.) is probably something you do not want to change, (2.) can be annoying, and that's the situations where this is annoying that 'refresh' will address (and in addition to the default mechanism, it will allow you to do whatever you want through a specific delegate if the object actually changed, just like with optimistic locking) In fact, clearing the cache cannot be the default, just because you'd probably won't rely in the framework to modify the data in your back. Suppose, for example, that when fetching, an previously fetched object has been deleted in the meantime (by an other applications): what you the framework do? Should it take the responsability to delete the object in the EC that fetched the data? Or suppose you modified the relationships in the EC, and that when fetching these relationships have changed: discarding the data could lead to inconsistencies in the graph of objects, since most of the time, relationships have an inverse (and constitue a bi-directional UML associations). When you say that it is a bad thing to hold references to objects between fetching, you forget that the objects themselves actually hold ref. to others they are in relation with. Now if you want to be sure to get fresh data (until 'refresh' for fetch is available), you can make sure that the row is deregistered by calling ec.dispose() on (each of) your EC. Be aware that this method invalidates any object the EC hold and that it discards any updates/deletes/ etc. that are not saved yet. This also has a significant impact on performances, since every objects will need to be refetched and rebuild. If it's not clear enough, feel free to ask for more ;) Maybe I'm not thinking/answering the right way, so if you have a specific example in mind that could help. > > I see the point. This is the current status. Say you have two instances > > (so, two adress spaces) with two ECs, ec1 and ec2. Both query and update > > an object obj1. >=20 > Your description matches what I figured, and I like the optimistic > locking idea. Is the implementation of optimistic lockig scheduled any > time soon? Ideas on how much effort it would entail to implement? Not now, but I can make a plan for it, say, this week-end if you wish. > > (In fact, as the documentation says and as you noted, we currently also > > have this problem between two different ECs in the same address space > > --but this will be solved by delivering notifications to the > > appropriate objects) >=20 > I must admit I'm kinda skeptic about the notification idea... Assume ec1 > and ec2 above are in the same address space now. When ec1 commits > changes to object x, it can notify ec2 of this... but what is ec2 going > to do with this information? If ec2 has uncommited changes to x, it has > to resort to the same kind of logic that we'd use in the optimistic lock > case. In the end, thus, the only thing we gain is that we skip a fetch. > Not that this is not important performance-wise, the point I want to > make is that notification alone does not solve the problem, we also need > the optimistic lock resolution for it to work. Agreed, just because the modifications could have been made by any bash/perl/... script who won't post any modifications ;) Back on the notifications, at least they could solve the case where the framework runs in a single address space (this is the case in Zope, for example, or in any threaded application) and an EC save changes that you'd like to see appear in other ECs. > > Now if you want two > > different address spaces to be notified of changes made by the other > > before any attempt to save the changes in the db, we would need a more > > general notification mechanism which should be able to broadcast changes > > through the network, but even then, I suspect this is a hard problem to > > solve. >=20 > This would be a nightmare, gazillions of things could go wrong, and they > would certainly do so in the worst possible sequence. We don't want to > pursue this. I really like the way you put it ;) and totally agree. In fact, this also applies to particular situations where the data can be changed by any mean outside the framework. Such situations require specific and specialized use-cases and actions, so this make sense I guess to leave it open (but we still need to provide the tool for handling them, such as refresh and optimistic locking). > > Another cleaner solution (and maybe the only one that can be > > guaranteed to be 100% safe) could be to explicitely lock the appropriate > > row before any attempt to modify an object, and to release the lock only > > after changes has been made --this is the so-called pessimistic locking > > strategy. >=20 > We could implement this as a method of persistent objects, thus x.lock() > would perform a locking read on the row until the transaction's done or > rolled back. Of course, this means that the programmer will have to take > care of which objects to lock, but such is the fate of the pessimistic > locking programmer :-) Yes, that could be done; this is in fact the very basis for automatic pessimistic locking: lock the object as sson as it is modified (binding lock() to willChange()), release the lock when it is saved and/or refaulted. -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-09-25 17:39:53
|
Federico Heinz <fh...@vi...> wrote: > Thanks for the prompt answer! Indeed, this answers my question, but of > course it prompts another :-) >=20 > On Thu, 2003-09-25 at 09:46, Sebastien Bigaret wrote: > > 2. It then opens a transaction in the db, and sends the necessary SQL > > statements (INSERT/UPDATE/DELETE) in this transaction. If any error > > occurres when these sql statements are executed, the whole > > transaction is rolled back. > > 3. It commits the transaction. Here again, if an error occurs at the > > DB-level, the transaction is rolled back. >=20 > I assume that it also raises an exception at this point. Which exception > is this, and what information does it contain? (a pointer to the > relevant code is enough if you don't feel like writing a long answer). Two different exceptions can be raised - ValidationException, which you already know, - GeneralAdaptorException (raised by AdaptorChannel) - and <embarrassed> hmm, well,... RuntimeError... </embarrassed> The relevant code is in ObjectStoreCoordinator's method saveChangesInEditingContext(). I *know* the exceptions should be better handled --I can still remember when I put the RuntimeError, I was finalizing the whole code underneath saveChanges() and didn't want to spend some time thinking of exceptions... Then it remained as is. I'm also quite sure, although I did not check, that GeneralAdaptorException is swallowed and mutated into a RuntimeError. The good news is that I never encountered such errors in any of the projects that are still alive. However, if in your environment you have, for example, a network that may cause db-connections to be randomly dropped, or specialized triggers/validation mechanisms that are only coded on the SQL side, then you'll probably need something better than a RuntimeError. This is on the TODO list, obviously. And I'm open to any suggestions to define the exceptions hierarchy. -- S=E9bastien. |
From: Federico H. <fh...@vi...> - 2003-09-25 17:18:00
|
On Thu, 2003-09-25 at 10:37, Sebastien Bigaret wrote: > I can't remember if it's been discussed here, however this is the way > I think this could be done: > - models: add support for flattened attributes and > relationships. Flattened properties are exactly this: take the value > for a given field from another entity (hence potentially from another > table). Flattened properties will be declared with a simple path, > such as: toEmployee.toStore.name > (BTW, backward-compatibility should, and will, be ensured) OK, so what we're talking about here is that the XML is a description of the *classes*, and how they map to the database, not a description of the database itself (i.e. "entities" declared in the XML are "class" entities, not "table" entities). No problem with that, I'm just trying to make sure we're talking about the same things. In this way of handling things, the XML will never reflect inheritance, and all classes that are part of an inheritance hierarchy will have to specify the flattening of each attribute. This makes writing the XML by hand a major pain in the neck, though it probably doesn't matter (at least to us, we're planning on generating this XML from our own, richer XML dstabase schema description). > - make the framework correctly handle those flattened properties. As > far as I can see, this would only impact the low-level bricks of the > framework (esp. SQL generation for fetching and saving). I agree, this should be pretty straightforward, and could be implemented without breaking anything... people who don't use it wouldn't even notice it's there.=20 > Could you possibly elaborate on that? I never played with postgresql's > inheritance support, and it's probably something that could be added in > the User's Guide until vertical mapping is supported. If you find some > time to illustrate the mechanisms, I'll be interested in learning how > they behave wrt. the framework. Well, I haven't done much work with it personally either, but one of my pals here has. The story is that you can specify inheritance in a PostgreSQL database scheme. When you tell PostgreSQL that table Mammal inherits from Vertebrate, a "SELECT * FROM Mammal" will include all fields defined in Vertebrate as well. The interesting part is that the Vertebrate fields are actually stored in their own table... think of it as vertical mapping implemented in the DBMS. So unique indentifiers are unique across all Vertebrates, regardless of whether they are Mammals or not. I have not tried this yet, but it looks to me that this could be used to put our application (which uses vertical mapping) in such a state that it can be used with the current framework. Fede |
From: Mario R. <ma...@ru...> - 2003-09-25 17:09:46
|
Hi, little issue with usedForLocking, which is mentioned throughout the the PyModels Attributes section, each page of which points back to 2.3.3 for further explanation, but there is never a word about what this is for (as far as I can see). What was this thing for? Cheers, mario |
From: Federico H. <fh...@vi...> - 2003-09-25 16:41:19
|
On Thu, 2003-09-25 at 10:18, Sebastien Bigaret wrote: > To be precise, an object's row is cached as long as one instance for > this row is registered in one EC. OK... So the question WRT cache life seems to be "when does a row get deregistered from an EC?". It would seem reasonable to think that every time the EC gets a user-level fetch request (as opposed as a fetch request due to accessing a fault object), it clears its cache, since the application now is obviously interested in another set of objects instead of the ones already in memory. This could create problems if the application kept references to objects between fetches, but I'd argue that doing this is a Bad Thing to boot. Should the need arise for this kind of functionality (kind of hard to imagine, but life is weird), a method cumulativeFetch() could be added to the EditingContext, which fetches without clearing the cache first. > I see the point. This is the current status. Say you have two instances > (so, two adress spaces) with two ECs, ec1 and ec2. Both query and update > an object obj1. Your description matches what I figured, and I like the optimistic locking idea. Is the implementation of optimistic lockig scheduled any time soon? Ideas on how much effort it would entail to implement? > (In fact, as the documentation says and as you noted, we currently also > have this problem between two different ECs in the same address space > --but this will be solved by delivering notifications to the > appropriate objects) I must admit I'm kinda skeptic about the notification idea... Assume ec1 and ec2 above are in the same address space now. When ec1 commits changes to object x, it can notify ec2 of this... but what is ec2 going to do with this information? If ec2 has uncommited changes to x, it has to resort to the same kind of logic that we'd use in the optimistic lock case. In the end, thus, the only thing we gain is that we skip a fetch. Not that this is not important performance-wise, the point I want to make is that notification alone does not solve the problem, we also need the optimistic lock resolution for it to work. > Now if you want two > different address spaces to be notified of changes made by the other > before any attempt to save the changes in the db, we would need a more > general notification mechanism which should be able to broadcast changes > through the network, but even then, I suspect this is a hard problem to > solve. This would be a nightmare, gazillions of things could go wrong, and they would certainly do so in the worst possible sequence. We don't want to pursue this. > Another cleaner solution (and maybe the only one that can be > guaranteed to be 100% safe) could be to explicitely lock the appropriate > row before any attempt to modify an object, and to release the lock only > after changes has been made --this is the so-called pessimistic locking > strategy. We could implement this as a method of persistent objects, thus x.lock() would perform a locking read on the row until the transaction's done or rolled back. Of course, this means that the programmer will have to take care of which objects to lock, but such is the fate of the pessimistic locking programmer :-) Fede |
From: Federico H. <fh...@vi...> - 2003-09-25 14:59:14
|
Sebastien, Thanks for the prompt answer! Indeed, this answers my question, but of course it prompts another :-) On Thu, 2003-09-25 at 09:46, Sebastien Bigaret wrote: > 2. It then opens a transaction in the db, and sends the necessary SQL > statements (INSERT/UPDATE/DELETE) in this transaction. If any error > occurres when these sql statements are executed, the whole > transaction is rolled back. > 3. It commits the transaction. Here again, if an error occurs at the > DB-level, the transaction is rolled back. I assume that it also raises an exception at this point. Which exception is this, and what information does it contain? (a pointer to the relevant code is enough if you don't feel like writing a long answer). Fede |
From: Sebastien B. <sbi...@us...> - 2003-09-25 13:38:06
|
Federico Heinz <fh...@vi...> wrote: > Modeling does not implement vertical mapping yet, a feature we might[1] > be willing to contribute. However, from the looks of it, the XML schema > definition does not provide the needed information for it: the way it's > structured, it doesn't contain any provisions for telling the system > that a given class takes its data from a series of tables and not just > from one. Agreed. > This can, of course, be corrected, and instantiating an object from > several tables at once was one of those things EOF did beautifully > (maybe still does). I did not check, but I can't see a reason why it would have been removed, even now that it's now completely rewritten in Java :) > But the XML schema definition must be changed, > hopefully in a backwards-compatible way. Are there any plans about how > these changes could be implemented? We could suggest ways for doing it, > but if the issue has been discussed, we'd rather add our salt to the > discussion than make complete fools of ourselves :-) I can't remember if it's been discussed here, however this is the way I think this could be done: - models: add support for flattened attributes and relationships. Flattened properties are exactly this: take the value for a given field from another entity (hence potentially from another table). Flattened properties will be declared with a simple path, such as: toEmployee.toStore.name (BTW, backward-compatibility should, and will, be ensured) - make the framework correctly handle those flattened properties. As far as I can see, this would only impact the low-level bricks of the framework (esp. SQL generation for fetching and saving). If you have different ideas, techniques, suggestions, etc., please tell; my point of view is, for sure, greatly influenced by my remembrance of past EOF-practices ;) and history has already shown that my eyes and brain sometimes need to be forced to be kept opened (Mario can tell, he who fought my initial doubts for PyModels!) > [1] depending on priorities. We have done similar work in other > contexts, but for now we could choose to use PostgreSQL's inheritance > mechanism to implement the database, until we can spare the time to > create the vertical mapping layer. Could you possibly elaborate on that? I never played with postgresql's inheritance support, and it's probably something that could be added in the User's Guide until vertical mapping is supported. If you find some time to illustrate the mechanisms, I'll be interested in learning how they behave wrt. the framework. -- S=E9bastien. |
From: Sebastien B. <sbi...@us...> - 2003-09-25 13:20:11
|
Hi, Federico Heinz <fh...@vi...> wrote: > How does ZODB compare exactly to Modeling? I've been trying to > understand ZODB from its documentation, but I can't quite figure out > whether it ZODB and Modeling are substitutes for one another, or whether > they solve different problems and can be used together. At times, the > ZODB docs seem to say that it's a O-O database engine in itself, at > other times that it's an O/R mapper... I can't quite figure it out. >=20 > Any experienced Zopers that can tell? It's been a long time since I haven't seriously played with ZODB internals, but I'll try to comment to the best of my knowledge. ZODB *is* an OODB, not a RDBMS, that's sure. It currently offers: - an object database (called Data.fs within zope), - a base class, Persistence.Persistent, which is very good at observing changes. It is very quick, since it's coded in C; the modeling's equivale= nt are CustomObject's method willChange() and the ObserverCenter. - a transactional scheme (see also the links at the end of the mail) AFAIK there are currently two major problems with ZODB, depending on your needs: - complex relationships are not easily expressed (see e.g. http://www.zope.org/Members/upfront/ZODBRelationships/FrontPage and the related thread http://mail.zope.org/pipermail/zodb-dev/2003-May/thread.html#4998) - queries on objects are not easily done --you need an external product, such as the ZCatalog in Zope, to achieve this. No need to say that if you need to build complex queries, involving relationships, it can quickly become a nightmare. For a quick and complete overview of the ZODB, you can e.g. refer to Jim Fulton's introduction paper at: http://www.python.org/workshops/2000-01/proceedings/papers/fulton/zodb3.h= tml The introduction of the ZODB/ZEO Dev. Guide by A.M.Kuchling is also of great interest: http://www.zope.org/Wikis/ZODB/FrontPage/guide/index.html Other related resources can be found at: http://www.zope.org/Wikis/ZODB/FrontPage BTW, I know that the Zope3 dev-team tries to address the problem, although I've no idea on what, how, when, etc. You might want to search the archives of zope3-dev; you can also check the Ape product (http://hathaway.freezope.org/Software/Ape) from Shane Hathaway, for additional info. These were quite general comments, I hope they are informative however... -- S=E9bastien. |