From: Eric E. <ev...@pr...> - 2002-06-03 21:43:29
|
Gavin- > >The decision to cascade saves and updates is really based on efficiency, > >and only the application can know when to take advantage of persisting > >select portions of the graph - thus I would not want to make the decision > >static by placing it in the mapping file. > >In my opinion, the current design gives maximum flexibility here. There >is a tension between two conflicting goals: > > * dont want to update() objects unecessarily or delete() objects >incorrectly > * dont want to force the application to have to spell out every single >save/update/delete in code > >I think what we have is a good balance. Hibernate never does an update or >delete unless you explicitly request it *somewhere*. However, by specifying >it in the mapping file, you can save having to request it *every time*. I think the meaning got lost in my verbiage here. What I meant to suggest was that cascading be available via overloaded methods such as: save/update(Object o, boolean cascade); //I wouldn't include cascade delete This saves the application from having to call these operations for each object in the graph and allows the application to decide when it can optimize by acting on only a portion of the graph. IMHO, the key problem with placing the cascade all option in the mapping file is that once this option is set, the application has no choice but to allow the cascade to happen. If the application knows that only one object out of a graph of a hundred changed, shouldn't it be able to optimize? The other problem I see with the cascade all option is that cascade delete is only viable in some cases, whereas cascade save and update are viable in almost all cases. For instances, an Organization-Organization relation (where one entity is not strictly dependant on the other) should cascade create and save, but not delete. As a result, cascade all can't be used in these instances and the application will *still* end up spelling out all of the calls. This could be addressed by splitting up the cascade option into save/update/delete as you had suggested, but why make it so complicated? > >Second, after taking the other cascading operations out of the mapping > >file, I would use the tag 'dependant' to indicate a relationship in which > >deleting the parent results in the child being deleted. Also, since the > >child lives and dies with the parent, I would allow de-referencing by the > >parent to implicitly delete the child if the call to parent.update() is > >allowing cascades. > >The first part of this (cascading deletes without cascading save + update) >can be addressed by allowing cascade="delete" in the mapping file. This was >something I was prepared to add anyway. Your cascade="delete" tag is the same as my "dependant" tag, so I'll just call it cascade delete. What I am really suggesting here is that cascade delete be the *only* cascade type tag - all other cascades to be determined by the overloaded method calls. This is because delete is the only operation where semantic knowledge of the Parent-Child dependencies determines how the operation is executed. If Pet is dependant on Person such that Pet should be deleted with Person, Pet must be deleted with Person every time- to not do so would be an error for the model. This type of relationship can be put in the mapping file and enforced by having delete act as the mapping file dictates. But if Person is modified and Pet is not, not updating Pet is an optimization that the application can choose, not an error in the same way that failing to delete would be. >The second part - deleting objects dereferenced by the parent - is in >general a Hard Problem. You would probably be able to make it work for >children one level deep fairly easily - but anything beyond that would have >very deep implications for the complexity + efficiency of Hibernate. I >already >implemented something like this for collections. The only way I could >really make it work with any expectation of efficiency was with the >following restrictions: > > * collections may be referenced by exactly one parent object/collection > * collections may only be retrieved by retrieving their parent > >Even then, the code for collections got real hairy, as anyone who has >looked at RelationalDatabaseSession can testify. Basically I regularly >start itching to rip out support for subcollections/toplevel-collections >purely for the purpose of simplifying understandability of that code. Yeah, I knew this was a tall order :) Maybe I'll work on it someday. . . > > Finally, my most questionable idea - add a PersistentLifcycle method: > > boolean impliciteDelete(Session, ParentEntity) > > to allow dependant entities to determine if they should be deleted. > > This would give dependant entities a chance to examine their own state > > for flags(Company issue or non-Company issue) and possibly check for > > other parents who might be holding a reference to them. Checking for > > parent references could get messy. . . I don't have a good answer for > > this. > >If this would be useful, It would be trivial to do. I would add a new >interface: > >interface CascadeVetoer { > public boolean vetoCascade(Object parent, Operation action); >} > >or something...... > >But would it really be _that_ useful? > > >Third, merge the functionality of insert(), save(), and update() into one > >new method (persist?). Assuming that these operations cascade, it becomes > >a chore to keep track of which portions of your graph are new (save) or > >existing (update). For a cascading persist operation, persist() could >walk > >the tree and assume that entities with assigned IDs are existing and those > >without IDs are new. > >This would be very nice. I think we can actually just add this behaviour >straight into update(), with rules like: > >1. if reached object is associated with session, ignore it (currently >throws exception) >2. if reached object has no id property, throw exception (current >behaviour) >3. if reached object has a null id, save it (currently throws exception) >4. if reached object has non-null id, update it (current behaviour) > >The most problematic is (3). How do we detect null values for primitive >ids? I guess we just consider that an unsupported case.... > >One further complication: > >foo -> bar -> baz > >foo, baz need updating. bar needs saving. Under the semantics I just >proposed, session.update(foo) would cascade to session.save(bar). I suspect >that what you would really want is for the next cascade to be >session.update(baz), not session.save(baz) so we would need to remember >that we are cascading an update, not just a save. > >This way, we would support two distinct programming styles > > * find/update/delete (with versioned data) > * find/load/save/delete (with versioned or unversioned data) > >Oh, I just realised something I forgot about before. At present insert() >cascades to save(). Is this best? Should insert() cascade to insert()? >Don't know what is the best semantics here.... Personally, I like the idea of this being a separate method (persist), since save/update/insert all lose their meaning when they cascade to other operations. From your example where foo and baz need updating and bar needs saving. foo -> bar -> baz By calling update(foo), I implicitly call update(bar). Instead of an error, the method is redirected to save(bar). Yet if I call save(bar) directly I would get an error. Why? Instead, I would push save/update/insert off center stage - people can use them if they know exactly what operation they want to perform - and instead use persist() as the normal case, which doesn't imply any one operation. Save/update/insert would then cascade to only their own operations, allowing optimization since they don't have to figure out what operation to perform. And, of course, persist would be overloaded to allow cascading! :) Insert is a tough one, since it is difficult to tell if a referenced object that has an assigned id was meant to be inserted or updated. The only way to tell is an SQL error or EXISTS check. You could stipulate that persist() only does saves and updates, no inserts. Or leave it as an option in the persist method as to whether or not persist is supposed to figure out if an object needs to be inserted or updated. Hopefully, people would use the option only if they really needed to. . . Best wishes, Eric Everman |