From: Michael S. <msa...@pc...> - 2005-10-27 14:38:15
|
Ok-- I'm not hearing any major opposition to the easy fix-- dropping all primary key constraints for version tables-- so I'm going to push this into the trunk. If you're really opposed to this, and you've been waiting to object, now's the time! --Mike On Oct 24, 2005, at 9:29 AM, Angel Pizarro wrote: > OK, let's collect our knowledge: > > DB throughput is high enough to cause conflicts in version tables. > Version tables are rarely used. > Indexes/constraints of any type reduce transaction throughput. > > Given these three items, I think that steve's suggestion of using a > sequence to order inserted rows, and debbies request for dropping > any PKs in the version tables are a good way to keep track of data > provenance and not make version tables a limitting factor on > database population. > > -angel > > Michael Saffitz wrote: > > >> What does the VERSION_TRANSACTION_ID column track? Perhaps we >> could take >> advantage of this for a composite PK, although I suppose that it >> may not >> work if there are multiple edits to a given row within a >> transaction (which >> is not well defined here). >> >> --Mike >> >> >> >> On 10/24/05 7:32 AM, "pi...@pc..." >> <pi...@pc...> wrote: >> >> >> >>> The date format was modified (precision increased) and the >>> problem persisted). >>> >>> The version tables are useful for a few reasons: >>> 1) Rows can be restored that were updated or deleted in error. >>> 2) There is a record of who, when, and how a row was updated or >>> deleted. >>> >>> Restoration can be accomplished using the alg_inv_id and as long >>> as the >>> user_id >>> and version modification and original modification date are >>> available, the >>> modifications can be tracked. >>> >>> I'd like to point out that the version tables are rarely used. >>> >>> Quoting Steve Fischer <sfi...@pc...>: >>> >>> >>> >>>> the "race" condition we have is that a single process is rapidly >>>> updating a row in time that is less than the resolution of our >>>> current date stamp. we won't be able to distinguish them by >>>> row_alg_invocation_id as they are done by the same process. it it >>>> likely but not assured that an increase in resolution will solve >>>> the >>>> problem. i propose that we: >>>> 1. increase the precision >>>> 2. drop the constraint >>>> 3. don't really care too much >>>> >>>> if we want to care, then, i suppose, we would have to add another >>>> column to the version tables that was an increasing sequence. >>>> >>>> steve >>>> >>>> >>>> Chris Sarnowski wrote: >>>> >>>> >>>> >>>>> I was going to suggest (2) below, but will nitpick and point out >>>>> that it would make the problem 1000 times less likely to >>>>> occur, but >>>>> that it's still possible to have collisions, especially if, >>>>> say, 2 >>>>> processes are simultaneously working on overlapping data. >>>>> With (1), the problem is that with any collision, there is the >>>>> possibility that you won't be able to determine in what order >>>>> changes were made to data (which could, for example, break an >>>>> automated "rollback" system). Given that collisions are probably >>>>> already rare, and that algorithm_invocation_id helps distinguish >>>>> simultaneous processes, this just may not be an important >>>>> problem. >>>>> So (1) is still the easiest workaround, and I'd guess "good >>>>> enough". >>>>> -Chris >>>>> >>>>> >>>>> On Oct 21, 2005, at 3:44 PM, Angel Pizarro wrote: >>>>> >>>>> >>>>> >>>>>> Debbie's point is that uniqueness in version tables doesn't >>>>>> really >>>>>> matter and it is getting in the way of fast inserts. The only >>>>>> ways >>>>>> to fix this are: >>>>>> >>>>>> 1) drop the constraints >>>>>> 2) increase the precision of row_modification_date to >>>>>> milliseconds. >>>>>> >>>>>> -angel >>>>>> >>>>>> msa...@pc... wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> The problem with that is that the ID alone isn't necessarily >>>>>>> unique-- a row that >>>>>>> has been edited multiple times will end up in the version table >>>>>>> multiple times, >>>>>>> hence the addition of the modification_date to (attempt to) >>>>>>> ensure >>>>>>> uniqueness. >>>>>>> >>>>>>> --Mike >>>>>>> >>>>>>> Quoting Steve Fischer <sfi...@pc...>: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> yes, i guess so. >>>>>>>> >>>>>>>> steve >>>>>>>> >>>>>>>> Deborah Pinney wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> You mean drop the uniqueness constraint for the original PK >>>>>>>>> +modification_date? >>>>>>>>> Steve Fischer wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>>> how about just dropping the uniqueness constraint of the >>>>>>>>>> modification date? >>>>>>>>>> >>>>>>>>>> steve >>>>>>>>>> >>>>>>>>>> Chris Sarnowski wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> I can't speak for GUS needs in this particular case, but >>>>>>>>>>> I've >>>>>>>>>>> had the same issue in other systems. Deborah is correct in >>>>>>>>>>> that these fields are not right for PK. The question is how >>>>>>>>>>> important it is to preserve exact order of the changes >>>>>>>>>>> to the >>>>>>>>>>> data. If it's not critical you could just drop the PK. >>>>>>>>>>> If it >>>>>>>>>>> is (or even if it's not but you believe in the rule "every >>>>>>>>>>> table must have a PK"), then you should use a surrogate key >>>>>>>>>>> from a sequence. In practice, at least, the sequence values >>>>>>>>>>> increase monotonically so changes to a record that occur >>>>>>>>>>> multiple times with the same modification date can still be >>>>>>>>>>> ordered correctly. >>>>>>>>>>> I prefer the latter, in general, but I suppose adding a new >>>>>>>>>>> surrogate key field to all version tables in GUS would be a >>>>>>>>>>> pretty big deal. >>>>>>>>>>> >>>>>>>>>>> Just my thoughts. >>>>>>>>>>> -Chris Sarnowski >>>>>>>>>>> >>>>>>>>>>> On Oct 21, 2005, at 11:07 AM, Deborah Pinney wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> I suggest that we drop the uniqueness constraint for PKs of >>>>>>>>>>>> version tables. PKs for these tables consist of the >>>>>>>>>>>> original >>>>>>>>>>>> PK and the modification date. Rapid computation can result >>>>>>>>>>>> in uniqueness copnstraint violations that are not remedied >>>>>>>>>>>> by date format changes. There doesn't seem to be a purpose >>>>>>>>>>>> in having these constraints for any of the version tables >>>>>>>>>>>> and dropping the constraints would not result in the >>>>>>>>>>>> loss of >>>>>>>>>>>> information or change query results. I don't know if there >>>>>>>>>>>> is implicit indexing on these keys but I wonder if this >>>>>>>>>>>> could be handled using explicit indexing. >>>>>>>>>>>> >>>>>>>>>>>> >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> This SF.Net email is sponsored by: >>>>> Power Architecture Resource Center: Free content, downloads, >>>>> discussions, >>>>> and more. http://solutions.newsforge.com/ibmarch.tmpl >>>>> _______________________________________________ >>>>> Gusdev-gusdev mailing list >>>>> Gus...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>>> >>>>> >>>> >>>> ------------------------------------------------------- >>>> This SF.Net email is sponsored by: >>>> Power Architecture Resource Center: Free content, downloads, >>>> discussions, >>>> and more. http://solutions.newsforge.com/ibmarch.tmpl >>>> _______________________________________________ >>>> Gusdev-gusdev mailing list >>>> Gus...@li... >>>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>>> >>>> >>>> >>> >>> >>> >>> ------------------------------------------------------- >>> This SF.Net email is sponsored by the JBoss Inc. >>> Get Certified Today * Register for a JBoss Training Course >>> Free Certification Exam for All Training Attendees Through End of >>> 2005 >>> Visit http://www.jboss.com/services/certification for more >>> information >>> _______________________________________________ >>> Gusdev-gusdev mailing list >>> Gus...@li... >>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >>> >>> >> >> >> >> >> ------------------------------------------------------- >> This SF.Net email is sponsored by the JBoss Inc. >> Get Certified Today * Register for a JBoss Training Course >> Free Certification Exam for All Training Attendees Through End of >> 2005 >> Visit http://www.jboss.com/services/certification for more >> information >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> >> > > > > ------------------------------------------------------- > This SF.Net email is sponsored by the JBoss Inc. > Get Certified Today * Register for a JBoss Training Course > Free Certification Exam for All Training Attendees Through End of 2005 > Visit http://www.jboss.com/services/certification for more information > _______________________________________________ > Gusdev-gusdev mailing list > Gus...@li... > https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > |