From: 'Kevin D. ' <ke...@tr...> - 2006-02-16 00:02:33
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <STYLE type=text/css> P, UL, OL, DL, DIR, MENU, PRE { margin: 0 auto;}</STYLE> <META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD> <BODY leftMargin=1 topMargin=1 rightMargin=1><FONT face=Tahoma> <DIV><FONT face=Arial size=2>Bryan-</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2></FONT></DIV> <DIV><FONT face=Arial size=2>I think that due to locality reasons that you will not want to store versions of a given row in a single data structure. Using individual records and a data structure with pointers to those records should be much better for performance. (this is kind of the 'anti' version of your logical containers design :-) ).</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>I'll have to give some thoughts to the time/txid stamping of rows... maybe there is an alternative that will have better performance.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>- K</FONT></DIV> <DIV><FONT face=Arial size=2> </FONT> <TABLE> <TBODY> <TR> <TD width=1 bgColor=blue><FONT face=Arial size=2></FONT></TD> <TD><FONT face=Arial size=2><FONT color=green>> Kevin,<BR><BR>[1] provides some perspective on MVCC. It examines how to achieve<BR>MVCC without rollback. The protocol as described by Reed (the PhD<BR>thesis) would force an abort (tx rollback). The conflict which leads<BR>to rollback is when transactions writing versions with smaller<BR>timestamps have not yet completed. Reed's protocol only permitted the<BR>reader to continue in cases which would lead to an abort. [1] proposes<BR>an approach which does not have this problem.<BR><BR>The main drawback of [1] for jdbm is that it requires pre-declaration<BR>of the write set (all objects that will be written or created). jdbm<BR>does not have sufficient structure to make such pre-declarations viable<BR>outside of the context of a specific application or application<BR>framework. (This is an issue with locking protocols as well).<BR><BR>The broad issue with MVCC is that rollback and transaction restart is<BR>very expensive when compared to blocking. Locking protocols appear to<BR>support higher concurrency by blocking transactions except when a dead<BR>lock is detected, in which case one or more transactions must be aborted<BR>and restarted.<BR><BR>Another issue fixed by the proposal in [1] is that read operations normally<BR>require an update on the database (to write the timestamp).<BR><BR>Based on [1] it is clear that historical versions must be persisted. I<BR>can imagine some ways to do this, e.g., by changing the physical row to<BR>include the #of versions, a timestamp[] for those versions, an offset[]<BR>for those versions, and then the serialized versions themselves. Only<BR>the version required for a given operation would be de-serialized and<BR>update could occur without having to de-serialize all versions. (It<BR>seems that MVCC poses challenges for clustering since you need to trade<BR>off the locality of the versions against locality for access of distinct<BR>data items whose read access patterns are clustered by your app.)<BR><BR>I think that we need to look around some more at MVCC. The analytic<BR>performance articles cover in good depth the costs associated with tx<BR>restart (which is another thing that will not be fun in jdbm since we<BR>do not have declarative query plans). Perhaps we should talk to the<BR>postgres people and see what their experience has been and how it might<BR>apply to an object database?<BR><BR>Based on my reading so far, MVCC would be nice ... but it requires pre-<BR>declaration of write sets for high concurrency and the only way to do<BR>that efficiently is to use a hierarchical declaration scheme for those<BR>write sets (e.g., I will write on this btree or table). Outside of the<BR>btree and htable packages, jdbm does not have higher level data structures<BR>so that means that applications will have to declare the data structures<BR>which they are using and declare their write sets in terms of those data<BR>structures. The exact same problem exists for efficient locking schemes<BR>-- you need to declare locks in terms of a hierarchy which is application<BR>specific. If you have a application framework then you can hope to hide<BR>some of this, but some of it will always bleed up to the application level<BR>as far as I can tell.<BR><BR>Both MVCC and locking (and the associated detection of cycles in the<BR>WAITS_ON graph) are perhaps comparable in terms of complexity as far<BR>as I can tell.<BR><BR>The B-link stuff is just a win since it allows concurrent traversal<BR>of an index which is being modified. Likewise replacing the RecordFile<BR>and TransactionManager with DBCache is a win. Introducing concurrency<BR>control appears to require application level semantics for high<BR>concurrency.<BR><BR>-bryan<BR><BR>[1] Gael N. Buckley, Abraham Silberschatz: Obtaining Progressive<BR> Protocols for a Simple Multiversion Database Model. VLDB 1983:<BR> 74-80 <A href="http://www.vldb.org/conf/1983/P074.PDF"><FONT color=#0000ff>http://www.vldb.org/conf/1983/P074.PDF</FONT></A><BR><BR><BR>-----Original Message-----<BR>From: <A href="mailto:jdb...@li..."><FONT color=#0000ff>jdb...@li...</FONT></A><BR>To: JDBM Developer listserv<BR>Sent: 2/15/2006 8:54 AM<BR>Subject: re[4]: [Jdbm-developer] MC-DataSafe (in Persistent Store Inte<BR>rface).<BR><BR>Bryan-<BR><BR>hmmmm - From my reading, it would be a pretty significant variation...<BR>I think the operating characteristics of MVCC are completely different<BR>than those of traditional timestamp CC (timestamp implies that tx will<BR>always be comitted in a specified order, and blocks tx that attempt to<BR>commit in a different order - MVCC is completely different).<BR><BR>Kind of like saying that a yugo and a ferrari are both cars...<BR><BR>- K<BR> <BR> > MVCC is considered a timestamping variation. I agree that<BR>this paper<BR>does not directly analyze MVCC, but there are some cites in there which<BR>do.<BR><BR>-bryan <BR><BR>-----Original Message-----<BR>From: <A href="mailto:jdb...@li..."><FONT color=#0000ff><mailto:jdb...@li...></FONT></A><BR><A href="mailto:jdb...@li..."><FONT color=#0000ff>jdb...@li...</FONT></A><BR>To: JDBM Developer listserv<BR>Sent: 2/14/2006 8:16 PM<BR>Subject: re[2]: [Jdbm-developer] MC-DataSafe (in Persistent Store<BR>Interface).<BR><BR>Bryan-<BR><BR>From reading [1], I don't think that timestamping and MVCC have much in<BR>common... timestamping *blocks* reads, MVCC allows reads to continue.<BR>I don't think that conclusions drawn about time stamping will have any<BR>bearing on MVCC...<BR><BR>I'm a bit puzzled why they are refering to "multiversion timestamp<BR>concurrency" in the second paper - unless they are just using a<BR>timestamp for the txid as opposed to a running counter (not sure why<BR>you'd do that as it seems like it would open you up to id conflicts). I<BR>don't have an ACM membership (may have to get one), so I can't read the<BR>paper itself (no time right now anyway)...<BR><BR>- K<BR><BR><BR> > Kevin,<BR><BR>I have done some further reading. This article [1] appears to be the<BR>locus of the critique -- and it is a critique for centralized databases,<BR>so it is relevant. Among other things it provides a critique of basic<BR>timestamp ordering, much of which may apply to MVCC. The article winds<BR>up favoring specific combinations of concurrency control and recovery<BR>mechanisms. I have not read it closely enough yet to interpret it in<BR>terms of jdbm or DBCache, and I have not tracked down the MVCC<BR>references<BR>either. See [2] also.<BR><BR>-bryan<BR><BR>[1] R AGRAWAL, DJ DEWITT. Integrated Concurrency Control and Recovery<BR> Mechanisms: Design and Performance Evaluation (1985) <BR><BR><A href="http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf"><FONT color=#0000ff><http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf></FONT></A><BR><A href="http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf"><FONT color=#0000ff><http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf></FONT></A><BR><A href="http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf"><FONT color=#0000ff><http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf></FONT></A><BR><A href="http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf"><FONT color=#0000ff>http://www.cs.ucsb.edu/~ravenben/papers/db/recovery/p529-agrawal.pdf</FONT></A><BR><BR>[2] Sun, R. and Thomas, G. 1987. Performance results on multiversion<BR> timestamp concurrency control with predeclared writesets. In<BR> Proceedings of the Sixth ACM SIGACT-SIGMOD-SIGART Symposium on<BR> Principles of Database Systems (San Diego, California, United<BR> States, March 23 - 25, 1987). M. Y. Vardi, Ed. PODS '87. ACM<BR> Press, New York, NY, 177-184. DOI=<BR> <A href="http://doi.acm.org/10.1145/28659.28678"><FONT color=#0000ff><http://doi.acm.org/10.1145/28659.28678></FONT></A><BR><A href="http://doi.acm.org/10.1145/28659.28678"><FONT color=#0000ff><http://doi.acm.org/10.1145/28659.28678></FONT></A><BR><A href="http://doi.acm.org/10.1145/28659.28678"><FONT color=#0000ff><http://doi.acm.org/10.1145/28659.28678></FONT></A><BR><A href="http://doi.acm.org/10.1145/28659.28678"><FONT color=#0000ff>http://doi.acm.org/10.1145/28659.28678</FONT></A><BR><BR><BR>-----Original Message-----<BR>From: <A href="mailto:jdb...@li..."><FONT color=#0000ff><mailto:jdb...@li...></FONT></A><BR><A href="mailto:jdb...@li..."><FONT color=#0000ff><mailto:jdb...@li...></FONT></A><BR><A href="mailto:jdb...@li..."><FONT color=#0000ff><mailto:jdb...@li...></FONT></A><BR><A href="mailto:jdb...@li..."><FONT color=#0000ff>jdb...@li...</FONT></A><BR>To: JDBM Developer listserv<BR>Sent: 2/13/2006 9:00 PM<BR>Subject: re: [Jdbm-developer] MC-DataSafe (in Persistent Store<BR>Interface).<BR><BR>Bryan-<BR><BR>My initial pass at this is that the criticism does apply primarily in<BR>the context of a distributed store. I suspect (without reading in great<BR>detail) that there are layers between the concurrency control layer (at<BR>the top of their stack) and the low level atomic commit layer that<BR>inhibit recovery (most likely, these layers are related to the<BR>distributed nature of the store).<BR><BR>That said, I haven't read this in great detail, so I could be off. I<BR>would need to have a good example of what types of opportunities are<BR>lost before I could really say whether this applies to our discussion so<BR>far or not - were there any other details in the paper that hinted at<BR>what those may have been?<BR><BR>Thanks!<BR><BR>- K<BR><BR><BR> > All,<BR><BR>I would appreciate some perspectives into [1] in the section in which<BR>MC-DataSafe is discussed. It appears on page 24 (as numbered on the<BR>document) that the notion of handling concurrency control at the object<BR>layer and simplifying it at the DBCache layer is advocated. However in<BR>the middle of the next page, it appears that this is critiqued.<BR><BR>> By placing concurrency control at the top of the Flask layered<BR>> architecture, opportunities for exploiting the synergy between<BR>> recovery and concurrency control [Agrawal and DeWitt 1985; Alonso et<BR>> al. 1994; Feeley et al. 1994] are missed, and to the extent that the<BR>> storage layer replicates pages, a duplication of effort is introduced<BR>> on account of the coherency mechanisms necessary within the storage<BR>> layer.<BR><BR>It may be that this critique was only meant to apply in the context of<BR>a distributed store, but I would like to understand this issue more as<BR>we are discussing a similar layering.<BR><BR>I will chase some of the references and see if that sheds any more<BR>light.<BR><BR>-bryan<BR><BR>[1] <A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff><http://citeseer.ist.psu.edu/187029.html></FONT></A><BR><A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff><http://citeseer.ist.psu.edu/187029.html></FONT></A><BR><A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff><http://citeseer.ist.psu.edu/187029.html></FONT></A><BR><A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff><http://citeseer.ist.psu.edu/187029.html></FONT></A><BR><A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff><http://citeseer.ist.psu.edu/187029.html></FONT></A><BR><A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff><http://citeseer.ist.psu.edu/187029.html></FONT></A><BR><A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff><http://citeseer.ist.psu.edu/187029.html></FONT></A><BR><A href="http://citeseer.ist.psu.edu/187029.html"><FONT color=#0000ff>http://citeseer.ist.psu.edu/187029.html</FONT></A><BR><BR><BR>-------------------------------------------------------<BR>This SF.net email is sponsored by: Splunk Inc. Do you grep through log<BR>files<BR>for problems? Stop! Download the new AJAX search engine that makes<BR>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!<BR><BR><BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>><BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>2><BR><BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>2><BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>2><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642"><FONT color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642</FONT></A><BR>_______________________________________________<BR>Jdbm-developer mailing list<BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff>Jdb...@li...</FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR> <BR>------------------------------------------------------- This SF.net<BR>email is sponsored by: Splunk Inc. Do you grep through log files for<BR>problems? Stop! Download the new AJAX search engine that makes searching<BR>your log files as easy as surfing the web. DOWNLOAD SPLUNK!<BR><BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>2><BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>2><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642"><FONT color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642</FONT></A><BR>_______________________________________________ Jdbm-developer mailing<BR>list <A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff>Jdb...@li...</FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR> <BR>------------------------------------------------------- This SF.net<BR>email is sponsored by: Splunk Inc. Do you grep through log files for<BR>problems? Stop! Download the new AJAX search engine that makes searching<BR>your log files as easy as surfing the web. DOWNLOAD SPLUNK!<BR><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164"><FONT color=#0000ff><http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164</FONT></A><BR>2><BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642"><FONT color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642</FONT></A><BR>_______________________________________________ Jdbm-developer mailing<BR>list <A href="mailto:Jdb...@li..."><FONT color=#0000ff><mailto:Jdb...@li...></FONT></A><BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff>Jdb...@li...</FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff><https://lists.sourceforge.net/lists/listinfo/jdbm-developer></FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR> <BR>------------------------------------------------------- This SF.net<BR>email is sponsored by: Splunk Inc. Do you grep through log files for<BR>problems? Stop! Download the new AJAX search engine that makes searching<BR>your log files as easy as surfing the web. DOWNLOAD SPLUNK!<BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642"><FONT color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642</FONT></A><BR>_______________________________________________ Jdbm-developer mailing<BR>list <A href="mailto:Jdb...@li..."><FONT color=#0000ff>Jdb...@li...</FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR></FONT></FONT></TD></TR></TBODY></TABLE></DIV></FONT></BODY></HTML> |