From: Thompson, B. B. <BRY...@sa...> - 2006-02-17 13:12:43
|
Alex, Ok. I am still getting my feet in this space and I need to think more about how the different kinds of inconsistencies could be permitted in a hybrid 2PL + MVCC strategy. 2PL seems straightforward -- and maybe MVCC is as well. I am thinking of the implementation in terms of 2PL hierarchical locking [1] support for rw synchronization plus MVCC support for ww synchronization (per [2], section 5.3). What I think I need to do is work through the interaction of those synchronization mechanisms in more depth, maybe do some modeling to support that, and generate some ideas for how this might interface with jdbm and DBCache. If you have a design in mind I am all ears. Regardless it seems that we could incrementally improve jdbm right now by: 1. integrating DBCache for VLR Tx support (but without concurrency). 2. integrating the b-link support into the b+tree to support index traversal under concurrent modification. -bryan PS: Gray laid out those isolation levels in [1], together with heirarchical locking. [1] J.N. Gray, R.A. Lorie, G.R. Putzolu, I.L. Traiger. Granularity of Locks and Degrees of Consistency in a Shared Data Base. http://www.seas.upenn.edu/~zives/cis650/papers/granularity-locks.pdf [2] Bernstein, P. A. and Goodman, N. 1981. Concurrency Control in Distributed Database Systems. ACM Comput. Surv. 13, 2 (Jun. 1981), 185-221. DOI= http://doi.acm.org/10.1145/356842.356846 http://www-static.cc.gatech.edu/classes/AY2003/cs8803i_fall/ConcurrencyContr ol.pdf -----Original Message----- From: Alex Boisvert To: Thompson, Bryan B. Cc: 'jdb...@li... '; ''Kevin Day ' '; ''JDBM Developer listserv ' ' Sent: 2/16/2006 7:18 PM Subject: Re: [Jdbm-developer] MC-DataSafe I think it would look similar to what industrial databases are doing today: -read uncommitted -read committed -repeatable read -serializable This is more about relaxing MVCC than locks. You can stack locking on top of these isolation levels for custom levels of conflict allowance or prevention. alex Thompson, Bryan B. wrote: >Alex, > >If you are going to relax conflict analysis, what is that going >to look like under the "proposed" scheme? Release of some locks >before all locks have been acquired? > >-bryan > >-----Original Message----- >From: jdb...@li... >To: 'Kevin Day ' >Cc: 'JDBM Developer listserv ' >Sent: 2/16/2006 6:45 PM >Subject: Re: [Jdbm-developer] MC-DataSafe > > >Serializability is more about how strict you are when doing conflict >analysis. > >To have total ordering (aka serializability) you must enforce that the >read and write set of a transaction are completely disjoint from the >write set of any transaction committed after the transaction started. > >alex > > >'Kevin Day ' wrote: > > > >>Alex- >> >>Maybe you can help - can you explain to me how MVCC could have >>anything to do serializability of transactions? Am I completely >>off-base in saying that adding read locking to MVCC would defeat the >>purpose? >> >>I'm obviously having a massive disconnect. >> >>Thanks, >> >>- K >> >> >> >> >> > Kevin, >> >>MVCC does provide serializability. The tradeoffs among correct >>concurrency control mechanisms involve performance not correctness. >>The problems with performance for MVCC are that, by itself, it tends >>to produce more transaction restarts and lower overall transaction >>throughput when compared to 2PL. Postgres is some sort of hybrid >>of 2PL and MVCC. I tried their chat group yesterday to get some >>insight into exactly what kind of a hybrid, but no one could say "Oh, >>we implemented XYZ". >> >>I do not think that MVCC (or any hybrid of MVCC) is a silver bullet >>for performance. Neither postgres nor Oracle is a clear performance >>winner -- in fact I have personal experience with the one of these >>platforms which places it an order of magnitude slower than jdbm for >>our application (an semantic web store) using a solution developer by >>the database people themselves. >> >>I recommend a closer reading of this paper. I have found later >> >> >articles > > >>which use analytic techniques to examine the performance of a variety >> >> >of > > >>concurrency control mechanisms, but none which benchmarks real >> >> >databases > > >>explicitly in terms of this issue. Perhaps we could look at some of >> >> >the > > >>RDBMS benchmark literature for some insight there? >> >>Enjoy your vacation, >> >>-bryan >> >>-----Original Message----- >>From: jdb...@li... >><mailto:jdb...@li...> >>To: JDBM Developer listserv >>Sent: 2/16/2006 4:48 PM >>Subject: re[2]: [Jdbm-developer] MC-DataSafe >> >>Bryan- >> >>Can you please provide a description of how MVCC (with non-serialized >>processing of transactions) would cause either of the two lost >> >> >data/bad > > >>data examples described in the paper? >> >>The way I read your email, it sounds like you are implying that MVCC >>can't provide transaction isolation or atomic updates, when it >>absolutely does. MVCC just operates optimistically in these >> >> >situations. > > >>Under MVCC in the first example, an abort would occur. In the second >>example under MVCC, the report created in T2 would be correct, and T1 >>would commit without problems. In either case, nothing gets left on >> >> >the > > >>floor... >> >>The massive advantage of MVCC comes in the second example: If the >>report required a significant amount of time to complete (it was >> >> >reading > > >>a huge number of rows), then serializing the transactions as described >>in the paper will cause massive slow downs for the poor sucker who is >>trying to execute T1 (and kill concurrency). In MVCC, the update >> >> >would > > >>happen immediately and return, even if T2 takes a long time to >> >> >complete. > > >>This absolutely maximizes concurrency, and is perfectly safe. >> >> >>I'm going to need some serious convincing with concrete examples >> >> >before > > >>I'm willing to favor serializable transactions over MVCC's approach. >> >> >My > > >>concern at this stage is that much of the literature on these topics >>either doesn't account for the radical departure from the norm that >> >> >MVCC > > >>is, or it is incorrect in it's description of how MVCC actually >> >> >operates > > >>in practice. >> >>I actually find it humorous that the concurrency control paper we are >>talking about here says that "the problem for nondistributed DMBSs is >>well understood... and one approach called two-phase locking has been >>accepted as a standard solution." That may have been true in 1981 >> >> >when > > >>the paper was published, but the good folks at PostgreSQL and others >>have pretty much blown that out of the water. >> >>This means that any broad claims made in these older papers have to be >>taken with a considerable grain of salt, and evaluated in the context >> >> >of > > >>the new development of MVCC. Bernstein and Goodman are certainly >> >> >going > > >>to say that 2 phase locking is the only way to go - they weren't even >>aware that a better approach was possible. >> >>Anyway - I'm going to be gone until the 28th - hopefully you and Alex >>can continue the discussion with vigour while I'm gone! >> >>Cheers, >> >>- K >> >> >> > > > >------------------------------------------------------- >This SF.net email is sponsored by: Splunk Inc. Do you grep through log >files >for problems? Stop! Download the new AJAX search engine that makes >searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! >http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=12164 2 >_______________________________________________ >Jdbm-developer mailing list >Jdb...@li... >https://lists.sourceforge.net/lists/listinfo/jdbm-developer > > |