From: Kevin D. <ke...@tr...> - 2006-02-02 22:50:19
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <STYLE type=text/css> P, UL, OL, DL, DIR, MENU, PRE { margin: 0 auto;}</STYLE> <META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD> <BODY leftMargin=1 topMargin=1 rightMargin=1><FONT face=Tahoma> <DIV><FONT face=Arial size=2>Bryan-</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>I think what Alex is saying is that in order to provide ACID, any object retrieved from jdbm by a transaction must be unique to that transaction - i.e. obj(tx1) != obj(tx2), even though recid(tx1)=recid(tx2). The first tx to do an update on it's version of the object (whether it commits or not!) gains an exclusive lock on the recid and all further attempts to update that recid by other transactions will cause the calling transaction to abort.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>In a complicated data structure like a BTree, this means that you can get an unexpected abort if two tx try to update the same set of BPages at the same time. The positive side of this, though, is that you can have a high degree of concurrency (many tx can read the same BPage without having to synchronize on the BTree structure (basically, the entire BTree becomes transaction local). If the BTree is allowed to interact more closely with the record manager, then there may be a way to design the BTree to allow simultaneous *updates* without causing transaction aborts, but at the cost of blocking threads if necessary to preserve the BTree's integrity. Doing this requires being able to obtain a unique lock object for a given record id (in other words, you can't synchronize on the BTree and BPage itself because those objects are created with transaction scope - so you have to have expose an alternative mechanism for locking).</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>The simplest approach here is to allow the developer to choose whether they want to synchronize the entire BTree during a transaction (i.e. lose concurrency, but guarantee that you won't have an unexpected abort), or not synchronize the BTree at all (i.e. gain concurrency, at the risk of unexpected transaction abort). That said, I am certain that some algorithms could be found that would give the best of both worlds - i.e. only synchronizing when absolutely necessary.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>The most important thing in my book is that we need to have per-transaction copies of objects. That is going to make the state machine (or equivalent) of the transaction manager a tad interesting... If we have a txA that retrieves a copy of obj1 (let's call it obj1A), then a subsequent txB that retrieves obj1 must get it's *own* copy, obj1B, etc... If txA does an update on obj1A, and commits, then any subsequent update on obj1B by txB must continue to fail until txB has been closed.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>So, we have to have one object per transaction. </FONT> <FONT face=Arial size=2>At the logical record level, we have to maintain one or two versions of the serialized form of obj1. If obj1 has been updated and flushed from the object cache, then we need to have the serialized form both before and after the update. Otherwise, we just need to hold on to the 'before' bytes. The 'before' bytes are used to construct unique objects for any transaction that fetches a given record. The 'after' bytes are what get written to the safe (if the safe is operating at the logical record level).</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>If we are working at the page level in the cache instead of the logical record level, then we need to keep one or two copies of the page(s) containing any record(s) fetched by transactions...</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>- K</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2></FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2> </FONT> <TABLE> <TBODY> <TR> <TD width=1 bgColor=blue><FONT face=Arial size=2></FONT></TD> <TD><FONT face=Arial size=2><FONT color=red>> Alex,<BR><BR>That is, you wish to allow updates of the btree from concurrent transactions<BR>to be<BR>"merged" in the sense that the btree ordering after the transactions<BR>reflects the<BR>updates from both? That feature is certainly important to improving<BR>concurrency<BR>since indices are otherwise likely to become sources of contention between<BR>writers.<BR><BR>-bryan<BR><BR>-----Original Message-----<BR>From: <A href="mailto:jdb...@li..."><FONT color=#0000ff>jdb...@li...</FONT></A><BR><A href="mailto:jdb...@li..."><FONT color=#0000ff>[mailto:jdb...@li...]</FONT></A> On Behalf Of Alex<BR>Boisvert<BR>Sent: Thursday, February 02, 2006 3:51 PM<BR>To: Kevin Day<BR>Cc: JDBM Developer listserv<BR>Subject: Re: [Jdbm-developer] persistent record management -- avoiding<BR>contention.<BR><BR><BR><BR>I think different transaction must get different copies of objects otherwise<BR>it would place an undue burden on application developers.<BR><BR>Data structures (e.g. BTree) are a different kind of beast because we need<BR>to ensure that commutative operations are allowed between<BR>transactions. Therefore, they must participate more closely with the<BR>transaction's concurrency control mechanism to achieve concurrent updates<BR>whenever possible.<BR><BR>alex<BR><BR>Kevin Day wrote:<BR><BR>> Finally, I really think that we need to all (Alex - time for you to <BR>> join in the discussion!) have at minimum an initial discussion of the <BR>> implications of simultaneous transactions to ACID behavior... There <BR>> are some very big issues here that I think we need to think through <BR>> before we do much anything... The biggest of these issues, as I see <BR>> it, is whether different transacations get their own copies of objects <BR>> or not - if they get their own copy, then it precludes the use of <BR>> higher level syncrhonization for managing complicated data structures <BR>> (e.g. it would be impossible to synchronize on a BTree to ensure that <BR>> the data structure remains valid during re-balancing, etc...). If <BR>> transactions share a copy of the same instance of a given object, then <BR>> Isolation is pretty much trashed.<BR><BR><BR><BR><BR>-------------------------------------------------------<BR>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files<BR>for problems? Stop! Download the new AJAX search engine that makes<BR>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!<BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642"><FONT color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642</FONT></A><BR>_______________________________________________<BR>Jdbm-developer mailing list <A href="mailto:Jdb...@li..."><FONT color=#0000ff>Jdb...@li...</FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><BR>-------------------------------------------------------<BR>This SF.net email is sponsored by: Splunk Inc. Do you grep through log files<BR>for problems? Stop! Download the new AJAX search engine that makes<BR>searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!<BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642"><FONT color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642</FONT></A><BR>_______________________________________________<BR>Jdbm-developer mailing list<BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff>Jdb...@li...</FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR></FONT></FONT></TD></TR></TBODY></TABLE></DIV></FONT></BODY></HTML> |