From: Kevin D. <ke...@tr...> - 2006-04-03 04:50:59
|
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"> <HTML><HEAD> <STYLE type=text/css> P, UL, OL, DL, DIR, MENU, PRE { margin: 0 auto;}</STYLE> <META content="MSHTML 6.00.2900.2802" name=GENERATOR></HEAD> <BODY leftMargin=1 topMargin=1 rightMargin=1><FONT face=Tahoma> <DIV><FONT face=Arial size=2>Elias-</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>There are a number of things about the current BTree implementation that need some over-hauling - Adding the ability to browse on keys instead of key/value pairs is definitely doable (in fact, in jdbm2, we are in discussion about including the concept of an index BTree that deals *ony* with keys plus a recid lookup).</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>I think the best strategy for reading keys only during a browse would be to have an option in the BPage itself to defer value deserialization. I don't think that we want to get back into stream processing on the serializers (there are very good reasons for working with known sized arrays in the serializers) - we are going to be having a lively (I'm sure) discussion about whether we should change the serlizer interface in jdbm2 to work with ByteBuffer objects instead of byte[].</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Here is a rough strategy of how deferred deserialization would work: During the BPage deserialize call, the byte arrays for the values would be retrieved and stored. If default serialization is being used, this is about all you can do - the first request for any value in the page requires deserialization of the entire byte array (this is one of the many problems with using default serialization). If custom serialization is being used, then you could store the byte array of each value individually, and only deserialize when needed.</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Another way to make this work is to ditch the idea of storing values in the BTree entirely (this is what I do, and I suspect it's what a lot of folks wind up doing eventually). Instead, make your key so the last part of the key is the recid of the object that is actually stored elsewhere in the record manager. Then set the value to a 0 sized byte array. This is much closer to an index style btree...</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>With all that said, are you seeing a performance hit from deserializing values in the BTree?</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>Some other things that need addressing in the next iteration of BTree:</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>1. Proper browser handling of structure changes (currently, if you are in the middle of a browse, and the tree changes structure, the browser fails).</FONT></DIV> <DIV><FONT face=Arial size=2>2. Range deletions</FONT></DIV> <DIV><FONT face=Arial size=2>3. I'd like to see an iterator interface</FONT></DIV> <DIV><FONT face=Arial size=2>4. Concurrency - the current implementation is a big problem for concurrent access. We are looking at b-link trees for index trees to address this, but handling concurrency in a generalized (non-index) btree in the main record store is something that will require some work (there is a large amount of work that Bryan is doing now on hierarchal locking sub-systems that may be of use here)</FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2>- K</FONT></DIV> <DIV><FONT face=Arial size=2></FONT></DIV> <DIV><FONT face=Arial size=2></FONT> </DIV> <DIV><FONT face=Arial size=2> </FONT> <TABLE> <TBODY> <TR> <TD width=1 bgColor=blue><FONT face=Arial size=2></FONT></TD> <TD><FONT face=Arial size=2><FONT color=red>By the way, for browsing BTree nodes... It would be nice if there was a<BR>browse option just for keys which didn't load the values. I'm thinking<BR>you could create a custom deserializer that did not read in the BTree<BR>values, just the keys.<BR><BR>To do this, just from browsing the code, it seems like you'd have to<BR>make some internal changes to return InputStream rather than byte[],<BR>e.g. add a PhysicalRowIdManager "fetchStream" or something. Would this<BR>be something I could contribute? This would certainly be a memory<BR>improvement.<BR><BR><BR><BR><BR>-------------------------------------------------------<BR>This SF.Net email is sponsored by xPML, a groundbreaking scripting language<BR>that extends applications into web and mobile media. Attend the live webcast<BR>and join the prime developer group breaking into this new coding territory!<BR><A href="http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642"><FONT color=#0000ff>http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642</FONT></A><BR>_______________________________________________<BR>Jdbm-developer mailing list<BR><A href="mailto:Jdb...@li..."><FONT color=#0000ff>Jdb...@li...</FONT></A><BR><A href="https://lists.sourceforge.net/lists/listinfo/jdbm-developer"><FONT color=#0000ff>https://lists.sourceforge.net/lists/listinfo/jdbm-developer</FONT></A><BR><BR><<BR></FONT></FONT></TD></TR></TBODY></TABLE></DIV></FONT></BODY></HTML> |