From: <tho...@us...> - 2011-05-15 18:46:24
|
Revision: 4503 http://bigdata.svn.sourceforge.net/bigdata/?rev=4503&view=rev Author: thompsonbry Date: 2011-05-15 18:46:17 +0000 (Sun, 15 May 2011) Log Message: ----------- A bit more work on the HTree. The test case that I have been developing turns out to be a bit of a complex edge case because it involves the split of a full bucket having only duplicate keys. I am going to rework the test case to be somewhat simpler in order to work through the core of the logic for introducing new directory pages with 2 and then with 3 levels, cache invalidation of global depth, etc. I can then work on the edge cases revolving around hash buckets which can not be split because they contain duplicate keys. Modified Paths: -------------- branches/QUADS_QUERY_BRANCH/bigdata/src/architecture/htree.xls branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/htree/HTree.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTree.java branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTreeUtil.java Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/architecture/htree.xls =================================================================== (Binary files differ) Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/htree/HTree.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/htree/HTree.java 2011-05-15 17:46:39 UTC (rev 4502) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/java/com/bigdata/htree/HTree.java 2011-05-15 18:46:17 UTC (rev 4503) @@ -685,7 +685,7 @@ * The parent {@link DirectoryPage}. * @param buddyOffset * The buddyOffset within the <i>parent</i>. This identifies - * which buddy hash table in the parent must be its pointers + * which buddy hash table in the parent must have its pointers * updated such that it points to both the original child and new * child. * @param oldDepth @@ -766,7 +766,7 @@ } - } + } // updatePointersInParent /** * Redistribute the buddy buckets. @@ -957,6 +957,138 @@ throw new UnsupportedOperationException(); } + + /** + * Validate pointers in buddy hash table in the parent against the global + * depth as self-reported by the child. By definition, the global depth of + * the child is based on the global depth of the parent and the #of pointers + * to the child in the parent. The {@link AbstractPage#globalDepth} value is + * the cached result of that computation. It can be cross checked by + * counting the actual number of pointers in the parent and comparing that + * with this formula: + * + * <pre> + * npointers := 1 << (parent.globalDepth - child.globalDepth) + * </pre> + * + * This method validates that the cached value of global depth on the child + * is consistent with the #of pointers in the specified buddy hash table of + * the parent. However, this validate depends on the global depth of the + * parent itself being correct. + * <p> + * Note: While other buddy hash tables in the parent can point into the same + * child page. However, due to the buddy offset computation they will not + * point into the same buddy on the same child. + * + * @param parent + * The parent {@link DirectoryPage}. + * @param buddyOffset + * The buddyOffset within the <i>parent</i>. This identifies + * which buddy hash table in the parent should have references to + * the child. + * @param child + * The child {@link AbstractPage}. + */ + private void validatePointersInParent(final DirectoryPage parent, + final int buddyOffset, final AbstractPage child) { + + // #of address slots in the parent buddy hash table. + final int slotsPerBuddy = (1 << parent.globalDepth); + + // #of pointers expected in the parent buddy hash table. + final int npointers = 1 << (parent.globalDepth - child.globalDepth); + + // The first slot in the buddy hash table in the parent. + final int firstSlot = buddyOffset; + + // The last slot in the buddy hash table in the parent. + final int lastSlot = buddyOffset + slotsPerBuddy; + + if (parent.isDirty() && parent.getIdentity() != IRawStore.NULL) + throw new RuntimeException( + "Parent address should be NULL since parent is dirty"); + + if (child.isDirty() && child.getIdentity() != IRawStore.NULL) + throw new RuntimeException( + "Child address should be NULL since child is dirty"); + + /* + * Count pointers to the child page. There should be [npointers] of them + * and they should be contiguous. Since the child is materialized (we + * have its reference) we can test pointers. If the child is dirty, then + * all slots in the parent having pointers to the child should be + * associated with NULL addresses for the child. If the child is clean, + * then all slots in the parent having references for the child should + * have the same non-zero address for the child. Also, any slot in the + * parent having the address of a persistent child should have the same + * Reference object, which should be child.self. + */ + int firstPointer = -1; + int nfound = 0; + boolean discontiguous = false; + for (int i = firstSlot; i < lastSlot; i++) { + final boolean slotRefIsChild = parent.childRefs[i] == child.self; + final boolean slotAddrIsChild = parent.getChildAddr(i) == child + .getIdentity(); + if (slotAddrIsChild && !slotRefIsChild) + throw new RuntimeException( + "Child reference in parent should be child since addr in parent is child addr"); + final boolean slotIsChild = slotRefIsChild || slotAddrIsChild; + if (slotIsChild) { + if (child.isDirty()) { + // A dirty child must have a NULL address in the parent. + if (parent.data.getChildAddr(i) != IRawStore.NULL) + throw new RuntimeException( + "Child address in parent should be NULL since child is dirty"); + } else { + // A clean child must have a non-NULL address in the parent.1 + if (parent.data.getChildAddr(i) == IRawStore.NULL) + throw new RuntimeException( + "Child address in parent must not be NULL since child is clean"); + } + if (firstPointer == -1) + firstPointer = i; + nfound++; + } else { + if (firstPointer != -1 && nfound != npointers) { + discontiguous = true; + } + } + } + if (firstPointer == -1) + throw new RuntimeException("No pointers to child"); + if (nfound != npointers) { + /* + * This indicates either a problem in maintaining the pointers + * (References and/or addresses) in the parent for the child, a + * problem where child.self is not a unique Reference for the child, + * and/or a problem with the cached [globalDepth] for the child. + */ + throw new RuntimeException("Expected " + npointers + + " pointers to child, but found=" + nfound); + } + if (discontiguous) + throw new RuntimeException( + "Pointers to child are discontiguous in parent's buddy hash table."); + } // validatePointersInParent + +// /** +// * TODO Scan the entire parent for addresses and References to each child. Any +// * time we see the address of a child, then the Reference must be +// * [child.self]. Any time we see a reference which is not child.self, then +// * the address for that slot must not be the same as the address of the +// * child. +// * <p> +// * Note: Since the child is materialized, any Reference in the parent to +// * that child should be [child.self]. Due to the buddy page system, when we +// * load a child we need to scan the parent across all buddy hash tables and +// * set the Reference on any slot having the address of that child. Likewise, +// * when the child becomes dirty, we need to clear the address of the child +// * in any slot of any buddy hash table in the parent. +// */ +// private void validateChildren(final DirectoryPage parent) { +// +// } /** * Persistence capable abstract base class for HTree pages. @@ -1741,7 +1873,7 @@ * slots on the page. We would rely on keys.capacity() in this case * rather than #slots. In fact, we could just reenter the method * above after doubling as long as we rely on keys.capacity() in the - * case where nbuddies==1. + * case where nbuddies==1. [Unit test for this case.] */ throw new UnsupportedOperationException(); } @@ -1802,8 +1934,6 @@ // Set to false iff an inconsistency is detected. boolean ok = true; - final int globalDepth = getGlobalDepth(); - if (parent == null || parent.get() == null) { out.println(indent(height) + "ERROR: parent not set"); ok = false; @@ -1813,7 +1943,18 @@ out.println(indent(height) + "ERROR: localDepth exceeds globalDepth of parent"); ok = false; } - + + /* + * FIXME Count the #of pointers in each buddy hash table of the + * parent to each buddy bucket in this bucket page and verify that + * the globalDepth on the child is consistent with the pointers in + * the parent. + * + * FIXME The same check must be performed for the directory page to + * cross validate the parent child linking pattern with the + * transient cached globalDepth fields. + */ + if (debug || ! ok ) { out.println(indent(height) + toString()); @@ -2107,7 +2248,7 @@ } - /* + /** * TODO We should dump each bucket page once. This could be done either * by dumping each buddy bucket on the page separately or by skipping * through the directory page until we get to the next bucket page and @@ -2120,6 +2261,8 @@ * there can not be "gaps" between observations of a reference to a * given bucket - once you see another bucket reference a previously * observed reference can not then appear. + * + * @see HTree#validatePointersInParent(DirectoryPage, int, AbstractPage) */ @Override protected boolean dump(Level level, PrintStream out, @@ -2163,14 +2306,6 @@ } } -// // verify keys are monotonically increasing. -// try { -// assertKeysMonotonic(); -// } catch (AssertionError ex) { -// out.println(indent(height) + " ERROR: " + ex); -// ok = false; -// } - if (debug) { out.println(indent(height) + toString()); } @@ -2213,19 +2348,6 @@ out.println(indent(height) + " ERROR child[" + i + "] has wrong parent."); ok = false; -// // some extra stuff used to track down a bug. -// if (!ok) { -// if (level == Level.DEBUG) { -// // dump the child also and exit. -// System.err.println("child"); -// child.dump(Level.DEBUG, System.err); -// throw new AssertionError(); -// } else { -// // recursive call to get debug level dump. -// System.err.println("this"); -// this.dump(Level.DEBUG, System.err); -// } -// } } if (child.isDirty()) { @@ -2250,15 +2372,6 @@ + " since the child is dirty"); ok = false; } - // if (!dirtyChildren.contains(child)) { - // out - // .println(indent(height + 1) - // + " ERROR child at index=" - // + i - // + " is dirty, but not on the dirty list: child=" - // + child); - // ok = false; - // } } else { /* * Clean child (ie, persistent). The parent of a clean @@ -2270,15 +2383,6 @@ + ", but child is not dirty"); ok = false; } - // if (dirtyChildren.contains(child)) { - // out - // .println(indent(height) - // + " ERROR child at index=" - // + i - // + " is not dirty, but is on the dirty list: child=" - // + child); - // ok = false; - // } } } @@ -2354,126 +2458,12 @@ } - // if (child.isDirty() && !dirtyChildren.contains(child)) { - // - // out - // .println(indent(height + 1) - // + "ERROR dirty child not in node's dirty list at index=" - // + i); - // - // ok = false; - // - // } - // - // if (!child.isDirty() && dirtyChildren.contains(child)) { - // - // out - // .println(indent(height + 1) - // + - // "ERROR clear child found in node's dirty list at index=" - // + i); - // - // ok = false; - // - // } - if (child.isDirty()) { dirty.add(child); } -// if (i == 0) { -// -// if (nkeys == 0) { -// -// /* -// * Note: a node with zero keys is valid. It MUST -// * have a single child. Such nodes arise when -// * splitting a node in a btree of order m := 3 when -// * the splitIndex is computed as m/2-1 = 0. This is -// * perfectly normal. -// */ -// -// } else { -// /* -// * Note: All keys on the first child MUST be LT the -// * first key on this node. -// */ -// final byte[] k0 = getKeys().get(0); -// final byte[] ck0 = child.getKeys().get(0); -// if (BytesUtil.compareBytes(ck0, k0) >= 0) { -// // if( child.compare(0,keys,0) >= 0 ) { -// -// out -// .println(indent(height + 1) -// + "ERROR first key on first child must be LT " -// + keyAsString(k0) -// + ", but found " -// + keyAsString(ck0)); -// -// ok = false; -// -// } -// -// if (child.getKeyCount() >= 1) { -// -// final byte[] ckn = child.getKeys().get( -// child.getKeyCount() - 1); -// if (BytesUtil.compareBytes(ckn, k0) >= 0) { -// // if (child.compare(child.nkeys-1, keys, 0) -// // >= 0) { -// -// out -// .println(indent(height + 1) -// + "ERROR last key on first child must be LT " -// + keyAsString(k0) -// + ", but found " -// + keyAsString(ckn)); -// -// ok = false; -// -// } -// -// } -// -// } -// -// } else if (i < nkeys) { -// -// // Note: The delete rule does not preserve this -// // characteristic since we do not -// // update the separatorKey for a leaf when removing its -// // left most key. -// // -// // if (child.isLeaf() && keys[i - 1] != child.keys[0]) { -// // -// // /* -// // * While each key in a node always is the first key of -// // * some leaf, we are only testing the direct children -// // * here. Therefore if the children are not leaves then -// // * we can not cross check their first key with the -// // keys -// // * on this node. -// // */ -// // out.println(indent(height + 1) -// // + "ERROR first key on child leaf must be " -// // + keys[i - 1] + ", not " + child.keys[0] -// // + " at index=" + i); -// // -// // ok = false; -// // -// // } -// -// } else { -// -// /* -// * While there is a child for the last index of a node, -// * there is no key for that index. -// */ -// -// } - if (!child.dump(level, out, height + 1, true, materialize)) { @@ -2485,16 +2475,6 @@ } - // if (dirty.size() != dirtyChildren.size()) { - // - // out.println(indent(height + 1) + "ERROR found " + dirty.size() - // + " dirty children, but " + dirtyChildren.size() - // + " in node's dirty list"); - // - // ok = false; - // - // } - } return ok; Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTree.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTree.java 2011-05-15 17:46:39 UTC (rev 4502) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTree.java 2011-05-15 18:46:17 UTC (rev 4503) @@ -339,65 +339,132 @@ new byte[][] {}, htree.lookupAll(new byte[] { 0x02 })); /* - * 4. Insert 0x20. This key is directed into the same buddy bucket + * FIXME We MUST NOT decrease the localDepth of (a) since that would + * place duplicate keys into different buckets (lost retrival). + * Since (a) can not have its localDepth decreased, we need to have + * localDepth(d=2) with one pointer to (a) to get localDepth(a:=2). + * That makes this a very complicated example. It would be a lot + * easier if we started with distinct keys such that the keys in (a) + * could be redistributed. This case should be reserved for later + * once the structural mutations are better understood. + * + * 5. Insert 0x20. This key is directed into the same buddy bucket * as the 0x01 keys that we have been inserting. That buddy bucket * is already full and, further, it is the only buddy bucket on the * page. Since we are not inserting a duplicate key we can split the - * buddy bucket. However, since global depth == local depth (i.e., - * only one buddy bucket on the page), this split will introduce a - * new directory page. The new directory page basically codes for - * the additional prefix bits required to differentiate the two - * distinct keys such that they are directed into the appropriate - * buckets. + * buddy bucket (rather than doubling the size of the bucket). + * However, since global depth == local depth (i.e., only one buddy + * bucket on the page), this split will introduce a new directory + * page. The new directory page basically codes for the additional + * prefix bits required to differentiate the two distinct keys such + * that they are directed into the appropriate buckets. * - * Note: The #of new directory levels which have to be introduced + * The new directory page (d) is inserted one level below the root + * on the path leading to (a). The new directory must have a local + * depth of ONE (1), since it will add a one bit distinction. Since + * we know the global depth of the root and the local depth of the + * new directory page, we solve for npointers := 1 << (globalDepth - + * localDepth). This tells us that we will have 2 pointers in the + * root to the new directory page. + * + * The precondition state of root is {a,c,b,b}. Since we know that + * we need npointers:=2 pointers to (d), this means that we will + * copy the {a,c} references into (d) and replace those references + * in the root with references to (d). The root is now {d,d,b,b}. d + * is now {a,a;c,c}. Since d has a local depth of 1 and address bits + * of 2, it is comprised of two buddy hash tables {a,a} and {c,c}. + * + * Linking in (d) has also changes the local depths of (a) and (c). + * Since they each now have npointers:=2, their localDepth is has + * been reduced from TWO (2) to ONE (1) (and their transient cached + * depth values must be either invalidated or recomputed). Note that + * the local depth of (d) and its children (a,c)) are ONE after this + * operation so if we force another split in (a) that will force a + * split in (d). + * + * Having introduced a new directory page into the hash tree, we now + * retry the insert. Once again, the insert is directed into (a). + * Since (a) is still full it is split. (d) is now the parent of + * (a). Once again, we have globalDepth(d=1)==localDepth(a=1) so we + * need to split (d). However, since localDepth(d=1) is less than + * globalDepth(root=2) we can split the buddy hash tables in (d). + * This will require us to allocate a new page (f) which will be the + * right sibling of (d). There are TWO (2) buddy hash tables in (d). + * They are now redistributed between (d) and (f). We also have to + * update the pointers to (d) in the parent such that 1/2 of them + * point to the new right sibling of (d). Since we have changed the + * #of pointers to (d) (from 2 to 1) the local depth of (d) (and of + * f) is now TWO (2). Since globalDepth(d=2) is greater than + * localDepth(a=1) we can now split (a) into (a,e), redistribute the + * tuples in the sole buddy page (a) between (a,e) and update the + * pointers in (d) to (a,a,e,e). + * + * Having split (d) into (d,f), we now retry the insert. This time + * the insert is directed into (e). There is room in (e) (it is + * empty) and the tuple is inserted without further mutation to the + * structure of the hash tree. + * + * TODO At this point we should also prune the buckets [b] and [c] + * since they are empty and replace them with null references. + * + * TODO The #of new directory levels which have to be introduced * here is a function of the #of prefix bits which have to be * consumed before a distinction can be made between the existing * key (0x01) and the new key (0x20). With an address space of 2 * bits, each directory level examines the next 2-bits of the key. * The key (x20) was chosen since the distinction can be made by * adding only one directory level (the keys differ in the first 4 - * bits). + * bits). [Do an alternative example which requires recursive splits + * in order to verify that we reenter the logic correctly each time. + * E.g., by inserting 0x02 rather than 0x20.] * - * TODO At this point we should also prune the buckets [b] and [c] - * since they are empty and replace them with null references. + * TODO Do an example in which we explore an insert which introduces + * a new directory level in a 3-level tree. This should be a driven + * by a single insert so we can examine in depth how the new + * directory is introduced and verify whether it is introduced below + * the root or above the bucket. [I believe that it is introduced + * immediately below below the root. Note that a balanced tree, + * e.g., a B-Tree, introduces the new level above the root. However, + * the HTree is intended to be unbalanced in order to optimize + * storage and access times to the parts of the index which + * correspond to unequal distributions in the hash codes.] */ - htree.insert(new byte[] { 0x20 }, new byte[] { 0x20 }); - assertEquals("nnodes", 2, htree.getNodeCount()); - assertEquals("nleaves", 4, htree.getLeafCount()); - assertEquals("nentries", 5, htree.getEntryCount()); - htree.dump(Level.ALL, System.err, true/* materialize */); - assertTrue(root == htree.getRoot()); - assertEquals(4, root.childRefs.length); - final DirectoryPage d = (DirectoryPage)root.childRefs[0].get(); - assertTrue(d == (DirectoryPage) root.childRefs[0].get()); - assertTrue(d == (DirectoryPage) root.childRefs[1].get()); - assertTrue(b == (BucketPage) root.childRefs[2].get()); - assertTrue(b == (BucketPage) root.childRefs[3].get()); - assertEquals(4, d.childRefs.length); - final BucketPage e = (BucketPage)d.childRefs[1].get(); - assertTrue(a == (BucketPage) d.childRefs[0].get()); - assertTrue(e == (BucketPage) d.childRefs[1].get()); - assertTrue(c == (BucketPage) d.childRefs[2].get()); - assertTrue(c == (BucketPage) d.childRefs[3].get()); - assertEquals(2, root.getGlobalDepth()); - assertEquals(2, d.getGlobalDepth()); - assertEquals(2, a.getGlobalDepth());// unchanged - assertEquals(2, e.getGlobalDepth());// same as [a]. - assertEquals(1, b.getGlobalDepth());// unchanged. - assertEquals(2, c.getGlobalDepth());// unchanged. - assertTrue(htree.contains(new byte[] { 0x01 })); - assertFalse(htree.contains(new byte[] { 0x02 })); - assertEquals(new byte[] { 0x01 }, htree - .lookupFirst(new byte[] { 0x01 })); - assertNull(htree.lookupFirst(new byte[] { 0x02 })); - AbstractBTreeTestCase.assertSameIterator( - // - new byte[][] { new byte[] { 0x01 }, new byte[] { 0x01 }, - new byte[] { 0x01 }, new byte[] { 0x01 } }, htree - .lookupAll(new byte[] { 0x01 })); - AbstractBTreeTestCase.assertSameIterator(// - new byte[][] {}, htree.lookupAll(new byte[] { 0x02 })); +// htree.insert(new byte[] { 0x20 }, new byte[] { 0x20 }); +// assertEquals("nnodes", 2, htree.getNodeCount()); +// assertEquals("nleaves", 4, htree.getLeafCount()); +// assertEquals("nentries", 5, htree.getEntryCount()); +// htree.dump(Level.ALL, System.err, true/* materialize */); +// assertTrue(root == htree.getRoot()); +// assertEquals(4, root.childRefs.length); +// final DirectoryPage d = (DirectoryPage)root.childRefs[0].get(); +// assertTrue(d == (DirectoryPage) root.childRefs[0].get()); +// assertTrue(d == (DirectoryPage) root.childRefs[1].get()); +// assertTrue(b == (BucketPage) root.childRefs[2].get()); +// assertTrue(b == (BucketPage) root.childRefs[3].get()); +// assertEquals(4, d.childRefs.length); +// final BucketPage e = (BucketPage)d.childRefs[1].get(); +// assertTrue(a == (BucketPage) d.childRefs[0].get()); +// assertTrue(e == (BucketPage) d.childRefs[1].get()); +// assertTrue(c == (BucketPage) d.childRefs[2].get()); +// assertTrue(c == (BucketPage) d.childRefs[3].get()); +// assertEquals(2, root.getGlobalDepth()); +// assertEquals(2, d.getGlobalDepth()); +// assertEquals(2, a.getGlobalDepth());// unchanged +// assertEquals(2, e.getGlobalDepth());// same as [a]. +// assertEquals(1, b.getGlobalDepth());// unchanged. +// assertEquals(2, c.getGlobalDepth());// unchanged. +// assertTrue(htree.contains(new byte[] { 0x01 })); +// assertFalse(htree.contains(new byte[] { 0x02 })); +// assertEquals(new byte[] { 0x01 }, htree +// .lookupFirst(new byte[] { 0x01 })); +// assertNull(htree.lookupFirst(new byte[] { 0x02 })); +// AbstractBTreeTestCase.assertSameIterator( +// // +// new byte[][] { new byte[] { 0x01 }, new byte[] { 0x01 }, +// new byte[] { 0x01 }, new byte[] { 0x01 } }, htree +// .lookupAll(new byte[] { 0x01 })); +// AbstractBTreeTestCase.assertSameIterator(// +// new byte[][] {}, htree.lookupAll(new byte[] { 0x02 })); // TODO REMOVE (or test suite for remove). Modified: branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTreeUtil.java =================================================================== --- branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTreeUtil.java 2011-05-15 17:46:39 UTC (rev 4502) +++ branches/QUADS_QUERY_BRANCH/bigdata/src/test/com/bigdata/htree/TestHTreeUtil.java 2011-05-15 18:46:17 UTC (rev 4503) @@ -98,6 +98,12 @@ * range is [0:addressBits]. This depends solely on the <i>globalDepth</i> * of a directory page and the #of pointers to child (<i>npointers</i>) in * that directory page.</dd> + * <dt>nbits</dt> + * <dd>This is <code>(globalDepth-localDepth)</code>. It is the #of bits of + * the hash code which will be used to compute the <i>buddyOffset</i>.</dd> + * <dt>hashBits</dt> + * <dd>The LSB <code>globalDepth-localDepth</code> bits of the hash code. + * This is used to compute the <i>buddyOffset</i>.</dd> * <dt>buddyOffset</dt> * <dd>The offset of the buddy hash table or buddy bucket within the child.</dd> * </dl> This was sent by the SourceForge.net collaborative development platform, the world's largest Open Source development site. |