Work at SourceForge, help us to make it a better place! We have an immediate need for a Support Technician in our San Francisco or Denver office.

Close

Unit tests in 1.0 (CVS) build fail

Kevin Day
2005-08-21
2013-06-03
  • Kevin Day
    Kevin Day
    2005-08-21

    FYI - when running jdbm.recman.TestStress, I'm getting the following junit test failures.  Can someone please confirm?  This is a serious problem if I'm right...

    testBasics(jdbm.recman.TestStress)
    junit.framework.AssertionFailedError: aborting test at slot 91: junit.framework.AssertionFailedError: fetch round=10447, slot=91, slot(132000,sz=256,b=-71)
        at junit.framework.Assert.fail(Assert.java:51)
        at jdbm.recman.TestStress.testBasics(TestStress.java:445)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at junit.framework.TestCase.runTest(TestCase.java:166)
        at junit.framework.TestCase.runBare(TestCase.java:140)
        at junit.framework.TestResult$1.protect(TestResult.java:106)
        at junit.framework.TestResult.runProtected(TestResult.java:124)
        at junit.framework.TestResult.run(TestResult.java:109)
        at junit.framework.TestCase.run(TestCase.java:131)
        at junit.framework.TestSuite.runTest(TestSuite.java:173)
        at junit.framework.TestSuite.run(TestSuite.java:168)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:478)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:344)
        at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:196)

     
    • Kevin Day
      Kevin Day
      2005-08-21

      More info:

      I checked out V0_13, and the unit tests definitely pass in that version...  That was a long time ago, though...

      When I made my changes to the freelogicalrowid, etc... classes, they were made against the CVS HEAD code a few weeks prior to the 1.0 release.  I absolutely ran the recman unit tests at that point, and they passed.  I'm thinking that something changed during the 1.0 release...

      Unfortunately, I don't have that configuration on my system anymore (I updated everything to the 1.0 version when it became available)...

      - K

       
    • Alex Boisvert
      Alex Boisvert
      2005-08-21

      I'm getting the same error on Linux (Fedora Core 4) with a fresh CVS checkout.

      Interesting, I just found out that jdbm.recman.TestStress isn't run when I launch jdbm.AllTests.   I need to fix that.

      alex

       
    • Kevin Day
      Kevin Day
      2005-08-21

      Do you still have a copy of the pre-1.0 release code laying around that we can do a diff against?  Something went south during that 1.0 release, and it's going to be a bugger to find going through each file one by one...

      Thanks!

      - K

       
      • Alex Boisvert
        Alex Boisvert
        2005-08-21

        You may want to compare against this CVS snapshot taken in Aug. 2003:

        http://jdbm.sourceforge.net/old/jdbm-current-2003-08-07.zip

        This is the only snapshot I have before 1.0.  If you don't find what you want in there, we would have to rummage through CVS.

        alex

         
      • Bryan Thompson
        Bryan Thompson
        2005-08-22

        I diffed my checked out version of jdbm against the 1.0 release.  There were two changes: VERSION and build.xml.  So, anything that was "changed" was at least several weeks back (when I did the checkout).

        -bryan

         
    • Kevin Day
      Kevin Day
      2005-08-21

      Well now I'm completely at a loss.  The code in the 2003-08-07 file fails the test as well.

      I can't understand how I was able to pass the unit tests with my patch, and see this fail...  Maybe I had the same problem you did with the tests not including that one.  I must have just run jdbm.recman.Test, thinking I was covered.  Now that I'm running all tests in the project, Eclipse is picking up the TestStress class.

      A couple of interesting items (though probably unrelated):

      The test itself was last modified on 3/20/2003.  At that time, it looks like whoever checked it in had a problem with CRLF tranlation (all of the sudden, the entire class has a ton of blank lines in it).  I'm half wondering if something went wrong with the test case at that point.

      I'm going to do some stepping through the test to see what might be going wrong (although my initial walk through definitely made this look like it was *not* a test problem).

      - K

       
    • Kevin Day
      Kevin Day
      2005-08-21

      More results...

      I'm going through the tests, and if I fetch the record for the bad slot (91) immediately after creating it, it works properly - so if the record is in the cache, it works properly...

      I'm going to try to disable object caching and see what happens (i.e. if we just use the page manager's cache do we still get the problem).

      - K

       
    • Kevin Day
      Kevin Day
      2005-08-21

      And more...

      If I completely disable the record manager cache (i.e. use BaseRecordManager instead of CacheRecordManager), the  test fails immediately after the initial insert (and on the very first inserted record).  The cache was hiding a problem.

      I still don't have root cause, but it appears that the record manager is returning an array *larger* than what is actually needed to store the full record.  Probably the reason that this hasn't caused tons of field failures is that most deserialization just ignores additional bytes in the record data passed in to deserialize.

      The physical row manager is literally returning an array that is longer than what was stored.  I'm still trying to figure out why.

      - K

       
    • Kevin Day
      Kevin Day
      2005-08-21

      Found the problem.  The test was calling recman.insert(obj) insted of recman.insert(obj, serializer).  This is a problem with the test, not with the record manager itself (thank goodness).

      I've now fixed the test and will try to figure out the whole SSH thing on my windows box and get the change uploaded to SF.

      I also took the opportunity to:

      Move the random number generation to the bottom of the loop so the first loop through it always an insert.

      Change the loop so it starts with 1 instead of 0 - that way the record manager isn't closed and reopened on the first loop.

      Cheers!

      - K

       
      • Alex Boisvert
        Alex Boisvert
        2005-08-22

        Cool, thanks for fixing this.

        I've just added TestStress to the jdbm.recman.Test testsuite and comitted to CVS.

        alex

         
    • Kevin Day
      Kevin Day
      2005-08-22

      No worries.

      I actually found a couple of other problems just now - I'll comit them in a second.

      Issues:

      1.  The last-minute changes I made to make the first loop be an insert didn't take into account that the test was written with 'continue' statements (I hate non-obvious gotos like this - makes it really tough to not mess things up when you are editing someone else's work).  Anyway, I've adjusted things again so the test runs to completion.

      2.  The test is still failing when the inserted record size is 0.

      I haven't figured out #2 yet...  If I can't get it, I'll comit the changes I have so you can try to figure out what's going on.  The slot that is messing up is # 551.

      OK - I've figured out that the insert immediately following the 0 size insert is blowing away the contents of the 0 size insert (i.e. it's blowing away the record header, and changing the length).

      The question is why is it deciding to overwrite the insertion point instead of locating a new one...

      I think I have the answer:  The physical row manager is using a 0 length to indicate a free block...

      So - is the test wrong?  i.e. are 0 length records not supported?  Or is jdbm wrong?  i.e. it shouldn't be using a 0 length for free?  I'd think that normally, you'd use -1 length to indicate a free row...

      Please let me know which is the case so I can start thinking about how to fix it.  I'm inclined to just say that 0 length records aren't supported at the recman level, but I don't know the full ramifications of that.

      Thanks,

      - K

       
    • Kevin Day
      Kevin Day
      2005-08-22

      Alex-

      Any reason why we shouldn't change the following in FreePhysicalRowId:

          /** Returns true if a slot is allocated */
          boolean isAllocated(int slot) {
        return get(slot).getSize() != 0;
          }

          /** Frees a slot */
          void free(int slot) {
        get(slot).setSize(0);
        setCount((short) (getCount() - 1));
          }

      to:

          /** Returns true if a slot is allocated */
          boolean isAllocated(int slot) {
        return get(slot).getSize() != -1; // -1 = block is free
          }

          /** Frees a slot */
          void free(int slot) {
        get(slot).setSize(-1); // -1 = block is free
        setCount((short) (getCount() - 1));
          }

      or should we just stick with saying that 0 length records aren't allowed in recman, and adjust the test?

      I can think of some real world examples where someone would want to be able to accurately store a 0 length byte array in the recman, so I'm hesitant to just say that it isn't allowed...

      - K

       
      • Alex Boisvert
        Alex Boisvert
        2005-08-22

        I'm fine with the change.  I think zero-length record is a valid use-case.

        alex

         
        • Kevin Day
          Kevin Day
          2005-08-22

          Alex-

          I spent about an hour trying to make jdbm use -1 as a "Free block" marker in the size field, without luck.  The problem is that jdbm relies on the default content of the block arrays being initialized to 0's.

          I changed the "EMPTY" default block, and made 7 or 8 adjustments to make things either check for -1 (Actually, I added a new Magic value for it, but whatever), and at the end of the day, I started getting file header corruption errors.

          It almost seems like jdbm is relying on the blocks to contain 0's - I would have expected that file and record headers would have been written in their entirety to the file, but that doesn't seem to be the case.

          If you want to recreate the corrupted header error, add a static initializer to the RecordFile class that initializes the cleanData array to -1:

              /** A block of clean data to wipe clean pages. */
              final static byte[] cleanData = new byte[BLOCK_SIZE];
              // static initializer for cleanData
              {
                 Arrays.fill(cleanData, (byte)-1);
              }

          Then run the unit tests.  (Don't even worry about changing the physical record manager, etc... - the tests won't even get to them).

          When I hit that last night, I threw my hands up and gave up.  The changes I submitted last night explicitly throw an IllegalArgumentException if an attempt to make a zero length record is made.

          If you can figure out why setting all default values to -1 causes a header corruption exception, then I'd be open to going back and trying this again...

          Cheers!

          - K

           
          • Alex Boisvert
            Alex Boisvert
            2005-08-22

            To be honest, I was expecting trouble when you asked about this.  The code makes assumptions about those 0's and I think JDBM needs to be smarter about page initialization. 

            I don't think it's a big deal to do the change but it's going to require some trial and error until all code sections making assumptions about 0's are identified.   Hopefully the unit tests will catch all of them.

            alex

             
            • Kevin Day
              Kevin Day
              2005-08-22

              I agree.  I got part way down that path and finally gave up in frustration.  If it's something that you want to get in to, I can send you the locations I've found so far.

              - K

               
              • Alex Boisvert
                Alex Boisvert
                2005-08-24

                I've added RecordManager.NULL_RECID to represent this special value.  Currently the value is set to 0 to avoid failures.  My intent is to find all the places where it should be used and eventually change the value to "-1".

                alex

                 
                • Kevin Day
                  Kevin Day
                  2005-08-24

                  Alex-

                  Here are the places I've found so far:

                  RecordFile#cleanData declaration, needs static initializer

                  PhysicalRowIdManager#allocNew(int, long) - available and current size are set explicitly to 0 - probably needs to be set to NULL_RECID

                  PhysicalRowIdManager#allocNew(int, long) - there is a loop: while ( hdr.getAvailableSize() != 0 && pos < RecordFile.BLOCK_SIZE ) { that is checking for available size != 0.

                  I'm actually a tad concerned about this one, as the loop symantecs may change if it is legal for getAvailableSize() to return 0 without exiting the loop.  The check for making sure pos < BLOCK_SIZE may need to be adjusted for the case where a 0 size block lands on a page boundary.  Make sure you come up with some unit tests for checking the possibilities through this loop - I think it could appear to work, then fail in some of the boundary conditions.

                  PhysicalRowIdManager#free(Location) - sets size to 0, probably needs to be NULL_RECID

                  That's what I found prior to giving up....

                  - K

                   
                  • Bryan Thompson
                    Bryan Thompson
                    2005-08-25

                    I have a question about recids as assigned by jdbm.  Since Java does not support unsigned longs, are always non-zero integers?  If true, then are we are only using 1/2 of the address space?

                    Perhaps there is some compression that could be done for recids if this is true?

                    -bryan

                     
                    • Kevin Day
                      Kevin Day
                      2005-08-25

                      Bryan-

                      That's true - I had this exact thought when I started playing with jdbm (in fact, I also had this thought when I was thinking about the fact that java allows only positive integers to be used to access arrays...)

                      After thinking about it, though, I realized that losing 1/2 the address space of a long just doesn't matter (you'll run out of disk space WAY before you hit this limit).  In fact, the long compression I implement in the multi-part-long serializer uses two bits of the 64 in the long to determine uncompressed length, so I'm only using 1/4 of the address space.

                      The lookup tables used for mapping logical to physical rows rely on a a fixed record length lookup table, so you wouldn't want to attempt any type of compression there.

                      Also, you may have noticed that the rowids returned by the record manager are not entirely sequential - recman uses a mask against the rowid to determine the page and offset of the start of a given row in the page manager - so you aren't using the entire address space on the low order bytes either.

                      Now, if you are storing recids in any of your data structures, and you are storing a lot of them, then there is a huge opportunity to compress that data - that is, in fact, why I created the multi-part long serializer with compression - I had BTrees with recid keys all over the place, and it was chewing up a huge amount of disk space.  Once I put that compression in place, my file sizes decreased dramatically.

                      I think it's probably time for me to commit that serializer - I'll see if I can tweak it to include a utility method for efficiently writing regular longs to an output stream - that way you can use the algorithm for any of your serialization, not just multi-part-long.

                      - K