On 8/13/2010 4:02 AM, Kevin Day wrote:
Intersting - like I said, I haven't done much with jdbm2, but I definitely see why the different files might be needed - that could have a big impact on performance (not positive of that, but it does seem likely).  Plus it drastically simplifies the paging logic.
 
The thought behind grabbing the db and lg files was that they may give us an idea of where the corruption originated from.  But on further thinking, it would probably be better to know where the test app was executing at the time of termination immediately before the failure was found.  The problem with this, of course, is that the failure may not be detected until several iterations after the cause (it depends on how thoroughly things get tested each iteration).  I would want to see a full iteration through the records in the map at the beginning of each test cycle.
 
All that said, it seems that trying to track this particular issue in jdbm1 may not rise to a high level of priority...  (esp if the problem doesn't occur in jdbm2).  I hate to have you do a ton of testing, only to be told that it's not going to get fixed because there's a new kid on the block...
 
 
As a random aside, would you be able to package up the db files into a single file on successful close, then extract them on open?  I know that may not work for your use-case, but thought I'd suggest it...

That might be a possibility, thanks for the idea.  I'm going to weigh my options (jdbm1 or jdbm2) and think about it for a bit.  Either I've got to make regular backups of jdbm1 to recover from corruption (probably on every startup or shutdown), or I've got to unpack/pack jdbm2 database files (on every startup/shutdown).  Both of those are rather similar.  I like that jdbm2 files are smaller overall; speed is not important for my case, since the use case is not data intensive.

Jim

 
- K
 
----------------------- Original Message -----------------------
  
From: Jim Newsham <jnewsham@referentia.com>
To: jdbm-general@lists.sourceforge.net
Cc: 
Date: Thu, 12 Aug 2010 12:59:16 -1000
Subject: Re: [Jdbm-general] Fwd: re:  jdbm2 mailing list; jdbm corruption
  

Hi Kevin,

Thanks for the reply.  I'll start by saying that I misspoke -- the issue was produced on Windows 7, not Windows XP.  See my comments inline.

The log file is actually cleared if things are shut down properly (only needed in the event of a crash) - just something to consider - theoretically, you could put the log file in a temp folder or something, but you have to make sure that if a crash occurs, the user immediately re-opens the app.

Yeah, I believe that on application shutdown, I should be able to shut down the jdbm instance and then delete the log.  That way, the user only sees a log file while the application is running, or if the application was killed or abruptly terminated.  This is satisfactory for my case.

 
Feasibility for putting the log into the db file is not good at all - it would be a nightmare.
Ok.  Understandable, thanks for entertaining the question.

 
I haven't started playing with jdbm2 yet - if it's creating multiple files, I'd be interested if the jdbm2 lead has any comments on that.
As you know, jdbm creates x.db, x.lg.  By contrast, jdbm2 creates x.dbf0, x.dbf.t, x.dbr.0, x.dbr.t, x.idf.0, x.idf.t, x.idr.0, x.idr.t.  The code in BaseRecordManager.reopen() hints at what these are used for:

        _physFileFree = new RecordFile( _filename +  DBF, FREE_BLOCK_SIZE);
        _physFile = new RecordFile( _filename + DBR, DATA_BLOCK_SIZE);
        _logicFileFree= new RecordFile( _filename +IDF,FREE_BLOCK_SIZE );
        _logicFile = new RecordFile( _filename +IDR,TRANS_BLOCK_SIZE );

 
 
For jdbm1, file corruption is something that we've seen in one very, very rare (and well defined) case.  The latest code in SVN has a fix for the issue (it was related to recoverable write failures - namely if the disk ran out of space at one point, then suddenly had enough space).  I'm pretty sure I commited a patch that fixes this, you can look at the SVN history notes to make sure.

For sure, the disk did not run out of space in my test case.  I have 86gb free.  The database file had grown to 90mb at the time the corruption occurred.

 
In your code, you are comitting every single change (this shouldn't cause problems, I just wanted to point that out)...
 
But outside of that, your code does look solid. 

Yep I wouldn't do that in a real world app. :)

 
 
Other folks may need to weigh in here (Alex??), but here are my thoughts about how to maybe go about troubleshooting this:
 
Is there any way for you to make backup copies of the db and lg files before you run each iteration, and halt the iterations as soon as you get the exception?  I'm not positive that this will help, but this would at least give the db and log files that are required to make the failure happen. 

Since I'm terminating the app from a different process, at some random time interval, the test case is indeterminate and I suspect that the issue will not be repeatable with any predictability.  I'm not sure a specific db or log file would predispose the problem to happening -- it might be related purely to the timing of killing the process.

For my test case I was exec'ing a new process once every second or so.  If I were to back up the database before each exec, that would severely slow down the test.  Recall that it took 12,500 iterations to produce the issue, and by that time the database had grown to 90gb.  Instead of taking 2 or 3 hours, it would take 2 or 3 days to repeat the same number of iterations.

You should be able to rename the lg file and actually bring up the db file by itself and not get errors.  If that is indeed the case, then there is something about the log file that is resulting in the corruption, and we may be able to track that down.

I removed the log file (after backing up both files), but the same exception continues to occur when I run CrashTest2.

 
Would you be willing to run the same test on the jdbm2 codebase?  It would interesting to know if the problem exists in both places (if it does, it would give at least a hint of where the problem is coming from).
 

Sure, I'll give that a try.

Thanks,
Jim


------------------------------------------------------------------------------
This SF.net email is sponsored by

Make an app they can't live without
Enter the BlackBerry Developer Challenge
http://p.sf.net/sfu/RIM-dev2dev
_______________________________________________
Jdbm-general mailing list
Jdbm-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/jdbm-general