Hi Kevin,

Thanks for the reply.  I'll start by saying that I misspoke -- the issue was produced on Windows 7, not Windows XP.  See my comments inline.

The log file is actually cleared if things are shut down properly (only needed in the event of a crash) - just something to consider - theoretically, you could put the log file in a temp folder or something, but you have to make sure that if a crash occurs, the user immediately re-opens the app.

Yeah, I believe that on application shutdown, I should be able to shut down the jdbm instance and then delete the log.  That way, the user only sees a log file while the application is running, or if the application was killed or abruptly terminated.  This is satisfactory for my case.

 
Feasibility for putting the log into the db file is not good at all - it would be a nightmare.
Ok.  Understandable, thanks for entertaining the question.

 
I haven't started playing with jdbm2 yet - if it's creating multiple files, I'd be interested if the jdbm2 lead has any comments on that.
As you know, jdbm creates x.db, x.lg.  By contrast, jdbm2 creates x.dbf0, x.dbf.t, x.dbr.0, x.dbr.t, x.idf.0, x.idf.t, x.idr.0, x.idr.t.  The code in BaseRecordManager.reopen() hints at what these are used for:

        _physFileFree = new RecordFile( _filename +  DBF, FREE_BLOCK_SIZE);
        _physFile = new RecordFile( _filename + DBR, DATA_BLOCK_SIZE);
        _logicFileFree= new RecordFile( _filename +IDF,FREE_BLOCK_SIZE );
        _logicFile = new RecordFile( _filename +IDR,TRANS_BLOCK_SIZE );

 
 
For jdbm1, file corruption is something that we've seen in one very, very rare (and well defined) case.  The latest code in SVN has a fix for the issue (it was related to recoverable write failures - namely if the disk ran out of space at one point, then suddenly had enough space).  I'm pretty sure I commited a patch that fixes this, you can look at the SVN history notes to make sure.

For sure, the disk did not run out of space in my test case.  I have 86gb free.  The database file had grown to 90mb at the time the corruption occurred.

 
In your code, you are comitting every single change (this shouldn't cause problems, I just wanted to point that out)...
 
But outside of that, your code does look solid. 

Yep I wouldn't do that in a real world app. :)

 
 
Other folks may need to weigh in here (Alex??), but here are my thoughts about how to maybe go about troubleshooting this:
 
Is there any way for you to make backup copies of the db and lg files before you run each iteration, and halt the iterations as soon as you get the exception?  I'm not positive that this will help, but this would at least give the db and log files that are required to make the failure happen. 

Since I'm terminating the app from a different process, at some random time interval, the test case is indeterminate and I suspect that the issue will not be repeatable with any predictability.  I'm not sure a specific db or log file would predispose the problem to happening -- it might be related purely to the timing of killing the process.

For my test case I was exec'ing a new process once every second or so.  If I were to back up the database before each exec, that would severely slow down the test.  Recall that it took 12,500 iterations to produce the issue, and by that time the database had grown to 90gb.  Instead of taking 2 or 3 hours, it would take 2 or 3 days to repeat the same number of iterations.

You should be able to rename the lg file and actually bring up the db file by itself and not get errors.  If that is indeed the case, then there is something about the log file that is resulting in the corruption, and we may be able to track that down.

I removed the log file (after backing up both files), but the same exception continues to occur when I run CrashTest2.

 
Would you be willing to run the same test on the jdbm2 codebase?  It would interesting to know if the problem exists in both places (if it does, it would give at least a hint of where the problem is coming from).
 

Sure, I'll give that a try.

Thanks,
Jim