From: Gary L. <gar...@en...> - 2011-03-24 18:56:35
|
Hi Wolfgang, Thanks for the info and taking the time to respond. I'll try to track this down. Am I correct that these calls are signaling the wrapper? if (mode == NodeProcessor.MODE_REPAIR) pool.signalSystemStatus(BrokerPool.SIGNAL_STARTUP); gary -----Original Message----- From: Wolfgang Meier [mailto:wol...@ex...] Sent: Thursday, March 24, 2011 2:27 PM To: Gary Larsen Cc: exi...@li... Subject: Re: [Exist-open] recovery settings > More info on this. I ran the kill test on a smaller database and the same > behavior of element.dbx and values.dxv rebuilding was seen and the database > was restored and functioning properly. The recovery log only covers the core db files (dom.dbx, collections.dbx and symbols.dbx). The other indexes have to be rebuilt afterwards. For large databases, this can take a longer time. We should really think about how to improve this. Basically it would be sufficient to only recreate the indexes for those documents which were part of an uncommitted transaction. > My guess is the problem with the > large database was this exception in the wrapper log: > > INFO | jvm 2 | 2011/03/23 15:11:38 | Redo > [================================================= ] (98 %) > > ERROR | wrapper | 2011/03/23 15:46:50 | Startup failed: Timed out waiting > for signal from JVM. > ERROR | wrapper | 2011/03/23 15:46:50 | JVM did not exit on request, > terminated Yes, this is probably the root issue. It's a problem I encountered in 1.4.0 and thought I had it fixed in 1.4.x. The wrapper kills the JVM because it thinks it became unresponsive, but eXist is still rebuilding the indexes. Interrupting the recovery process can kill the db completely. In 1.4.x, the database does inform the wrapper that it's still alive. For some reason, this doesn't seem to work in your case. I'll need to reproduce and fix this. Wolfgang |