|
From: Jeremy J C. <jj...@sy...> - 2017-04-11 19:23:24
|
2147483616 = 7FFFFFE0 > On Apr 11, 2017, at 11:50 AM, Jeremy J Carroll <jj...@sy...> wrote: > > Hi > > We have an issue with one of our larger instances. The journal is 1.5T big. > The blazegraph process has been running since January and has now stopped accepting updates. > The errors we see are: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO > > Caused by …. > > Caused by: java.lang.AssertionError: Record exists for offset in cache: offset=2147483616 > at com.bigdata.io.writecache.WriteCache.write(WriteCache.java:977) > > This also occurs for the other indexes .spo.CSPO, .spo.OCSP, .spo.PCSO, .spo.POCS, .spo.SOPC and .spo.SPOC > > The full stack trace is below. > > Every occurrence of the error always has the same offset 2147483616 for the record in cache > > The system is still running 2.0.1, and we have not tried restarting it yet. (BLZG-2086 is not too positive about that). I have checked that: > > com.bigdata.service.AbstractTransactionService.minReleaseAge=1 > > We see in the logs three physical address errors (one in January, one in March, and one a week ago). The January one was preceeded, by a few days, by a query time-out during an update. > > The journal file is on an AWS EBS volume, and has been in use for a couple of years. It was copied from one volume to another in November, when we changed our encryption strategy. > > We have taken a copy of the journal to try various strategies for recovery, does anyone have suggestions? > > Are there maintenance procedures that may help clean the journal up, e.g. DumpJournal dumpPages > > Thanks > > Jeremy Carroll > Syapse, Inc. > > > > Full stack trace: > > Apr 09,2017 05:00:24 PDT - ERROR: 7457721498 com.bigdata.rdf.sail.webapp.BigdataRDFContext.queryService4 com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:787): l.name: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO > java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO > at java.util.concurrent.FutureTask.report(FutureTask.java:122) > at java.util.concurrent.FutureTask.get(FutureTask.java:192) > at com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:749) > at com.bigdata.journal.AbstractJournal.notifyCommitters(AbstractJournal.java:2716) > at com.bigdata.journal.AbstractJournal.access$1700(AbstractJournal.java:255) > at com.bigdata.journal.AbstractJournal$CommitState.notifyCommitters(AbstractJournal.java:3422) > at com.bigdata.journal.AbstractJournal$CommitState.access$2600(AbstractJournal.java:3298) > at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4092) > at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3129) > at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:98) > at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3695) > at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330) > at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertCommit(AST2BOpUpdate.java:375) > at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:321) > at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1072) > at com.bigdata.rdf.sail.BigdataSailUpdate.execute2(BigdataSailUpdate.java:152) > at com.bigdata.rdf.sail.webapp.BigdataRDFContext$UpdateTask.doQuery(BigdataRDFContext.java:1966) > at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1568) > at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1533) > at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:705) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO > at com.bigdata.journal.Name2Addr$CommitIndexTask.call(Name2Addr.java:578) > at com.bigdata.journal.Name2Addr$CommitIndexTask.call(Name2Addr.java:513) > ... 4 more > Caused by: java.lang.AssertionError: Record exists for offset in cache: offset=2147483616 > at com.bigdata.io.writecache.WriteCache.write(WriteCache.java:977) > at com.bigdata.io.writecache.WriteCacheService.write_timed(WriteCacheService.java:2487) > at com.bigdata.io.writecache.WriteCacheService.write(WriteCacheService.java:2421) > at com.bigdata.rwstore.RWStore.alloc(RWStore.java:3020) > at com.bigdata.rwstore.PSOutputStream.save(PSOutputStream.java:359) > at com.bigdata.rwstore.RWStore.alloc(RWStore.java:2991) > at com.bigdata.journal.RWStrategy.write(RWStrategy.java:239) > at com.bigdata.journal.RWStrategy.write(RWStrategy.java:199) > at com.bigdata.journal.AbstractJournal.write(AbstractJournal.java:4313) > at com.bigdata.btree.AbstractBTree.writeNodeOrLeaf(AbstractBTree.java:3948) > at com.bigdata.btree.AbstractBTree.writeNodeRecursive(AbstractBTree.java:3720) > at com.bigdata.btree.BTree.flush(BTree.java:756) > at com.bigdata.btree.BTree._writeCheckpoint2(BTree.java:961) > at com.bigdata.btree.BTree.writeCheckpoint2(BTree.java:922) > at com.bigdata.btree.BTree.handleCommit(BTree.java:1323) > at com.bigdata.journal.Name2Addr$CommitIndexTask.call(Name2Addr.java:570) > ... 5 more |