|
From: Jeremy J C. <jj...@sy...> - 2017-04-11 19:21:53
|
Hi
We have an issue with one of our larger instances. The journal is 1.5T big.
The blazegraph process has been running since January and has now stopped accepting updates.
The errors we see are:
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO
Caused by ….
Caused by: java.lang.AssertionError: Record exists for offset in cache: offset=2147483616
at com.bigdata.io.writecache.WriteCache.write(WriteCache.java:977)
This also occurs for the other indexes .spo.CSPO, .spo.OCSP, .spo.PCSO, .spo.POCS, .spo.SOPC and .spo.SPOC
The full stack trace is below.
Every occurrence of the error always has the same offset 2147483616 for the record in cache
The system is still running 2.0.1, and we have not tried restarting it yet. (BLZG-2086 is not too positive about that). I have checked that:
com.bigdata.service.AbstractTransactionService.minReleaseAge=1
We see in the logs three physical address errors (one in January, one in March, and one a week ago). The January one was preceeded, by a few days, by a query time-out during an update.
The journal file is on an AWS EBS volume, and has been in use for a couple of years. It was copied from one volume to another in November, when we changed our encryption strategy.
We have taken a copy of the journal to try various strategies for recovery, does anyone have suggestions?
Are there maintenance procedures that may help clean the journal up, e.g. DumpJournal dumpPages
Thanks
Jeremy Carroll
Syapse, Inc.
Full stack trace:
Apr 09,2017 05:00:24 PDT - ERROR: 7457721498 com.bigdata.rdf.sail.webapp.BigdataRDFContext.queryService4 com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:787): l.name: java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO
java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at com.bigdata.journal.Name2Addr.handleCommit(Name2Addr.java:749)
at com.bigdata.journal.AbstractJournal.notifyCommitters(AbstractJournal.java:2716)
at com.bigdata.journal.AbstractJournal.access$1700(AbstractJournal.java:255)
at com.bigdata.journal.AbstractJournal$CommitState.notifyCommitters(AbstractJournal.java:3422)
at com.bigdata.journal.AbstractJournal$CommitState.access$2600(AbstractJournal.java:3298)
at com.bigdata.journal.AbstractJournal.commitNow(AbstractJournal.java:4092)
at com.bigdata.journal.AbstractJournal.commit(AbstractJournal.java:3129)
at com.bigdata.rdf.store.LocalTripleStore.commit(LocalTripleStore.java:98)
at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3695)
at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330)
at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertCommit(AST2BOpUpdate.java:375)
at com.bigdata.rdf.sparql.ast.eval.AST2BOpUpdate.convertUpdate(AST2BOpUpdate.java:321)
at com.bigdata.rdf.sparql.ast.eval.ASTEvalHelper.executeUpdate(ASTEvalHelper.java:1072)
at com.bigdata.rdf.sail.BigdataSailUpdate.execute2(BigdataSailUpdate.java:152)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$UpdateTask.doQuery(BigdataRDFContext.java:1966)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1568)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1533)
at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:705)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.RuntimeException: Could not commit index: name=.spo.CSPO
at com.bigdata.journal.Name2Addr$CommitIndexTask.call(Name2Addr.java:578)
at com.bigdata.journal.Name2Addr$CommitIndexTask.call(Name2Addr.java:513)
... 4 more
Caused by: java.lang.AssertionError: Record exists for offset in cache: offset=2147483616
at com.bigdata.io.writecache.WriteCache.write(WriteCache.java:977)
at com.bigdata.io.writecache.WriteCacheService.write_timed(WriteCacheService.java:2487)
at com.bigdata.io.writecache.WriteCacheService.write(WriteCacheService.java:2421)
at com.bigdata.rwstore.RWStore.alloc(RWStore.java:3020)
at com.bigdata.rwstore.PSOutputStream.save(PSOutputStream.java:359)
at com.bigdata.rwstore.RWStore.alloc(RWStore.java:2991)
at com.bigdata.journal.RWStrategy.write(RWStrategy.java:239)
at com.bigdata.journal.RWStrategy.write(RWStrategy.java:199)
at com.bigdata.journal.AbstractJournal.write(AbstractJournal.java:4313)
at com.bigdata.btree.AbstractBTree.writeNodeOrLeaf(AbstractBTree.java:3948)
at com.bigdata.btree.AbstractBTree.writeNodeRecursive(AbstractBTree.java:3720)
at com.bigdata.btree.BTree.flush(BTree.java:756)
at com.bigdata.btree.BTree._writeCheckpoint2(BTree.java:961)
at com.bigdata.btree.BTree.writeCheckpoint2(BTree.java:922)
at com.bigdata.btree.BTree.handleCommit(BTree.java:1323)
at com.bigdata.journal.Name2Addr$CommitIndexTask.call(Name2Addr.java:570)
... 5 more |