From: Bryan T. <br...@sy...> - 2015-04-23 13:33:27
|
I am at the following with group commit + small slots but without the full text index. ><br>totalElapsed=4502693ms, elapsed=4502570ms, parsed=43820000, tps=9732, done=false</br -rw-r--r-- 1 root root 6.0G Apr 23 09:31 bigdata.jnl There is clearly a lot of recycling going on. I am going to wait for it to finish to look into this further. magic=e6b4c275 version=1 extent=209715200(200M), userExtent=209714512(199M), bytesAvailable=209714512(199M), nextOffset=0 rootBlock{ rootBlock=0, challisField=4, version=3, nextOffset=253403152405, localTime=1429791054868 [Thursday, April 23, 2015 8:10:54 AM EDT], firstCommitTime=1429789657461 [Thursday, April 23, 2015 7:47:37 AM EDT], lastCommitTime=1429791054859 [Thursday, April 23, 2015 8:10:54 AM EDT], commitCounter=4, commitRecordAddr={off=NATIVE:-106500,len=422}, commitRecordIndexAddr={off=NATIVE:-81940,len=220}, blockSequence=1, quorumToken=-1, metaBitsAddr=206535917615, metaStartAddr=3200, storeType=RW, uuid=8d9bce3f-db56-4a87-b3fd-c1a433e1d3d8, offsetBits=42, checksum=-1504696410, createTime=1429789657046 [Thursday, April 23, 2015 7:47:37 AM EDT], closeTime=0} rootBlock{ rootBlock=1, challisField=3, version=3, nextOffset=231928315910, localTime=1429791050520 [Thursday, April 23, 2015 8:10:50 AM EDT], firstCommitTime=1429789657461 [Thursday, April 23, 2015 7:47:37 AM EDT], lastCommitTime=1429791050513 [Thursday, April 23, 2015 8:10:50 AM EDT], commitCounter=3, commitRecordAddr={off=NATIVE:-40968,len=422}, commitRecordIndexAddr={off=NATIVE:-81925,len=220}, blockSequence=1, quorumToken=-1, metaBitsAddr=206221344815, metaStartAddr=3200, storeType=RW, uuid=8d9bce3f-db56-4a87-b3fd-c1a433e1d3d8, offsetBits=42, checksum=-2109528144, createTime=1429789657046 [Thursday, April 23, 2015 7:47:37 AM EDT], closeTime=0} The current root block is #0 ------------------------- RWStore Allocator Summary ------------------------- AllocatorSize AllocatorCount SlotsAllocated %SlotsAllocated SlotsRecycled SlotChurn SlotsInUse %SlotsInUse MeanAllocation SlotsReserved %SlotsUnused BytesReserved BytesAppData %SlotWaste %AppData %StoreFile %TotalWaste %FileWaste 64 3390 25653334 51.17 8436924 1.49 17216410 89.79 27 24299520 29.15 1555169280 566105149 63.60 14.84 28.50 60.20 18.13 128 178 1349254 2.69 106699 1.09 1242555 6.48 87 1270272 2.18 162594816 107803126 33.70 2.83 2.98 3.33 1.00 192 19 229968 0.46 105038 1.84 124930 0.65 153 134144 6.87 25755648 18951542 26.42 0.50 0.47 0.41 0.12 320 5 229984 0.46 203087 8.55 26897 0.14 253 35840 24.95 11468800 7145051 37.70 0.19 0.21 0.26 0.08 512 2 296128 0.59 289066 41.93 7062 0.04 415 7424 4.88 3801088 3949092 -3.89 0.10 0.07 -0.01 0.00 768 2 369042 0.74 365413 101.69 3629 0.02 639 7424 51.12 5701632 3754158 34.16 0.10 0.10 0.12 0.04 1024 2 348064 0.69 345243 123.38 2821 0.01 895 7424 62.00 7602176 3907272 48.60 0.10 0.14 0.22 0.07 2048 4 1307596 2.61 1280762 48.73 26834 0.14 1525 28672 6.41 58720256 41087168 30.03 1.08 1.08 1.07 0.32 3072 2 1175674 2.34 1162053 86.31 13621 0.07 2558 14336 4.99 44040192 42018252 4.59 1.10 0.81 0.12 0.04 4096 26 1758846 3.51 1581120 9.90 177726 0.93 3572 186368 4.64 763363328 621452046 18.59 16.30 13.99 8.64 2.60 8192 48 17418250 34.74 17086197 52.46 332053 1.73 7274 344064 3.49 2818572288 2397567451 14.94 62.87 51.65 25.62 7.72 ------------------------- BLOBS ------------------------- Bucket(K) Allocations Allocated Deletes Deleted Current Data Mean Churn 16 7529975 87846235428 7432952 86784750383 97023 1061485045 11666 77.61 32 890621 17213523980 885724 17117153133 4897 96370847 19327 181.87 64 15272 560980091 15190 557968835 82 3011256 36732 186.24 128 0 0 0 0 0 0 0 0.00 256 0 0 0 0 0 0 0 0.00 512 0 0 0 0 0 0 0 0.00 1024 0 0 0 0 0 0 0 0.00 2048 0 0 0 0 0 0 0 0.00 4096 0 0 0 0 0 0 0 0.00 8192 0 0 0 0 0 0 0 0.00 16384 0 0 0 0 0 0 0 0.00 32768 0 0 0 0 0 0 0 0.00 65536 0 0 0 0 0 0 0 0.00 2097151 0 0 0 0 0 0 0 0.00 ---- Bryan Thompson Chief Scientist & Founder SYSTAP, LLC 4501 Tower Road Greensboro, NC 27410 br...@sy... http://blazegraph.com http://blog.bigdata.com <http://bigdata.com> http://mapgraph.io Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Thu, Apr 23, 2015 at 9:30 AM, Andreas Kahl <ka...@bs...> wrote: > Now I am 25mins into the new load with groupCommit enabled and > com.bigdata.rwstore.RWStore.smallSlotType=1024 commented out. > Currently 24,870,000 Triples are parsed and the journal is at 3.6GB. It > looks like disabling smallSlotOptimization also resolves the problem > (Otherwise I would have more than twice the space used at that time). > > So, I would conclude, it's the combination of groupCommit and > smallSlotOptimization. > > All tests were run on Blazegraph 1.5.1 from Git revision f4c63e5. > > Best Regards > Andreas > > > >>> "Andreas Kahl" <ka...@bs...> 23.04.2015 14:54 >>> > Bryan, > > in the meantime, I could successfully load the file into a 18GB journal > after disabling groupCommit (I simply commented out the line in > RWStore.properties). > I can try again with groupCommit enabled, but smallSlotOptimization > disabled. > > Best Regards > Andreas > > >>> Bryan Thompson <br...@sy...> 23.04.2015 13:24 >>> > Andreas, > > I was not able to replicate your result. Unfortunately I navigated away > from the browser page in which I had submitted the request, so it loaded > all the data but failed to commit. However, the resulting file is only > 16GB. > > I will redo this run and verify that the journal after the commit has this > same size on the disk. > > I was only assuming that this was related to group commit because of your > original message. Perhaps I misinterpreted your message. This is simply > about 1.5.1 (with group commit) vs 1.4.0. > > Perhaps the issue is related to the small slot optimization? Maybe in > combination with group commit? > > *> com.bigdata.rwstore.RWStore.smallSlotType=1024* > > I could not replicate your properties exactly because you are using a > non-standard vocabulary class. Therefore I simply deleted the default > namespace (in quads mode) and recreated it with the defaults in triples > mode. The small slot optimization and other parameters were not enabled in > my run. > > Perhaps you could try to replicate my experience and I will enable the > small slots optimization? > > Thanks, > Bryan > > ---- > Bryan Thompson > Chief Scientist & Founder > SYSTAP, LLC > 4501 Tower Road > Greensboro, NC 27410 > br...@sy... > http://blazegraph.com > http://blog.bigdata.com <http://bigdata.com> > http://mapgraph.io > > Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance > graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints > APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new > technology to use GPUs to accelerate data-parallel graph analytics. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > > On Thu, Apr 23, 2015 at 1:51 AM, Andreas Kahl <ka...@bs...> > wrote: > > > Bryan & Martyn, > > > > Thank you very much for investigating the issue. I assume from the > ticket > > that the error will vanish if I disable groupCommit. I will do so for the > > meantime. > > > > Although there is already extensive information in Bryan's ticket, please > > find attached my logs and DumpJournal outputs: > > - dumpJournal.html contains a dump from the 67GB journal after Blazegraph > > ran into "No space left on device" > > - dumpJournalWithTraceEnabled.html is the same dump for a running query > > when the journal was at about 14GB > > - queryStatus.html is just the status page showing my query > > - catalina.out.gz contains the trace outputs from starting Tomcat until I > > killed the curl running the SPARQL Update by Ctrl-C > > - loadGnd.log.gz is Blazegraphs output when loading the data > > > > Best Regards > > Andreas > > > > > > > > >>> Bryan Thompson <br...@sy...> 22.04.15 20.56 Uhr >>> > > See http://trac.bigdata.com/ticket/1206. This is still in the > > investigation stage. > > > > Thanks, > > Bryan > > > > ---- > > Bryan Thompson > > Chief Scientist & Founder > > SYSTAP, LLC > > 4501 Tower Road > > Greensboro, NC 27410 > > br...@sy... > > http://blazegraph.com > > http://blog.bigdata.com <http://bigdata.com> > > http://mapgraph.io > > > > Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance > > graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints > > APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new > > technology to use GPUs to accelerate data-parallel graph analytics. > > > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > > for the sole use of the intended recipient(s) and are confidential or > > proprietary to SYSTAP. Any unauthorized review, use, disclosure, > > dissemination or copying of this email or its contents or attachments is > > prohibited. If you have received this communication in error, please > notify > > the sender by reply email and permanently delete all copies of the email > > and its contents and attachments. > > > > On Wed, Apr 22, 2015 at 5:37 AM, Andreas Kahl <ka...@bs...> > > wrote: > > > > > Hello everyone, > > > > > > I currently updated to the current Revision (f4c63e5) of Blazegraph > from > > > Git and tried to load a dataset into the updated Webapp. With Bigdata > > 1.4.0 > > > this resulted in a journal of ~18GB. Now the process was cancelled > > because > > > the disk was full - the journal was beyond 50GB for the same file with > > the > > > same settings. > > > The only exception was that I activated GroupCommit. > > > > > > The dataset can be downloaded here: > > > > > > http://datendienst.dnb.de/cgi-bin/mabit.pl?cmd=fetch&userID=opendata&pass=opendata&mabheft=GND.rdf.gz > > > . > > > Please find the settings used to load the file below. > > > > > > Do I have a misconfiguration, or is there a bug eating all disk memory? > > > > > > Best regards > > > Andreas > > > > > > Namespace-Properties: > > > curl -H "Accept: text/plain" > > > http://localhost:8080/bigdata/namespace/gnd/properties > > > #Wed Apr 22 11:35:31 CEST 2015 > > > > com.bigdata.namespace.kb.spo.com.bigdata.btree.BTree.branchingFactor=700 > > > com.bigdata.relation.container=gnd > > > com.bigdata.rwstore.RWStore.smallSlotType=1024 > > > com.bigdata.journal.AbstractJournal.bufferMode=DiskRW > > > com.bigdata.journal.AbstractJournal.file=/var/lib/bigdata/bigdata.jnl > > > > > > > > > com.bigdata.rdf.store.AbstractTripleStore.vocabularyClass=de.bsb_muenchen.bigdata.vocab.B3KatVocabulary > > > com.bigdata.journal.AbstractJournal.initialExtent=209715200 > > > com.bigdata.rdf.store.AbstractTripleStore.textIndex=true > > > com.bigdata.btree.BTree.branchingFactor=700 > > > > > > > > > com.bigdata.rdf.store.AbstractTripleStore.axiomsClass=com.bigdata.rdf.axioms.NoAxioms > > > com.bigdata.rdf.sail.isolatableIndices=false > > > com.bigdata.service.AbstractTransactionService.minReleaseAge=1 > > > com.bigdata.rdf.sail.bufferCapacity=2000 > > > com.bigdata.rdf.sail.truthMaintenance=false > > > com.bigdata.rdf.sail.namespace=gnd > > > com.bigdata.relation.class=com.bigdata.rdf.store.LocalTripleStore > > > com.bigdata.rdf.store.AbstractTripleStore.quads=false > > > com.bigdata.journal.AbstractJournal.writeCacheBufferCount=500 > > > com.bigdata.search.FullTextIndex.fieldsEnabled=false > > > com.bigdata.relation.namespace=gndity=10000 > > > com.bigdata.rdf.sail.BigdataSail.bufferCapacity=2000 > > > com.bigdata.rdf.store.AbstractTripleStore.statementIdentifiers=false > > > > > > > > > > > > ------------------------------------------------------------------------------ > > > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > > > Develop your own process in accordance with the BPMN 2 standard > > > Learn Process modeling best practices with Bonita BPM through live > > > exercises > > > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- > > > event?utm_ > > > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > > > _______________________________________________ > > > Bigdata-developers mailing list > > > Big...@li... > > > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > > > > > > > > > > > ------------------------------------------------------------------------------ > BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT > Develop your own process in accordance with the BPMN 2 standard > Learn Process modeling best practices with Bonita BPM through live > exercises > http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- > event?utm_ > source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |