This list is closed, nobody may subscribe to it.
2010 |
Jan
|
Feb
(19) |
Mar
(8) |
Apr
(25) |
May
(16) |
Jun
(77) |
Jul
(131) |
Aug
(76) |
Sep
(30) |
Oct
(7) |
Nov
(3) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
(16) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(7) |
Dec
(7) |
2012 |
Jan
(10) |
Feb
(1) |
Mar
(8) |
Apr
(6) |
May
(1) |
Jun
(3) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
(8) |
Dec
(2) |
2013 |
Jan
(5) |
Feb
(12) |
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(22) |
Aug
(50) |
Sep
(31) |
Oct
(64) |
Nov
(83) |
Dec
(28) |
2014 |
Jan
(31) |
Feb
(18) |
Mar
(27) |
Apr
(39) |
May
(45) |
Jun
(15) |
Jul
(6) |
Aug
(27) |
Sep
(6) |
Oct
(67) |
Nov
(70) |
Dec
(1) |
2015 |
Jan
(3) |
Feb
(18) |
Mar
(22) |
Apr
(121) |
May
(42) |
Jun
(17) |
Jul
(8) |
Aug
(11) |
Sep
(26) |
Oct
(15) |
Nov
(66) |
Dec
(38) |
2016 |
Jan
(14) |
Feb
(59) |
Mar
(28) |
Apr
(44) |
May
(21) |
Jun
(12) |
Jul
(9) |
Aug
(11) |
Sep
(4) |
Oct
(2) |
Nov
(1) |
Dec
|
2017 |
Jan
(20) |
Feb
(7) |
Mar
(4) |
Apr
(18) |
May
(7) |
Jun
(3) |
Jul
(13) |
Aug
(2) |
Sep
(4) |
Oct
(9) |
Nov
(2) |
Dec
(5) |
2018 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Jeremy J C. <jj...@sy...> - 2016-02-29 22:26:21
|
bigdata-rdf/src/java/com/bigdata/rdf/sparql/ast/optimizers/ASTDistinctTermScanOptimizer.java line 381 (in master) reads: final long newCard = (long) (1.0 / arity); the comment above says: newCard = oldCard * 1.0 / arity(context, sp) I suspect the comment is correct and the code is wrong. (I have no idea what code this is: I had a case of an obviously incorrect estimate cardinality and was looking for an int/long bug, and found this instead) Jeremy |
From: Brad B. <be...@bl...> - 2016-02-28 00:13:55
|
Joakim, Is this against the 2.0.0 version? If so, you may try 2.0.1, which is now available on Maven Central. Thanks, --Brad On Sat, Feb 27, 2016 at 7:06 PM, Joakim Soderberg < joa...@bl...> wrote: > I am trying to query combining two endpoints: > > > PREFIX wd: <http://www.wikidata.org/entity/> > SELECT ?wikiId ?wikidataItem > WHERE > { > <http://dbpedia.org/resource/Elvis_Presley> owl:sameAs ?wikidataItem . > FILTER regex(str(?wikidataItem), 'http://wikidata.org/entity/', 'i') > > SERVICE <http://nn.nn.nnn.1:9999/bigdata/sparql> { > ?s ?p ?wikidataItem } > > } limit 100 > > But this returns an empty result, even though ?wikidataItem contains the > value: <http://wikidata.org/entity/Q303> > Manually replacing ?wikidataItem with wd:Q303, makes the query work: > > PREFIX wd: <http://www.wikidata.org/entity/> > SELECT ?wikiId ?wikidataItem > WHERE > { > <http://dbpedia.org/resource/Elvis_Presley> owl:sameAs ?wikidataItem . > FILTER regex(str(?wikidataItem), 'http://wikidata.org/entity/', 'i') > > SERVICE <http://nn.nn.nnn.1:9999/bigdata/sparql> { > ?s ?p wd:Q303} > > } limit 100 > > When I build a string from the variable I get another error: > > PREFIX wd: <http://www.wikidata.org/entity/> > PREFIX foaf: <http://xmlns.com/foaf/0.1/> > SELECT ?wikiId ?wikidataItem > WHERE > { > <http://dbpedia.org/resource/Elvis_Presley> owl:sameAs ?wikidataItem . > FILTER regex(str(?wikidataItem), 'http://wikidata.org/entity/', 'i') > > BIND(REPLACE(str(?wikidataItem), '^.*(#|/)', "") AS ?localname) > BIND( CONCAT( "wd:", str(?localname) ) AS ?wikiId ) > > SERVICE <http://nn.nn.nnn.1:9999/bigdata/sparql> { > ?s ?p ?wikiId } > > } limit 100 > > > prefix bds: SELECT ?s ?p ?wikiId WHERE { ?s ?p ?wikiId } > java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: > org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.lang.RuntimeException: Unknown extension: Vocab(-9) at > java.util.concurrent.FutureTask.report(FutureTask.java:122) at > java.util.concurrent.FutureTask.get(FutureTask.java:192) at > com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:281) > at > com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:636) > at com.bigdata.rdf.sail.webapp.QueryServlet.doPost(QueryServlet.java:263) > at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:269) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at > javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at > org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808) at > org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) > at > org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) > at > org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) > at > org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) > at > org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) > at > org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) > at > org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) > at > org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) > at > org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) > at > org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) > at > org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) > at org.eclipse.jetty.server.Server.handle(Server.java:497) at > org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) at > org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) > at > org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) > at > org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) > at > org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) > at java.lang.Thread.run(Thread.java:745) Caused by: > java.util.concurrent.ExecutionException: > org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.lang.RuntimeException: Unknown extension: Vocab(-9) at > java.util.concurrent.FutureTask.report(FutureTask.java:122) at > java.util.concurrent.FutureTask.get(FutureTask.java:192) at > com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:834) > at > com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:653) > at > com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > ... 1 more Caused by: org.openrdf.query.QueryEvaluationException: > java.lang.RuntimeException: java.util.concurrent.ExecutionException: > java.lang.RuntimeException: java.lang.RuntimeException: Unknown extension: > Vocab(-9) at > com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:188) > at info.aduna.iteration.IterationWrapper.hasNext(IterationWrapper.java:68) > at org.openrdf.query.QueryResults.report(QueryResults.java:155) at > org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:76) > at > com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1711) > at > com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1568) > at > com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1533) > at > com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:705) > ... 4 more Caused by: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.RuntimeException: > java.lang.RuntimeException: Unknown extension: Vocab(-9) at > com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator.checkFuture(BlockingBuffer.java:1523) > at > com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator._hasNext(BlockingBuffer.java:1710) > at > com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator.hasNext(BlockingBuffer.java:1563) > at > com.bigdata.striterator.AbstractChunkedResolverator._hasNext(AbstractChunkedResolverator.java:365) > at > com.bigdata.striterator.AbstractChunkedResolverator.hasNext(AbstractChunkedResolverator.java:341) > at > com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:134) > > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Joakim S. <joa...@bl...> - 2016-02-28 00:06:46
|
I am trying to query combining two endpoints: PREFIX wd: <http://www.wikidata.org/entity/> SELECT ?wikiId ?wikidataItem WHERE { <http://dbpedia.org/resource/Elvis_Presley> owl:sameAs ?wikidataItem . FILTER regex(str(?wikidataItem), 'http://wikidata.org/entity/', 'i') SERVICE <http://nn.nn.nnn.1:9999/bigdata/sparql> { ?s ?p ?wikidataItem } } limit 100 But this returns an empty result, even though ?wikidataItem contains the value: <http://wikidata.org/entity/Q303> Manually replacing ?wikidataItem with wd:Q303, makes the query work: PREFIX wd: <http://www.wikidata.org/entity/> SELECT ?wikiId ?wikidataItem WHERE { <http://dbpedia.org/resource/Elvis_Presley> owl:sameAs ?wikidataItem . FILTER regex(str(?wikidataItem), 'http://wikidata.org/entity/', 'i') SERVICE <http://nn.nn.nnn.1:9999/bigdata/sparql> { ?s ?p wd:Q303} } limit 100 When I build a string from the variable I get another error: PREFIX wd: <http://www.wikidata.org/entity/> PREFIX foaf: <http://xmlns.com/foaf/0.1/> SELECT ?wikiId ?wikidataItem WHERE { <http://dbpedia.org/resource/Elvis_Presley> owl:sameAs ?wikidataItem . FILTER regex(str(?wikidataItem), 'http://wikidata.org/entity/', 'i') BIND(REPLACE(str(?wikidataItem), '^.*(#|/)', "") AS ?localname) BIND( CONCAT( "wd:", str(?localname) ) AS ?wikiId ) SERVICE <http://nn.nn.nnn.1:9999/bigdata/sparql> { ?s ?p ?wikiId } } limit 100 prefix bds: SELECT ?s ?p ?wikiId WHERE { ?s ?p ?wikiId } java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Unknown extension: Vocab(-9) at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:281) at com.bigdata.rdf.sail.webapp.QueryServlet.doSparqlQuery(QueryServlet.java:636) at com.bigdata.rdf.sail.webapp.QueryServlet.doPost(QueryServlet.java:263) at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:269) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:497) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745) Caused by: java.util.concurrent.ExecutionException: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Unknown extension: Vocab(-9) at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:834) at com.bigdata.rdf.sail.webapp.QueryServlet$SparqlQueryTask.call(QueryServlet.java:653) at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more Caused by: org.openrdf.query.QueryEvaluationException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Unknown extension: Vocab(-9) at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:188) at info.aduna.iteration.IterationWrapper.hasNext(IterationWrapper.java:68) at org.openrdf.query.QueryResults.report(QueryResults.java:155) at org.openrdf.repository.sail.SailTupleQuery.evaluate(SailTupleQuery.java:76) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$TupleQueryTask.doQuery(BigdataRDFContext.java:1711) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.innerCall(BigdataRDFContext.java:1568) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:1533) at com.bigdata.rdf.sail.webapp.BigdataRDFContext$AbstractQueryTask.call(BigdataRDFContext.java:705) ... 4 more Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.lang.RuntimeException: Unknown extension: Vocab(-9) at com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator.checkFuture(BlockingBuffer.java:1523) at com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator._hasNext(BlockingBuffer.java:1710) at com.bigdata.relation.accesspath.BlockingBuffer$BlockingIterator.hasNext(BlockingBuffer.java:1563) at com.bigdata.striterator.AbstractChunkedResolverator._hasNext(AbstractChunkedResolverator.java:365) at com.bigdata.striterator.AbstractChunkedResolverator.hasNext(AbstractChunkedResolverator.java:341) at com.bigdata.rdf.sail.Bigdata2Sesame2BindingSetIterator.hasNext(Bigdata2Sesame2BindingSetIterator.java:134) |
From: Bryan T. <br...@sy...> - 2016-02-23 10:07:09
|
The email alias sa...@sy... should work. Brad Bebee will be the primary point of contact. We will take a look at the web site and see if we can replicate the problem you are encountering there. Thanks, Bryan On Tuesday, February 23, 2016, <fol...@bi...> wrote: > Greetings, > Would anybody tell me how to contact Blazegraph, using its > "contact-us" on official website can't work? > We are trying to ask some questions before order its enterprise > license. > > Thanks in advance! > > > Nai Yan. > > > > > -- ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: <fol...@bi...> - 2016-02-23 09:35:36
|
Greetings, Would anybody tell me how to contact Blazegraph, using its "contact-us" on official website can't work? We are trying to ask some questions before order its enterprise license. Thanks in advance! Nai Yan. |
From: Joakim S. <joa...@bl...> - 2016-02-22 21:41:01
|
Bryan, Thanks for the reminder. I changed the loggers to: log4j.logger.com.bigdata=WARN log4j.logger.com.bigdata.btree=WARN log4j.rootCategory=INFO, devDest, fileDev and restarted the indexing. How long time should I expect to index dbpedia core on a 8 CPU 61 GiB memory machine? > On Feb 22, 2016, at 12:20 PM, Bryan Thompson <br...@sy...> wrote: > > Do not have log @ INFO for blazegraph. It will kill performance. Put it at WARN. > > There is a bug in the DataLoaderServlet. If you have to abort a load, make sure that you terminate the blazegraph process since that servlet does not correctly unwind a partial commit. > > Bryan > > ---- > Bryan Thompson > Chief Scientist & Founder > Blazegraph > e: br...@bl... <mailto:br...@bl...> > w: http://blazegraph.com <http://blazegraph.com/> > > Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > > On Mon, Feb 22, 2016 at 3:16 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: > Brad, > Thats’s right, in my log i get a steady stream of this: > > -02-22 20:11:11,639) INFO : StatementBuffer.java:1773: term: http://pl.dbpedia.org/resource/Melbourne_Zoo <http://pl.dbpedia.org/resource/Melbourne_Zoo>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://pt.dbpedia.org/resource/Zoológico_de_Melbourne <http://pt.dbpedia.org/resource/Zool%C3%B3gico_de_Melbourne>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://ru.dbpedia.org/resource/Мельбурнский_зоопарк <http://ru.dbpedia.org/resource/%D0%9C%D0%B5%D0%BB%D1%8C%D0%B1%D1%83%D1%80%D0%BD%D1%81%D0%BA%D0%B8%D0%B9_%D0%B7%D0%BE%D0%BE%D0%BF%D0%B0%D1%80%D0%BA>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://uk.dbpedia.org/resource/Мельбурнський_зоопарк <http://uk.dbpedia.org/resource/%D0%9C%D0%B5%D0%BB%D1%8C%D0%B1%D1%83%D1%80%D0%BD%D1%81%D1%8C%D0%BA%D0%B8%D0%B9_%D0%B7%D0%BE%D0%BE%D0%BF%D0%B0%D1%80%D0%BA>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://vi.dbpedia.org/resource/Sở_thú_Melbourne <http://vi.dbpedia.org/resource/S%E1%BB%9F_th%C3%BA_Melbourne>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://dbpedia.org/resource/Nova_Air <http://dbpedia.org/resource/Nova_Air>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://wikidata.org/entity/Q578032 <http://wikidata.org/entity/Q578032>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://wikidata.dbpedia.org/resource/Q578032 <http://wikidata.dbpedia.org/resource/Q578032>, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://es.dbpedia.org/resource/Nova_Air <http://es.dbpedia.org/resource/Nova_Air>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://pl.dbpedia.org/resource/Nova_Air <http://pl.dbpedia.org/resource/Nova_Air>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://dbpedia.org/resource/Milton_Work <http://dbpedia.org/resource/Milton_Work>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://wikidata.org/entity/Q578085 <http://wikidata.org/entity/Q578085>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://wikidata.dbpedia.org/resource/Q578085 <http://wikidata.dbpedia.org/resource/Q578085>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://fr.dbpedia.org/resource/Milton_Work <http://fr.dbpedia.org/resource/Milton_Work>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://pl.dbpedia.org/resource/Milton_Work <http://pl.dbpedia.org/resource/Milton_Work>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://dbpedia.org/resource/Lisa_Nandy <http://dbpedia.org/resource/Lisa_Nandy>, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://wikidata.org/entity/Q578037 <http://wikidata.org/entity/Q578037>, iv: null > (2016-02-22 20:11:11,642) INFO : StatementBuffer.java:1773: term: http://wikidata.dbpedia.org/resource/Q578037 <http://wikidata.dbpedia.org/resource/Q578037>, iv: null > > > Is “iv:null” bad? > > I am loading 53 ttl-files of 150G > > /Joakim > > > >> On Feb 22, 2016, at 12:06 PM, Brad Bebee <be...@bl... <mailto:be...@bl...>> wrote: >> >> Joakim, >> >> You should see log output as the statements are loaded. How much data are you loading at once? >> >> Thanks, --Brad >> >> On Mon, Feb 22, 2016 at 2:59 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >> Thanks for the advice. Now it has been indexing for several days and I have no idea what it’s doing. >> >>> On Feb 22, 2016, at 9:04 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>> >>> Try looking on the status tab of the blazegraph UI in the browser. In the detail view of your particular task, there might be a counter showing how many triples have been updated. >>> >>> (I am unsure as to which tasks support this under which versions …) >>> >>> Jeremy >>> >>> >>> >>>> On Feb 17, 2016, at 12:26 PM, Brad Bebee <be...@bl... <mailto:be...@bl...>> wrote: >>>> >>>> Joakim, >>>> >>>> With the DataLoader, the commit is after all of the data is loaded. Once the load is complete, all of the statements will be visible. >>>> >>>> Thanks, --Brad >>>> >>>> On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>> I am calling: >>>> >>>> curl -X POST --data-binary @dataloader.xml --header 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader >>>> >>>> I can see the size of the JNL-file is increasing, but when I query number of statements in the dashboard the data doesn’t show up. >>>> >>>> select (count(*) as ?num) { ?s ?p ?o } >>>> >>>> Do I need to Flush the StatementBuffer to the backing store after the curl? >>>> >>>> This is my config file: >>>> >>>> <?xml version="1.0" encoding="UTF-8" standalone="no"?> >>>> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd <http://java.sun.com/dtd/properties.dtd>"> >>>> <properties> >>>> <!-- RDF Format (Default is rdf/xml) --> >>>> <entry key="format">N-Triples</entry> >>>> <!-- Base URI (Optional) --> >>>> <entry key="baseURI"></entry> >>>> <!-- Default Graph URI (Optional - Required for quads mode namespace) --> >>>> <entry key="defaultGraph"></entry> >>>> <!-- Suppress all stdout messages (Optional) --> >>>> <entry key="quiet">false</entry> >>>> <!-- Show additional messages detailing the load performance. (Optional) --> >>>> <entry key="verbose">3</entry> >>>> <!-- Compute the RDF(S)+ closure. (Optional) --> >>>> <entry key="closure">false</entry> >>>> <!-- Files will be renamed to either .good or .fail as they are processed. >>>> The files will remain in the same directory. --> >>>> <entry key="durableQueues">true</entry> >>>> <!-- The namespace of the KB instance. Defaults to kb. --> >>>> <entry key="namespace">kb</entry> >>>> <!-- The configuration file for the database instance. It must be readable by the web application. --> >>>> <entry key="propertyFile">RWStore.properties</entry> >>>> <!-- Zero or more files or directories containing the data to be loaded. >>>> This should be a comma delimited list. The files must be readable by the web application. --> >>>> <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> >>>> </properties> >>>> >>>> >>>> >>>> >>>>> On Feb 16, 2016, at 8:35 AM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>>> >>>>> I knew there is a DataLoader class, but I wasn’t aware it was available as a service in NanoSparql server. I will try it immediately >>>>> >>>>> >>>>> Thanks >>>>> Joakim >>>>> >>>>>> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>>>> >>>>>>> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>>>> >>>>>> >>>>>> That looks very interesting: >>>>>> >>>>>> I read: >>>>>> >>>>>> "Parsing, insert, and removal on the database are now decoupled from the index writes” >>>>>> >>>>>> One behavior we have is that we have small inserts concurrent with other activity (typically but not exclusively read activity). Does the enhanced configurability in 2.0 give us options that may allow us to improve performance of these writes. >>>>>> >>>>>> E.g. this week we have many (millions? at least hundreds of thousands) of such small writes (10 - 100 quads) and we also are trying to delete 25 million quads using about 100 delete/insert requests (that I take to be not impacted by this change). I am currently suggesting we should do one or the other at any one time, and not try to mix: but frankly I am guessing, and guessing conservatively. We have to maintain an always-on read performance at the same time. Total store size approx 3billion. >>>>>> >>>>>> [Unfortunately this machine is still a 1.5.3 machine, but for future reference I am trying to have better sense of how to organize such activity] >>>>>> >>>>>> Jeremy >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>>> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy... <mailto:br...@sy...>> wrote: >>>>>>> >>>>>>> 2.0 includes support for bulk data load with a number of interesting features, including durable queue patterns, folders, etc. See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>>>>> >>>>>>> ---- >>>>>>> Bryan Thompson >>>>>>> Chief Scientist & Founder >>>>>>> Blazegraph >>>>>>> e: br...@bl... <mailto:br...@bl...> >>>>>>> w: http://blazegraph.com <http://blazegraph.com/> >>>>>>> >>>>>>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>>>>>> >>>>>>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>>>>>> >>>>>>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>>>>> >>>>>>> >>>>>>>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>>>>>> >>>>>>>> Has anyone succeeded to load a folder of .nt files? I can load one by one: >>>>>>>> >>>>>>>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>>>>> >>>>>>>> But it doesn’t like a folder name >>>>>>>> LOAD <file:///mydata/dbpedia2015/core/ <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>>>> >>>>>>> >>>>>>> That is correct. If you look at the spec for LOAD: >>>>>>> https://www.w3.org/TR/sparql11-update/#load <https://www.w3.org/TR/sparql11-update/#load> >>>>>>> then it takes an IRI as where you are loading from, and the concept of folder is simply not applicable. >>>>>>> A few schemes such as file: and ftp: may have such a notion, but the operation you are looking for is local to your machine on the client and you should probably implement it yourself. >>>>>>> >>>>>>> In particular, do you want each file loaded into a different graph or the same graph: probably best for you to make up your own mind. >>>>>>> >>>>>>> I have had success loading trig files into multiple graphs, using a simple POST to the endpoint. >>>>>>> >>>>>>> >>>>>>> Jeremy >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>>>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>>>>> Monitor end-to-end web transactions and take corrective actions now >>>>>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>>>>>> _______________________________________________ >>>>>>> Bigdata-developers mailing list >>>>>>> Big...@li... <mailto:Big...@li...> >>>>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>> Monitor end-to-end web transactions and take corrective actions now >>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>>> _______________________________________________ >>>> Bigdata-developers mailing list >>>> Big...@li... <mailto:Big...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>>> >>>> >>>> >>>> >>>> -- >>>> _______________ >>>> Brad Bebee >>>> CEO >>>> Blazegraph >>>> e: be...@bl... <mailto:be...@bl...> >>>> m: 202.642.7961 <tel:202.642.7961> >>>> w: www.blazegraph.com <http://www.blazegraph.com/> >>>> >>>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>>> >>>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>>> >>>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>>> >>>> ------------------------------------------------------------------------------ >>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>> Monitor end-to-end web transactions and take corrective actions now >>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________> >>>> Bigdata-developers mailing list >>>> Big...@li... <mailto:Big...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>> >> >> >> >> >> -- >> _______________ >> Brad Bebee >> CEO >> Blazegraph >> e: be...@bl... <mailto:be...@bl...> >> m: 202.642.7961 <tel:202.642.7961> >> w: www.blazegraph.com <http://www.blazegraph.com/> >> >> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >> >> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >> >> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >> > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> > _______________________________________________ > Bigdata-developers mailing list > Big...@li... <mailto:Big...@li...> > https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> > > |
From: Bryan T. <br...@sy...> - 2016-02-22 20:20:42
|
Do not have log @ INFO for blazegraph. It will kill performance. Put it at WARN. There is a bug in the DataLoaderServlet. If you have to abort a load, make sure that you terminate the blazegraph process since that servlet does not correctly unwind a partial commit. Bryan ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Mon, Feb 22, 2016 at 3:16 PM, Joakim Soderberg < joa...@bl...> wrote: > Brad, > Thats’s right, in my log i get a steady stream of this: > > -02-22 20:11:11,639) INFO : StatementBuffer.java:1773: term: > http://pl.dbpedia.org/resource/Melbourne_Zoo, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://pt.dbpedia.org/resource/Zoológico_de_Melbourne, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://ru.dbpedia.org/resource/Мельбурнский_зоопарк, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://uk.dbpedia.org/resource/Мельбурнський_зоопарк, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://vi.dbpedia.org/resource/Sở_thú_Melbourne, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://dbpedia.org/resource/Nova_Air, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://wikidata.org/entity/Q578032, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://wikidata.dbpedia.org/resource/Q578032, iv: null > (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: > http://es.dbpedia.org/resource/Nova_Air, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://pl.dbpedia.org/resource/Nova_Air, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://dbpedia.org/resource/Milton_Work, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://wikidata.org/entity/Q578085, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://wikidata.dbpedia.org/resource/Q578085, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://fr.dbpedia.org/resource/Milton_Work, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://pl.dbpedia.org/resource/Milton_Work, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://dbpedia.org/resource/Lisa_Nandy, iv: null > (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: > http://wikidata.org/entity/Q578037, iv: null > (2016-02-22 20:11:11,642) INFO : StatementBuffer.java:1773: term: > http://wikidata.dbpedia.org/resource/Q578037, iv: null > > > Is “iv:null” bad? > > I am loading 53 ttl-files of 150G > > /Joakim > > > > On Feb 22, 2016, at 12:06 PM, Brad Bebee <be...@bl...> wrote: > > Joakim, > > You should see log output as the statements are loaded. How much data > are you loading at once? > > Thanks, --Brad > > On Mon, Feb 22, 2016 at 2:59 PM, Joakim Soderberg < > joa...@bl...> wrote: > >> Thanks for the advice. Now it has been indexing for several days and I >> have no idea what it’s doing. >> >> On Feb 22, 2016, at 9:04 AM, Jeremy J Carroll <jj...@sy...> wrote: >> >> Try looking on the status tab of the blazegraph UI in the browser. In the >> detail view of your particular task, there might be a counter showing how >> many triples have been updated. >> >> (I am unsure as to which tasks support this under which versions …) >> >> Jeremy >> >> >> >> On Feb 17, 2016, at 12:26 PM, Brad Bebee <be...@bl...> wrote: >> >> Joakim, >> >> With the DataLoader, the commit is after all of the data is loaded. Once >> the load is complete, all of the statements will be visible. >> >> Thanks, --Brad >> >> On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg < >> joa...@bl...> wrote: >> >>> I am calling: >>> >>> curl -X POST --data-binary @dataloader.xml --header >>> 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader >>> >>> I can see the size of the JNL-file is increasing, but when I query >>> number of statements in the dashboard the data doesn’t show up. >>> >>> select (count(*) as ?num) { ?s ?p ?o } >>> >>> Do I need to Flush the StatementBuffer to the backing store after the >>> curl? >>> >>> This is my config file: >>> >>> <?xml version="1.0" encoding="UTF-8" standalone="no"?> >>> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> >>> <properties> >>> <!-- RDF Format (Default is rdf/xml) --> >>> <entry key="format">N-Triples</entry> >>> <!-- Base URI (Optional) --> >>> <entry key="baseURI"></entry> >>> <!-- Default Graph URI (Optional - >>> Required for quads mode namespace) --> >>> <entry >>> key="defaultGraph"></entry> >>> <!-- Suppress all stdout >>> messages (Optional) --> >>> <entry >>> key="quiet">false</entry> >>> <!-- Show >>> additional messages detailing the load performance. (Optional) --> >>> <entry >>> key="verbose">3</entry> >>> >>> <!-- Compute the RDF(S)+ closure. (Optional) --> >>> <entry key="closure">false</entry> >>> <!-- Files will be renamed to either .good or .fail as >>> they are processed. >>> The files will remain in the same directory. --> >>> <entry key="durableQueues">true</entry> >>> <!-- The namespace of the KB instance. >>> Defaults to kb. --> >>> <entry key="namespace">kb</entry> >>> <!-- The configuration file for >>> the database instance. It must be readable by the web application. --> >>> <entry key="propertyFile">RWStore.properties</entry> >>> <!-- Zero or more files or directories containing the >>> data to be loaded. >>> This should be a comma delimited list. The files must >>> be readable by the web application. --> >>> <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> >>> </properties> >>> >>> >>> >>> On Feb 16, 2016, at 8:35 AM, Joakim Soderberg < >>> joa...@bl...> wrote: >>> >>> I knew there is a DataLoader class, but I wasn’t aware it was available >>> as a service in NanoSparql server. I will try it immediately >>> >>> >>> Thanks >>> Joakim >>> >>> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy...> wrote: >>> >>> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load >>> >>> >>> >>> That looks very interesting: >>> >>> I read: >>> >>> "Parsing, insert, and removal on the database are now decoupled from the >>> index writes” >>> >>> One behavior we have is that we have small inserts concurrent with other >>> activity (typically but not exclusively read activity). Does the >>> enhanced configurability in 2.0 give us options that may allow us to >>> improve performance of these writes. >>> >>> E.g. this week we have many (millions? at least hundreds of thousands) >>> of such small writes (10 - 100 quads) and we also are trying to delete 25 >>> million quads using about 100 delete/insert requests (that I take to be not >>> impacted by this change). I am currently suggesting we should do one or the >>> other at any one time, and not try to mix: but frankly I am guessing, and >>> guessing conservatively. We have to maintain an always-on read >>> performance at the same time. Total store size approx 3billion. >>> >>> [Unfortunately this machine is still a 1.5.3 machine, but for future >>> reference I am trying to have better sense of how to organize such activity] >>> >>> Jeremy >>> >>> >>> >>> >>> >>> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy...> wrote: >>> >>> 2.0 includes support for bulk data load with a number of interesting >>> features, including durable queue patterns, folders, etc. See >>> https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load >>> >>> ---- >>> Bryan Thompson >>> Chief Scientist & Founder >>> Blazegraph >>> e: br...@bl... >>> w: http://blazegraph.com >>> >>> Blazegraph products help to solve the Graph Cache Thrash to achieve >>> large scale processing for graph and predictive analytics. Blazegraph is >>> the creator of the industry’s first GPU-accelerated high-performance >>> database for large graphs, has been named as one of the “10 Companies >>> and Technologies to Watch in 2016” >>> <http://insideanalysis.com/2016/01/20535/>. >>> >>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high >>> performance graph database that supports both RDF/SPARQL and >>> Tinkerpop/Blueprints APIs. Blazegraph GPU >>> <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS >>> <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive >>> new technologies that use GPUs to enable extreme scaling that is thousands >>> of times faster and 40 times more affordable than CPU-based solutions. >>> >>> CONFIDENTIALITY NOTICE: This email and its contents and attachments >>> are for the sole use of the intended recipient(s) and are confidential or >>> proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, >>> disclosure, dissemination or copying of this email or its contents or >>> attachments is prohibited. If you have received this communication in >>> error, please notify the sender by reply email and permanently delete all >>> copies of the email and its contents and attachments. >>> >>> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy...> >>> wrote: >>> >>>> >>>> >>>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg < >>>> joa...@bl...> wrote: >>>> >>>> Has anyone succeeded to load a folder of .nt files? I can load one by >>>> one: >>>> >>>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt> INTO >>>> GRAPH <http://dbpedia2015> >>>> >>>> But it doesn’t like a folder name >>>> LOAD <file:///mydata/dbpedia2015/core/> INTO GRAPH <http://dbpedia2015> >>>> >>>> >>>> >>>> That is correct. If you look at the spec for LOAD: >>>> https://www.w3.org/TR/sparql11-update/#load >>>> then it takes an IRI as where you are loading from, and the concept of >>>> folder is simply not applicable. >>>> A few schemes such as file: and ftp: may have such a notion, but the >>>> operation you are looking for is local to your machine on the client and >>>> you should probably implement it yourself. >>>> >>>> In particular, do you want each file loaded into a different graph or >>>> the same graph: probably best for you to make up your own mind. >>>> >>>> I have had success loading trig files into multiple graphs, using a >>>> simple POST to the endpoint. >>>> >>>> >>>> Jeremy >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>> Monitor end-to-end web transactions and take corrective actions now >>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >>>> _______________________________________________ >>>> Bigdata-developers mailing list >>>> Big...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>> >>>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>> Monitor end-to-end web transactions and take corrective actions now >>> Troubleshoot faster and improve end-user experience. Signup Now! >>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> >>> >> >> >> -- >> _______________ >> Brad Bebee >> CEO >> Blazegraph >> e: be...@bl... >> m: 202.642.7961 >> w: www.blazegraph.com >> >> Blazegraph products help to solve the Graph Cache Thrash to achieve large >> scale processing for graph and predictive analytics. Blazegraph is the >> creator of the industry’s first GPU-accelerated high-performance database >> for large graphs, has been named as one of the “10 Companies and >> Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >> >> >> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high >> performance graph database that supports both RDF/SPARQL and >> Tinkerpop/Blueprints APIs. Blazegraph GPU >> <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS >> <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive >> new technologies that use GPUs to enable extreme scaling that is thousands >> of times faster and 40 times more affordable than CPU-based solutions. >> >> CONFIDENTIALITY NOTICE: This email and its contents and attachments are >> for the sole use of the intended recipient(s) and are confidential or >> proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, >> disclosure, dissemination or copying of this email or its contents or >> attachments is prohibited. If you have received this communication in >> error, please notify the sender by reply email and permanently delete all >> copies of the email and its contents and attachments. >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> >> >> >> > > > -- > _______________ > Brad Bebee > CEO > Blazegraph > e: be...@bl... > m: 202.642.7961 > w: www.blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and > Tinkerpop/Blueprints APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > |
From: Joakim S. <joa...@bl...> - 2016-02-22 20:16:38
|
Brad, Thats’s right, in my log i get a steady stream of this: -02-22 20:11:11,639) INFO : StatementBuffer.java:1773: term: http://pl.dbpedia.org/resource/Melbourne_Zoo, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://pt.dbpedia.org/resource/Zoológico_de_Melbourne, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://ru.dbpedia.org/resource/Мельбурнский_зоопарк, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://uk.dbpedia.org/resource/Мельбурнський_зоопарк, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://vi.dbpedia.org/resource/Sở_thú_Melbourne, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://dbpedia.org/resource/Nova_Air, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://wikidata.org/entity/Q578032, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://wikidata.dbpedia.org/resource/Q578032, iv: null (2016-02-22 20:11:11,640) INFO : StatementBuffer.java:1773: term: http://es.dbpedia.org/resource/Nova_Air, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://pl.dbpedia.org/resource/Nova_Air, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://dbpedia.org/resource/Milton_Work, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://wikidata.org/entity/Q578085, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://wikidata.dbpedia.org/resource/Q578085, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://fr.dbpedia.org/resource/Milton_Work, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://pl.dbpedia.org/resource/Milton_Work, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://dbpedia.org/resource/Lisa_Nandy, iv: null (2016-02-22 20:11:11,641) INFO : StatementBuffer.java:1773: term: http://wikidata.org/entity/Q578037, iv: null (2016-02-22 20:11:11,642) INFO : StatementBuffer.java:1773: term: http://wikidata.dbpedia.org/resource/Q578037, iv: null Is “iv:null” bad? I am loading 53 ttl-files of 150G /Joakim > On Feb 22, 2016, at 12:06 PM, Brad Bebee <be...@bl...> wrote: > > Joakim, > > You should see log output as the statements are loaded. How much data are you loading at once? > > Thanks, --Brad > > On Mon, Feb 22, 2016 at 2:59 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: > Thanks for the advice. Now it has been indexing for several days and I have no idea what it’s doing. > >> On Feb 22, 2016, at 9:04 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >> >> Try looking on the status tab of the blazegraph UI in the browser. In the detail view of your particular task, there might be a counter showing how many triples have been updated. >> >> (I am unsure as to which tasks support this under which versions …) >> >> Jeremy >> >> >> >>> On Feb 17, 2016, at 12:26 PM, Brad Bebee <be...@bl... <mailto:be...@bl...>> wrote: >>> >>> Joakim, >>> >>> With the DataLoader, the commit is after all of the data is loaded. Once the load is complete, all of the statements will be visible. >>> >>> Thanks, --Brad >>> >>> On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>> I am calling: >>> >>> curl -X POST --data-binary @dataloader.xml --header 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader >>> >>> I can see the size of the JNL-file is increasing, but when I query number of statements in the dashboard the data doesn’t show up. >>> >>> select (count(*) as ?num) { ?s ?p ?o } >>> >>> Do I need to Flush the StatementBuffer to the backing store after the curl? >>> >>> This is my config file: >>> >>> <?xml version="1.0" encoding="UTF-8" standalone="no"?> >>> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd <http://java.sun.com/dtd/properties.dtd>"> >>> <properties> >>> <!-- RDF Format (Default is rdf/xml) --> >>> <entry key="format">N-Triples</entry> >>> <!-- Base URI (Optional) --> >>> <entry key="baseURI"></entry> >>> <!-- Default Graph URI (Optional - Required for quads mode namespace) --> >>> <entry key="defaultGraph"></entry> >>> <!-- Suppress all stdout messages (Optional) --> >>> <entry key="quiet">false</entry> >>> <!-- Show additional messages detailing the load performance. (Optional) --> >>> <entry key="verbose">3</entry> >>> <!-- Compute the RDF(S)+ closure. (Optional) --> >>> <entry key="closure">false</entry> >>> <!-- Files will be renamed to either .good or .fail as they are processed. >>> The files will remain in the same directory. --> >>> <entry key="durableQueues">true</entry> >>> <!-- The namespace of the KB instance. Defaults to kb. --> >>> <entry key="namespace">kb</entry> >>> <!-- The configuration file for the database instance. It must be readable by the web application. --> >>> <entry key="propertyFile">RWStore.properties</entry> >>> <!-- Zero or more files or directories containing the data to be loaded. >>> This should be a comma delimited list. The files must be readable by the web application. --> >>> <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> >>> </properties> >>> >>> >>> >>> >>>> On Feb 16, 2016, at 8:35 AM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>> >>>> I knew there is a DataLoader class, but I wasn’t aware it was available as a service in NanoSparql server. I will try it immediately >>>> >>>> >>>> Thanks >>>> Joakim >>>> >>>>> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>>> >>>>>> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>>> >>>>> >>>>> That looks very interesting: >>>>> >>>>> I read: >>>>> >>>>> "Parsing, insert, and removal on the database are now decoupled from the index writes” >>>>> >>>>> One behavior we have is that we have small inserts concurrent with other activity (typically but not exclusively read activity). Does the enhanced configurability in 2.0 give us options that may allow us to improve performance of these writes. >>>>> >>>>> E.g. this week we have many (millions? at least hundreds of thousands) of such small writes (10 - 100 quads) and we also are trying to delete 25 million quads using about 100 delete/insert requests (that I take to be not impacted by this change). I am currently suggesting we should do one or the other at any one time, and not try to mix: but frankly I am guessing, and guessing conservatively. We have to maintain an always-on read performance at the same time. Total store size approx 3billion. >>>>> >>>>> [Unfortunately this machine is still a 1.5.3 machine, but for future reference I am trying to have better sense of how to organize such activity] >>>>> >>>>> Jeremy >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy... <mailto:br...@sy...>> wrote: >>>>>> >>>>>> 2.0 includes support for bulk data load with a number of interesting features, including durable queue patterns, folders, etc. See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>>>> >>>>>> ---- >>>>>> Bryan Thompson >>>>>> Chief Scientist & Founder >>>>>> Blazegraph >>>>>> e: br...@bl... <mailto:br...@bl...> >>>>>> w: http://blazegraph.com <http://blazegraph.com/> >>>>>> >>>>>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>>>>> >>>>>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>>>>> >>>>>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>>>>> >>>>>> >>>>>> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>>>> >>>>>> >>>>>>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>>>>> >>>>>>> Has anyone succeeded to load a folder of .nt files? I can load one by one: >>>>>>> >>>>>>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>>>> >>>>>>> But it doesn’t like a folder name >>>>>>> LOAD <file:///mydata/dbpedia2015/core/ <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>>> >>>>>> >>>>>> That is correct. If you look at the spec for LOAD: >>>>>> https://www.w3.org/TR/sparql11-update/#load <https://www.w3.org/TR/sparql11-update/#load> >>>>>> then it takes an IRI as where you are loading from, and the concept of folder is simply not applicable. >>>>>> A few schemes such as file: and ftp: may have such a notion, but the operation you are looking for is local to your machine on the client and you should probably implement it yourself. >>>>>> >>>>>> In particular, do you want each file loaded into a different graph or the same graph: probably best for you to make up your own mind. >>>>>> >>>>>> I have had success loading trig files into multiple graphs, using a simple POST to the endpoint. >>>>>> >>>>>> >>>>>> Jeremy >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>>>> Monitor end-to-end web transactions and take corrective actions now >>>>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>>>>> _______________________________________________ >>>>>> Bigdata-developers mailing list >>>>>> Big...@li... <mailto:Big...@li...> >>>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>> Monitor end-to-end web transactions and take corrective actions now >>> Troubleshoot faster and improve end-user experience. Signup Now! >>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... <mailto:Big...@li...> >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>> >>> >>> >>> >>> -- >>> _______________ >>> Brad Bebee >>> CEO >>> Blazegraph >>> e: be...@bl... <mailto:be...@bl...> >>> m: 202.642.7961 <tel:202.642.7961> >>> w: www.blazegraph.com <http://www.blazegraph.com/> >>> >>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>> >>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>> >>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>> >>> ------------------------------------------------------------------------------ >>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>> Monitor end-to-end web transactions and take corrective actions now >>> Troubleshoot faster and improve end-user experience. Signup Now! >>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________> >>> Bigdata-developers mailing list >>> Big...@li... <mailto:Big...@li...> >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >> > > > > > -- > _______________ > Brad Bebee > CEO > Blazegraph > e: be...@bl... <mailto:be...@bl...> > m: 202.642.7961 <tel:202.642.7961> > w: www.blazegraph.com <http://www.blazegraph.com/> > > Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > |
From: Brad B. <be...@bl...> - 2016-02-22 20:06:13
|
Joakim, You should see log output as the statements are loaded. How much data are you loading at once? Thanks, --Brad On Mon, Feb 22, 2016 at 2:59 PM, Joakim Soderberg < joa...@bl...> wrote: > Thanks for the advice. Now it has been indexing for several days and I > have no idea what it’s doing. > > On Feb 22, 2016, at 9:04 AM, Jeremy J Carroll <jj...@sy...> wrote: > > Try looking on the status tab of the blazegraph UI in the browser. In the > detail view of your particular task, there might be a counter showing how > many triples have been updated. > > (I am unsure as to which tasks support this under which versions …) > > Jeremy > > > > On Feb 17, 2016, at 12:26 PM, Brad Bebee <be...@bl...> wrote: > > Joakim, > > With the DataLoader, the commit is after all of the data is loaded. Once > the load is complete, all of the statements will be visible. > > Thanks, --Brad > > On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg < > joa...@bl...> wrote: > >> I am calling: >> >> curl -X POST --data-binary @dataloader.xml --header >> 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader >> >> I can see the size of the JNL-file is increasing, but when I query number >> of statements in the dashboard the data doesn’t show up. >> >> select (count(*) as ?num) { ?s ?p ?o } >> >> Do I need to Flush the StatementBuffer to the backing store after the >> curl? >> >> This is my config file: >> >> <?xml version="1.0" encoding="UTF-8" standalone="no"?> >> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> >> <properties> >> <!-- RDF Format (Default is rdf/xml) --> >> <entry key="format">N-Triples</entry> >> <!-- Base URI (Optional) --> >> <entry key="baseURI"></entry> >> <!-- Default Graph URI (Optional - >> Required for quads mode namespace) --> >> <entry key="defaultGraph"></entry> >> <!-- Suppress all stdout >> messages (Optional) --> >> <entry >> key="quiet">false</entry> >> <!-- Show >> additional messages detailing the load performance. (Optional) --> >> <entry >> key="verbose">3</entry> >> <!-- >> Compute the RDF(S)+ closure. (Optional) --> >> <entry key="closure">false</entry> >> <!-- Files will be renamed to either .good or .fail as >> they are processed. >> The files will remain in the same directory. --> >> <entry key="durableQueues">true</entry> >> <!-- The namespace of the KB instance. >> Defaults to kb. --> >> <entry key="namespace">kb</entry> >> <!-- The configuration file for the >> database instance. It must be readable by the web application. --> >> <entry key="propertyFile">RWStore.properties</entry> >> <!-- Zero or more files or directories containing the >> data to be loaded. >> This should be a comma delimited list. The files must >> be readable by the web application. --> >> <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> >> </properties> >> >> >> >> On Feb 16, 2016, at 8:35 AM, Joakim Soderberg < >> joa...@bl...> wrote: >> >> I knew there is a DataLoader class, but I wasn’t aware it was available >> as a service in NanoSparql server. I will try it immediately >> >> >> Thanks >> Joakim >> >> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy...> wrote: >> >> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load >> >> >> >> That looks very interesting: >> >> I read: >> >> "Parsing, insert, and removal on the database are now decoupled from the >> index writes” >> >> One behavior we have is that we have small inserts concurrent with other >> activity (typically but not exclusively read activity). Does the >> enhanced configurability in 2.0 give us options that may allow us to >> improve performance of these writes. >> >> E.g. this week we have many (millions? at least hundreds of thousands) of >> such small writes (10 - 100 quads) and we also are trying to delete 25 >> million quads using about 100 delete/insert requests (that I take to be not >> impacted by this change). I am currently suggesting we should do one or the >> other at any one time, and not try to mix: but frankly I am guessing, and >> guessing conservatively. We have to maintain an always-on read >> performance at the same time. Total store size approx 3billion. >> >> [Unfortunately this machine is still a 1.5.3 machine, but for future >> reference I am trying to have better sense of how to organize such activity] >> >> Jeremy >> >> >> >> >> >> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy...> wrote: >> >> 2.0 includes support for bulk data load with a number of interesting >> features, including durable queue patterns, folders, etc. See >> https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load >> >> ---- >> Bryan Thompson >> Chief Scientist & Founder >> Blazegraph >> e: br...@bl... >> w: http://blazegraph.com >> >> Blazegraph products help to solve the Graph Cache Thrash to achieve large >> scale processing for graph and predictive analytics. Blazegraph is the >> creator of the industry’s first GPU-accelerated high-performance database >> for large graphs, has been named as one of the “10 Companies and >> Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >> >> >> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high >> performance graph database that supports both RDF/SPARQL and >> Tinkerpop/Blueprints APIs. Blazegraph GPU >> <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS >> <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive >> new technologies that use GPUs to enable extreme scaling that is thousands >> of times faster and 40 times more affordable than CPU-based solutions. >> >> CONFIDENTIALITY NOTICE: This email and its contents and attachments are >> for the sole use of the intended recipient(s) and are confidential or >> proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, >> disclosure, dissemination or copying of this email or its contents or >> attachments is prohibited. If you have received this communication in >> error, please notify the sender by reply email and permanently delete all >> copies of the email and its contents and attachments. >> >> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy...> >> wrote: >> >>> >>> >>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg < >>> joa...@bl...> wrote: >>> >>> Has anyone succeeded to load a folder of .nt files? I can load one by >>> one: >>> >>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt> INTO >>> GRAPH <http://dbpedia2015> >>> >>> But it doesn’t like a folder name >>> LOAD <file:///mydata/dbpedia2015/core/> INTO GRAPH <http://dbpedia2015> >>> >>> >>> >>> That is correct. If you look at the spec for LOAD: >>> https://www.w3.org/TR/sparql11-update/#load >>> then it takes an IRI as where you are loading from, and the concept of >>> folder is simply not applicable. >>> A few schemes such as file: and ftp: may have such a notion, but the >>> operation you are looking for is local to your machine on the client and >>> you should probably implement it yourself. >>> >>> In particular, do you want each file loaded into a different graph or >>> the same graph: probably best for you to make up your own mind. >>> >>> I have had success loading trig files into multiple graphs, using a >>> simple POST to the endpoint. >>> >>> >>> Jeremy >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>> Monitor end-to-end web transactions and take corrective actions now >>> Troubleshoot faster and improve end-user experience. Signup Now! >>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> >>> >> >> >> >> >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> >> > > > -- > _______________ > Brad Bebee > CEO > Blazegraph > e: be...@bl... > m: 202.642.7961 > w: www.blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and > Tinkerpop/Blueprints APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Joakim S. <joa...@bl...> - 2016-02-22 19:59:17
|
Thanks for the advice. Now it has been indexing for several days and I have no idea what it’s doing. > On Feb 22, 2016, at 9:04 AM, Jeremy J Carroll <jj...@sy...> wrote: > > Try looking on the status tab of the blazegraph UI in the browser. In the detail view of your particular task, there might be a counter showing how many triples have been updated. > > (I am unsure as to which tasks support this under which versions …) > > Jeremy > > > >> On Feb 17, 2016, at 12:26 PM, Brad Bebee <be...@bl... <mailto:be...@bl...>> wrote: >> >> Joakim, >> >> With the DataLoader, the commit is after all of the data is loaded. Once the load is complete, all of the statements will be visible. >> >> Thanks, --Brad >> >> On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >> I am calling: >> >> curl -X POST --data-binary @dataloader.xml --header 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader >> >> I can see the size of the JNL-file is increasing, but when I query number of statements in the dashboard the data doesn’t show up. >> >> select (count(*) as ?num) { ?s ?p ?o } >> >> Do I need to Flush the StatementBuffer to the backing store after the curl? >> >> This is my config file: >> >> <?xml version="1.0" encoding="UTF-8" standalone="no"?> >> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd <http://java.sun.com/dtd/properties.dtd>"> >> <properties> >> <!-- RDF Format (Default is rdf/xml) --> >> <entry key="format">N-Triples</entry> >> <!-- Base URI (Optional) --> >> <entry key="baseURI"></entry> >> <!-- Default Graph URI (Optional - Required for quads mode namespace) --> >> <entry key="defaultGraph"></entry> >> <!-- Suppress all stdout messages (Optional) --> >> <entry key="quiet">false</entry> >> <!-- Show additional messages detailing the load performance. (Optional) --> >> <entry key="verbose">3</entry> >> <!-- Compute the RDF(S)+ closure. (Optional) --> >> <entry key="closure">false</entry> >> <!-- Files will be renamed to either .good or .fail as they are processed. >> The files will remain in the same directory. --> >> <entry key="durableQueues">true</entry> >> <!-- The namespace of the KB instance. Defaults to kb. --> >> <entry key="namespace">kb</entry> >> <!-- The configuration file for the database instance. It must be readable by the web application. --> >> <entry key="propertyFile">RWStore.properties</entry> >> <!-- Zero or more files or directories containing the data to be loaded. >> This should be a comma delimited list. The files must be readable by the web application. --> >> <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> >> </properties> >> >> >> >> >>> On Feb 16, 2016, at 8:35 AM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>> >>> I knew there is a DataLoader class, but I wasn’t aware it was available as a service in NanoSparql server. I will try it immediately >>> >>> >>> Thanks >>> Joakim >>> >>>> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>> >>>>> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>> >>>> >>>> That looks very interesting: >>>> >>>> I read: >>>> >>>> "Parsing, insert, and removal on the database are now decoupled from the index writes” >>>> >>>> One behavior we have is that we have small inserts concurrent with other activity (typically but not exclusively read activity). Does the enhanced configurability in 2.0 give us options that may allow us to improve performance of these writes. >>>> >>>> E.g. this week we have many (millions? at least hundreds of thousands) of such small writes (10 - 100 quads) and we also are trying to delete 25 million quads using about 100 delete/insert requests (that I take to be not impacted by this change). I am currently suggesting we should do one or the other at any one time, and not try to mix: but frankly I am guessing, and guessing conservatively. We have to maintain an always-on read performance at the same time. Total store size approx 3billion. >>>> >>>> [Unfortunately this machine is still a 1.5.3 machine, but for future reference I am trying to have better sense of how to organize such activity] >>>> >>>> Jeremy >>>> >>>> >>>> >>>> >>>> >>>>> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy... <mailto:br...@sy...>> wrote: >>>>> >>>>> 2.0 includes support for bulk data load with a number of interesting features, including durable queue patterns, folders, etc. See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>>> >>>>> ---- >>>>> Bryan Thompson >>>>> Chief Scientist & Founder >>>>> Blazegraph >>>>> e: br...@bl... <mailto:br...@bl...> >>>>> w: http://blazegraph.com <http://blazegraph.com/> >>>>> >>>>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>>>> >>>>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>>>> >>>>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>>>> >>>>> >>>>> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>>> >>>>> >>>>>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>>>> >>>>>> Has anyone succeeded to load a folder of .nt files? I can load one by one: >>>>>> >>>>>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>>> >>>>>> But it doesn’t like a folder name >>>>>> LOAD <file:///mydata/dbpedia2015/core/ <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>> >>>>> >>>>> That is correct. If you look at the spec for LOAD: >>>>> https://www.w3.org/TR/sparql11-update/#load <https://www.w3.org/TR/sparql11-update/#load> >>>>> then it takes an IRI as where you are loading from, and the concept of folder is simply not applicable. >>>>> A few schemes such as file: and ftp: may have such a notion, but the operation you are looking for is local to your machine on the client and you should probably implement it yourself. >>>>> >>>>> In particular, do you want each file loaded into a different graph or the same graph: probably best for you to make up your own mind. >>>>> >>>>> I have had success loading trig files into multiple graphs, using a simple POST to the endpoint. >>>>> >>>>> >>>>> Jeremy >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>>> Monitor end-to-end web transactions and take corrective actions now >>>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>>>> _______________________________________________ >>>>> Bigdata-developers mailing list >>>>> Big...@li... <mailto:Big...@li...> >>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>>>> >>>>> >>>> >>> >> >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... <mailto:Big...@li...> >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >> >> >> >> >> -- >> _______________ >> Brad Bebee >> CEO >> Blazegraph >> e: be...@bl... <mailto:be...@bl...> >> m: 202.642.7961 <tel:202.642.7961> >> w: www.blazegraph.com <http://www.blazegraph.com/> >> >> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >> >> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >> >> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________> >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: Jeremy J C. <jj...@sy...> - 2016-02-22 17:04:32
|
Try looking on the status tab of the blazegraph UI in the browser. In the detail view of your particular task, there might be a counter showing how many triples have been updated. (I am unsure as to which tasks support this under which versions …) Jeremy > On Feb 17, 2016, at 12:26 PM, Brad Bebee <be...@bl...> wrote: > > Joakim, > > With the DataLoader, the commit is after all of the data is loaded. Once the load is complete, all of the statements will be visible. > > Thanks, --Brad > > On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: > I am calling: > > curl -X POST --data-binary @dataloader.xml --header 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader > > I can see the size of the JNL-file is increasing, but when I query number of statements in the dashboard the data doesn’t show up. > > select (count(*) as ?num) { ?s ?p ?o } > > Do I need to Flush the StatementBuffer to the backing store after the curl? > > This is my config file: > > <?xml version="1.0" encoding="UTF-8" standalone="no"?> > <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd <http://java.sun.com/dtd/properties.dtd>"> > <properties> > <!-- RDF Format (Default is rdf/xml) --> > <entry key="format">N-Triples</entry> > <!-- Base URI (Optional) --> > <entry key="baseURI"></entry> > <!-- Default Graph URI (Optional - Required for quads mode namespace) --> > <entry key="defaultGraph"></entry> > <!-- Suppress all stdout messages (Optional) --> > <entry key="quiet">false</entry> > <!-- Show additional messages detailing the load performance. (Optional) --> > <entry key="verbose">3</entry> > <!-- Compute the RDF(S)+ closure. (Optional) --> > <entry key="closure">false</entry> > <!-- Files will be renamed to either .good or .fail as they are processed. > The files will remain in the same directory. --> > <entry key="durableQueues">true</entry> > <!-- The namespace of the KB instance. Defaults to kb. --> > <entry key="namespace">kb</entry> > <!-- The configuration file for the database instance. It must be readable by the web application. --> > <entry key="propertyFile">RWStore.properties</entry> > <!-- Zero or more files or directories containing the data to be loaded. > This should be a comma delimited list. The files must be readable by the web application. --> > <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> > </properties> > > > > >> On Feb 16, 2016, at 8:35 AM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >> >> I knew there is a DataLoader class, but I wasn’t aware it was available as a service in NanoSparql server. I will try it immediately >> >> >> Thanks >> Joakim >> >>> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>> >>>> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>> >>> >>> That looks very interesting: >>> >>> I read: >>> >>> "Parsing, insert, and removal on the database are now decoupled from the index writes” >>> >>> One behavior we have is that we have small inserts concurrent with other activity (typically but not exclusively read activity). Does the enhanced configurability in 2.0 give us options that may allow us to improve performance of these writes. >>> >>> E.g. this week we have many (millions? at least hundreds of thousands) of such small writes (10 - 100 quads) and we also are trying to delete 25 million quads using about 100 delete/insert requests (that I take to be not impacted by this change). I am currently suggesting we should do one or the other at any one time, and not try to mix: but frankly I am guessing, and guessing conservatively. We have to maintain an always-on read performance at the same time. Total store size approx 3billion. >>> >>> [Unfortunately this machine is still a 1.5.3 machine, but for future reference I am trying to have better sense of how to organize such activity] >>> >>> Jeremy >>> >>> >>> >>> >>> >>>> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy... <mailto:br...@sy...>> wrote: >>>> >>>> 2.0 includes support for bulk data load with a number of interesting features, including durable queue patterns, folders, etc. See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>> >>>> ---- >>>> Bryan Thompson >>>> Chief Scientist & Founder >>>> Blazegraph >>>> e: br...@bl... <mailto:br...@bl...> >>>> w: http://blazegraph.com <http://blazegraph.com/> >>>> >>>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>>> >>>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>>> >>>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>>> >>>> >>>> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>> >>>> >>>>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>>> >>>>> Has anyone succeeded to load a folder of .nt files? I can load one by one: >>>>> >>>>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>> >>>>> But it doesn’t like a folder name >>>>> LOAD <file:///mydata/dbpedia2015/core/ <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>> >>>> >>>> That is correct. If you look at the spec for LOAD: >>>> https://www.w3.org/TR/sparql11-update/#load <https://www.w3.org/TR/sparql11-update/#load> >>>> then it takes an IRI as where you are loading from, and the concept of folder is simply not applicable. >>>> A few schemes such as file: and ftp: may have such a notion, but the operation you are looking for is local to your machine on the client and you should probably implement it yourself. >>>> >>>> In particular, do you want each file loaded into a different graph or the same graph: probably best for you to make up your own mind. >>>> >>>> I have had success loading trig files into multiple graphs, using a simple POST to the endpoint. >>>> >>>> >>>> Jeremy >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>> Monitor end-to-end web transactions and take corrective actions now >>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>>> _______________________________________________ >>>> Bigdata-developers mailing list >>>> Big...@li... <mailto:Big...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>>> >>>> >>> >> > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> > _______________________________________________ > Bigdata-developers mailing list > Big...@li... <mailto:Big...@li...> > https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> > > > > > -- > _______________ > Brad Bebee > CEO > Blazegraph > e: be...@bl... <mailto:be...@bl...> > m: 202.642.7961 <tel:202.642.7961> > w: www.blazegraph.com <http://www.blazegraph.com/> > > Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
From: Brad B. <be...@bl...> - 2016-02-22 13:56:03
|
Andreas, Thank you. Upon further checking, it appears that the page no longer exists (it was working up until a few days ago). Would you mind creating a ticket at https://jira.blazegraph.com/ on this and we can look at migration options in an upcoming release? Thanks, --Brad On Mon, Feb 22, 2016 at 8:45 AM, Andreas Kahl <ka...@bs...> wrote: > Brad, Bryan, > > Thanks for your explanations. The URL > http://ir.dcs.gla.ac.uk/~bpiwowar/maven/ is unavailable in the meantime. > But with some googling I found > > http://pkgs.org/fedora-centos-rhel-opensuse-mandriva/jpackage-5.0/dsiutils-1.0.10-1.jpp5.noarch.rpm.html > , downloaded that, and got the .jar via > rpm2cpio dsiutils-1.0.10-1.jpp5.noarch.rpm | cpio -idmv > > Then, > mvn install:install-file -Dfile=./dsiutils-1.0.10.jar > -DgroupId=it.unimi.dsi -DartifactId=dsiutils -Dversion=1.0.10 > -Dpackaging=jar > installed the .jar in my local Maven Repo. Now Blazegraph compiles fine. > Just some Test runs into an error, but I can simply skip tests. > > Best Regards > Andreas > > P.S. The failed Test is in Blazegraph Utilities: > Failed tests: > TestCSVReader.test_read_test_csv:123->assertSameValues:272 Col=Salary > expected:<class java.lang.Double> but was:<class java.lang.String> > TestCSVReader.test_read_test_no_headers_csv:180->assertSameValues:272 > Col=5 expected:<class java.lang.Double> but was:<class java.lang.String> > > > > >>> Brad Bebee <be...@bl...> 22.02.16 13.51 Uhr >>> > Andreas, > > Thank you. Please see the repo (repo-for-dsiutils) URL in the dsi-utils > pom: https://github.com/blazegraph/database/blob/master/dsi-utils/pom.xml. > Occasionally, that repository is unavailable. Please let us know if you > continue to see the error. > > Thanks, --Brad > > On Mon, Feb 22, 2016 at 6:37 AM, Bryan Thompson <br...@sy...> wrote: > >> The more recent dsiutil so is, I believe, under a different license. This >> is why we do not roll forward and do not test for compatibility with newer >> releases. Instead, we maintain a fork. >> >> Brad, can you please comment on how to obtain the dsiutils dependency? >> >> Also, do you have any feedback on the web.xml configuration and >> environment variable overrides? >> >> Thanks, >> Bryan >> >> >> On Monday, February 22, 2016, Andreas Kahl <ka...@bs...> wrote: >> >>> Bryan, >>> >>> Thanks for your hint. And thanks for changing the project to Maven, too. >>> Now it was very easy to check out Blazegraph and open the project in >>> Netbeans. >>> I found that the parameters from web.xml have their defaults in >>> com.bigdata.rdf.sail.webapp.ConfigParams. So I would have simply to add a >>> custom argument --readOnly and then set >>> initParams.put(ConfigParams.READ_ONLY,true); >>> >>> Just one version in the pom.xml needed manual attention: Neither Maven >>> Central nor Apache Releases contain dsiutils 1.0.10, so I changed this to >>> 2.3.0 which is the most recent version. Probably it is advisable not to >>> blindly change that version. Do you have a recommended Maven repository to >>> add to my list? >>> >>> Best Regards >>> Andreas >>> >>> >>> >>> >>> >>> >>> Bryan Thompson <br...@sy...> 19.02.16 14.16 Uhr >>> >>> Andreas, >>> >>> Please see the options defined for: >>> >>> - StandaloneNanoSparqlServer (used by the bundled-jar). >>> - NanoSparqlServer (and it extends this class). >>> >>> I defer to Brad on the AJP port. >>> >>> Thanks, >>> Bryan >>> >>> >>> ---- >>> Bryan Thompson >>> Chief Scientist & Founder >>> Blazegraph >>> e: br...@bl... >>> w: http://blazegraph.com >>> >>> Blazegraph products help to solve the Graph Cache Thrash to achieve >>> large scale processing for graph and predictive analytics. Blazegraph is >>> the creator of the industry’s first GPU-accelerated high-performance >>> database for large graphs, has been named as one of the “10 Companies >>> and Technologies to Watch in 2016” >>> <http://insideanalysis.com/2016/01/20535/>. >>> >>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high >>> performance graph database that supports both RDF/SPARQL and >>> Tinkerpop/Blueprints APIs. Blazegraph GPU >>> <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS >>> <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive >>> new technologies that use GPUs to enable extreme scaling that is thousands >>> of times faster and 40 times more affordable than CPU-based solutions. >>> >>> CONFIDENTIALITY NOTICE: This email and its contents and attachments >>> are for the sole use of the intended recipient(s) and are confidential or >>> proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, >>> disclosure, dissemination or copying of this email or its contents or >>> attachments is prohibited. If you have received this communication in >>> error, please notify the sender by reply email and permanently delete all >>> copies of the email and its contents and attachments. >>> >>> On Fri, Feb 19, 2016 at 3:08 AM, Andreas Kahl <ka...@bs...> >>> wrote: >>> >>>> Hello everyone, >>>> >>>> our Blazegraph production environment runs on Tomcat. Currently I am >>>> looking whether we could change that and use the standalone variant by >>>> calling blazegraph-bundled.jar directly like described in the getting >>>> started section: java -server -Xmx4g -jar bigdata-bundled.jar >>>> >>>> We do quite often edit settings in web.xml and restart Tomcats; >>>> especially readOnly and queryTimeout. >>>> So having to recomile a new bigdata-bundled.jar for each change would be >>>> a bit cumbersome. >>>> >>>> 1. Is there a way of specifying these parameters via command line >>>> options or via RWStore.properties? >>>> 2. Can we configure/compile standalone Blazegraph to provide an AJP port >>>> for Apache's mod_jk (instead of mod_proxy_http)? >>>> >>>> Thanks & Best Regards >>>> Andreas >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>> Monitor end-to-end web transactions and take corrective actions now >>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >>>> _______________________________________________ >>>> Bigdata-developers mailing list >>>> Big...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>> >>> >>> >> >> -- >> ---- >> Bryan Thompson >> Chief Scientist & Founder >> Blazegraph >> e: br...@bl... >> w: http://blazegraph.com >> >> Blazegraph products help to solve the Graph Cache Thrash to achieve large >> scale processing for graph and predictive analytics. Blazegraph is the >> creator of the industry’s first GPU-accelerated high-performance database >> for large graphs, has been named as one of the “10 Companies and >> Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >> >> >> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high >> performance graph database that supports both RDF/SPARQL and >> Tinkerpop/Blueprints APIs. Blazegraph GPU >> <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS >> <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive >> new technologies that use GPUs to enable extreme scaling that is thousands >> of times faster and 40 times more affordable than CPU-based solutions. >> >> CONFIDENTIALITY NOTICE: This email and its contents and attachments are >> for the sole use of the intended recipient(s) and are confidential or >> proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, >> disclosure, dissemination or copying of this email or its contents or >> attachments is prohibited. If you have received this communication in >> error, please notify the sender by reply email and permanently delete all >> copies of the email and its contents and attachments. >> >> >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> >> > > > -- > _______________ > Brad Bebee > CEO > Blazegraph > e: be...@bl... > m: 202.642.7961 > w: www.blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and > Tinkerpop/Blueprints APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Brad B. <be...@bl...> - 2016-02-22 12:51:49
|
Andreas, Thank you. Please see the repo (repo-for-dsiutils) URL in the dsi-utils pom: https://github.com/blazegraph/database/blob/master/dsi-utils/pom.xml. Occasionally, that repository is unavailable. Please let us know if you continue to see the error. Thanks, --Brad On Mon, Feb 22, 2016 at 6:37 AM, Bryan Thompson <br...@sy...> wrote: > The more recent dsiutil so is, I believe, under a different license. This > is why we do not roll forward and do not test for compatibility with newer > releases. Instead, we maintain a fork. > > Brad, can you please comment on how to obtain the dsiutils dependency? > > Also, do you have any feedback on the web.xml configuration and > environment variable overrides? > > Thanks, > Bryan > > > On Monday, February 22, 2016, Andreas Kahl <ka...@bs...> wrote: > >> Bryan, >> >> Thanks for your hint. And thanks for changing the project to Maven, too. >> Now it was very easy to check out Blazegraph and open the project in >> Netbeans. >> I found that the parameters from web.xml have their defaults in >> com.bigdata.rdf.sail.webapp.ConfigParams. So I would have simply to add a >> custom argument --readOnly and then set >> initParams.put(ConfigParams.READ_ONLY,true); >> >> Just one version in the pom.xml needed manual attention: Neither Maven >> Central nor Apache Releases contain dsiutils 1.0.10, so I changed this to >> 2.3.0 which is the most recent version. Probably it is advisable not to >> blindly change that version. Do you have a recommended Maven repository to >> add to my list? >> >> Best Regards >> Andreas >> >> >> >> >> >> >>> Bryan Thompson <br...@sy...> 19.02.16 14.16 Uhr >>> >> Andreas, >> >> Please see the options defined for: >> >> - StandaloneNanoSparqlServer (used by the bundled-jar). >> - NanoSparqlServer (and it extends this class). >> >> I defer to Brad on the AJP port. >> >> Thanks, >> Bryan >> >> >> ---- >> Bryan Thompson >> Chief Scientist & Founder >> Blazegraph >> e: br...@bl... >> w: http://blazegraph.com >> >> Blazegraph products help to solve the Graph Cache Thrash to achieve large >> scale processing for graph and predictive analytics. Blazegraph is the >> creator of the industry’s first GPU-accelerated high-performance database >> for large graphs, has been named as one of the “10 Companies and >> Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >> >> >> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high >> performance graph database that supports both RDF/SPARQL and >> Tinkerpop/Blueprints APIs. Blazegraph GPU >> <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS >> <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive >> new technologies that use GPUs to enable extreme scaling that is thousands >> of times faster and 40 times more affordable than CPU-based solutions. >> >> CONFIDENTIALITY NOTICE: This email and its contents and attachments are >> for the sole use of the intended recipient(s) and are confidential or >> proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, >> disclosure, dissemination or copying of this email or its contents or >> attachments is prohibited. If you have received this communication in >> error, please notify the sender by reply email and permanently delete all >> copies of the email and its contents and attachments. >> >> On Fri, Feb 19, 2016 at 3:08 AM, Andreas Kahl <ka...@bs...> >> wrote: >> >>> Hello everyone, >>> >>> our Blazegraph production environment runs on Tomcat. Currently I am >>> looking whether we could change that and use the standalone variant by >>> calling blazegraph-bundled.jar directly like described in the getting >>> started section: java -server -Xmx4g -jar bigdata-bundled.jar >>> >>> We do quite often edit settings in web.xml and restart Tomcats; >>> especially readOnly and queryTimeout. >>> So having to recomile a new bigdata-bundled.jar for each change would be >>> a bit cumbersome. >>> >>> 1. Is there a way of specifying these parameters via command line >>> options or via RWStore.properties? >>> 2. Can we configure/compile standalone Blazegraph to provide an AJP port >>> for Apache's mod_jk (instead of mod_proxy_http)? >>> >>> Thanks & Best Regards >>> Andreas >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>> Monitor end-to-end web transactions and take corrective actions now >>> Troubleshoot faster and improve end-user experience. Signup Now! >>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> >> >> > > -- > ---- > Bryan Thompson > Chief Scientist & Founder > Blazegraph > e: br...@bl... > w: http://blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and > Tinkerpop/Blueprints APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Bryan T. <br...@sy...> - 2016-02-22 11:38:04
|
The more recent dsiutil so is, I believe, under a different license. This is why we do not roll forward and do not test for compatibility with newer releases. Instead, we maintain a fork. Brad, can you please comment on how to obtain the dsiutils dependency? Also, do you have any feedback on the web.xml configuration and environment variable overrides? Thanks, Bryan On Monday, February 22, 2016, Andreas Kahl <ka...@bs...> wrote: > Bryan, > > Thanks for your hint. And thanks for changing the project to Maven, too. > Now it was very easy to check out Blazegraph and open the project in > Netbeans. > I found that the parameters from web.xml have their defaults in > com.bigdata.rdf.sail.webapp.ConfigParams. So I would have simply to add a > custom argument --readOnly and then set > initParams.put(ConfigParams.READ_ONLY,true); > > Just one version in the pom.xml needed manual attention: Neither Maven > Central nor Apache Releases contain dsiutils 1.0.10, so I changed this to > 2.3.0 which is the most recent version. Probably it is advisable not to > blindly change that version. Do you have a recommended Maven repository to > add to my list? > > Best Regards > Andreas > > > > > > >>> Bryan Thompson <br...@sy... > <javascript:_e(%7B%7D,'cvml','br...@sy...');>> 19.02.16 14.16 Uhr >>> > Andreas, > > Please see the options defined for: > > - StandaloneNanoSparqlServer (used by the bundled-jar). > - NanoSparqlServer (and it extends this class). > > I defer to Brad on the AJP port. > > Thanks, > Bryan > > > ---- > Bryan Thompson > Chief Scientist & Founder > Blazegraph > e: br...@bl... > <javascript:_e(%7B%7D,'cvml','br...@bl...');> > w: http://blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and > Tinkerpop/Blueprints APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > On Fri, Feb 19, 2016 at 3:08 AM, Andreas Kahl <ka...@bs... > <javascript:_e(%7B%7D,'cvml','ka...@bs...');>> wrote: > >> Hello everyone, >> >> our Blazegraph production environment runs on Tomcat. Currently I am >> looking whether we could change that and use the standalone variant by >> calling blazegraph-bundled.jar directly like described in the getting >> started section: java -server -Xmx4g -jar bigdata-bundled.jar >> >> We do quite often edit settings in web.xml and restart Tomcats; >> especially readOnly and queryTimeout. >> So having to recomile a new bigdata-bundled.jar for each change would be >> a bit cumbersome. >> >> 1. Is there a way of specifying these parameters via command line >> options or via RWStore.properties? >> 2. Can we configure/compile standalone Blazegraph to provide an AJP port >> for Apache's mod_jk (instead of mod_proxy_http)? >> >> Thanks & Best Regards >> Andreas >> >> >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> <javascript:_e(%7B%7D,'cvml','Big...@li...');> >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> > > -- ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Bryan T. <br...@sy...> - 2016-02-19 13:16:54
|
Andreas, Please see the options defined for: - StandaloneNanoSparqlServer (used by the bundled-jar). - NanoSparqlServer (and it extends this class). I defer to Brad on the AJP port. Thanks, Bryan ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Fri, Feb 19, 2016 at 3:08 AM, Andreas Kahl <ka...@bs...> wrote: > Hello everyone, > > our Blazegraph production environment runs on Tomcat. Currently I am > looking whether we could change that and use the standalone variant by > calling blazegraph-bundled.jar directly like described in the getting > started section: java -server -Xmx4g -jar bigdata-bundled.jar > > We do quite often edit settings in web.xml and restart Tomcats; > especially readOnly and queryTimeout. > So having to recomile a new bigdata-bundled.jar for each change would be > a bit cumbersome. > > 1. Is there a way of specifying these parameters via command line > options or via RWStore.properties? > 2. Can we configure/compile standalone Blazegraph to provide an AJP port > for Apache's mod_jk (instead of mod_proxy_http)? > > Thanks & Best Regards > Andreas > > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: Andreas K. <ka...@bs...> - 2016-02-19 08:23:38
|
Hello everyone, our Blazegraph production environment runs on Tomcat. Currently I am looking whether we could change that and use the standalone variant by calling blazegraph-bundled.jar directly like described in the getting started section: java -server -Xmx4g -jar bigdata-bundled.jar We do quite often edit settings in web.xml and restart Tomcats; especially readOnly and queryTimeout. So having to recomile a new bigdata-bundled.jar for each change would be a bit cumbersome. 1. Is there a way of specifying these parameters via command line options or via RWStore.properties? 2. Can we configure/compile standalone Blazegraph to provide an AJP port for Apache's mod_jk (instead of mod_proxy_http)? Thanks & Best Regards Andreas |
From: Stas M. <sma...@wi...> - 2016-02-18 21:04:18
|
Hi! > Probably ok. But we should document it as a public API. The backing > map is NOT thread-safe. If we want to expose this, then we might want > to have a narrower interface. Yes, would be nice. I'll add a task in Jira then. -- Stas Malyshev sma...@wi... |
From: Bryan T. <br...@sy...> - 2016-02-18 21:02:17
|
Probably ok. But we should document it as a public API. The backing map is NOT thread-safe. If we want to expose this, then we might want to have a narrower interface. Bryan ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Thu, Feb 18, 2016 at 3:59 PM, Stas Malyshev <sma...@wi...> wrote: > Hi! > > > No. But i am open to doing this. Has to be independent of the db > > connection. Perhaps and openrdf style factory pattern? > > I just noticed that PrefixDeclProcessor.defaultDecls is public and > mutable. Would it be OK to just add prefixes there in context listener > or some other initializer procedure? Or that could break something? It > would be DB-independent since those are just strings at that point. > > -- > Stas Malyshev > sma...@wi... > |
From: Stas M. <sma...@wi...> - 2016-02-18 21:00:09
|
Hi! > No. But i am open to doing this. Has to be independent of the db > connection. Perhaps and openrdf style factory pattern? I just noticed that PrefixDeclProcessor.defaultDecls is public and mutable. Would it be OK to just add prefixes there in context listener or some other initializer procedure? Or that could break something? It would be DB-independent since those are just strings at that point. -- Stas Malyshev sma...@wi... |
From: Bryan T. <br...@sy...> - 2016-02-18 19:26:34
|
No. But i am open to doing this. Has to be independent of the db connection. Perhaps and openrdf style factory pattern? On Feb 18, 2016 2:21 PM, "Stas Malyshev" <sma...@wi...> wrote: > Hi! > > There are number of prefixes available by default in Blazegraph, defined > in PrefixDeclProcessor class. Those are hardcoded. Is there a > possibility to extend this list with custom prefixes? E.g., Wikidata > Query Service has about 15 prefixes, and frequently uses at least 5-6 of > them. It would be nice to allow users to use those without having to > redefine them each time. But I don't see any extension points in > PrefixDeclProcessor. Is there an existing way to do it? > > Thanks, > -- > Stas Malyshev > sma...@wi... > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: Stas M. <sma...@wi...> - 2016-02-18 19:21:34
|
Hi! There are number of prefixes available by default in Blazegraph, defined in PrefixDeclProcessor class. Those are hardcoded. Is there a possibility to extend this list with custom prefixes? E.g., Wikidata Query Service has about 15 prefixes, and frequently uses at least 5-6 of them. It would be nice to allow users to use those without having to redefine them each time. But I don't see any extension points in PrefixDeclProcessor. Is there an existing way to do it? Thanks, -- Stas Malyshev sma...@wi... |
From: Brad B. <be...@bl...> - 2016-02-17 20:27:06
|
Joakim, With the DataLoader, the commit is after all of the data is loaded. Once the load is complete, all of the statements will be visible. Thanks, --Brad On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg < joa...@bl...> wrote: > I am calling: > > curl -X POST --data-binary @dataloader.xml --header > 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader > > I can see the size of the JNL-file is increasing, but when I query number > of statements in the dashboard the data doesn’t show up. > > select (count(*) as ?num) { ?s ?p ?o } > > Do I need to Flush the StatementBuffer to the backing store after the curl? > > This is my config file: > > <?xml version="1.0" encoding="UTF-8" standalone="no"?> > <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> > <properties> > <!-- RDF Format (Default is rdf/xml) --> > <entry key="format">N-Triples</entry> > <!-- Base URI (Optional) --> > <entry key="baseURI"></entry> > <!-- Default Graph URI (Optional - > Required for quads mode namespace) --> > <entry key="defaultGraph"></entry> > <!-- Suppress all stdout > messages (Optional) --> > <entry > key="quiet">false</entry> > <!-- Show > additional messages detailing the load performance. (Optional) --> > <entry > key="verbose">3</entry> > <!-- > Compute the RDF(S)+ closure. (Optional) --> > <entry key="closure">false</entry> > <!-- Files will be renamed to either .good or .fail as > they are processed. > The files will remain in the same directory. --> > <entry key="durableQueues">true</entry> > <!-- The namespace of the KB instance. > Defaults to kb. --> > <entry key="namespace">kb</entry> > <!-- The configuration file for the > database instance. It must be readable by the web application. --> > <entry key="propertyFile">RWStore.properties</entry> > <!-- Zero or more files or directories containing the > data to be loaded. > This should be a comma delimited list. The files must > be readable by the web application. --> > <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> > </properties> > > > > On Feb 16, 2016, at 8:35 AM, Joakim Soderberg < > joa...@bl...> wrote: > > I knew there is a DataLoader class, but I wasn’t aware it was available as > a service in NanoSparql server. I will try it immediately > > > Thanks > Joakim > > On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy...> wrote: > > See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load > > > > That looks very interesting: > > I read: > > "Parsing, insert, and removal on the database are now decoupled from the > index writes” > > One behavior we have is that we have small inserts concurrent with other > activity (typically but not exclusively read activity). Does the > enhanced configurability in 2.0 give us options that may allow us to > improve performance of these writes. > > E.g. this week we have many (millions? at least hundreds of thousands) of > such small writes (10 - 100 quads) and we also are trying to delete 25 > million quads using about 100 delete/insert requests (that I take to be not > impacted by this change). I am currently suggesting we should do one or the > other at any one time, and not try to mix: but frankly I am guessing, and > guessing conservatively. We have to maintain an always-on read > performance at the same time. Total store size approx 3billion. > > [Unfortunately this machine is still a 1.5.3 machine, but for future > reference I am trying to have better sense of how to organize such activity] > > Jeremy > > > > > > On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy...> wrote: > > 2.0 includes support for bulk data load with a number of interesting > features, including durable queue patterns, folders, etc. See > https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load > > ---- > Bryan Thompson > Chief Scientist & Founder > Blazegraph > e: br...@bl... > w: http://blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and > Tinkerpop/Blueprints APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy...> wrote: > >> >> >> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg < >> joa...@bl...> wrote: >> >> Has anyone succeeded to load a folder of .nt files? I can load one by one: >> >> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt> INTO >> GRAPH <http://dbpedia2015> >> >> But it doesn’t like a folder name >> LOAD <file:///mydata/dbpedia2015/core/> INTO GRAPH <http://dbpedia2015> >> >> >> >> That is correct. If you look at the spec for LOAD: >> https://www.w3.org/TR/sparql11-update/#load >> then it takes an IRI as where you are loading from, and the concept of >> folder is simply not applicable. >> A few schemes such as file: and ftp: may have such a notion, but the >> operation you are looking for is local to your machine on the client and >> you should probably implement it yourself. >> >> In particular, do you want each file loaded into a different graph or the >> same graph: probably best for you to make up your own mind. >> >> I have had success loading trig files into multiple graphs, using a >> simple POST to the endpoint. >> >> >> Jeremy >> >> >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> >> > > > > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Joakim S. <joa...@bl...> - 2016-02-17 20:21:47
|
I am calling: curl -X POST --data-binary @dataloader.xml --header 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader I can see the size of the JNL-file is increasing, but when I query number of statements in the dashboard the data doesn’t show up. select (count(*) as ?num) { ?s ?p ?o } Do I need to Flush the StatementBuffer to the backing store after the curl? This is my config file: <?xml version="1.0" encoding="UTF-8" standalone="no"?> <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> <properties> <!-- RDF Format (Default is rdf/xml) --> <entry key="format">N-Triples</entry> <!-- Base URI (Optional) --> <entry key="baseURI"></entry> <!-- Default Graph URI (Optional - Required for quads mode namespace) --> <entry key="defaultGraph"></entry> <!-- Suppress all stdout messages (Optional) --> <entry key="quiet">false</entry> <!-- Show additional messages detailing the load performance. (Optional) --> <entry key="verbose">3</entry> <!-- Compute the RDF(S)+ closure. (Optional) --> <entry key="closure">false</entry> <!-- Files will be renamed to either .good or .fail as they are processed. The files will remain in the same directory. --> <entry key="durableQueues">true</entry> <!-- The namespace of the KB instance. Defaults to kb. --> <entry key="namespace">kb</entry> <!-- The configuration file for the database instance. It must be readable by the web application. --> <entry key="propertyFile">RWStore.properties</entry> <!-- Zero or more files or directories containing the data to be loaded. This should be a comma delimited list. The files must be readable by the web application. --> <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> </properties> > On Feb 16, 2016, at 8:35 AM, Joakim Soderberg <joa...@bl...> wrote: > > I knew there is a DataLoader class, but I wasn’t aware it was available as a service in NanoSparql server. I will try it immediately > > > Thanks > Joakim > >> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >> >>> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >> >> >> That looks very interesting: >> >> I read: >> >> "Parsing, insert, and removal on the database are now decoupled from the index writes” >> >> One behavior we have is that we have small inserts concurrent with other activity (typically but not exclusively read activity). Does the enhanced configurability in 2.0 give us options that may allow us to improve performance of these writes. >> >> E.g. this week we have many (millions? at least hundreds of thousands) of such small writes (10 - 100 quads) and we also are trying to delete 25 million quads using about 100 delete/insert requests (that I take to be not impacted by this change). I am currently suggesting we should do one or the other at any one time, and not try to mix: but frankly I am guessing, and guessing conservatively. We have to maintain an always-on read performance at the same time. Total store size approx 3billion. >> >> [Unfortunately this machine is still a 1.5.3 machine, but for future reference I am trying to have better sense of how to organize such activity] >> >> Jeremy >> >> >> >> >> >>> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy... <mailto:br...@sy...>> wrote: >>> >>> 2.0 includes support for bulk data load with a number of interesting features, including durable queue patterns, folders, etc. See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>> >>> ---- >>> Bryan Thompson >>> Chief Scientist & Founder >>> Blazegraph >>> e: br...@bl... <mailto:br...@bl...> >>> w: http://blazegraph.com <http://blazegraph.com/> >>> >>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>> >>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>> >>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>> >>> >>> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>> >>> >>>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>> >>>> Has anyone succeeded to load a folder of .nt files? I can load one by one: >>>> >>>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>> >>>> But it doesn’t like a folder name >>>> LOAD <file:///mydata/dbpedia2015/core/ <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>> >>> >>> That is correct. If you look at the spec for LOAD: >>> https://www.w3.org/TR/sparql11-update/#load <https://www.w3.org/TR/sparql11-update/#load> >>> then it takes an IRI as where you are loading from, and the concept of folder is simply not applicable. >>> A few schemes such as file: and ftp: may have such a notion, but the operation you are looking for is local to your machine on the client and you should probably implement it yourself. >>> >>> In particular, do you want each file loaded into a different graph or the same graph: probably best for you to make up your own mind. >>> >>> I have had success loading trig files into multiple graphs, using a simple POST to the endpoint. >>> >>> >>> Jeremy >>> >>> >>> ------------------------------------------------------------------------------ >>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>> Monitor end-to-end web transactions and take corrective actions now >>> Troubleshoot faster and improve end-user experience. Signup Now! >>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... <mailto:Big...@li...> >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>> >>> >> > |
From: Bryan T. <br...@sy...> - 2016-02-17 16:22:28
|
Updates against different quads contexts in the same blazegraph namespace are updating the same 6 backing indices in blazegraph. So these updates would be serialized. For improved throughput, you can batched together a number of small updates against different contexts. Bryan ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Wed, Feb 17, 2016 at 11:20 AM, Jeremy J Carroll <jj...@sy...> wrote: > > > > On Feb 16, 2016, at 8:17 AM, Bryan Thompson <br...@sy...> wrote: > > > Updates against a single graph must be serialized using the unisolated > connection. > > > perhaps reading too much into this … > if in quads mode I have many small graphs, do updates against different > graphs not interface (at the blazegraph level, rather than disk I/O level > where there is obvious contention) > > Jeremy > > > > |
From: Jeremy J C. <jj...@sy...> - 2016-02-17 16:20:15
|
> On Feb 16, 2016, at 8:17 AM, Bryan Thompson <br...@sy...> wrote: > > > Updates against a single graph must be serialized using the unisolated connection. perhaps reading too much into this … if in quads mode I have many small graphs, do updates against different graphs not interface (at the blazegraph level, rather than disk I/O level where there is obvious contention) Jeremy |