This list is closed, nobody may subscribe to it.
2010 |
Jan
|
Feb
(19) |
Mar
(8) |
Apr
(25) |
May
(16) |
Jun
(77) |
Jul
(131) |
Aug
(76) |
Sep
(30) |
Oct
(7) |
Nov
(3) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
(16) |
Aug
(3) |
Sep
(1) |
Oct
|
Nov
(7) |
Dec
(7) |
2012 |
Jan
(10) |
Feb
(1) |
Mar
(8) |
Apr
(6) |
May
(1) |
Jun
(3) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
|
Nov
(8) |
Dec
(2) |
2013 |
Jan
(5) |
Feb
(12) |
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
(22) |
Aug
(50) |
Sep
(31) |
Oct
(64) |
Nov
(83) |
Dec
(28) |
2014 |
Jan
(31) |
Feb
(18) |
Mar
(27) |
Apr
(39) |
May
(45) |
Jun
(15) |
Jul
(6) |
Aug
(27) |
Sep
(6) |
Oct
(67) |
Nov
(70) |
Dec
(1) |
2015 |
Jan
(3) |
Feb
(18) |
Mar
(22) |
Apr
(121) |
May
(42) |
Jun
(17) |
Jul
(8) |
Aug
(11) |
Sep
(26) |
Oct
(15) |
Nov
(66) |
Dec
(38) |
2016 |
Jan
(14) |
Feb
(59) |
Mar
(28) |
Apr
(44) |
May
(21) |
Jun
(12) |
Jul
(9) |
Aug
(11) |
Sep
(4) |
Oct
(2) |
Nov
(1) |
Dec
|
2017 |
Jan
(20) |
Feb
(7) |
Mar
(4) |
Apr
(18) |
May
(7) |
Jun
(3) |
Jul
(13) |
Aug
(2) |
Sep
(4) |
Oct
(9) |
Nov
(2) |
Dec
(5) |
2018 |
Jan
|
Feb
|
Mar
|
Apr
(2) |
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2019 |
Jan
|
Feb
|
Mar
(1) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Douglas F. <dr...@gm...> - 2016-05-26 17:45:58
|
Hello, I am sure this is a simple question but I can't seem to launch a previously created blazegraph instance with the namespaces in "readonly" mode. Is there a simple way to add a -D option or something to java -server -Xmx4g -jar blazegraph.jar such that the existing namespaces are readonly on the sparql endpoint so that I can expose the server to the net without a fear of someone doing a update/delete etc on the triples? Thanks Doug |
From: Bob D. <bo...@sn...> - 2016-05-08 14:01:41
|
Thanks Bryan, that worked! Watch my blog at snee.com/bobdc.blog for a writeup within a few weeks. Bob On 05/07/2016 01:10 PM, Bryan Thompson wrote: > Bob, > > You need to create a namespace in which inference is enabled. You can > do this using the workbench. Or you can do this using the REST API. > See > https://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance. > Especially look at > https://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance#Triples_Modes > (properties required) and > https://wiki.blazegraph.com/wiki/index.php/Quick_Start#Create_Namespace (workbench). > See > https://wiki.blazegraph.com/wiki/index.php/REST_API#Triples_.2B_Inference_.2B_Truth_Maintenance > for doing this with the REST API. > > Thanks, > Bryan > > On Sat, May 7, 2016 at 12:11 PM, Bob DuCharme <bo...@sn... > <mailto:bo...@sn...>> wrote: > > I want to have SPARQL queries on http://192.168.0.79:9999/blazegraph > return inferred triples, so that after loading the data at > http://learningsparql.com/2ndeditionexamples/ex417.ttl I can query for > dc:creator values and see the dm:composer and dm:photographer values > returned. I was able to do a query on dm:photographer with no problem, > but couldn 't get at any inferred triples. > > Based on what I saw at > https://wiki.blazegraph.com/wiki/index.php/SPARQL_Update#Manage_truth_maintenance_in_SPARQL_UPDATE, > on http://192.168.0.79:9999/blazegraph/#update I entered ENABLE > ENTAILMENTS with a Type of "SPARQL Update" selected underneath. This > gave me a bunch of errors starting with this: > > java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: > org.openrdf.query.UpdateExecutionException: > java.lang.RuntimeException: > java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: 3 > > I had hoped that ENABLE (or CREATE) ENTAILMENTS as an update request > would let me query for the dc:creator values. Is this possible? > What am > I doing wrong? > > Thanks, > > Bob > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with > Applications Manager > Applications Manager provides deep performance insights into > multiple tiers of > your business applications. It resolves application problems > quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > <mailto:Big...@li...> > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > -- (Lately, when people Reply To email sent from bo...@sn..., it goes to my old work address of bdu...@to... and I never see it. I think I've fixed this, but if you do a Reply To, double-check the To field in your reply to make sure that it says bo...@sn....) |
From: Bryan T. <br...@bl...> - 2016-05-07 17:10:18
|
Bob, You need to create a namespace in which inference is enabled. You can do this using the workbench. Or you can do this using the REST API. See https://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance. Especially look at https://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance#Triples_Modes (properties required) and https://wiki.blazegraph.com/wiki/index.php/Quick_Start#Create_Namespace (workbench). See https://wiki.blazegraph.com/wiki/index.php/REST_API#Triples_.2B_Inference_.2B_Truth_Maintenance for doing this with the REST API. Thanks, Bryan On Sat, May 7, 2016 at 12:11 PM, Bob DuCharme <bo...@sn...> wrote: > I want to have SPARQL queries on http://192.168.0.79:9999/blazegraph > return inferred triples, so that after loading the data at > http://learningsparql.com/2ndeditionexamples/ex417.ttl I can query for > dc:creator values and see the dm:composer and dm:photographer values > returned. I was able to do a query on dm:photographer with no problem, > but couldn 't get at any inferred triples. > > Based on what I saw at > > https://wiki.blazegraph.com/wiki/index.php/SPARQL_Update#Manage_truth_maintenance_in_SPARQL_UPDATE > , > on http://192.168.0.79:9999/blazegraph/#update I entered ENABLE > ENTAILMENTS with a Type of "SPARQL Update" selected underneath. This > gave me a bunch of errors starting with this: > > java.util.concurrent.ExecutionException: > java.util.concurrent.ExecutionException: > org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: > java.util.concurrent.ExecutionException: > java.lang.ArrayIndexOutOfBoundsException: 3 > > I had hoped that ENABLE (or CREATE) ENTAILMENTS as an update request > would let me query for the dc:creator values. Is this possible? What am > I doing wrong? > > Thanks, > > Bob > > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications > Manager > Applications Manager provides deep performance insights into multiple > tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: Bob D. <bo...@sn...> - 2016-05-07 16:31:52
|
I want to have SPARQL queries on http://192.168.0.79:9999/blazegraph return inferred triples, so that after loading the data at http://learningsparql.com/2ndeditionexamples/ex417.ttl I can query for dc:creator values and see the dm:composer and dm:photographer values returned. I was able to do a query on dm:photographer with no problem, but couldn 't get at any inferred triples. Based on what I saw at https://wiki.blazegraph.com/wiki/index.php/SPARQL_Update#Manage_truth_maintenance_in_SPARQL_UPDATE, on http://192.168.0.79:9999/blazegraph/#update I entered ENABLE ENTAILMENTS with a Type of "SPARQL Update" selected underneath. This gave me a bunch of errors starting with this: java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.UpdateExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.ArrayIndexOutOfBoundsException: 3 I had hoped that ENABLE (or CREATE) ENTAILMENTS as an update request would let me query for the dc:creator values. Is this possible? What am I doing wrong? Thanks, Bob |
From: Bryan T. <br...@bl...> - 2016-05-06 17:49:18
|
Bob, The inspiration for the "RDFS+" is due to Hendler. The specific out of the box entailment options are linked from [1]. See the InferenceEngine.Options link there. You can express anything that is captured by conjunctive query with equality / non-equality constraints through additional customization of the rules. Thanks, Bryan [1] https://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance#Configuring_Inference On Fri, May 6, 2016 at 12:53 PM, <bo...@sn...> wrote: > Is the RDFS+ mentioned at > https://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance > the same as the Allemang/Hendler RDFS+ or does BlazeGraph support its > own subset of OWL added on to RDFS? If the latter, are the supported > properties listed somewhere? > > Thanks, > > Bob > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications > Manager > Applications Manager provides deep performance insights into multiple > tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: <bo...@sn...> - 2016-05-06 17:19:05
|
Is the RDFS+ mentioned at https://wiki.blazegraph.com/wiki/index.php/InferenceAndTruthMaintenance the same as the Allemang/Hendler RDFS+ or does BlazeGraph support its own subset of OWL added on to RDFS? If the latter, are the supported properties listed somewhere? Thanks, Bob |
From: Joakim S. <joa...@bl...> - 2016-05-03 22:10:41
|
I’m running Blazegraph / wikidata-query service-0.2.0-SNAPSHOT. When trying to load a ttl file from the Blazegraph dashboard, I get the following error: ERROR: INSERT-WITH-BODY: ...bigdata/namespace/wdq/sparql, Content-Type=application/x-turtle, context-uri=[] java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: com.bigdata.rwstore.PhysicalAddressResolutionException: Address did not resolve to physical address: -137016524 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at com.bigdata.rdf.sail.webapp.BigdataServlet.submitApiTask(BigdataServlet.java:281) at com.bigdata.rdf.sail.webapp.InsertServlet.doPostWithBody(InsertServlet.java:203) at com.bigdata.rdf.sail.webapp.InsertServlet.doPost(InsertServlet.java:119) at com.bigdata.rdf.sail.webapp.RESTServlet.doPost(RESTServlet.java:308) at com.bigdata.rdf.sail.webapp.MultiTenancyServlet.doPost(MultiTenancyServlet.java:170) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:790) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:808) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:587) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:577) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:223) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1127) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:515) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1061) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:215) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:110) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:497) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:310) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:257) at org.eclipse.jetty.io.AbstractConnection$2.run(AbstractConnection.java:540) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:635) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:555) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: com.bigdata.rwstore.PhysicalAddressResolutionException: Address did not resolve to physical address: -137016524 at com.bigdata.rdf.rio.StatementBuffer.flush(StatementBuffer.java:927) at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.flushStatementBuffers(BigdataSail.java:3904) at com.bigdata.rdf.sail.BigdataSail$BigdataSailConnection.commit2(BigdataSail.java:3687) at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit2(BigdataSailRepositoryConnection.java:330) at com.bigdata.rdf.sail.BigdataSailRepositoryConnection.commit(BigdataSailRepositoryConnection.java:349) at com.bigdata.rdf.sail.webapp.InsertServlet$InsertWithBodyTask.call(InsertServlet.java:308) at com.bigdata.rdf.sail.webapp.InsertServlet$InsertWithBodyTask.call(InsertServlet.java:226) at com.bigdata.rdf.task.ApiTaskForIndexManager.call(ApiTaskForIndexManager.java:68) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) ... 1 more |
From: Bryan T. <br...@bl...> - 2016-04-25 15:25:40
|
Non-blocking hash joins are currently only used when a limit is specified. When using a query that has subqueries or sub-groups, blocking hash joins can cause the query to take more time to the first result but less time overall. If you want to enable the non-blocking hash joins, you can probably just specify a large LIMIT. Thanks, Bryan On Mon, Apr 25, 2016 at 9:49 AM, Brad Bebee <be...@bl...> wrote: > Artur, > > Thank you. I've also copied the developers list as others may have > experience with this as well. Presuming that you are not using an ORDER BY > or DISTINCT (these cause the intermediate results to be fully materialized > before sending a response) in your query, they results are streamed to the > output stream as they are matched using the REST API. > > Thanks, --Brad > > On Mon, Apr 25, 2016 at 9:34 AM, Artur Polit <Art...@lh... > > wrote: > >> Hello, >> >> >> >> I would like to ask one technical question about your product. >> >> We’re currently query the remote instance of Blazegraph using >> RemoteRepositoryAPI. We would like to get first result as fast as possible. >> >> The question is are the results (found matching triples of triple query) >> are written to the HTTP output stream one by one, or they are flushed at >> once. >> >> I was trying to understand that from Blazegraph sources, but I’ve failed. >> >> >> >> Kind regards, >> >> Artur Polit >> >> Software Developer >> >> ------------------------------ >> >> Switchboard: +44 (0)113 394 6020 >> >> Technical Support: +44 (0)113 394 6030 >> ------------------------------ >> >> Lhasa Limited, a not-for-profit organisation, promotes scientific >> knowledge & understanding through the development of computer-aided >> reasoning & information systems in chemistry & the life sciences. >> Registered Charity Number 290866. Registered Office: Granary Wharf House, 2 >> Canal Wharf, Leeds, LS11 5PS. Company Registration Number 01765239. >> Registered in England and Wales. >> >> This communication, including any associated attachments, is intended for >> the use of the addressee only and may contain confidential, privileged or >> copyright material. If you are not the intended recipient, you must not >> copy this message or attachment or disclose the contents to any other >> person. If you have received this transmission in error, please notify the >> sender immediately and delete the message and any attachment from your >> system. Except where specifically stated, any views expressed in this >> message are those of the individual sender and do not represent the views >> of Lhasa Limited. Lhasa Limited cannot accept liability for any statements >> made which are the sender's own. Lhasa Limited does not guarantee that >> electronic communications, including any attachments, are free of viruses. >> Virus scanning is recommended and is the responsibility of the recipient. >> > > > > -- > _______________ > Brad Bebee > CEO > Blazegraph > e: be...@bl... > m: 202.642.7961 > w: www.blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and Apache > TinkerPop™ APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications > Manager > Applications Manager provides deep performance insights into multiple > tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > |
From: Brad B. <be...@bl...> - 2016-04-25 13:49:44
|
Artur, Thank you. I've also copied the developers list as others may have experience with this as well. Presuming that you are not using an ORDER BY or DISTINCT (these cause the intermediate results to be fully materialized before sending a response) in your query, they results are streamed to the output stream as they are matched using the REST API. Thanks, --Brad On Mon, Apr 25, 2016 at 9:34 AM, Artur Polit <Art...@lh...> wrote: > Hello, > > > > I would like to ask one technical question about your product. > > We’re currently query the remote instance of Blazegraph using > RemoteRepositoryAPI. We would like to get first result as fast as possible. > > The question is are the results (found matching triples of triple query) > are written to the HTTP output stream one by one, or they are flushed at > once. > > I was trying to understand that from Blazegraph sources, but I’ve failed. > > > > Kind regards, > > Artur Polit > > Software Developer > > ------------------------------ > > Switchboard: +44 (0)113 394 6020 > > Technical Support: +44 (0)113 394 6030 > ------------------------------ > > Lhasa Limited, a not-for-profit organisation, promotes scientific > knowledge & understanding through the development of computer-aided > reasoning & information systems in chemistry & the life sciences. > Registered Charity Number 290866. Registered Office: Granary Wharf House, 2 > Canal Wharf, Leeds, LS11 5PS. Company Registration Number 01765239. > Registered in England and Wales. > > This communication, including any associated attachments, is intended for > the use of the addressee only and may contain confidential, privileged or > copyright material. If you are not the intended recipient, you must not > copy this message or attachment or disclose the contents to any other > person. If you have received this transmission in error, please notify the > sender immediately and delete the message and any attachment from your > system. Except where specifically stated, any views expressed in this > message are those of the individual sender and do not represent the views > of Lhasa Limited. Lhasa Limited cannot accept liability for any statements > made which are the sender's own. Lhasa Limited does not guarantee that > electronic communications, including any attachments, are free of viruses. > Virus scanning is recommended and is the responsibility of the recipient. > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Apache TinkerPop™ APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Bryan T. <br...@bl...> - 2016-04-24 21:09:59
|
Perhaps you are not draining the query results? That would explain all the points you have mentioned since the queries will continue to "run" until all solutions are drained or the query times out or is cancelled. The status tab will show you the running queries. I suspect that each submitted query continues until it times out because it is not being drained. If you are embedded, be sure to close the solutions iterator. Bryan On Apr 24, 2016 4:39 PM, "Daniel Hernández" <da...@de...> wrote: I found another issue on this. When I stopped the server it prints several InterruptedExceptions, one for each of the queries that give a timeout. ERROR: BigdataRDFServlet.java:214: cause=java.lang.InterruptedException, query=SPARQL-QUERY: queryStr=PREFIX wikibase:<http://wikiba.se/ontology-beta#> PREFIX wd: <http://www.wikidata.org/entity/> SELECT ?s ?p ?q ?qo WHERE { { ?s ?p ?c . ?c ?ps wd:Q16533 . ?p wikibase:propertyValue ?ps . } OPTIONAL { ?c ?q ?qo . ?q a wikibase:Propert y . } } LIMIT 10000 java.lang.InterruptedException I have set timeout=60 in the request. That is, parametrically the request is: "#{@endpoint}?query=#{url_encode(query)}&timeout=60&analytic=true" It seems that the timeout is not working because tasks continue several minutes after the timeout. Then, I also set a timeout in a file override.xml <?xml version="1.0" encoding="UTF-8"?> <web-app xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_1.xsd" version="3.1"> <display-name>Bigdata</display-name> <description>Bigdata</description> <context-param> <description></description> <param-name>queryTimeout</param-name> <param-value>2</param-value> </context-param> </web-app> Then, I run the server with java -Xmx10g -Xms10g -XX:PermSize=10g -XX:MaxPermSize=10g -XX:+UseConcMarkSweepGC -XX:ParallelCMSThreads=5 -Djetty.overrideWebXml=override.xml -Dbigdata.propertyFile=server.properties -jar blazegraph.jar Then, several queries where aborted by the server. That is, the response have a status code 500 and the server prints this: Go to http://172.17.69.182:9999/blazegraph/ to get started. WARN : Haltable.java:466: com.bigdata.util.concurrent.Haltable@bd8c9ab : isFirstCause=true : com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired. com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired. at com.bigdata.bop.engine.RunState.checkDeadline(RunState.java:832) at com.bigdata.bop.engine.RunState.startOp(RunState.java:753) at com.bigdata.bop.engine.AbstractRunningQuery.startOp(AbstractRunningQuery.java:789) at com.bigdata.bop.engine.QueryEngine.startOp(QueryEngine.java:1348) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTaskWrapper.run(ChunkedRunningQuery.java:883) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkFutureTask.run(ChunkedRunningQuery.java:792) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR: BigdataRDFServlet.java:214: cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryInterruptedException: ja va.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: com.bigdata.bop.engine.QueryTimeou tException: Query deadline is expired., query=SPARQL-QUERY: queryStr=PREFIX wikibase: <http://wikiba.se/ontology-beta#> PREFIX wd: <http://www.wikidata.org/entity/> SELE CT ?s ?p ?q ?qo WHERE { { ?s ?p ?c . ?c ?ps wd:Q17912672 . ?p wikibase:propertyValue ?ps . } OPTIONAL { ?c ?q ?qo . ?q a wikibase:Property . } } LIMIT 10000 java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryInterruptedException: java.lang.RuntimeException: java.util.concu rrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired. Thus, it seems that the timeout was working. However, after some queries, I got a timeout in the client. After this first timeout every query is answered with a timeout. Cheers, Daniel ------------------------------------------------------------------------------ Find and fix application performance issues faster with Applications Manager Applications Manager provides deep performance insights into multiple tiers of your business applications. It resolves application problems quickly and reduces your MTTR. Get your free trial! https://ad.doubleclick.net/ddm/clk/302982198;130105516;z _______________________________________________ Bigdata-developers mailing list Big...@li... https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
From: Daniel H. <da...@de...> - 2016-04-24 20:39:17
|
I found another issue on this. When I stopped the server it prints several InterruptedExceptions, one for each of the queries that give a timeout. ERROR: BigdataRDFServlet.java:214: cause=java.lang.InterruptedException, query=SPARQL-QUERY: queryStr=PREFIX wikibase:<http://wikiba.se/ontology-beta#> PREFIX wd: <http://www.wikidata.org/entity/> SELECT ?s ?p ?q ?qo WHERE { { ?s ?p ?c . ?c ?ps wd:Q16533 . ?p wikibase:propertyValue ?ps . } OPTIONAL { ?c ?q ?qo . ?q a wikibase:Propert y . } } LIMIT 10000 java.lang.InterruptedException I have set timeout=60 in the request. That is, parametrically the request is: "#{@endpoint}?query=#{url_encode(query)}&timeout=60&analytic=true" It seems that the timeout is not working because tasks continue several minutes after the timeout. Then, I also set a timeout in a file override.xml <?xml version="1.0" encoding="UTF-8"?> <web-app xmlns="http://java.sun.com/xml/ns/javaee" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/web-app_3_1.xsd" version="3.1"> <display-name>Bigdata</display-name> <description>Bigdata</description> <context-param> <description></description> <param-name>queryTimeout</param-name> <param-value>2</param-value> </context-param> </web-app> Then, I run the server with java -Xmx10g -Xms10g -XX:PermSize=10g -XX:MaxPermSize=10g -XX:+UseConcMarkSweepGC -XX:ParallelCMSThreads=5 -Djetty.overrideWebXml=override.xml -Dbigdata.propertyFile=server.properties -jar blazegraph.jar Then, several queries where aborted by the server. That is, the response have a status code 500 and the server prints this: Go to http://172.17.69.182:9999/blazegraph/ to get started. WARN : Haltable.java:466: com.bigdata.util.concurrent.Haltable@bd8c9ab : isFirstCause=true : com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired. com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired. at com.bigdata.bop.engine.RunState.checkDeadline(RunState.java:832) at com.bigdata.bop.engine.RunState.startOp(RunState.java:753) at com.bigdata.bop.engine.AbstractRunningQuery.startOp(AbstractRunningQuery.java:789) at com.bigdata.bop.engine.QueryEngine.startOp(QueryEngine.java:1348) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkTaskWrapper.run(ChunkedRunningQuery.java:883) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at com.bigdata.concurrent.FutureTaskMon.run(FutureTaskMon.java:63) at com.bigdata.bop.engine.ChunkedRunningQuery$ChunkFutureTask.run(ChunkedRunningQuery.java:792) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR: BigdataRDFServlet.java:214: cause=java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryInterruptedException: ja va.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: com.bigdata.bop.engine.QueryTimeou tException: Query deadline is expired., query=SPARQL-QUERY: queryStr=PREFIX wikibase: <http://wikiba.se/ontology-beta#> PREFIX wd: <http://www.wikidata.org/entity/> SELE CT ?s ?p ?q ?qo WHERE { { ?s ?p ?c . ?c ?ps wd:Q17912672 . ?p wikibase:propertyValue ?ps . } OPTIONAL { ?c ?q ?qo . ?q a wikibase:Property . } } LIMIT 10000 java.util.concurrent.ExecutionException: java.util.concurrent.ExecutionException: org.openrdf.query.QueryInterruptedException: java.lang.RuntimeException: java.util.concu rrent.ExecutionException: java.lang.RuntimeException: java.util.concurrent.ExecutionException: com.bigdata.bop.engine.QueryTimeoutException: Query deadline is expired. Thus, it seems that the timeout was working. However, after some queries, I got a timeout in the client. After this first timeout every query is answered with a timeout. Cheers, Daniel |
From: Daniel H. <da...@de...> - 2016-04-24 17:38:12
|
Michael, I set swap off as you suggest. Then, I started the process with: java -Xmx6g -XX:+UseG1GC -Dbigdata.propertyFile=server.properties -jar blazegraph.jar Then, I check jstat -gcutil. Before receiving queries: S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 100.00 5.74 26.17 92.52 24 0.477 0 0.000 0.477 While receiving queries before the timeout: S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 100.00 46.00 76.97 99.26 13 0.269 0 0.000 0.269 After the first timeout: S0 S1 E O P YGC YGCT FGC FGCT GCT 0.00 100.00 22.67 85.96 96.57 248 2.289 0 0.000 2.289 I notice that values YGC, YGCT and GCT increases a lot. In this experiment it occurs at query 106. I shuffle the queries every time, so is not problem of queries, is something that occurs after some queries are executed. Cheers, Daniel |
From: Michael S. <ms...@me...> - 2016-04-24 16:57:25
|
Hi Daniel, if GC is really the bootleneck, you should verify that 1.) Swapping is disabled: sudo swapoff -a 2.) Swappiness is set to zero: http://askubuntu.com/questions/103915/how-do-i-configure-swappiness If that doesn’t help, you should try to understand in more detail what’s going on, I’d recommend running vmstat and jstat during your run and correlating the counters to your configuration: a.) vmstat -n 30 > vmstat.log, for instance, plots a vmstat output every 30s (containing information about memory, disk reads, CPU utilisation and the like) b.) For jstat, you need to pass in the PID as an argument (see http://docs.oracle.com/javase/7/docs/technotes/tools/share/jstat.html), I’d recommend jstat -gc or jstat -gcutil Then try correlate your values to the point in time when the timeout occurs. Best, Michael > On 24 Apr 2016, at 18:47, Daniel Hernández <da...@de...> wrote: > > First I run 30 queries and I got a timeout around query 20. With the option -XX:+UseG1GC I have solved issue. Then I increment the number of queries to 300. Now I got the first timeout around query 100. So the problem continues. > > The issue is not with queries. Queries are not complex. Most of them end in one second. > Also, I shuffle them in each experiment. The behavior is always the same, there is an instant after that all queries get timeout. > > Now I'm reading the GC documentation and testing with different parameters. > > Cheers, > Daniel > > El 24/04/16 a las 13:14, Bryan Thompson escribió: >> Great. However, if this was GC overhead, then there is likely a remaining issue with the queries or the configuration. I would not expect normal queries to incur that kind of GC overhead. >> >> Thanks, >> Bryan >> >> On Sun, Apr 24, 2016 at 12:12 PM, Daniel Hernández <da...@de...> wrote: >> I solved the problem of timeouts. The issue was generated by the garbage >> collector. I used the option -XX:+UseG1GC that is documented here: >> >> http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options >> >> >> Thanks to all! >> Daniel >> >> >> ------------------------------------------------------------------------------ >> Find and fix application performance issues faster with Applications Manager >> Applications Manager provides deep performance insights into multiple tiers of >> your business applications. It resolves application problems quickly and >> reduces your MTTR. Get your free trial! >> https://ad.doubleclick.net/ddm/clk/302982198;130105516;z >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z_______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
From: Daniel H. <da...@de...> - 2016-04-24 16:47:13
|
First I run 30 queries and I got a timeout around query 20. With the option -XX:+UseG1GC I have solved issue. Then I increment the number of queries to 300. Now I got the first timeout around query 100. So the problem continues. The issue is not with queries. Queries are not complex. Most of them end in one second. Also, I shuffle them in each experiment. The behavior is always the same, there is an instant after that all queries get timeout. Now I'm reading the GC documentation and testing with different parameters. Cheers, Daniel El 24/04/16 a las 13:14, Bryan Thompson escribió: > Great. However, if this was GC overhead, then there is likely a > remaining issue with the queries or the configuration. I would not > expect normal queries to incur that kind of GC overhead. > > Thanks, > Bryan > > On Sun, Apr 24, 2016 at 12:12 PM, Daniel Hernández <da...@de... > <mailto:da...@de...>> wrote: > > I solved the problem of timeouts. The issue was generated by the > garbage > collector. I used the option -XX:+UseG1GC that is documented here: > > http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options > > > Thanks to all! > Daniel > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with > Applications Manager > Applications Manager provides deep performance insights into > multiple tiers of > your business applications. It resolves application problems > quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > <mailto:Big...@li...> > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > |
From: Bryan T. <br...@bl...> - 2016-04-24 16:14:51
|
Great. However, if this was GC overhead, then there is likely a remaining issue with the queries or the configuration. I would not expect normal queries to incur that kind of GC overhead. Thanks, Bryan On Sun, Apr 24, 2016 at 12:12 PM, Daniel Hernández <da...@de...> wrote: > I solved the problem of timeouts. The issue was generated by the garbage > collector. I used the option -XX:+UseG1GC that is documented here: > > > http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options > > > Thanks to all! > Daniel > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications > Manager > Applications Manager provides deep performance insights into multiple > tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: Daniel H. <da...@de...> - 2016-04-24 16:12:52
|
I solved the problem of timeouts. The issue was generated by the garbage collector. I used the option -XX:+UseG1GC that is documented here: http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html#G1Options Thanks to all! Daniel |
From: Michael S. <ms...@me...> - 2016-04-23 08:15:01
|
Hi Daniel, >> where are these parsing errors coming from, they might indicate some other problems? And did you change log4j settings or the like, possibly causing the logger to write out too much information? > I guess that the errors are produced because when the server gets a > timeout, then an exception of the log4j is written in the end of the > response body. Thus, the response is not correctly parsed as JSON (that > is the format that I'm requesting). bit confused now: as I can see from the log, the parsing errors show up prior to the timeouts? Or am I misinterpreting the result logs? >> In case you’re still stuck, would it be an option to share the data (or an anonymised version thereof) + scripts for running the benchmark (or a set of generated queries that allow to reproduce the problem)? If setup is straightforward, I could give it a quick try on a my machine... > Thanks a lot for your offering. I haven't solve this issue yet. I will > try to generate data and queries that reproduce this issue. Sure, just drop me a line once you have a settings that allows me to reproduce the problem. Best, Michael |
From: Daniel H. <da...@de...> - 2016-04-22 22:10:22
|
Hi Michael, > where are these parsing errors coming from, they might indicate some other problems? And did you change log4j settings or the like, possibly causing the logger to write out too much information? I guess that the errors are produced because when the server gets a timeout, then an exception of the log4j is written in the end of the response body. Thus, the response is not correctly parsed as JSON (that is the format that I'm requesting). > In case you’re still stuck, would it be an option to share the data (or an anonymised version thereof) + scripts for running the benchmark (or a set of generated queries that allow to reproduce the problem)? If setup is straightforward, I could give it a quick try on a my machine... Thanks a lot for your offering. I haven't solve this issue yet. I will try to generate data and queries that reproduce this issue. Thanks! Daniel |
From: Michael S. <ms...@me...> - 2016-04-22 16:34:28
|
Hi Daniel, where are these parsing errors coming from, they might indicate some other problems? And did you change log4j settings or the like, possibly causing the logger to write out too much information? In case you’re still stuck, would it be an option to share the data (or an anonymised version thereof) + scripts for running the benchmark (or a set of generated queries that allow to reproduce the problem)? If setup is straightforward, I could give it a quick try on a my machine... Best, Michael > On 21 Apr 2016, at 17:59, Daniel Hernández <da...@de...> wrote: > > Brad, As you said that the thread pool size does not affect queries if I > am running them sequentially, then I have not proved modifying it yet. > Instead, I proved modifying the order of queries. I post the results here: > > https://gist.github.com/danielhz/6066f0136ede5f50b495acb9236b8e3c > > The first column is the id of the query (now I'm running only 50). > Queries are generated parametrically and are of the form > > SELECT * WHERE { P1 OPTIONAL P1 } LIMIT 10000 > > Where P1 and P2 are two basic graph patterns generated parametrically. > They are not complex basic graph patterns. In fact, each of them has no > more than 3 triples patterns. The data is huge. It includes 583 millions > of triples. > > The columns in the results are respectively the id of the query, the > time and the result status (200 if success) or an error explanation. In > the case of a timeout in the http client (timeout status), the time is > elapsed time for the query in the client. In the case of a timeout in > the server (parsing-error status), the body of the request truncated by > a Java exception message and thus not a valid JSON document. > > I the two experiments timeout there are always timeouts. Also, the query > that produces the timeout is not necessary complex. In fact, I have run > again only the query 48 (that produces the timeout when shuffling > queries). I takes 0.9 seconds and gets only 9 solutions. > > Note that between each experiment I restart Blazegraph, thus, I guess > that the query was not cached in RAM. Maybe disk pages where cached by > the system. > > Thanks, > Daniel > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers |
From: Daniel H. <da...@de...> - 2016-04-21 15:59:42
|
Brad, As you said that the thread pool size does not affect queries if I am running them sequentially, then I have not proved modifying it yet. Instead, I proved modifying the order of queries. I post the results here: https://gist.github.com/danielhz/6066f0136ede5f50b495acb9236b8e3c The first column is the id of the query (now I'm running only 50). Queries are generated parametrically and are of the form SELECT * WHERE { P1 OPTIONAL P1 } LIMIT 10000 Where P1 and P2 are two basic graph patterns generated parametrically. They are not complex basic graph patterns. In fact, each of them has no more than 3 triples patterns. The data is huge. It includes 583 millions of triples. The columns in the results are respectively the id of the query, the time and the result status (200 if success) or an error explanation. In the case of a timeout in the http client (timeout status), the time is elapsed time for the query in the client. In the case of a timeout in the server (parsing-error status), the body of the request truncated by a Java exception message and thus not a valid JSON document. I the two experiments timeout there are always timeouts. Also, the query that produces the timeout is not necessary complex. In fact, I have run again only the query 48 (that produces the timeout when shuffling queries). I takes 0.9 seconds and gets only 9 solutions. Note that between each experiment I restart Blazegraph, thus, I guess that the query was not cached in RAM. Maybe disk pages where cached by the system. Thanks, Daniel |
From: Brad B. <be...@bl...> - 2016-04-21 12:38:47
|
Daniel, Thank you. You can also use the -Djetty.overrideWebXml to alter the properties with the executable jar. See [1]. Another option would to use the tar.gz deployer that has the configuration unpacked. I hadn't noted the serial query execution. In this case, the thread pool size may not have an affect. You may want to review the queries to see why they may be causing heap pressure. For example, are there distinct or order by clauses in the queries that are causing the issues. [1] https://wiki.blazegraph.com/wiki/index.php/NanoSparqlServer#Customizing_the_web.xml On Thu, Apr 21, 2016 at 7:44 AM, Daniel Hernández <da...@de...> wrote: > > > Daniel, > > > > On other option to try is varying the queryThreadPoolSize in the > > web.xml parameters. > > I was using the bundled version of Blazegraph (i.e., I have dowloaded > only a JAR file). Now, I have checked out the 2.1.0 version from the > code. I guess that the web.xml file is the file > > bigdata-war-html/src/main/webapp/WEB-INF/web.xml > > The default queryThreadPoolSize is 16. What value do you recommend me? > > I am running queries sequentially, that is, each I send each query after > the previous has finished or after the timout. Maybe this way of running > the experiment is relevant to the queryThreadPoolSize. > > Thanks, > Daniel > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications > Manager > Applications Manager provides deep performance insights into multiple > tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Apache TinkerPop™ APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Daniel H. <da...@de...> - 2016-04-21 11:44:52
|
> Daniel, > > On other option to try is varying the queryThreadPoolSize in the > web.xml parameters. I was using the bundled version of Blazegraph (i.e., I have dowloaded only a JAR file). Now, I have checked out the 2.1.0 version from the code. I guess that the web.xml file is the file bigdata-war-html/src/main/webapp/WEB-INF/web.xml The default queryThreadPoolSize is 16. What value do you recommend me? I am running queries sequentially, that is, each I send each query after the previous has finished or after the timout. Maybe this way of running the experiment is relevant to the queryThreadPoolSize. Thanks, Daniel |
From: Brad B. <be...@bl...> - 2016-04-20 19:39:01
|
Daniel, On other option to try is varying the queryThreadPoolSize in the web.xml parameters. Thanks, --Brad <context-param> <description>The size of the thread pool used to service SPARQL queries -OR- ZERO (0) for an unbounded thread pool.</description> <param-name>queryThreadPoolSize</param-name> <param-value>16</param-value> </context-param> On Wed, Apr 20, 2016 at 3:30 PM, Daniel Hernandez <da...@de...> wrote: > Hi, > > I have run the experiments again, but now using different parameters in > the request: > > 1. With no options. > 2. With the option timeout=1. > 3. With the options timeout=1 and analytic=true. > > Simultaneously, I set a timeout in the HTTP client of 120 seconds. Then > I run the 500 queries. In the three cases queries have small time > responses, > but when some of them get a timeout then, all the following get a timeout. > In the case 1 the first timeout was produced in the query 37, in the second > in the query 39 and in the third in the query 41. It seems, that the > timeout > was not produced because these queries where specially difficult but > because a process that is initiated a while after starting running the > queries. > I guess that this process is the garbage collector. > > What can I do to avoid this issue and to improve the resource usage of the > machine? > > Cheers, > Daniel > > ---- On mié, 20 abr 2016 11:30:23 -0300 Daniel Henández <da...@de...> > wrote ---- > > Hi, > > > > I, running an experiment with 583 millions of triples loaded into > > Blazegraph 2.1.0. I have 500 queries generated randomly with similar > > complexity. I started running these queries with a timeout of 120 > > seconds in the http client. The first queries get small times around 2 > > seconds, except some that gets around 40 seconds. However, in query 37 I > > got a timeout. Then, all the following queries get timeouts. > > > > I guessed that this behavior was because the server was working in the > > previous query when the following queries arrive. That is, the server > > does not notice that the client have a timeout, so it is not waiting for > > a result. > > > > I found the timeout option in the documentation. Thus, now I'm running > > queries with the option &timeout=1 to check if this behavior was because > > running queries simultaneously. I starting getting only results in less > > than 1 second, and some trunked results with an internal timeout error > > message that truncate the JSON output. However, after some queries I get > > the timeout in the client again, and then all the following queries get > > timeout. > > > > Now I guess that a GC job that avoid receiving more queries by the > > server. I set the parameter -Xmx6g in the java command. The machine have > > 32GB of RAM and 6 cores. I do not put more memory in the heap, because > > the documentation does not recommend it. However, top says that the > > process has VIRT=6913m RES=781m and CPU=104%. That is, the process is > > underusing the machine resources. > > > > I found that I can use the option analytic to using the system heap > > instead of the java heap. The documentation said that this will solve > > the GC issue. > > > > Do you think that am I right in suppositions about the GC? > > > > Cheers, > > Daniel > > > > > > > ------------------------------------------------------------------------------ > > Find and fix application performance issues faster with Applications > Manager > > Applications Manager provides deep performance insights into multiple > tiers of > > your business applications. It resolves application problems quickly and > > reduces your MTTR. Get your free trial! > > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > > _______________________________________________ > > Bigdata-developers mailing list > > Big...@li... > > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > > > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications > Manager > Applications Manager provides deep performance insights into multiple > tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > -- _______________ Brad Bebee CEO Blazegraph e: be...@bl... m: 202.642.7961 w: www.blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Apache TinkerPop™ APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |
From: Daniel H. <da...@de...> - 2016-04-20 19:30:26
|
Hi, I have run the experiments again, but now using different parameters in the request: 1. With no options. 2. With the option timeout=1. 3. With the options timeout=1 and analytic=true. Simultaneously, I set a timeout in the HTTP client of 120 seconds. Then I run the 500 queries. In the three cases queries have small time responses, but when some of them get a timeout then, all the following get a timeout. In the case 1 the first timeout was produced in the query 37, in the second in the query 39 and in the third in the query 41. It seems, that the timeout was not produced because these queries where specially difficult but because a process that is initiated a while after starting running the queries. I guess that this process is the garbage collector. What can I do to avoid this issue and to improve the resource usage of the machine? Cheers, Daniel ---- On mié, 20 abr 2016 11:30:23 -0300 Daniel Henández <da...@de...> wrote ---- > Hi, > > I, running an experiment with 583 millions of triples loaded into > Blazegraph 2.1.0. I have 500 queries generated randomly with similar > complexity. I started running these queries with a timeout of 120 > seconds in the http client. The first queries get small times around 2 > seconds, except some that gets around 40 seconds. However, in query 37 I > got a timeout. Then, all the following queries get timeouts. > > I guessed that this behavior was because the server was working in the > previous query when the following queries arrive. That is, the server > does not notice that the client have a timeout, so it is not waiting for > a result. > > I found the timeout option in the documentation. Thus, now I'm running > queries with the option &timeout=1 to check if this behavior was because > running queries simultaneously. I starting getting only results in less > than 1 second, and some trunked results with an internal timeout error > message that truncate the JSON output. However, after some queries I get > the timeout in the client again, and then all the following queries get > timeout. > > Now I guess that a GC job that avoid receiving more queries by the > server. I set the parameter -Xmx6g in the java command. The machine have > 32GB of RAM and 6 cores. I do not put more memory in the heap, because > the documentation does not recommend it. However, top says that the > process has VIRT=6913m RES=781m and CPU=104%. That is, the process is > underusing the machine resources. > > I found that I can use the option analytic to using the system heap > instead of the java heap. The documentation said that this will solve > the GC issue. > > Do you think that am I right in suppositions about the GC? > > Cheers, > Daniel > > > ------------------------------------------------------------------------------ > Find and fix application performance issues faster with Applications Manager > Applications Manager provides deep performance insights into multiple tiers of > your business applications. It resolves application problems quickly and > reduces your MTTR. Get your free trial! > https://ad.doubleclick.net/ddm/clk/302982198;130105516;z > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > |
From: Daniel H. <da...@de...> - 2016-04-20 14:30:38
|
Hi, I, running an experiment with 583 millions of triples loaded into Blazegraph 2.1.0. I have 500 queries generated randomly with similar complexity. I started running these queries with a timeout of 120 seconds in the http client. The first queries get small times around 2 seconds, except some that gets around 40 seconds. However, in query 37 I got a timeout. Then, all the following queries get timeouts. I guessed that this behavior was because the server was working in the previous query when the following queries arrive. That is, the server does not notice that the client have a timeout, so it is not waiting for a result. I found the timeout option in the documentation. Thus, now I'm running queries with the option &timeout=1 to check if this behavior was because running queries simultaneously. I starting getting only results in less than 1 second, and some trunked results with an internal timeout error message that truncate the JSON output. However, after some queries I get the timeout in the client again, and then all the following queries get timeout. Now I guess that a GC job that avoid receiving more queries by the server. I set the parameter -Xmx6g in the java command. The machine have 32GB of RAM and 6 cores. I do not put more memory in the heap, because the documentation does not recommend it. However, top says that the process has VIRT=6913m RES=781m and CPU=104%. That is, the process is underusing the machine resources. I found that I can use the option analytic to using the system heap instead of the java heap. The documentation said that this will solve the GC issue. Do you think that am I right in suppositions about the GC? Cheers, Daniel |