From: Bryan T. <br...@sy...> - 2016-04-11 10:36:53
|
This is because the journal can not be opened from two separate processes at the same time. I can discuss this with Olaf. Thanks, Bryan On Monday, April 11, 2016, Blaise de Carné <bde...@gm...> wrote: > Hi there, > > It works when the Blazegraph server is not running. We can't get it work > when the NanoSparqlServer is running, we get this error : > > org.eclipse.jetty.servlet.ServletHolder$1: org.linkeddatafragments.exceptions.DataSourceCreationException: java.lang.RuntimeException: file=blazegraph.jnl > > Best, > Blaise > > 2016-04-10 17:07 GMT+02:00 Bryan Thompson <br...@sy... > <javascript:_e(%7B%7D,'cvml','br...@sy...');>>: > >> Blaise, >> >> Please confirm that you can simply reconfigure to access an existing >> Journal file. This should work. >> >> Thanks, >> Bryan >> >> ---- >> Bryan Thompson >> Chief Scientist & Founder >> Blazegraph >> e: br...@bl... >> <javascript:_e(%7B%7D,'cvml','br...@bl...');> >> w: http://blazegraph.com >> >> Blazegraph products help to solve the Graph Cache Thrash to achieve large >> scale processing for graph and predictive analytics. Blazegraph is the >> creator of the industry’s first GPU-accelerated high-performance database >> for large graphs, has been named as one of the “10 Companies and >> Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >> >> >> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high >> performance graph database that supports both RDF/SPARQL and >> Tinkerpop/Blueprints APIs. Blazegraph GPU >> <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS >> <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive >> new technologies that use GPUs to enable extreme scaling that is thousands >> of times faster and 40 times more affordable than CPU-based solutions. >> >> CONFIDENTIALITY NOTICE: This email and its contents and attachments are >> for the sole use of the intended recipient(s) and are confidential or >> proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, >> disclosure, dissemination or copying of this email or its contents or >> attachments is prohibited. If you have received this communication in >> error, please notify the sender by reply email and permanently delete all >> copies of the email and its contents and attachments. >> >> On Sat, Apr 9, 2016 at 1:02 AM, Olaf Hartig <oh...@uw... >> <javascript:_e(%7B%7D,'cvml','oh...@uw...');>> wrote: >> >>> Hi Braise, >>> >>> I think you can do it. Although I have not tested this use case, I do >>> not see why it would not be possible. Just point the config.json to the >>> journal file. >>> >>> Best, >>> Olaf >>> >>> >>> On April 9, 2016 12:41:37 AM GMT+02:00, "Blaise de Carné" < >>> bde...@gm... <javascript:_e(%7B%7D,'cvml','bde...@gm...');>> >>> wrote: >>>> >>>> Hi Olaf, >>>> >>>> Yes, we already took a look on your implementation. It looks good, but >>>> we can't use it on a journal that is already used for the SPARQL Endpoint, >>>> am i wrong ? >>>> >>>> Blaise >>>> >>>> Le ven. 8 avr. 2016 à 16:20, Olaf Hartig <oh...@uw... >>>> <javascript:_e(%7B%7D,'cvml','oh...@uw...');>> a écrit : >>>> >>>>> Dear Blaise, >>>>> >>>>> As Michael mentioned, I implemented a TPF interface directly on top of >>>>> Blazegraph. This implementation uses directly the Blazegraph internals >>>>> and, >>>>> thus, avoids the overhead of forwarding every TPF request to the SPARQL >>>>> endpoint interface (as would be done by using the standard TPF server >>>>> implementation). >>>>> >>>>> Find the original source code here: >>>>> >>>>> https://github.com/hartig/BlazegraphBasedTPFServer >>>>> >>>>> ...and note that this TPF interface is included in the official 2.0 >>>>> release of >>>>> Blazegraph: >>>>> >>>>> http://search.maven.org/#search|ga|1|a%3A%22BlazegraphBasedTPFServer%22 >>>>> <http://search.maven.org/#search%7Cga%7C1%7Ca%3A%22BlazegraphBasedTPFServer%22> >>>>> >>>>> Cheers, >>>>> Olaf >>>>> >>>>> >>>>> >>>>> On Friday 08 April 2016 15:36:48 Michael Schmidt wrote: >>>>> > In response to the request from the bigdata-commit (see below), >>>>> please let’s >>>>> > resume the discussion on this place: >>>>> > >>>>> > Determinism is not guaranteed unless parallelism is explicitly >>>>> disabled — >>>>> > this even holds for select queries. There are several potential >>>>> sources for >>>>> > non-determinism: in the general case, Blazegraph may choose to run >>>>> multiple >>>>> > parallel threads for a given operator (processing different chunks >>>>> of data >>>>> > in parallel), and in some cases operators also use multiple threads >>>>> > internally. >>>>> > >>>>> > For the given query at hand, the single triple pattern access path >>>>> will >>>>> > yield results in order, but this order actually might be destroyed >>>>> by other >>>>> > operators on top. The projection operator, for instance, does not >>>>> guarantee >>>>> > order in the general case, as it might process data in different >>>>> threads. >>>>> > The way to achieve determinism would be to explicitly disable this >>>>> > parallelism. In fact, this is what Blazegraph is doing when >>>>> projecting for >>>>> > queries that have an ORDER BY clause. Code-wise, a good starting >>>>> point is >>>>> > in AST2BOpUtility, starting at line 579: >>>>> > >>>>> > <snip> >>>>> > if (projection != null) { >>>>> > >>>>> > /** >>>>> > * The projection after the ORDER BY needs to >>>>> preserve the ordering. >>>>> > * So does the chunked materialization >>>>> operator. The code above >>>>> > * handles this for ORDER_BY + DISTINCT, but >>>>> does not go far enough >>>>> > * to impose order preserving evaluation on >>>>> the PROJECTION and >>>>> > * chunked materialization, both of which are >>>>> downstream from the >>>>> > * ORDER_BY operator. >>>>> > * >>>>> > * @see #1044 (PROJECTION after ORDER BY does >>>>> not preserve order) >>>>> > */ >>>>> > final boolean preserveOrder = orderBy != null; >>>>> > >>>>> > /* >>>>> > * Append operator to drop variables which are not >>>>> projected by >>>>> > the * subquery. >>>>> > * >>>>> > * Note: We need to retain all variables which were >>>>> visible in >>>>> > the * parent group plus anything which was projected out of the * >>>>> subquery. >>>>> > Since there can be exogenous variables, the easiest way * to do this >>>>> > correctly is to drop variables from the subquery plan * which are not >>>>> > projected by the subquery. (This is not done at the * top-level >>>>> query plan >>>>> > because it would cause exogenous variables * to be dropped.) >>>>> > */ >>>>> > >>>>> > { >>>>> > // The variables projected by the >>>>> subquery. >>>>> > final IVariable<?>[] projectedVars = >>>>> projection >>>>> > .getProjectionVars(); >>>>> > >>>>> > final List<NV> anns = new >>>>> LinkedList<NV>(); >>>>> > anns.add(new >>>>> NV(BOp.Annotations.BOP_ID, ctx.nextId())); >>>>> > anns.add(new >>>>> NV(BOp.Annotations.EVALUATION_CONTEXT, >>>>> > BOpEvaluationContext.CONTROLLER)); anns.add(new >>>>> > NV(PipelineOp.Annotations.SHARED_STATE, true));// live stats >>>>> anns.add(new >>>>> > NV(ProjectionOp.Annotations.SELECT, projectedVars)); if >>>>> (preserveOrder) { >>>>> > /** >>>>> > * @see #563 (ORDER BY + >>>>> DISTINCT) >>>>> > * @see #1044 (PROJECTION >>>>> after ORDER BY does not preserve >>>>> > * order) >>>>> > */ >>>>> > anns.add(new >>>>> NV(PipelineOp.Annotations.MAX_PARALLEL, 1)); >>>>> > anns.add(new >>>>> NV(SliceOp.Annotations.REORDER_SOLUTIONS, false)); >>>>> > } >>>>> > left = applyQueryHints(new >>>>> ProjectionOp(leftOrEmpty(left),// >>>>> > anns.toArray(new >>>>> NV[anns.size()])// >>>>> > ), queryBase, ctx); >>>>> > } >>>>> > </snip> >>>>> > >>>>> > If the preserve order flag is true, parallelism for the operator is >>>>> > explicitly disabled. Disabling parallelism for the projection node >>>>> would >>>>> > help for simple queries such as single triple pattern, but in the >>>>> general >>>>> > case (for more complex queries) there will be other operators that >>>>> might >>>>> > cause non-deterministic behaviour. >>>>> > >>>>> > @Olaf Hartig (CC) implemented a Linked Data Fragment interface on >>>>> top of >>>>> > Blazegraph, adding him in CC. >>>>> > >>>>> > >>>>> > Best, >>>>> > Michael >>>>> > >>>>> > > From: Blaise de Carné <bde...@gm... >>>>> <javascript:_e(%7B%7D,'cvml','bde...@gm...');>> >>>>> > > Subject: [Bigdata-commit] Pagination consistency without ORDER BY >>>>> > > Date: 8 April 2016 at 10:58:02 GMT+2 >>>>> > > To: "big...@li... >>>>> <javascript:_e(%7B%7D,'cvml','big...@li...');> >>>>> " >>>>> > > <big...@li... >>>>> <javascript:_e(%7B%7D,'cvml','big...@li...');> >>>>> > >>>>> > > >>>>> > > Hi there, >>>>> > > >>>>> > > I would like to expose a considiration that I find very annoying. >>>>> I need >>>>> > > to do more tests but i would like to know your fellings about it. >>>>> > > >>>>> > > Look for this exemple : >>>>> > > >>>>> > > construct where { >>>>> > > >>>>> > > ?s <http://geovocab.org/geometry#geometry >>>>> > > <http://geovocab.org/geometry#geometry>> ?event> >>>>> > > } limit 5 >>>>> > > >>>>> > > It take avout 100ms to execute on my 3B dataset. >>>>> > > >>>>> > > In 90% of time, this give me 5 results in the same order : >>>>> > > >>>>> > > <http://linkedgeodata.org/triplify/node1003406722> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1003406722>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1003 >>>>> > > 406722> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1003406722>> >>>>> > > <http://linkedgeodata.org/triplify/node1003749425> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1003749425>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1003 >>>>> > > 749425> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1003749425>> >>>>> > > <http://linkedgeodata.org/triplify/node1011261499> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1011261499>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1011 >>>>> > > 261499> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1011261499>> >>>>> > > <http://linkedgeodata.org/triplify/node1011261514> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1011261514>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1011 >>>>> > > 261514> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1011261514>> >>>>> > > <http://linkedgeodata.org/triplify/node1011286717> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1011286717>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1011 >>>>> > > 286717> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1011286717>> But sometime, i get >>>>> differents >>>>> > > results : >>>>> > > >>>>> > > <http://linkedgeodata.org/triplify/node1204787784> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1204787784>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1204 >>>>> > > 787784> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1204787784>> >>>>> > > <http://linkedgeodata.org/triplify/node1206798938> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1206798938>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1206 >>>>> > > 798938> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1206798938>> >>>>> > > <http://linkedgeodata.org/triplify/node12081506> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node12081506>> >>>>> <http://geovocab.org/geometry#geom >>>>> > > etry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1208 >>>>> > > 1506> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node12081506>> >>>>> > > <http://linkedgeodata.org/triplify/node1209197022> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1209197022>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1209 >>>>> > > 197022> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1209197022>> >>>>> > > <http://linkedgeodata.org/triplify/node1212230478> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/triplify/node1212230478>> < >>>>> http://geovocab.org/geometry#ge >>>>> > > ometry> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://ge >>>>> > > ovocab.org/geometry#geometry>> >>>>> <http://linkedgeodata.org/geometry/node1212 >>>>> > > 230478> >>>>> > > <http://ns3027589.ip-149-202-90.eu:9999/blazegraph/#explore:kb:< >>>>> http://li >>>>> > > nkedgeodata.org/geometry/node1212230478>> >>>>> > > >>>>> > > Conclusion : order is not garantee without ORDER BY. If i use an >>>>> ORDER BY, >>>>> > > performance drop alarmingly. >>>>> > > >>>>> > > Now take this fabulous project : Linked Data Fragments >>>>> > > (http://linkeddatafragments.org/ <http://linkeddatafragments.org/ >>>>> >), >>>>> > > which provide a SparqlDatasource to handle data from a SPARQL >>>>> Endpoint. >>>>> > > They use CONSTRUCT queries with LIMIT and OFFSET to paginate the >>>>> results, >>>>> > > as they says in the comments : >>>>> > > >>>>> > > // Even though the SPARQL spec indicates that >>>>> > > // LIMIT and OFFSET might be meaningless without ORDER BY, >>>>> > > // this doesn't seem a problem in practice. >>>>> > > // Furthermore, sorting can be slow. Therefore, don't sort. >>>>> > > >>>>> > > But it's a problem in practice with Blazegraph, and i >>>>> exeperimented it : a >>>>> > > Linked Data Fragments server configured over a Blazegraph SPARQL >>>>> Endpoint >>>>> > > serve different pages in 5-10% of time. >>>>> > > >>>>> > > In our project we really need to get consistent pagination, >>>>> without ORDER >>>>> > > BY. Do you think that is possible with Blazegraph ? >>>>> > > >>>>> > > Bests, >>>>> > > Blaise >>>>> > > >>>>> > > PS : i don't see this behaviour with SELECT, but cache could be >>>>> > > responsible... >>>>> >>>> >>> -- >>> Sent from my Android device with K-9 Mail. Please excuse my brevity. >>> >>> >>> ------------------------------------------------------------------------------ >>> Find and fix application performance issues faster with Applications >>> Manager >>> Applications Manager provides deep performance insights into multiple >>> tiers of >>> your business applications. It resolves application problems quickly and >>> reduces your MTTR. Get your free trial! http://pubads.g.doubleclick.net/ >>> gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532 >>> <http://pubads.g.doubleclick.net/gampad/clk?id=1444514301&iu=/ca-pub-7940484522588532> >>> _______________________________________________ >>> Bigdata-developers mailing list >>> Big...@li... >>> <javascript:_e(%7B%7D,'cvml','Big...@li...');> >>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>> >> -- ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. |