Re: [Bigdata-developers] Performance impact of Named Subqueries

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hello Bryan, Edgar,

there is already a query hint for pipelined (non-blocking) hash joins. For instance, when using

<snip>
SELECT * WHERE {
  ?s <http://p1> ?o1
  {
    SELECT * WHERE {
      ?s <http://p2> ?o2 .
      ?s <http://p3> ?o3 .
    }
  }
  hint:Prior hint:pipelinedHashJoin "true" .  
}
</snip>

the inner SELECT will be executed in a pipelined fashion. The hint works also for complex OPTIONALs, EXISTS, and VALUES clauses. Note that we do not use pipelining for named subqueries (i.e., %INCLUDE … patterns) at time being.

I’ve added documentation to the Wiki at https://wiki.blazegraph.com/wiki/index.php/QueryHints <https://wiki.blazegraph.com/wiki/index.php/QueryHints>. 

Best,
Michael

> On 17 Jun 2016, at 23:46, Bryan Thompson <br...@bl...> wrote:
> 
> Named subqueries are run first in a bottom up evaluation style. Normal subqueries are evaluated as-bound using left to right evaluation. So the plans are quite different. There are now non blocking subquery hash joins that are used if a limit is specified. So they can be turned on using a very large limit. We plan to have a query hint for that soon...
> 
> On Jun 17, 2016 5:40 PM, "Edgar Rodriguez-Diaz" <ed...@sy... <mailto:ed...@sy...>> wrote:
> Hi,
> 
> If I understand correctly from the wiki docs, named subqueries should help by reusing results computed, but I’m noticing some very significant performance difference between a query using named subqueries and the same query but inlining the named subqueries, both queries return the same results but the latter performs at least an order of magnitude better than the former. I’ve seen this in several queries now:
> 
>               SPARQLBenchmark
>         ---------------
>         
>                       name | result | rank | runs |  mean |     sd
>         -------------------|--------|------|------|-------|--------
>                     query1 |   PASS |    1 |    3 | 3.497 |  1.195
>                     query2 |   PASS |    2 |    3 | 16.03 |  1.448
>                     query3 |   PASS |    3 |    3 |  21.6 |  1.386
>               named query2 |   PASS |    4 |    3 | 386.9 |  3.411
>               named query3 |   PASS |    5 |    3 | 397.5 |   6.31
>               named query1 |   PASS |    6 |    3 | 827.3 | 0.7966
>               
>         Each of the above 18 runs were run in random, non-consecutive order. Mean times in seconds.
> 
> So I guess the questions are: 
> What’s known the performance impact and if what I’m experiencing is a known behavior? 
> Is it that named subqueries are blocking and when inlining the subqueries the results can be streamed?
> 
> Cheers, 
> Edgar
> 
> 
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are
> consuming the most bandwidth. Provides multi-vendor support for NetFlow,
> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
> reports. http://sdm.link/zohomanageengine <http://sdm.link/zohomanageengine>
> _______________________________________________
> Bigdata-developers mailing list
> Big...@li... <mailto:Big...@li...>
> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers>
> 
> ------------------------------------------------------------------------------
> What NetFlow Analyzer can do for you? Monitors network bandwidth and traffic
> patterns at an interface-level. Reveals which users, apps, and protocols are 
> consuming the most bandwidth. Provides multi-vendor support for NetFlow, 
> J-Flow, sFlow and other flows. Make informed decisions using capacity planning
> reports. http://sdm.link/zohomanageengine_______________________________________________
> Bigdata-developers mailing list
> Big...@li...
> https://lists.sourceforge.net/lists/listinfo/bigdata-developers

Re: [Bigdata-developers] Performance impact of Named Subqueries

Fast, scalable, robust graph database platform

Re: [Bigdata-developers] Performance impact of Named Subqueries