[Bigdata-developers] Performance impact of Named Subqueries

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

If I understand correctly from the wiki docs, named subqueries should help
by reusing results computed, but I’m noticing some very significant
performance difference between a query using named subqueries and the same
query but inlining the named subqueries, both queries return the same
results but the latter performs at least an order of magnitude better than
the former. I’ve seen this in several queries now:

              SPARQLBenchmark
        ---------------

                      name | result | rank | runs |  mean |     sd
        -------------------|--------|------|------|-------|--------
                    query1 |   PASS |    1 |    3 | 3.497 |  1.195
                    query2 |   PASS |    2 |    3 | 16.03 |  1.448
                    query3 |   PASS |    3 |    3 |  21.6 |  1.386
              named query2 |   PASS |    4 |    3 | 386.9 |  3.411
              named query3 |   PASS |    5 |    3 | 397.5 |   6.31
              named query1 |   PASS |    6 |    3 | 827.3 | 0.7966

        Each of the above 18 runs were run in random, non-consecutive
order. Mean times in seconds.

So I guess the questions are:
What’s known the performance impact and if what I’m experiencing is a known
behavior?
Is it that named subqueries are blocking and when inlining the subqueries
the results can be streamed?

Cheers,
Edgar

[Bigdata-developers] Performance impact of Named Subqueries

Fast, scalable, robust graph database platform

[Bigdata-developers] Performance impact of Named Subqueries