1. Summary
  2. Files
  3. Support
  4. Report Spam
  5. Create account
  6. Log in

Ticket #501 (closed defect: fixed)

Opened 15 months ago

Last modified 15 months ago

SPARQL 1.1 BINDINGS are ignored

Reported by: thompsonbry Owned by: thompsonbry
Priority: major Milestone:
Component: Bigdata SAIL Version: BIGDATA_RELEASE_1_1_0
Keywords: Cc: mrpersonick

Description (last modified by thompsonbry) (diff)

The SPARQL parser recognizes and extracts this information and attaches it to the AST, but the query plan generator does not yet use the attached bindings. The simplest way to integrate those BINDINGS is to pump them into a named solution set and then INCLUDE that solution set within the top-level WHERE clause.

However, bigdata accepts IBindingSet[]s pretty much everywhere. In fact, they are accepted at the IRunningQuery, the ASTOptimizers, etc. Just not ASTEvalHelper. The more general solution is therefore to simply accept multiple solutions into the query, do the static analysis of those solutions (the SolutionStats? class) and then attach that analysis to the QueryRoot?. The SolutionStats? can then be leveraged in the ASTOptimizers.

The openrdf platform allows the caller to specify a single BindingSet? as input to the query. That pre-existing API needs to be reconciled with a BindingSet?[] flowing into the query through the Bindings clause. I think that the right way to reconcile these things is to treat the caller given BindingSet? as a constraint which must be applied to every solution in the BINDINGS clause. If this results in conflicting bindings for a given source solution from the BINDINGS clause, then there are no solutions to the query for that source solution.

Thus, another way to look at this is that the BINDINGS clause attached to the QueryRoot? *replaces* the BindingSet? or IBindingSet flowing into the query. They are basically different approaches to capturing the same information and just need to be reconciled.

See https://sourceforge.net/apps/trac/bigdata/ticket/412 (StaticAnalysis#getDefinitelyBound?() ignores exogenous variables)
See https://sourceforge.net/apps/trac/bigdata/ticket/449 (SPARQL 1.1 Federation)
See https://sourceforge.net/apps/trac/bigdata/ticket/267 (Support evaluation of 3rd party operators)

Change History

Changed 15 months ago by thompsonbry

  • status changed from new to accepted

Changed 15 months ago by thompsonbry

  • description modified (diff)

Changed 15 months ago by thompsonbry

  • status changed from accepted to closed
  • resolution set to fixed

The BINDINGS clause is now obeyed. More work on federated query support. The remaining issue is deferring SERVICE calls where the service reference is a variable.

The static analysis of exogenous variables issue (https://sourceforge.net/apps/trac/bigdata/ticket/412) remains open. I have not investigated optimizations there yet.

- Renamed some methods on the IServiceOptions interface and some

implementations of that interface in order to reduce confusion
between internal bigdata versus internal openrdf services versus
remote SPARQL services.

- Modified the named solution set operators (JVMNamedSubqueryOp and

HTreeNamedSubqueryOp) to vector all source solutions into the named
subquery. These operators now verify that they are configured for
"at-once" evaluation, thus ensuring that any BINDINGS clause is
fully passed through into the named subquery by the operator.

- Modified ASTEvalHelper and AST2BOpUtility to process the BINDINGS

clause. If the openrdf API specifies a non-empty BindingSet? and a
BINDINGS clause was also given, then we do a simple JOIN of those
solutions. This is always a [1 x N] join and will have at most N
solutions. Solutions which do not join are dropped. They represent
a conflict between the openrdf API given bindings and the BINDINGS
clause. The remaining solutions are vectored into the query. The
various tests which rely on the BINDINGS clause now pass.

Note: I have not yet revisited the AST optimizations for the
exogenous variables.

- Added AT_ONCE query hint. This indicates that the corresponding

operator will be marked as !pipelined. All source solutions for
that operator will be buffered before the operator is evaluated and
the operator will be evaluated exactly once. Added unit tests for
this operator for PipelineJoin? and ServiceCallJoin?.

- Added CHUNK_SIZE query hint. This is just a well known name for the

BufferAnnotations?.CHUNK_CAPACITY and duplicates the existing
ChunkCapacityQueryHint?, making it more convienent to override the
vector size for an operator.

Added unit tests to verify that CHUNK_SIZE is correctly applied to
PipelineJoin? and ServiceCallJoin?.

- ASTServiceNodeOptimizer has been modified at least temporarily to

NOT lift out SERVICE calls into a named subquery unless the SERVICE
reference is a constant which is the bigdata internal search
service. I want to think about more general purpose ways of
handling this. E.g., by registering a service as "runOnce".
However, it may be that the most general way to handle this is to
specify the service as "at-once" (which is in fact the default for a
Service).

See https://sourceforge.net/apps/trac/bigdata/ticket/449 (SPARQL 1.1 Federated Query)

Changed 15 months ago by thompsonbry

Committed revision r6080. (for the above comment)

Note: See TracTickets for help on using tickets.