1. Summary
  2. Files
  3. Support
  4. Report Spam
  5. Create account
  6. Log in

Ticket #448 (closed enhancement: fixed)

Opened 17 months ago

Last modified 15 months ago

SPARQL 1.1 Update

Reported by: thompsonbry Owned by: thompsonbry
Priority: major Milestone: Query
Component: Bigdata RDF Database Version: BIGDATA_RELEASE_1_1_0
Keywords: Cc: mrpersonick

Description

Add support for SPARQL 1.1 Update operations.

Change History

Changed 16 months ago by thompsonbry

  • status changed from new to accepted

Changed 16 months ago by thompsonbry

Checkpoint. I have developed the AST model for most of the UPDATE operators and provided the parser integration, and unit tests of that integration, for those operators as well. The last operator that I need to handle is the DELETE/INSERT operator.

I introduced a new abstraction which unifies the ConstructNode? with the concept of QuadsData?.

Committed revision r6104.

Changed 15 months ago by thompsonbry

Checkpoint on SPARQL 1.1 UPDATE support. This commit provides for the parser to bigdata AST translation and full test suite coverage for those AST translation targets. It does not provide any evaluation semantics for UPDATE.

Committed revision r6108.

Changed 15 months ago by thompsonbry

Wrote the first bootstrap unit tests for SPARQL UPDATE. We now have operators which batch resolve RDF Values to IVs and then batch insert statements formed from those IVs onto the database and finally commit the database. This is the kernel of an update plan and went pretty smoothly. Remove of statements should be a trival variation on this plan. Likewise, LOAD should be staightforward.

Committed revision r6154.

Changed 15 months ago by thompsonbry

More work on SPARQL UPDATE support. Refactored to derive an operator to REMOVE statements from the same base class as the operator to ADD statements. Added support for reporting mutation counts on relations through the bop statistics and extended that support through the painting of the EXPLAIN page. Added the ability to identify vocabulary IVs via isVocabulary(). LexiconRelation#addTerms?() now reports the #of terms added (or the #of terms which would have been added if invoked with readOnly=true).

Committed revision r6155.

Changed 15 months ago by thompsonbry

I have added a pipeline operator to parse RDF data. It feeds the data into the solutions flowing through the query engine with (s,p,o,c) bindings. There are a LOT of configuration options.

This only supports triples and quads right now. It does not support truth maintenance or SIDS. In order to support those things we need to refactor some of the truth maintenance and SIDs resolution logic in order to fit into a pipelined update plan.

The ParseOp? feeds bindings which are compatible with the ChunkedResolutionOp? (add/resolve IVs) and the InsertStatementsOp?. The SPARQL "LOAD" command is thus modeled by:

ParseOp? => ChunkedResolutionOp? => InsertStatementsOp? => CommitOp?

However, that "plan" only works for triples and quads without inference. It does not work for SIDS (there is no step to do SIDS resolution) and it does not work for truth maintenance. We need to look at how our existing support classes for those things (StatementBuffer?, TruthMaintenance?, and DataLoader?) can be refactored to provide that support for UPDATE plans.

Committed revision r6157.

Changed 15 months ago by thompsonbry

I have integrated several of the basic operations such that they can now be executed directly from SPARQL. There is also some measure of test coverage for these operations.

- INSERT DATA
- REMOVE DATA
- CLEAR/DROP
- CREATE

There remains significant confusion about the "nullGraph" (an openrdf concept) and the data set is not yet attached to the AST so anything which operates on a specific data set will fail.

Truth maintenance and SIDs mode support are both lacking, and both need test suites as well.

Committed revision r6160.

Changed 15 months ago by thompsonbry

Checkpoint adds coverage for the graph add/move/copy operations and insert/where and delete/where. The main problem at this point is getting the default and named graph context stuff right. I also have not done the most general case for delete+insert. I want to review the atomicity requirements for the WHERE clause for that operation.

Committed revision r6163.

Changed 15 months ago by thompsonbry

Checkpoint on SPARQL UPDATE. Everything except the more complex DELETE+INSERT and the DELETE WHERE shortcut is working now.

Committed revision r6164.

Changed 15 months ago by thompsonbry

Fixed one more SPARQL UPDATE test.

The remaining issues appear to be:

- Handling blank nodes in the template for the INSERT clause and the DELETE clause of a DELETE/INSERT request.

- Interpreting the WITH, USING, and USING FROM keywords (these indicate the data set for DELETE/INSERT requests)

- The general case for DELETE/WHERE when both the INSERT and DELETE clause are specified. The UPDATE spec specifies that the WHERE clause is evaluated, the bindings substituted into the DELETE clause, and then the bindings substituted into the INSERT clause. For large solution sets we obviously will want to materialize the bindings on an HTree.

- The DELETE WHERE shortcut. We just need to generate the appropriate CONSTRUCT node from the WHERE clause.

Committed revision r6165.

Changed 15 months ago by thompsonbry

Fixed support for WITH for SPARQL UPDATE DELETE/INSERT. There still appears to be a problem related to the data set and also to the default graph as interpreted within a QuadPattern? (the DELETE clause and INSERT clause of a DELETE/INSERT clause).

Committed revision r6166.

Changed 15 months ago by thompsonbry

Also handling the general DELETE/INSERT case in r6166. The current outstanding issues are:

- Handling blank nodes in the template for the INSERT clause and the DELETE clause of a DELETE/INSERT request.

- Interpreting the USING and USING FROM keywords (these indicate the data set for DELETE/INSERT requests). The data set is not getting interpreted properly for DELETE/INSERT. It is being attached and updated on the UpdateRoot? for each Update operation. I wonder if it should be on the Update AST node instead.

- The DELETE WHERE shortcut. We just need to generate the appropriate CONSTRUCT node from the WHERE clause.

- There is a problem with full read/write tx support (versus the unisolated writer).

Changed 15 months ago by thompsonbry

We are now passing all of the openrdf unit tests for UPDATE and UPDATE can now be used from the bigdata/openrdf APIs. However, it is not yet integrated with the NanoSparqlServer?.

- Built out the test suite for USING, USING NAMED, and WITH.

- Modified DatasetNode? to allow unknown IVs for UPDATE. Unlike query, for update those IVs might be created by the operation and hence can not be pruned.

- Moved the DatasetNode? for UPDATE from the UpdateRoot? to the DeleteInsertNode?. The data set can only be specified for the general case of DELETE/INSERT. The data set is specific to the DELETE/INSERT operation to which it is attached (it is not inherited or combined with the data set for later operations in a sequence).

- Added documentation and argument checking to the core

StatementPatternNode? constructor.

- Fixed the "DELETE WHERE" shortcut.

- The code path for the INSERT and DELETE clause now treats blank nodes in the template as unbound variables.

- Fixed the bug when full transactions were being used (it was not going through the transaction for clearGraph).

Remaining issues include:

- Expose UPDATE at the HTTP SPARQL end point.

- includeInferred is not being passed through for UPDATE. See ASTEvalHelper and BigdataSailUpdate?.

- We do not yet have a SIDS mode test suite for UPDATE.

Committed revision r6167.

Changed 15 months ago by thompsonbry

Exposed a formerly private method which was causing compile errors in CI.

Committed revision r6168.

Changed 15 months ago by thompsonbry

Integrated SPARQL UPDATE into the NanoSparqlServer?. I have only added a few tests so far and those tests are only in QUADS mode, but it is running. The NSS SPARQL UPDATE tests are not enabled in CI yet because they are quads specific. I also added support for SPARQL UPDATE to the RemoteRepository? class (the java remote client).

The RemoteRepository? was modified to use the given serviceURL as is rather than appending /sparql to the serviceURL. This change is necessary in case there is a URL rewriter in front of the SPARQL end point.

Reconciled changes with MikeP's recent commit of multi-part MIME support. This included changing the url query parameter for POST of multi-part MIME to avoid a conflict with the SPARQL UPDATE HTTP binding.

Committed revision r6169.

Changed 15 months ago by thompsonbry

  • The includeInferred attribute was not being passed through for UPDATE. It is now.
  • Bug fix for CONSTRUCT WHERE shortcut. Added new test cases for this. Fix was to the BigdataExprBuilder?. It now generates the CONSTRUCT template from the WHERE clause if none was given. This should also fix some of the negative parser tests which were failing. Also fixed a problem where a CONSTRUCT query could visit a Statement in the template even though the WHERE clause failed. The template ground triples are now output only if the WHERE clause succeeds (at least one solution, even if it is empty). Also, fixed a related bug in the SPARQL UPDATE "DELETE WHERE" short cut pattern where it was failing to clone the StatementPatternNode?.
  • Added getStatements() to the RemoteRepository?. It is excercised by the SPARQL UPDATE test suite.
  • Built out the SPARQL UPDATE test suite for QUADS. We are still lacking a test suite for TRIPLEs and SIDs. There is also no unit test yet for LOAD.
  • Added documentation for SPARQL 1.1 UPDATE support to the Wiki.

@see http://sourceforge.net/apps/mediawiki/bigdata/index.php?title=NanoSparqlServer
@see http://sourceforge.net/apps/trac/bigdata/ticket/448 (SPARQL 1.1 UPDATE)
@see http://sourceforge.net/apps/trac/bigdata/ticket/520 (CONSTRUCT WHERE shortcut)

Committed revision r6170.

Changed 15 months ago by thompsonbry

- Set the default RDFFormat for LOAD to RDF/XML (fallback).

- Added unit tests for LOAD in quads mode.

- RemoteRepository? now uses a shared connection manager by default and will use the caller's connection manager if you use the appropriate constructor.

Committed revision r6171

Changed 15 months ago by thompsonbry

  • status changed from accepted to closed
  • resolution set to fixed

Basic SPARQL UPDATE support is complete. An issue has been filed for some optimizations [1].

[1] http://sourceforge.net/apps/trac/bigdata/ticket/522 (SPARQL UPDATE should not materialize RDF Values)

Note: See TracTickets for help on using tickets.