From: Bryan T. <br...@sy...> - 2015-04-24 16:35:32
|
Rick, I would recommend that you model your problem a bit differently. I will give some suggestions as to how you might do this, but first let me explain how we handle such moves and storage reclamation. - Blazegraph does a COPY + DELETE for SPARQL UPDATE "MOVE". You might be able to hack this. I will outline how below. - Blazegraph recycles storage. This is documented in some depth on the wiki, but the basic concept is that allocation slots are recycled once they no longer have data that is visible from a retained commit point. Let me suggest some ways in which you might achieve your goals without a performance penalty. As I see it, you are basically trying to change the state associated with a named graph as you move it along in some workflow. The first two options would require you to manage metadata (in yet another graph) mapping workflow state URIs onto fixed URIs associated with a named graph. When you change the workflow state, you are just changing the mapping between the external URI and the fixed URI naming the graph internally. Either of these approaches would give you constant time "renames". 1. Use named graphs. But, per above, do the rename outside of the quads store. You can either use a special named graph to old this mapping or you can have yet another graph in the database that has this mapping. We even have support for "virtual graphs" that might let you do this out of the box. See http://wiki.blazegraph.com/wiki/index.php/VirtualGraphs 2. Using multiple triple stores. SPARQL "quads" (named graphs) provides the ability to transparently query across the named graphs either extracting their identifiers (using the GRAPH keyword for a named graph access path) or collapsing duplicate statements onto distinct statements (for a default graph access path). If each of these named graphs is really just being used as its own triple store, then you can have many different triple stores in a single blazegraph instance. Just put each one into its own namespace. The 3rd approach is more in the spirit of hacking the rename. 3. Hacking the rename. Ok, you effectively want to change the name of the graph. Internally each statement in a graph has an IV (Internal Value) in the 4th position of the statement tuple that is the graph identifier. If you need to modify those IVs, then you are going to touch a lot of data. Not constant time operation. The alternative is to hack the dictionary. You would *replace* the entry in the TERM2ID dictionary (mapping the URI onto an IV) with a different entry mapping the new URI onto the same IV. You would also update the reverse lookup (in ID2TERM). The old URI will "disappear". The new URI will be mapped to the data associated with the old URI. This would be a constant time operation. However, it WILL NOT work if the new URI is already defined since it would then orphan any data associated with the IV for the new URI. If your URIs are always new when you do this "rename" then you could use this mechanism. We could not make this a general purpose rename. We could perhaps do this rename if we could prove that the new URI was not pre-existing through some clever code. Either we or you could implement this as a special operator for your application. Let me know if you want to setup a telcon to discuss any of this. Thanks, Bryan ---- Bryan Thompson Chief Scientist & Founder SYSTAP, LLC 4501 Tower Road Greensboro, NC 27410 br...@sy... http://blazegraph.com http://blog.bigdata.com <http://bigdata.com> http://mapgraph.io Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. MapGraph™ <http://www.systap.com/mapgraph> is our disruptive new technology to use GPUs to accelerate data-parallel graph analytics. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Fri, Apr 24, 2015 at 7:51 AM, Rick Moynihan <ri...@sw...> wrote: > Hi all, > > We've recently been evaluating quad-stores, and in particular are looking > for better storage layers, and Blazegraph looks like a promising option. > > We have a linked data management system, which has several management > workflows where by: > > 1. large named graphs can be moved around (renamed via a SPARQL Update > MOVE command). > > 2. large named graphs can be inserted, reviewed, deleted (repaired > offline) and reinserted again before finally being approved. > > With this workflow there are two problems we have been finding with some > of the other quad stores: > > The first is that renames are often implemented as a copy/delete; which > results in a slow linear-time (or worse) operation. Ideally renaming > graphs would be constant time. > > The second problem we have been encountering (which the first can > compound) is that some stores don't free storage on deletions, and don't > even have a mechanism for expunging deletions without taking the database > offline. > > I'm curious as to what Blazegraph's behaviour is in these two > circumstances, and whether or not the different journals have different > behaviours. > > Many thanks, > > R. > > > ------------------------------------------------------------------------------ > One dashboard for servers and applications across Physical-Virtual-Cloud > Widest out-of-the-box monitoring support with 50+ applications > Performance metrics, stats and reports that give you Actionable Insights > Deep dive visibility with transaction tracing using APM Insight. > http://ad.doubleclick.net/ddm/clk/290420510;117567292;y > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers > > |