From: Jeremy J C. <jj...@sy...> - 2015-08-28 16:05:07
|
One large graph in quads mode, sounds like we lose. Jeremy > On Aug 28, 2015, at 8:34 AM, Bryan Thompson <br...@sy...> wrote: > > It only impacts DELETE/INSERT + WHERE. > > DROP GRAPH for quads needs to write on the indices to remove all edges in the graph from all 6 quads indices. So it would be close to the cost of writing 1B quads. > > DROP ALL could be optimized. The minimum work to drop an index is to visit all nodes so the allocation slots of the children can be freed. BTree.removeAll() does implement that optimization. > > Are you trying to drop all quads or just one large graph? > > Bryan > > ---- > Bryan Thompson > Chief Scientist & Founder > SYSTAP, LLC > 4501 Tower Road > Greensboro, NC 27410 > br...@sy... <mailto:br...@sy...> > http://blazegraph.com <http://blazegraph.com/> > http://blog.bigdata.com <http://bigdata.com/> > http://mapgraph.io <http://mapgraph.io/> > Blazegraph™ <http://www.blazegraph.com/> is our ultra high-performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph is now available with GPU acceleration using our disruptive technology to accelerate data-parallel graph analytics and graph query. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > > > On Fri, Aug 28, 2015 at 11:15 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: > Sorry for the delayed response, I have been on paternity leave. > > I take it that this impacts usage via NSS, and the fix is likely in the next blazegraph release? > > We believe we have seen this performance issue. > > May this also impact e.g. DROP GRAPH ? > (We have been having surprisingly slow performance DROPping a billion triple graph, via NSS) > > Jeremy > > > >> On Aug 7, 2015, at 6:09 AM, Bryan Thompson <br...@sy... <mailto:br...@sy...>> wrote: >> >> We found a bug in the openrdf library that is having a very strong negative impact on SPARQL UPDATE performance for larger UPDATE sets. The root cause is MultipleTupleQueryResult using LinkedList.get(index), which is a linear scan, for next(). So the iterator performance falls off linearly as the scan progresses. >> >> For the impatient, there is a very simple fix: >> >> 1. Clone the MutableTupleQueryResult class in openrdf into a new namespace in blazegraph. >> 2. Replace LinkedList with ArrayList (one line change); >> 3. Import that modified version of the class in our AST2BOpUpdate class (one line change). >> >> Michael is testing the performance impact of that fix now. >> >> See https://jira.blazegraph.com/browse/BLZG-1404 <https://jira.blazegraph.com/browse/BLZG-1404> >> >> Thanks, >> Bryan >> > > |