From: Bryan T. <br...@sy...> - 2016-02-16 16:17:55
|
Jeremy, The bulk data loader will not help with that scenario. It is desired for high throughput load. It can be used concurrent with query, but it can not really be mixed with concurrent small updates. In general, mixing concurrent small updates and large updates does not work well. Updates against a single graph must be serialized using the unisolated connection. So at some point the small updates will block for the large update. Thanks, Bryan ---- Bryan Thompson Chief Scientist & Founder Blazegraph e: br...@bl... w: http://blazegraph.com Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Tue, Feb 16, 2016 at 11:09 AM, Jeremy J Carroll <jj...@sy...> wrote: > See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load > > > > That looks very interesting: > > I read: > > "Parsing, insert, and removal on the database are now decoupled from the > index writes” > > One behavior we have is that we have small inserts concurrent with other > activity (typically but not exclusively read activity). Does the > enhanced configurability in 2.0 give us options that may allow us to > improve performance of these writes. > > E.g. this week we have many (millions? at least hundreds of thousands) of > such small writes (10 - 100 quads) and we also are trying to delete 25 > million quads using about 100 delete/insert requests (that I take to be not > impacted by this change). I am currently suggesting we should do one or the > other at any one time, and not try to mix: but frankly I am guessing, and > guessing conservatively. We have to maintain an always-on read > performance at the same time. Total store size approx 3billion. > > [Unfortunately this machine is still a 1.5.3 machine, but for future > reference I am trying to have better sense of how to organize such activity] > > Jeremy > > > > > > On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy...> wrote: > > 2.0 includes support for bulk data load with a number of interesting > features, including durable queue patterns, folders, etc. See > https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load > > ---- > Bryan Thompson > Chief Scientist & Founder > Blazegraph > e: br...@bl... > w: http://blazegraph.com > > Blazegraph products help to solve the Graph Cache Thrash to achieve large > scale processing for graph and predictive analytics. Blazegraph is the > creator of the industry’s first GPU-accelerated high-performance database > for large graphs, has been named as one of the “10 Companies and > Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high > performance graph database that supports both RDF/SPARQL and > Tinkerpop/Blueprints APIs. Blazegraph GPU > <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS > <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new > technologies that use GPUs to enable extreme scaling that is thousands of > times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, > disclosure, dissemination or copying of this email or its contents or > attachments is prohibited. If you have received this communication in > error, please notify the sender by reply email and permanently delete all > copies of the email and its contents and attachments. > > On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy...> wrote: > >> >> >> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg < >> joa...@bl...> wrote: >> >> Has anyone succeeded to load a folder of .nt files? I can load one by one: >> >> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt> INTO >> GRAPH <http://dbpedia2015> >> >> But it doesn’t like a folder name >> LOAD <file:///mydata/dbpedia2015/core/> INTO GRAPH <http://dbpedia2015> >> >> >> >> That is correct. If you look at the spec for LOAD: >> https://www.w3.org/TR/sparql11-update/#load >> then it takes an IRI as where you are loading from, and the concept of >> folder is simply not applicable. >> A few schemes such as file: and ftp: may have such a notion, but the >> operation you are looking for is local to your machine on the client and >> you should probably implement it yourself. >> >> In particular, do you want each file loaded into a different graph or the >> same graph: probably best for you to make up your own mind. >> >> I have had success loading trig files into multiple graphs, using a >> simple POST to the endpoint. >> >> >> Jeremy >> >> >> >> ------------------------------------------------------------------------------ >> Site24x7 APM Insight: Get Deep Visibility into Application Performance >> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >> Monitor end-to-end web transactions and take corrective actions now >> Troubleshoot faster and improve end-user experience. Signup Now! >> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 >> _______________________________________________ >> Bigdata-developers mailing list >> Big...@li... >> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >> >> > > |