From: Jeremy J C. <jj...@sy...> - 2016-02-22 17:04:32
|
Try looking on the status tab of the blazegraph UI in the browser. In the detail view of your particular task, there might be a counter showing how many triples have been updated. (I am unsure as to which tasks support this under which versions …) Jeremy > On Feb 17, 2016, at 12:26 PM, Brad Bebee <be...@bl...> wrote: > > Joakim, > > With the DataLoader, the commit is after all of the data is loaded. Once the load is complete, all of the statements will be visible. > > Thanks, --Brad > > On Wed, Feb 17, 2016 at 3:21 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: > I am calling: > > curl -X POST --data-binary @dataloader.xml --header 'Content-Type:application/xml' http:/__.__.__:9999/blazegraph/dataloader > > I can see the size of the JNL-file is increasing, but when I query number of statements in the dashboard the data doesn’t show up. > > select (count(*) as ?num) { ?s ?p ?o } > > Do I need to Flush the StatementBuffer to the backing store after the curl? > > This is my config file: > > <?xml version="1.0" encoding="UTF-8" standalone="no"?> > <!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd <http://java.sun.com/dtd/properties.dtd>"> > <properties> > <!-- RDF Format (Default is rdf/xml) --> > <entry key="format">N-Triples</entry> > <!-- Base URI (Optional) --> > <entry key="baseURI"></entry> > <!-- Default Graph URI (Optional - Required for quads mode namespace) --> > <entry key="defaultGraph"></entry> > <!-- Suppress all stdout messages (Optional) --> > <entry key="quiet">false</entry> > <!-- Show additional messages detailing the load performance. (Optional) --> > <entry key="verbose">3</entry> > <!-- Compute the RDF(S)+ closure. (Optional) --> > <entry key="closure">false</entry> > <!-- Files will be renamed to either .good or .fail as they are processed. > The files will remain in the same directory. --> > <entry key="durableQueues">true</entry> > <!-- The namespace of the KB instance. Defaults to kb. --> > <entry key="namespace">kb</entry> > <!-- The configuration file for the database instance. It must be readable by the web application. --> > <entry key="propertyFile">RWStore.properties</entry> > <!-- Zero or more files or directories containing the data to be loaded. > This should be a comma delimited list. The files must be readable by the web application. --> > <entry key="fileOrDirs">/mydata/dbpedia2015/core/</entry> > </properties> > > > > >> On Feb 16, 2016, at 8:35 AM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >> >> I knew there is a DataLoader class, but I wasn’t aware it was available as a service in NanoSparql server. I will try it immediately >> >> >> Thanks >> Joakim >> >>> On Feb 16, 2016, at 8:09 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>> >>>> See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>> >>> >>> That looks very interesting: >>> >>> I read: >>> >>> "Parsing, insert, and removal on the database are now decoupled from the index writes” >>> >>> One behavior we have is that we have small inserts concurrent with other activity (typically but not exclusively read activity). Does the enhanced configurability in 2.0 give us options that may allow us to improve performance of these writes. >>> >>> E.g. this week we have many (millions? at least hundreds of thousands) of such small writes (10 - 100 quads) and we also are trying to delete 25 million quads using about 100 delete/insert requests (that I take to be not impacted by this change). I am currently suggesting we should do one or the other at any one time, and not try to mix: but frankly I am guessing, and guessing conservatively. We have to maintain an always-on read performance at the same time. Total store size approx 3billion. >>> >>> [Unfortunately this machine is still a 1.5.3 machine, but for future reference I am trying to have better sense of how to organize such activity] >>> >>> Jeremy >>> >>> >>> >>> >>> >>>> On Feb 16, 2016, at 7:55 AM, Bryan Thompson <br...@sy... <mailto:br...@sy...>> wrote: >>>> >>>> 2.0 includes support for bulk data load with a number of interesting features, including durable queue patterns, folders, etc. See https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load <https://wiki.blazegraph.com/wiki/index.php/REST_API#Bulk_Data_Load> >>>> >>>> ---- >>>> Bryan Thompson >>>> Chief Scientist & Founder >>>> Blazegraph >>>> e: br...@bl... <mailto:br...@bl...> >>>> w: http://blazegraph.com <http://blazegraph.com/> >>>> >>>> Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. >>>> >>>> Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. >>>> >>>> CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. >>>> >>>> >>>> On Tue, Feb 16, 2016 at 10:40 AM, Jeremy J Carroll <jj...@sy... <mailto:jj...@sy...>> wrote: >>>> >>>> >>>>> On Feb 15, 2016, at 10:42 PM, Joakim Soderberg <joa...@bl... <mailto:joa...@bl...>> wrote: >>>>> >>>>> Has anyone succeeded to load a folder of .nt files? I can load one by one: >>>>> >>>>> LOAD <file:///mydata/dbpedia2015/core/amsterdammuseum_links.nt <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>>> >>>>> But it doesn’t like a folder name >>>>> LOAD <file:///mydata/dbpedia2015/core/ <>> INTO GRAPH <http://dbpedia2015 <http://dbpedia2015/>> >>>> >>>> >>>> That is correct. If you look at the spec for LOAD: >>>> https://www.w3.org/TR/sparql11-update/#load <https://www.w3.org/TR/sparql11-update/#load> >>>> then it takes an IRI as where you are loading from, and the concept of folder is simply not applicable. >>>> A few schemes such as file: and ftp: may have such a notion, but the operation you are looking for is local to your machine on the client and you should probably implement it yourself. >>>> >>>> In particular, do you want each file loaded into a different graph or the same graph: probably best for you to make up your own mind. >>>> >>>> I have had success loading trig files into multiple graphs, using a simple POST to the endpoint. >>>> >>>> >>>> Jeremy >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Site24x7 APM Insight: Get Deep Visibility into Application Performance >>>> APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month >>>> Monitor end-to-end web transactions and take corrective actions now >>>> Troubleshoot faster and improve end-user experience. Signup Now! >>>> http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> >>>> _______________________________________________ >>>> Bigdata-developers mailing list >>>> Big...@li... <mailto:Big...@li...> >>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> >>>> >>>> >>> >> > > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140 <http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140> > _______________________________________________ > Bigdata-developers mailing list > Big...@li... <mailto:Big...@li...> > https://lists.sourceforge.net/lists/listinfo/bigdata-developers <https://lists.sourceforge.net/lists/listinfo/bigdata-developers> > > > > > -- > _______________ > Brad Bebee > CEO > Blazegraph > e: be...@bl... <mailto:be...@bl...> > m: 202.642.7961 <tel:202.642.7961> > w: www.blazegraph.com <http://www.blazegraph.com/> > > Blazegraph products help to solve the Graph Cache Thrash to achieve large scale processing for graph and predictive analytics. Blazegraph is the creator of the industry’s first GPU-accelerated high-performance database for large graphs, has been named as one of the “10 Companies and Technologies to Watch in 2016” <http://insideanalysis.com/2016/01/20535/>. > > Blazegraph Database <https://www.blazegraph.com/> is our ultra-high performance graph database that supports both RDF/SPARQL and Tinkerpop/Blueprints APIs. Blazegraph GPU <https://www.blazegraph.com/product/gpu-accelerated/> andBlazegraph DAS <https://www.blazegraph.com/product/gpu-accelerated/>L are disruptive new technologies that use GPUs to enable extreme scaling that is thousands of times faster and 40 times more affordable than CPU-based solutions. > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP, LLC DBA Blazegraph. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > ------------------------------------------------------------------------------ > Site24x7 APM Insight: Get Deep Visibility into Application Performance > APM + Mobile APM + RUM: Monitor 3 App instances at just $35/Month > Monitor end-to-end web transactions and take corrective actions now > Troubleshoot faster and improve end-user experience. Signup Now! > http://pubads.g.doubleclick.net/gampad/clk?id=272487151&iu=/4140_______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers |