Hi, I would like to express my gratitude for all of you who put so many efforts in devleoping blazegraph. Without you, we won't be able to use such an excellent triplestore!
I am currently developing an embedded script using bigdata-bundle.jar and sesame to do a sparql update federated query. The idea is to use federated query to pull data from other endpoints and insert query into a blazegraph instance.
Since I need to pull so many data, and those endpoints are deployed with virtuoso which has a limitation of returning result at 10000 in maximum, I plan to use subquery and increase offset in my script to retrieve all the data.
The script runs well actually, HOWEVER, the data is not inserting into the triplestore.
I have attached my code (I presume something goes wrong with my executeUpdate() method, but not sure).
I couldn't find any sample code for Blazegraph embedded sparql update, which is sad. So I really hope that blazegraph provides a detailed programming tutorial like AllegroGraph does here, so that a lot of people will benefit from it.
Thank you very much for your help!
Ray
The following is my code:
public class NanoPubCon {
//NanoPubConclassconstructorpublicNanoPubCon(){}//loadpropertiesfilepublicPropertiesloadProperties(Stringresource)throwsException{Propertiesp=newProperties();InputStreamis=NanoPubCon.class.getResourceAsStream(resource);p.load(newInputStreamReader(newBufferedInputStream(is)));returnp;}//counthowmanyinteractionsbeforedecidehowmanytimestoresetoffsetpublicintcount(Repositoryrepo,Stringquery,QueryLanguageql)throwsException{/**WithMVCC,youreadfromahistoricalstatetoavoidblockingand*beingblockedbywriters.BigdataSailRepository.getQueryConnection*givesyouaviewoftherepositoryatthelastcommitpoint.*/RepositoryConnectioncxn;if(repoinstanceofBigdataSailRepository){cxn=((BigdataSailRepository)repo).getReadOnlyConnection();}else{cxn=repo.getConnection();}finalTupleQuerytupleQuery=cxn.prepareTupleQuery(ql,query);//tupleQuery.setIncludeInferred(true/*includeInferred*/);TupleQueryResultresult=tupleQuery.evaluate();//dosomethingwiththeresultsBindingSetbindingSet=result.next();//closetherepositoryconnectioncxn.close();returnInteger.parseInt(bindingSet.getValue("no").stringValue());}//sparqlupdatemethodpublicvoidexecuteUpdate(Repositoryrepo,Stringquery,QueryLanguageq1)throwsException{RepositoryConnectioncxn;cxn=repo.getConnection();try{//sparqlupdatecxn.prepareUpdate(q1,query).execute();}finally{cxn.close();}}//mainmethodpublicstaticvoidmain(finalString[]args){finalStringpropertiesFile="RWStore.properties";try{finalNanoPubConnpc=newNanoPubCon();System.out.println("[INFO] Reading properties from file: "+propertiesFile);finalPropertiesproperties=npc.loadProperties(propertiesFile);//instantiateasailfinalBigdataSailsail=newBigdataSail(properties);finalRepositoryrepo=newBigdataSailRepository(sail);try{repo.initialize();StringqueryCount="/home/Desktop/nanopub converting script/drugbank/drugbank_count.sparql";StringqueryPart1="/home/Desktop/nanopub converting script/drugbank/drugbank_part1.sparql";StringqueryPart2="/home/Desktop/nanopub converting script/drugbank/drugbank_part2.sparql";try{System.out.println("[INFO] Loading query from : "+queryCount);Scannerscan=newScanner(newFile(queryCount));StringqueryContent=scan.useDelimiter("\\Z").next();scan.close();intcount=npc.count(repo,queryContent,QueryLanguage.SPARQL);System.out.println(count);System.out.println("[INFO] Loading query from : "+queryPart1);System.out.println("[INFO] Loading query from : "+queryPart2);Scannerscan1=newScanner(newFile(queryPart1));Scannerscan2=newScanner(newFile(queryPart2));StringqueryContent1=scan1.useDelimiter("\\Z").next();StringqueryContent2=scan2.useDelimiter("\\Z").next();scan1.close();scan2.close();//sethowmanytimestheoffsetneedingtobesetintoffsetTimes=0;if(count%10000==0)offsetTimes=count/10000;elseoffsetTimes=count/10000+1;System.out.println("[INFO] Needs to run "+offsetTimes+" rounds");for(inti=0;i<offsetTimes;++i){System.out.println("[INfO] Round "+i+" begins...");Stringquery=queryContent1+i*100000+queryContent2;npc.executeUpdate(repo,query,QueryLanguage.SPARQL);System.out.println(" Round "+i+" done !");}System.out.println("[INFO] All is finished");}catch(FileNotFoundExceptione){e.printStackTrace();}}finally{repo.shutDown();}}catch(Exceptionex){ex.printStackTrace();}}
BIND(iri(concat(str(?interaction),"_nanopub")) as ?nanopub)
BIND(iri(concat(str(?nanopub),"_assertion")) as ?assertionGraph)
BIND(iri(concat(str(?nanopub),"_provenance")) as ?provenanceGraph)
BIND(iri(concat(str(?nanopub),"_publicationInfo")) as ?publicationInfoGraph)
BIND(NOW() as ?now)
}
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Thanks for your interest. I think you are failing to call can.commit() in
your execute update method so the changes are being discarded when you
close() the update connection.
Thanks,
Bryan
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Yes, it worked after I add cxn.commit() after cxn.prepareUpdate(q1, query).execute();
Thank you very much!
Also I would like to reaffirm that we users are expecting you guys to write tutorial codes that introduce every feature of the blazegraph java api, like AllegroGraph one here .
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Hi, I would like to express my gratitude for all of you who put so many efforts in devleoping blazegraph. Without you, we won't be able to use such an excellent triplestore!
I am currently developing an embedded script using bigdata-bundle.jar and sesame to do a sparql update federated query. The idea is to use federated query to pull data from other endpoints and insert query into a blazegraph instance.
Since I need to pull so many data, and those endpoints are deployed with virtuoso which has a limitation of returning result at 10000 in maximum, I plan to use subquery and increase offset in my script to retrieve all the data.
The script runs well actually, HOWEVER, the data is not inserting into the triplestore.
I have attached my code (I presume something goes wrong with my executeUpdate() method, but not sure).
I couldn't find any sample code for Blazegraph embedded sparql update, which is sad. So I really hope that blazegraph provides a detailed programming tutorial like AllegroGraph does here, so that a lot of people will benefit from it.
Thank you very much for your help!
Ray
The following is my code:
public class NanoPubCon {
}
My query file looks like this:
PREFIX np: http://www.nanopub.org/nschema#
PREFIX iv: http://bio2rdf.org/irefindex_vocabulary:
PREFIX pav: http://purl.org/pav/2.0/
PREFIX sio: http://semanticscience.org/resource/
PREFIX void: http://rdfs.org/ns/void#
PREFIX prov: http://www.w3.org/ns/prov#
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
PREFIX dcterms: http://purl.org/dc/terms/
insert{
GRAPH ?nanopub
{
?nanopub a np:Nanopublication ;
np:hasAssertion ?assertionGraph ;
np:hasProvenance ?provenanceGraph ;
np:hasPublicationInfo ?publicationInfoGraph .
?assertionGraph a np:Assertion .
?provenanceGraph a np:Provenance .
?publicationInfoGraph a np:PublicationInfo .
}
GRAPH ?assertionGraph
{
?interaction a ?interaction_type ;
rdfs:label ?interaction_label ;
iv:interactor_a ?a ;
iv:interactor_b ?b ;
sio:has-participant ?a ;
sio:has-target ?b .
?a a ?a_type .
?a rdfs:label ?a_label .
?a_type dcterms:title ?a_type_title .
?b a ?b_type .
?b rdfs:label ?b_label .
?b_type dcterms:title ?b_type_title .
}
GRAPH ?provenanceGraph
{
?assertionGraph prov:wasGeneratedBy [
a prov:Activity, ?method
].
# specific paper (sio_000772: has evidence)
?assertionGraph sio:SIO_000772 ?article .
# specific source (sio_000253: has source)
?assertionGraph sio:SIO_000253 ?dataset .
}
GRAPH ?publicationInfoGraph
{
?nanopub pav:authoredBy http://tw.rpi.edu/web/person/RuiYan ;
pav:createdBy http://tw.rpi.edu/web/person/RuiYan ;
dcterms:created ?now ;
dcterms:rights http://opendatacommons.org/licenses/odbl/1.0/ ;
dcterms:rightsHolder http://tw.rpi.edu .
}
}
where{
# bio2rdf iRefindex endpoint
service http://cu.irefindex.bio2rdf.org/sparql {
# openlife iRefindex endpoint
#service http://beta.openlifedata.org/irefindex/sparql {
select distinct ?interaction ?interaction_type ?interaction_label ?a ?a_type ?a_type_title ?b ?b_type ?b_type_title ?method ?article ?dataset
where {
{
select distinct *
where {
# info of interaction itself
?interaction a ?interaction_type .
?interaction rdfs:label ?interaction_label .
FILTER REGEX(?interaction_type, "^.*?(?<!Resource)$", "i")
BIND(iri(concat(str(?interaction),"_nanopub")) as ?nanopub)
BIND(iri(concat(str(?nanopub),"_assertion")) as ?assertionGraph)
BIND(iri(concat(str(?nanopub),"_provenance")) as ?provenanceGraph)
BIND(iri(concat(str(?nanopub),"_publicationInfo")) as ?publicationInfoGraph)
BIND(NOW() as ?now)
}
Ray,
Thanks for your interest. I think you are failing to call can.commit() in
your execute update method so the changes are being discarded when you
close() the update connection.
Thanks,
Bryan
Thank you very much Bryan,
Yes, it worked after I add cxn.commit() after cxn.prepareUpdate(q1, query).execute();
Thank you very much!
Also I would like to reaffirm that we users are expecting you guys to write tutorial codes that introduce every feature of the blazegraph java api, like AllegroGraph one here .
Ray,
Thank you. We have a new Blazegraph user's manual in the works. You
should see if around the time of the 1.5.2 release towards the end of this
month.
Improved Sesame documentation and support is a focus on this release.
Thanks, --Brad
On Fri, Apr 3, 2015 at 5:04 PM, Ray raymondino@users.sf.net wrote: