Menu

Bigdata Sesase Sparql Update Won't Work

Help
Ray
2015-04-03
2015-04-03
  • Ray

    Ray - 2015-04-03

    Hi, I would like to express my gratitude for all of you who put so many efforts in devleoping blazegraph. Without you, we won't be able to use such an excellent triplestore!

    I am currently developing an embedded script using bigdata-bundle.jar and sesame to do a sparql update federated query. The idea is to use federated query to pull data from other endpoints and insert query into a blazegraph instance.

    Since I need to pull so many data, and those endpoints are deployed with virtuoso which has a limitation of returning result at 10000 in maximum, I plan to use subquery and increase offset in my script to retrieve all the data.

    The script runs well actually, HOWEVER, the data is not inserting into the triplestore.

    I have attached my code (I presume something goes wrong with my executeUpdate() method, but not sure).

    I couldn't find any sample code for Blazegraph embedded sparql update, which is sad. So I really hope that blazegraph provides a detailed programming tutorial like AllegroGraph does here, so that a lot of people will benefit from it.

    Thank you very much for your help!
    Ray

    The following is my code:
    public class NanoPubCon {

    // NanoPubCon class constructor
    public NanoPubCon() {}
    
    // load properties file
    public Properties loadProperties(String resource) throws Exception {
        Properties p = new Properties();
        InputStream is = NanoPubCon.class.getResourceAsStream(resource);
        p.load(new InputStreamReader(new BufferedInputStream(is)));
        return p;
    }
    
    // count how many interactions before decide how many times to reset offset
    public int count(Repository repo, String query, 
        QueryLanguage ql) throws Exception {
    
        /*
         * With MVCC, you read from a historical state to avoid blocking and
         * being blocked by writers.  BigdataSailRepository.getQueryConnection
         * gives you a view of the repository at the last commit point.
         */
        RepositoryConnection cxn;
        if (repo instanceof BigdataSailRepository) { 
            cxn = ((BigdataSailRepository) repo).getReadOnlyConnection();
        } else {
            cxn = repo.getConnection();
        }
        final TupleQuery tupleQuery = cxn.prepareTupleQuery(ql, query);
        //tupleQuery.setIncludeInferred(true /* includeInferred */);
        TupleQueryResult result = tupleQuery.evaluate();
        // do something with the results  
        BindingSet bindingSet = result.next(); 
        // close the repository connection
        cxn.close();
        return Integer.parseInt(bindingSet.getValue("no").stringValue());        
    }
    
    // sparql update method
    public void executeUpdate(Repository repo, String query, 
            QueryLanguage q1) throws Exception{
    
        RepositoryConnection cxn;        
        cxn = repo.getConnection();
    
        try {
            // sparql update
            cxn.prepareUpdate(q1, query).execute();    
        } finally {
            cxn.close();
        }
    }
    
    // main method
    public static void main(final String[] args){
        final String propertiesFile = "RWStore.properties";
        try{            
            final NanoPubCon npc = new NanoPubCon();
            System.out.println("[INFO] Reading properties from file: " + propertiesFile);
            final Properties properties = npc.loadProperties(propertiesFile);
    
            // instantiate a sail
            final BigdataSail sail = new BigdataSail(properties);
            final Repository repo = new BigdataSailRepository(sail);
    
            try{
                repo.initialize();          
                String queryCount = "/home/Desktop/nanopub converting script/drugbank/drugbank_count.sparql";
                String queryPart1 = "/home/Desktop/nanopub converting script/drugbank/drugbank_part1.sparql";
                String queryPart2 = "/home/Desktop/nanopub converting script/drugbank/drugbank_part2.sparql";   
                try {
                    System.out.println("[INFO] Loading query from : " + queryCount);
                    Scanner scan = new Scanner(new File(queryCount));
                    String queryContent = scan.useDelimiter("\\Z").next();
                    scan.close();
    
                    int count = npc.count(repo, queryContent, QueryLanguage.SPARQL);
                    System.out.println(count);
    
                    System.out.println("[INFO] Loading query from : " + queryPart1);
                    System.out.println("[INFO] Loading query from : " + queryPart2);
                    Scanner scan1 = new Scanner(new File(queryPart1));
                    Scanner scan2 = new Scanner(new File(queryPart2));
                    String queryContent1 = scan1.useDelimiter("\\Z").next();        
                    String queryContent2 = scan2.useDelimiter("\\Z").next();
                    scan1.close();
                    scan2.close();
    
                    // set how many times the offset needing to be set
                    int offsetTimes = 0;
                    if(count % 10000 == 0)
                        offsetTimes = count / 10000;
                    else
                        offsetTimes = count / 10000 + 1;
                    System.out.println("[INFO] Needs to run " + offsetTimes + " rounds");
    
                    for(int i=0; i < offsetTimes; ++i){
                        System.out.println("[INfO] Round " + i + " begins...");
                        String query = queryContent1 + i*100000 + queryContent2;
                        npc.executeUpdate(repo, query, QueryLanguage.SPARQL);
                        System.out.println("       Round " + i + " done !");
                    }
                    System.out.println("[INFO] All is finished");
    
                } catch (FileNotFoundException e){
                    e.printStackTrace();
                }               
            } finally {
                repo.shutDown();
            }           
        } catch (Exception ex){
            ex.printStackTrace();
        }
    }
    

    }

    My query file looks like this:
    PREFIX np: http://www.nanopub.org/nschema#
    PREFIX iv: http://bio2rdf.org/irefindex_vocabulary:
    PREFIX pav: http://purl.org/pav/2.0/
    PREFIX sio: http://semanticscience.org/resource/
    PREFIX void: http://rdfs.org/ns/void#
    PREFIX prov: http://www.w3.org/ns/prov#
    PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema#
    PREFIX dcterms: http://purl.org/dc/terms/

    insert{
    GRAPH ?nanopub
    {
    ?nanopub a np:Nanopublication ;
    np:hasAssertion ?assertionGraph ;
    np:hasProvenance ?provenanceGraph ;
    np:hasPublicationInfo ?publicationInfoGraph .
    ?assertionGraph a np:Assertion .
    ?provenanceGraph a np:Provenance .
    ?publicationInfoGraph a np:PublicationInfo .
    }
    GRAPH ?assertionGraph
    {
    ?interaction a ?interaction_type ;
    rdfs:label ?interaction_label ;
    iv:interactor_a ?a ;
    iv:interactor_b ?b ;
    sio:has-participant ?a ;
    sio:has-target ?b .
    ?a a ?a_type .
    ?a rdfs:label ?a_label .
    ?a_type dcterms:title ?a_type_title .
    ?b a ?b_type .
    ?b rdfs:label ?b_label .
    ?b_type dcterms:title ?b_type_title .
    }
    GRAPH ?provenanceGraph
    {
    ?assertionGraph prov:wasGeneratedBy [
    a prov:Activity, ?method
    ]
    .
    # specific paper (sio_000772: has evidence)
    ?assertionGraph sio:SIO_000772 ?article .
    # specific source (sio_000253: has source)
    ?assertionGraph sio:SIO_000253 ?dataset .
    }
    GRAPH ?publicationInfoGraph
    {
    ?nanopub pav:authoredBy http://tw.rpi.edu/web/person/RuiYan ;
    pav:createdBy http://tw.rpi.edu/web/person/RuiYan ;
    dcterms:created ?now ;
    dcterms:rights http://opendatacommons.org/licenses/odbl/1.0/ ;
    dcterms:rightsHolder http://tw.rpi.edu .
    }
    }

    where{
    # bio2rdf iRefindex endpoint
    service http://cu.irefindex.bio2rdf.org/sparql {
    # openlife iRefindex endpoint
    #service http://beta.openlifedata.org/irefindex/sparql {
    select distinct ?interaction ?interaction_type ?interaction_label ?a ?a_type ?a_type_title ?b ?b_type ?b_type_title ?method ?article ?dataset
    where {
    {
    select distinct *
    where {
    # info of interaction itself
    ?interaction a ?interaction_type .
    ?interaction rdfs:label ?interaction_label .
    FILTER REGEX(?interaction_type, "^.*?(?<!Resource)$", "i")

                # info of interaction a
                ?interaction iv:interactor_a ?a .
                ?a a ?a_type .
                ?a_type dcterms:title ?a_type_title .
                #info of interaction b
                ?interaction iv:interactor_b ?b .
                ?b a ?b_type .
                ?b_type dcterms:title ?b_type_title .
    
                #provinance
                ?interaction iv:method ?method .
                ?interaction iv:article ?article .
                ?interaction void:inDataset ?dataset .
             }
             order by ?interaction
          }             
        }
       limit 10000
       offset  0
     }
    

    BIND(iri(concat(str(?interaction),"_nanopub")) as ?nanopub)
    BIND(iri(concat(str(?nanopub),"_assertion")) as ?assertionGraph)
    BIND(iri(concat(str(?nanopub),"_provenance")) as ?provenanceGraph)
    BIND(iri(concat(str(?nanopub),"_publicationInfo")) as ?publicationInfoGraph)
    BIND(NOW() as ?now)
    }

     
  • Bryan Thompson

    Bryan Thompson - 2015-04-03

    Ray,

    Thanks for your interest. I think you are failing to call can.commit() in
    your execute update method so the changes are being discarded when you
    close() the update connection.

    Thanks,
    Bryan

     
    • Ray

      Ray - 2015-04-03

      Thank you very much Bryan,

      Yes, it worked after I add cxn.commit() after cxn.prepareUpdate(q1, query).execute();

      Thank you very much!
      Also I would like to reaffirm that we users are expecting you guys to write tutorial codes that introduce every feature of the blazegraph java api, like AllegroGraph one here .

       

Log in to post a comment.