Re: [dotNetRDF-Develop] Integrating SPIN into dotnetrdf

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Rob/Tom,

Thank you for your engagement and valuable feedback.  I will defer to you
guys and develop SPIN dotNetRDF from scratch, using other implementations
only as a reference.  Preserving unit test sounds very smart and will keep
it in mind.

I will sign up for the SPIN dotNetRDF task, but it may not always proceed
quickly.  Since I plan to use SPIN dotNetRDF in my application it should
ensure I will see it to completion.  To achieve this goal can you guys
continue to answer questions and provide snippets of code?  My reading has
lead to numerous questions, which would be confusing to place in a single
email thread.  My plan is to organize the questions and start separate SPIN
integration emails.

Thanks,
Kevin

-----Original Message-----
From: Rob Vesse [mailto:rv...@do...] 
Sent: Friday, March 22, 2013 1:59 PM
To: dotNetRDF Developer Discussion and Feature Request
Subject: Re: [dotNetRDF-Develop] Integrating SPIN into dotnetrdf

Sorry I've been out the discussion but this week is really busy for me.

Comments inline:

On 3/19/13 10:35 AM, "Tomasz Pluskiewicz" <tom...@gm...>
wrote:

>Comments inline
>
>On Sun, Mar 17, 2013 at 4:56 AM, Kevin <ke...@th...> wrote:
>> Rob, Tom
>>
>>
>>
>> After more reading I believe a dotnetrdf/SPIN/Fluent SPARQL would be 
>> an excellent solution.  In fact I would think people would convert to 
>> C# dotnetrdf due to this compiling story.
>>
>>
>>
>> Strategy Questions:
>>
>>
>>
>> Due to my knowledge, time constraints, and past experience I know if 
>>I  attempt SPIN dotnetrdf alone it will drag out and never complete.  
>>Can you  let me know if any of the strategies below make any sense.  
>>Both strategies  involve splitting the work with the understanding 
>>that we are all busy and  there is no real obligation.  Also I assume 
>>any knowledgeable new comer is  welcomed to contribute to the cause.  
>>I believe the effort will obviously  benefit the community, but also 
>>provide a deep insight to the contributing  developers.
>>
>>
>>
>> -Convert the Java TopBraid SPIN API (Uses Jena interface) into C# 
>>starting  with a conversion tool like Tangible Java->C# (I will 
>>purchase).  The tool  will help with most of the syntax, but there 
>>will be a lot of manual work.
>> The conversion will obviously entail replacing the Jena interface 
>>with  dotnetrdf.  The benefits of this strategy is we harness the 
>>completeness,  robustness, and future updates (re-port changes) of 
>>TopBraid SPIN API.
>>The
>> conversion should provide insights into the implementation enabling 
>>custom  tweaks.  It should also be possible to split the converted 
>>Java files into 3  groups if you guys are up for the task.  I am in 
>>preference of this solution  primarily because we are essential 
>>creating a SPIN inference engine based  upon code from the people who 
>>invented SPIN.
>>
>>
>
>Has TopBraid opened all of their SPIN API? Last time I looked only 
>parts were open source. I wonder that it got no attention at 
>semanticweb.com. Or did it?
>
>Regarding the first strategy I undersnatd it is very appealing but I'm 
>afraid converting API calls from Jena to dotNetRDF could be more work 
>than you expect. I don't know Jena very well but I expect that a lot of 
>the functionality isn't just a simple 1:1 relation. In this case we are 
>talking about a number of areas:
>
>- Graph traversal for reading SPIN queries in RDF form
>- SPARQL query and it's programmatic counterpart 
>(VDS.RDF.Query.SparqlQuery and others)
>- Query execution
>- Reasoner

I would echo what Tom said, write code from scratch don't try and port the
existing Java.  As someone who also works on Jena I can confirm what Tom
said that the APIs around queries and different enough that you would spend
far more time extricating Jena from the converted code than if you just
wrote the relevant code yourself.

Also converting code opened source under a different license than dotNetRDF
uses provides a somewhat grey area about whether you can re-license the code
since we would not be the original copyright holders and would likely shoot
ourselves in the foot wrt making our SPIN implementation usable in a
corporate setting (where they are going to care about legal stuff like this)
if at all.

>
>However what we should definitely try to preserve are any unit tests 
>and possibly some kind of test hareness. I hope the guys at TopBraid 
>have SPIN thoroughly tested and we should reuse those tests on our 
>implementation.

This is a good point, if they have open sourced the tests these would be
useful references to write our own tests from.

>
>>
>> -Follow your initial start at SPIN dotnetrdf and maybe use TopBraid 
>>SPIN API  as a reference (Difficult because approaches will diverge).  
>>Rob are you a  take any small essential blocks and coordinate the 
>>overall effort?  Tom are  you able to help out, especially where 
>>Fluent SPARQL is utilized?
>>Again I
>> think the alternate strategy above has a lot of merit, but it must align
>> with Rob's vision.   No matter the strategy please look at the
>> implementation questions below to help clarify the high/low level 
>>picture.
>>
>
>Now, if we had not only the Java API reference but also complete 
>automatic test suite from the start I would opt for the second 
>approach.
>
>If not available or somehow unusable for a .NET project I think we 
>could resort to http://spinservices.org/spinrdfconverter.html for 
>validating our efforts.

I think the early tests I wrote did use this as well, I likely need to drag
those over from the old branch as well.

>
>Whatever the case I think that starting from scratch rather than 
>porting Java is likely a better approach in the long run.

As I said above +1

>
>>
>>
>> Implementation Questions:
>>
>>
>>
>> -Can you provide a high level overview of a SPIN dotnetrdf 
>>implementation so  I can ensure the final solution is acceptable.  
>>Let's say you have the SPIN  rule for "adult rdfs:subclassof person" 
>>at the top-level (ie.
>>owl:Thing) and
>> the user queries is Elvis a Adult.   Would dotnetrdf  SPIN spawn
>>numerous
>> intermediate SPARQL queries to gather the important SPIN rules that 
>>must be  executed?  If RDF database is remote (ie. dbpedia) could this 
>>accumulative  delay become unacceptably (Over 3 seconds in my 
>>application).
>>Alternatively
>> is everything required by the user query obtained in 1 or 2 complex  
>>intermediate queries which dotnetrdf SPIN creates and digests?
>>Basically if
>> you could paint the high level overview with an example I could then 
>>move to  the details.
>>
>>
>
>I think that a complete dotNetRDF implementation of SPIN would require 
>at least two components.
>
>1. A converter from SPIN/RDF to in-memory queries (not necessarily 
>VDS.RDF.Query.SparqlQuery - see below) 2. A SPIN runner, which will 
>take a SPIN query and execute it in an instance's context 3. A 
>reasoner, which uses the above converter and executes the SPIN rules 
>and constraints againt a graph or triple store

Yes that is pretty much what I was going to suggest

>
>>
>>
>> -Can you provide a low-level overview of your SPIN dotnetrdf 
>>implementation  strategy.  I only partially understood you explanation 
>>below, which I know  will be a little clearer after looking at the 
>>code.  What is  spin-sparql-syntax.ttl and how does it fit into the 
>>puzzle?  What does it  mean to convert a query into SPIN RDF 
>>representation?  My naive picture is  that a user query is converted 
>>into internal query(s) to obtain SPIN RDF  rules, which are then 
>>executed by the new SPIN dotnetrdf engine.  Lastly  please explain a 
>>little more your view of turning a SPIN query into a query?
>>
>>
>
>Basically I think that a complete solution should work like this:
>1. Prepare a metadata (ontology) graph with your SPIN rules/constraints 
>2. SPIN rules are converted to SPARQL queries 3. Queries are executed 
>against graph/store to produce constraint violations and 
>inferences/assertions

I agree, for point 3 we could even go so far to have a SPIN constraint
validator as a decorator over a dataset which would fire off the constraints
in the Flush() method (I.e. the transaction commit) and fail the transaction
if constraints have been violated.  The same goes for running inferences.

>
>If you look at the TopBraid videos and SPIN Modelling vocabulary you 
>will notice that there is much more to SPIN than just converting 
>queries from RDF representation to SPARQL:
>- rule modularization templates
>- constructors
>- extending SPARQL Engine with functions and magic properties
>
>I agree with Rob that first we would need a converter from SPIN RDF to 
>SPARQL but its design should be influenced by how we will use it and we 
>already know very good what that would be. Unfortunately a lot seems 
>unclear to me when I read the SPIN Modelling documentation. For one 
>thing there can be many rules and as Kevin wrote, executing many such 
>rules against a store may not be a good idea.

I think the immediate goal would be to make SPIN runnable against in-memory
data, a longer term goal can be to make this run against arbitrary SPARQL
endpoints.

Rob

>
>Kevin, what do you mean by you last question? You have RDF triples, 
>which represent a query and you traverse the graph to create a query 
>(string). The IGraph and Fluent SPARQL APIs combined should make this 
>work easy.
>
>Hope this helps, for now ;)
>Tom
>
>>
>>
>> Thanks,
>>
>> Kevin
>>
>>
>>
>> From: Rob Vesse [mailto:rv...@do...]
>> Sent: Friday, March 15, 2013 2:57 PM
>> To: Kevin
>> Cc: dotNetRDF Developer Discussion and Feature Request
>> Subject: Re: Integrating SPIN into dotnetrdf
>>
>>
>>
>> Hey Kevin
>>
>>
>>
>> Discussion inline:
>>
>>
>>
>> From: Kevin <ke...@th...>
>> Date: Wednesday, March 13, 2013 7:40 PM
>> To: Rob Vesse <rv...@do...>
>> Subject: Integrating SPIN into dotnetrdf
>>
>>
>>
>> Rob,
>>
>>
>>
>> First thank you for your quality work you have done with the 
>>dotnetrdf  project.  I have seen a few different posts about your 
>>initiative to  integrate SPIN into dotnetrdf (ie.  SPIN Post).  After 
>>much reading on the  subject it really seems that SPIN would really 
>>propel/complement dotnetrdf.
>> I believe SPIN not only makes up for the missing OWL inference (Via 
>>SPIN  OWL-RL implementation), it also can expand to suit the modelers 
>>imagination.
>> The fact that the rules are in SPARQL makes for an unbeatable solution.
>> Should it matter my current effort involves query a Virtuoso database 
>>(Some  owl support) with dotnetrdf.  I would really appreciate you 
>>taking a look at  the questions below:
>>
>>
>>
>> -Have you made any further progress on integrating SPIN into dotnetrdf?
>> Would you allow me to have the source code in its current state?  
>>Could I  possibly be a contributor on this cause as I am not really 
>>equipped for the  full task?  In any case I would appreciate any 
>>source code which I could use  as a learning tool.
>>
>>
>>
>> No I haven't had time to do anything on SPIN for a long time now.  I've
>>been
>> primarily concentrating on getting core features stabilized such as the
>> SPARQL engine which are obviously fairly key to building stuff like
>>SPIN on
>> top.
>>
>>
>>
>> However I still don't have time to work on SPIN directly so if you want
>>to
>> work on this please feel free, find the code in the mercurial
>>repository at
>> https://bitbucket.org/dotnetrdf/dotnetrdf
>>
>>
>>
>> The previous and very minimal SPIN stub is under Libraries\Query\Spin,
>> create your own fork and then you can send pull requests as and when you
>> have something to
>>
>>
>>
>> The key things that need to be done to get the core of SPIN implemented
>>are
>> as follows:
>>
>> Update the current spin-sparql-syntax.ttl to a current version, it
>>likely
>> doesn't represent the current version of the spec (this is primarily a
>> convenience reference for developers)
>> Finish the existing stubs for converting queries into their SPIN RDF
>> representation (see SpinSyntax.cs)
>> Write code to turn a RDF encoding of a SPIN query into a query
>>
>> The middle one would be the easiest to start with since there is already
>> some partial stubs to get you started.
>>
>>
>>
>>
>>
>> -From the available TopQuadrant documentation I have tried to deduce how
>> dotnetrdf might implement SPIN.  According to SPIN tutrial, TopBraid
>>finds
>> all SPIN inferecer rules and runs them when you hit play.  Would
>>dotnetrdf
>> SPIN inferencer only run the rules that are associated with the class
>> structure being queried?  Basically I am confused how dotnetrdf decides
>> when/how/which SPIN rules to run for a given query.
>>
>>
>>
>> That's an implementation detail, we would control how and when rules get
>> run.  We need to get the basic implementation of SPIN done first before
>>this
>> aspect of things gets implemented anyway.
>>
>>
>>
>>
>>
>> -How much of SPIN could dotnetrdf possibly support.? It appears SPIN
>> contains Inference Rules, Constraint Checking, and ability to Isolate
>>rules
>> for certain conditions.  Also the TopBraid tool seems to have
>>"User-Defined
>> SPARQL functions" and "SPIN Query Templates".  I imagine dotnetrdf would
>> have to keep up with any SPIN improvements.
>>
>>
>>
>> All of those are supportable in some shape of form, until we have the
>>core
>> of SPIN up and running we can't really implement those.  Most of those
>> features run on top of the SPIN core and so will ultimately just be
>> implementation details once we have a core to build upon.  User defined
>> SPARQL functions are basically just SPARQL queries that return a single
>> value and query templates are just parameterized queries both of which
>>the
>> existing SPARQL engine is capable of supporting in one way or another.
>>So
>> it is just a case of exposing that functionality in the SPIN style.
>>
>>
>>
>> Hope this is enough to get you started, if not please let us know,
>>
>>
>>
>> Rob
>>
>>
>>
>>
>>
>> Regards,
>>
>> Kevin
>>
>>
>> 
>>-------------------------------------------------------------------------
>>-----
>> Everyone hates slow websites. So do we.
>> Make your web apps faster with AppDynamics
>> Download AppDynamics Lite for free today:
>> http://p.sf.net/sfu/appdyn_d2d_mar
>> _______________________________________________
>> dotNetRDF-develop mailing list
>> dot...@li...
>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-develop
>>
>
>--------------------------------------------------------------------------
>----
>Everyone hates slow websites. So do we.
>Make your web apps faster with AppDynamics
>Download AppDynamics Lite for free today:
>http://p.sf.net/sfu/appdyn_d2d_mar
>_______________________________________________
>dotNetRDF-develop mailing list
>dot...@li...
>https://lists.sourceforge.net/lists/listinfo/dotnetrdf-develop

----------------------------------------------------------------------------
--
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://p.sf.net/sfu/appdyn_d2d_mar
_______________________________________________
dotNetRDF-develop mailing list
dot...@li...
https://lists.sourceforge.net/lists/listinfo/dotnetrdf-develop