Hello Paolo,

1) Do you see a sense to load an rdf model containing a formal description of the exist content?

Yes! First of all, doing simple inference of RDF content within eXist would be a very useful rule test. There are many places where this might come in very handy when checking for consistency of models and instances or ids and id-references. I know that there has been discussion of using eXist as a back end for OWL tools such as Protoge. So this is a great idea and I want to encourage you to continue you your research.

1.1) If yes: is there an (easy) way to embed a tool able to verify the internal coherence of an existing content against a given metamodel?

There are simple XQuery scripts that you can run that will do some consistency checks. For example you can test to make sure that each class has a valid super class, that all properties are connected to at least one class etc. But I believe that beyond these checks there is very little XQuery code I have found to validate RDF structures. There about a dozen or so checks that I think would be useful. Perhaps we should start a place to put an XQuery module to check these items.

1.2) And (more important) do you think could be easy to incorporate a trigger able to check the consistency before doing some xupdate operations?

Yes! This would be a very good idea. Any incoming RDF document to use a trigger to store documents in XML, a triple store like BigData or both. The trigger could also do things like scan for RDFa attributes within an HTML file. One other thing we could do is integrate an external reasoner like pellet and see if we could write a XQuery interface module with something like Pellet for consistency checking. But their APIs are Java, not XQuery.

2) Intuitively, in term of performance, my opinion is that an rdf based knowledge base is more (too) resource expensive than its equivalent model directly designed using xml. Has anyone experienced in this field?

Yes, I would agree. If knowledge is stored in annotations in text such as entity TEI annotations (person, dates, places etc.) then the overhead of getting each triple out of the document as well as the CONTEXT of the triple (what speaker and what phrase of what line of what paragraph of what section of what chapter of what book of what volume) is very inefficient. A single in-context tag could require 20 RDF context triples. Keeping the facts within the text could be much more efficient and things such as the Context of an assertion could be extracted from the text when needed.

3) Can someone convince me that using rdf with exist is not a good idea? Or viceversa?

It all depends on what you are doing of course. People like RDF and SPARQL because they are designed to do reasoning and inferences that are not normally done with XML unless you have extensive experience storing ID/IDREF data within XML and doing XQueries on these structures. RDF and OWL in practice are often used advanced rules engines that go beyond what XML Schema and Schematron are designed to do. In many cases they are the basis for advanced searching of graphs for complex structures that are difficult to find in a simple XQuery. RDF also has the potential to allow joins between data sets if these data sets share URI standards. So we can not really to suggest or discourage your use of RDF vs. XML/ID/IDREFs unless we know a little bit more about your goals.

4) Can you give me some link to these kind of discussions?

I think that we need to include people that have a deep understanding of both RDF in triple stores and XML in native XML databases. People like Chris Wallace clearly have experience in both areas. Jeni Tennison has been very concerned about overselling RDF when XML options are often much easier. She has been working in these area for some time. But I don't know if anyone has tried to yet integrate eXist triggers directly with an RDF triple store. But it would seem like a very good research project to start.

Hope that help! - Dan

On Tue, Dec 28, 2010 at 1:31 PM, Paolo Di Pietro <pdipietro@diviana.net> wrote:

Hi all,

Im doing some abstract reasoning about using rdf based content with eXist, and there are some questions Id like to share.

1) Do you see a sense to load an rdf model containing a formal description of the exist content?

1.1) If yes: is there an (easy) way to embed a tool able to verify the internal coherence of an existing content against a given metamodel?

1.2) And (more important) do you think could be easy to incorporate a trigger able to check the consistency before doing some xupdate operations?

2) Intuitively, in term of performance, my opinion is that an rdf based knowledge base is more (too) resource expensive than its equivalent model directly designed using xml.

Has anyone experienced in this field?

3) Can someone convince me that using rdf with exist is not a good idea? Or viceversa?

4) Can you give me some link to these kind of discussions?

Thank you to everybody and have an happy new year

Paolo


------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and,
should the need arise, upgrade to a full multi-node Oracle RAC database
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Exist-open mailing list
Exist-open@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/exist-open




--
Dan McCreary
Semantic Solutions Architect
office: (952) 931-9198
cell: (612) 986-1552