Re: [Sparql4j-devel] RE: XML parsing

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

>>If we have a factory that can handle RDF terms, adding support for
>>triples is trivial. 
>>    
>>
>
>In the sense that creating a triple is just a 3-slot object, yes.  But
>the factory idea means that objects specific to the local RDF tookit are
>retruned and it will have it's own idea of a triple (e..g. in Jena, the
>application API object "Statement" is not a plain 3 slot triple - it
>knows which model it comes from).
>  
>

There's no public API for constructing RDFNodes directly either in Jena, 
so that
too might be a problem too. Wouldn't it at least bypass all (per-model) 
node caches?

>The XML Results format does not return triples - it's only CONSTRUCT and
>DESCRIBE.
>  
>
Good point.

>>I would find it more convenient to get the triples of the graph
>>returned as triples (i.e. triple per row) using the  factory along
>>    
>>
>with
>  
>
>>speudo column accessors. This way we would (first of all) avoid
>>    
>>
>special
>  
>
>>content-type handling.
>>    
>>
>
>I don't follow this - the HTTP reply header has to be correctly parsed.
>Such content handling is easy.
>  
>
But there's no standard way in jdbc for user to access this information. 
If the user is provided with an access to InputStream of the result, he 
needs to get access to the content type also.

>Could you give the use case you have in mind here? (why is it more
>convenient to have a stream of triples?)
>  
>
I use frequently Model.listStatements variants - and have used in every 
RDF based applications I've ever made using Jena or SIR ;-) I wouldn't 
like the performance penalty nor increased memory requirements of having 
to read the results first into a model just for iterating over them. One 
could also argue that every (reading) RDF operation involves ultimately 
a stream/iteration of triples. Sure there's convenience accesses 
filtering objects of the statements or select-type query returning 
bindings, but these operations in turn rely on statement iterations. 
[When building a generic program that doesn't have full control of all 
input, the select-query- access is strictly speaking not usable if "told 
bnodes" are not supported.]

One (internal to Jena/SIR) use case for this could be a (read-only) 
graph wrapping some sparql-enabled repository, addressing queries when 
necessary and possibly caching the results.

One (possibly quite far fetched) use case is using this driver in a 
generic SQL browser allowing user to address queries and showing results 
in a tabular form. Having tabular form for RDF triples (exposed vie 
ResultSetMetaData) instead of clob/blob might be more convenient.

>Toolkits have their own APIs to their parsers to generate triples - I
>guess most have a stream interface (ours do) but it would be more normal
>to parse directly into the graph 
>and return a graph (yes - streaming is not possible).  
>
>
>To get a stream of triples, do the query:
>
>SELECT * { ?s ?p ?o }
>
>maybe with an ORDER BY 
>  
>
That's certainly one alternative, but gets pretty difficult when using 
just a bit more complex templates.

>As a graph isn't usable until all the triples are known (a triple can
>turn up at any point in the stream), an application would need to do the
>SELECT query to process results before the last is seen.
>  
>
A graph may not be, but triples are also usable as such.

Also I find the stream based access to the results quite usable 
regardles of the result form - at least if it's XML and not N3 (e.g. XSLT).

Perhaps we should discuss and document what kind of use cases we wan to 
support with sparql4j?

Br,
Samppa

-- 
Samppa Saarela <samppa.saarela at profium.com> Profium, Lars Sonckin kaari 12, 02600 Espoo, Finland Tel. +358 (0)9 855 98 000 Fax. +358 (0)9 855 98 002 Mob. +358 (0)41 515 1412  Internet: http://www.profium.com