RE: [Sparql4j-devel] RE: XML parsing
Status: Pre-Alpha
Brought to you by:
jsaarela
From: Seaborne, A. <and...@hp...> - 2006-01-03 14:05:15
|
-------- Original Message -------- > From: Samppa Saarela <> > Date: 29 December 2005 08:40 >=20 > > There is a mismatch between the JDBC paradigm and the SAX paradigm. > > JDBC is purely driven by the client application and there are no > > mandator call backs. SAX is driven by the rate of arrival as given > > by the parser so it migh have to accumulate results until the client > > is ready.=20 > >=20 > > There could be a bounded pipe between application and SAX code but I > > found it simpler to use a StAX parser (Woodstox) because the whole > > results-consuming process is then determined by the application. It > > is as easy to write StAX code as to write SAX > code. >=20 > StAX seems like a good choice. >=20 > > For handling SELECT queries, we don't need a full API. We need to be > > > able to handle RDF terms, but not triples. >=20 > If we have a factory that can handle RDF terms, adding support for > triples is trivial.=20 In the sense that creating a triple is just a 3-slot object, yes. But the factory idea means that objects specific to the local RDF tookit are retruned and it will have it's own idea of a triple (e..g. in Jena, the application API object "Statement" is not a plain 3 slot triple - it knows which model it comes from). The XML Results format does not return triples - it's only CONSTRUCT and DESCRIBE. >=20 > > For CONSTRUCT, etc, it might be better to properly link the result to > > > the local RDf toolkit of choice (e.g. via an InputStream). c.f. > > > SQL Blobs/Clobs.=20 >=20 >=20 > I would find it more convenient to get the triples of the graph > returned as triples (i.e. triple per row) using the factory along with > speudo column accessors. This way we would (first of all) avoid special > content-type handling. I don't follow this - the HTTP reply header has to be correctly parsed. Such content handling is easy. Could you give the use case you have in mind here? (why is it more convenient to have a stream of triples?) > When using the InputStream-approach the user > should be in control of the requested content-type. However, since > InputStreams are more convenient in some situations (e.g. when using > XSLT to process the results) maybe the best alternative would be to > provice both... and since at least some of the use cases for > InputStream access to the results are the same regardless of the type > of the query, I see no reason to limit the usage of InputStream results > only to construct and describe. To support these use cases we could > overload Statement's execute with one that returns an InputStream. E.g. >=20 > Connection c =3D datasource.getConnection(); >=20 > Statement select =3D c.createStatement(); > InputStream results =3D select.executeRaw("<any sparql query>"); >=20 >=20 > The preferred RDF serialization could be provided to the statement via > select.setRdfLang("N3") similarily to other hints (e.g. fetch size, > escape processing...).=20 >=20 >=20 > Br, > Samppa Toolkits have their own APIs to their parsers to generate triples - I guess most have a stream interface (ours do) but it would be more normal to parse directly into the graph=20 and return a graph (yes - streaming is not possible). =20 To get a stream of triples, do the query: SELECT * { ?s ?p ?o } maybe with an ORDER BY=20 As a graph isn't usable until all the triples are known (a triple can turn up at any point in the stream), an application would need to do the SELECT query to process results before the last is seen. Andy >=20 >=20 >=20 > -- > Samppa Saarela <samppa.saarela at profium.com> Profium, Lars Sonckin > kaari 12, 02600 Espoo, Finland Tel. +358 (0)9 855 98 000 Fax. +358 (0)9 > 855 98 002 Mob. +358 (0)41 515 1412 Internet: http://www.profium.com=20 >=20 |