Re: [Sparql4j-devel] SPARQL4J : a few first thoughts
Status: Pre-Alpha
Brought to you by:
jsaarela
|
From: Alberto R. <al...@as...> - 2005-01-15 16:38:38
|
hello
On Jan 15, 2005, at 10:18 AM, Janne Saarela wrote:
> I very much agree with your goals for this project.
me too - let's start simple SELECT HTTP GET - but design should also
accommodate future extensions eventually - plug-in idea of Andy is a
good one as a starting hook for other things.
> The non-RDF applications will find it easy to access SPARQL enabled
> repositories. The easiness comes via using the familiar programming
> concepts relating to access relational databases using JDBC. In
> addition, the easiness comes via the use of on single jdbc driver
> instead of having to download a separate one for each repository.
I agree - we should start to provide an SQL alike tabular interface to
RDF result sets, rather than graphs.
>
>> This, together with the scoping of JDBC
>>
>> """ javadoc java.sql (1.5.0)
>
> I would be in favor of targetting 1.4.2 JDK to start with. This is due
> to our product which ships with 1.4.2 support as we speak with 1.5
> support coming in the future.
>
> I should check what changes there are in java.sql from 1.4.2 to 1.5.
> Would you remember by heart?
I can not help here too much - I guess Andy knows more about these
issues - or we can check on Sun specs. Last spec is JDBC-3.0
>
>> means I see SPARQL4J as a JDBC driver that is mainly about issuing
>> SPARQL
>> SELECT queries, rather than CONSTRUCT or DESCRIBE. A release with
>> just
>> SPARQL SELECT, using the plain XML result format, would be very
>> useful to
>> application writers - one, conventional, interface to RDF published
>> data.
>> Toolkit independent.
>
> SELECT we start with - let's see how CONSTRUCT and DESCRIBE can be
> tweaked in the long run.
exactly - the design should allow to accommodate extensions for more
graph-ical alike queries
Speaking about the Perl DBI world, what we have added some extra
methods in addition to traditional SQL/relational operations (I would
call it RDBC for RDF- DataBaseConnectivity)
fetchrow_XML()
-> fetch next XML chunk using DAWG-xml format
fetchall_XML()
-> fetch all the XML result-set in one go using DAWG-xml
fetchsubgraph_serialize()
-> return a serilization (RDF/XML, N-Triples or other) of the next
subgraph resulting from
the query (either SELECT, DESCRIBE, CONSTRUCT)
fetchallgraph_serialize()
-> return a serilization (RDF/XML, N-Triples or other) of the whole
subgraph resulting from
the query (merge of all subgraph of previous method)
fetchsubgraph()
-> return the next subgraph resulting from the query (e.g. GraphModel)
fetchallgraph()
-> return the whole subgraph resulting from the query
Then the Perl API has explicitly a method called func() to call an
extension function/method - I guess we will be able to tweak something
similar in the JDBC, perhaps extending/sub-classing.
>
>> We could be merely inspired by JDBC but actually produce a new
>> interface that is more SPARQL suitable. I'd like to avoid this for
>> now
>> and try to implement "pure" JDBC.
>
> My goal is the very same - let's know get into extensions right away.
incremental - let's expose canonical (old) RDQL (SELECT only)
functionality through JDBC - then in the meantime start with the rest
of the features
>
>> I'm assuming that the connection to the database (the RDF store, the
>> knowledge base) is HTTP. Now I would like to be able to take the
>> SPARQL4J codebase and plug-in an adapter, instead of HTTP, to get a
>> JDBC
>> driver for ARQ [0] directly for local use. We need a plug-in layer
>> for that
>> which is a connection layer with SPARQL-centric mechanisms and then
>> have
>> common code for the presentation of results as JDBC methods.
>
> Ok, I see. Internally we can create a factory that gives the protocol
> part implementation for the other parts of the driver. From the user
> perspective this protocol should perhaps be visible on the datasource
> string? What do you think? Let's start a separate thread on the
> datasource.
it might be JDBC database metadata (catalog) methods will help in there
to bootstrap and negotiate the protocol part - or carefully define an
initial list of possible protocols in addition to HTTP - for sure a
local/hijacked one will be needed.
>
>> Update - out of scope: Until there is a standard (de facto or de jure)
>> language, servers won't implement a common way to do RDF update -
>> there are
>> several major decisions, like handling bNodes, to be settled first.
>
> Agreed - out of scope for now.
+1
>
>> A quick review of the JDBC interfaces shows few tricky parts:
>>
>> 1/ NULLs. SQL has NULLs; SPARQL has unbound variables. That in itself
>> is not important - NULL would be just a way of saying "not bound".
>> But
>> the getXXX methods must return a value of the given type and getInt
>> returns
>> 0 on NULL, which isn't very distinguishing value. But in RDF, there
>> isn't
>> always a value in the result set for a given row - NULLs become more
>> common
>> and the "return zero" solution is a bit weak.
>>
>> Solutions?: Default values that are more unusual values (e.g.
>> MIN_VALUE+2)
>> or ones that can be set by the app (requires an extension to JDBC).
>
> I would be happy with a default value. Let's see what the eventual
> other users think.
again - we might need to have a better look at the JDBC
metadata/catalog layer to see what it offers to negotiate this if
possible
>
>> 2/ Metadata. Each solution to a SPARQL query can have different types
>> of values for the same property. Returning anything meaningful as
>> "metadata" will need design work. Solution?: For now, return very
>> little and see if applications use the metadata information much.
>
> Let's make a, if not dummy, a very basic implementation of the
> resultset metadata and listen to use cases that would require a more
> elaborate implementation.
agree - let's try to get basic conjunctive query to work (even no
optional if too hard) - then move on to next step
>
>> 3/ Error conditions: The JDBC interface assumes a conventional
>> connection to
>> a local database. SPARQL is a web protocol and many error conditions
>> matter
>> - the difference between soft errors like can't contact and hard
>> errors like
>> invalid query make more of a difference to a web app.
>
> I think the different errors could be modelled as a hierarchy of
> Exceptions subclassed from SQLExeption. This would enable client code
> determine different flavours of errors without having to do string
> parsing from a SQLException to see what happened.
agree also here - but if we have HTTP protocol and XML results, we
might have status/errors also encoded into data - or we want to avoid
that for JDBC?
It would be interesting to look how for example XQuery or other XML-DB
people have done similar things over JDBC
Alberto
|