Re: [dotNetRDF-bugs] Problems with SPARQL queries

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Also, here's a test repo https://bitbucket.org/tpluscode/sparql-test


On Wed, May 21, 2014 at 2:18 PM, Tomek Pluskiewicz <to...@pl...>wrote:

> Hi Rob
>
> We've developing a ORM solution complete with Linq for some time now. Will
> be open source'd at some point. Currently we've been experiencing problems
> with query speed and reliability. Let me acquaint you with how things work.
>
> Each resource is contained within its own named graph and additionally
> there is a meta-graph, which connects graphs and the described entities
> (there could be many graphs for one resource). For example
>
> # meta graph
> <http://foo.com/productList/>
> {
>   ex:Wrench1 foaf:primaryTopic ex:Wrench1 .
> }
>
> # wrench
> ex:Wrench1 { ex:Wrench1 a sch:Product ; sch:name "Wrench" . }
>
> The problem is with a query
>
> PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX schema: <http://schema.org/>
> PREFIX foaf: <http://xmlns.com/foaf/0.1/>
>
> SELECT ?s ?p ?o ?Gp0 ?p0
> WHERE
> {
>  GRAPH ?Gp0
> {
> ?s ?p ?o .
>  ?p0_sub schema:name ?name0_sub .
> FILTER (CONTAINS(UCASE(?name0_sub),"W"^^xsd:string))
>  ?p0 rdf:type schema:Product .
> {
>  SELECT DISTINCT ?p0_sub
> WHERE
>  {
> GRAPH ?Gp0_sub
>  {
> ?p0_sub rdf:type schema:Product .
>  ?p0_sub schema:name ?name0_sub .
> FILTER (CONTAINS(UCASE(?name0_sub),"W"^^xsd:string))
>  }
> GRAPH <http://foo.com/productList/>
>  {
> ?Gp0_sub foaf:primaryTopic ?p0_sub .
>  }
> }
> #ORDER BY ?p0_sub
>  LIMIT 2
> }
> FILTER(?p0_sub=?p0)
>  }
>
> GRAPH <http://foo.com/productList/>
>  {
> ?Gp0 foaf:primaryTopic ?p0 .
>  }
> }
>
> transformed from the following Linq
>
> Query<IProduct>().Where(p =>
> p.Name.ToUpper().Contains(name.ToUpper())).Take(2)
>
> There are two problems here. The query returns different results on
> subsequent runs against the same dataset and it runs very slow.
> Uncommenting the ORDER BY helps with the varying result count though I'm
> not exactly sure why it should be necessary. However I'm not sure what's
> with performance. Obviously it has something to do with the subquery but I
> was unable to alter this SELECT so that it executed quickly. Even as
> small a dataset as 9 quads (3 resources * (2 triples + 1 meta-triple))
> takes 1 second to complete and the time seems to increase exponentially. At
> 90 quads/30 graphs it is already taking close to 3 minutes.
>
> We've first observed the performance problems with version 1.0.4 but with
> a synthetic dataset the same issues arise in previous releases and 1.0.5+.
>
> Hope you can help. Would you like any additional info?
>
> Regards,
> Tom
>