From: Ivan M. <imi...@op...> - 2010-07-23 17:57:07
|
On Fri, 2010-07-23 at 11:44 +0100, Vanessa Lopez wrote: > Hello, > > I am trying to optimize time performance as much as possible on my > full text index queries. To query for classes in DBpedia that contains > "person" in the label, I send the query > > SELECT DISTINCT ?s ?o FROM <http://dbpedia.org> WHERE {{?s rdfs:label > ?o.[] a ?s .FILTER( bif:contains(?o, "person" ) )}}LIMIT 15 > > or also (if I want to check dbprop:name): > > SELECT DISTINCT ?s ?o FROM <http://dbpedia.org> WHERE {{?s rdfs:label ? > o.[] a ?s .FILTER( bif:contains(?o, "person" ) ) } UNION { ?s > dbpprop:name > ?o.[] a ?s .FILTER( bif:contains(?o, "person" ) )}}LIMIT 10 The performance of bif:contains is "self-protected", I'd say. When the optimizer unable to find a good join with appropriate variable it reports some error. Both these queries are OK. > Can I optimize this query in any way? Does it make any different if I > put the bif:contains out of the FILTER, e.g: > > SELECT DISTINCT ?s ?o FROM <http://dbpedia.org> WHERE {{?s rdfs:label > ?o.[] a ?s. ?o bif:contains "person"}}LIMIT 15 No difference, bif:contains as a "magic predicate" is no more than syntax sugar. Both "filter" version and a "magic predicate" are boiled down to a join between table with variable in object position and a free-text for object column. Moreover, the constant graph http://dbpedia.org adds its own optimization effect to the text search. For each graph, a special "graph keyword" is created and every object used in graph is indexed in such a way that it seems that the graph keyword is a part of the object. So the actual search is for "graph keyword for http://dbpedia.org" AND "person", this does not matter if almost all data are in one graph but helps in other cases. Best Regards, Ivan Mikhailov OpenLink Software http://virtuoso.openlinksw.com |