From: Seaborne, A. <and...@hp...> - 2007-01-23 15:26:23
|
-------- Original Message -------- > From: Christian Weiske <> > Date: 22 January 2007 18:23 >=20 > Hello all, >=20 >=20 > I face a very big problem regarding my SparqlEngineDb implementation: > Sorting. >=20 > One can sort by any variable of any type in Sparql. The data type of the > values is stored in statements/l_datatype column and can hardly be > directly used in a SQL query. >=20 > Imagine the following query: > ---------------------- > SELECT ?name ?emp > WHERE { ?x foaf:name ?name ; > ex:empId ?emp > } > ORDER BY ASC(?emp) > ---------------------- >=20 > Now ?emp is an object following an "ex:empId" predicate, making the > datatype of ?emp xsd:integer. Making the SQL query in this case is easy; > just "ORDER BY CAST(emp as INTEGER) ASC" and it's done. >=20 > Main problem is that I don't know in advance which data type the column > will have (as there could be statements like "?a ?b ?c .", ordering by > ?c) and that, even more serious, there could be multiple data types in > this data. >=20 > I don't know what to do here. The only ideas I came up with are: > 1) Use a stored procedure which does comparison. This will render the > engine useless for older db systems such as mysql < 5. I also need to > get into stored procedures, but I think it is possible to hook into > sorting. 2) Sort on the client side. I would like to omit this since it > would=20 > decrease performance greatly. Further, I couldn't use server-side > LIMITs, making performance even worse. > 3) Provide data type hints in the query, which would make the sparql > implementation understand proprietary queries and still wouldn't work > with normal queries. >=20 > Anybody an idea what I could do? The ordering of unlike things is partially defined in SPARQL. I'm not sure what your DB schema is but if the type of the node (URI, bnode, datatype of literal) is some carefully choosen integer then=20 ORDER BY ?emp=20 is (nearly)=20 ORDER BY emp.type, emp.lexicialform. That is, the kind of type is more significant. The numbers need to order the conditions: [[ 1. (Lowest) no value assigned to the variable or expression in this solution. 2. Blank nodes 3. IRIs 4. RDF literals 5. A plain literal is lower than an RDF literal with type xsd:string of the same lexical form. ]] Sorting by lexical form is too weak for value sorting of numbers. This approach might be able to be extended to cover this, And what is the SQL schema by the way? >=20 >=20 > Btw, RDQL doesn't even provide ORDER support. I think I know why.. :-) Andy |