From: Jim B. <ba...@ne...> - 2014-11-05 20:43:35
|
I thought that different types of strings might affect the ordering, so I also tried this as the last line of the query: ORDER BY STR(?term_label) This also results in similar incorrect ordering. Would you expect this to be enough to remove any problems due to different literal types? Based on the standard, my expectation is that this would. Thank you, Jim > On Nov 5, 2014, at 2:14 PM, Bryan Thompson <br...@sy...> wrote: > > Jim, > > If you look at the SPARQL output, the labels appear to be present twice because some of them are: > > <literal xml:lang='en'>anterior humeral ridge</literal> > > and some are: > > <literal datatype='http://www.w3.org/2001/XMLSchema#string'>1st arch mandibular component</literal> > > So they are not the same "type" of literal. > > You can probably cast everything to a single type to get around this. > > Please check with the standard, but I am not sure that there is a bug here. > > Thanks, > Bryan > > > ---- > Bryan Thompson > Chief Scientist & Founder > SYSTAP, LLC > 4501 Tower Road > Greensboro, NC 27410 > br...@sy... > http://bigdata.com > http://mapgraph.io > CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > > > On Wed, Nov 5, 2014 at 10:36 AM, Jim Balhoff <ba...@ne...> wrote: > > On Nov 5, 2014, at 10:20 AM, Bryan Thompson <br...@sy...> wrote: > > > > Is there a public endpoint and query that I can use to test this? > > I will send you a separate email with this. > > > > > If this is local data, is there a small data set that we can use to replicate the problem? > > I am using the same dataset in a local instance as in the original ticket: http://purl.obolibrary.org/obo/uberon/releases/2014-10-26/ext.owl > > Just the triples in that file. Query: > > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#> > PREFIX owl: <http://www.w3.org/2002/07/owl#> > > SELECT DISTINCT ?term ?term_label > WHERE > { > ?term rdf:type owl:Class . > ?term rdfs:label ?term_label . > } > ORDER BY ?term_label > > > > > In general, the ORDER BY operator should execute once ALL solutions have been materialized within that operator. It then applies the sort and the solutions are reported. > > > > My questions would be: > > > > - What is the EXPLAIN of the query? > > I attached a copy of the EXPLAIN output, in HTML format to preserve the table. To me it looks like the sort is not happening at the end, but instead earlier, but I don't have much confidence in my understanding of everything being reported. > > > - Does a simple unit test of the MemorySortOp show the same problem? That is, is this related to the MemorySortOp implementation or the query engine / query plan generator? > > I've only tested SPARQL queries so far. > > > - Are there any odd things going on with the unicode setup? Are the characters "a" and "a" really the same characters. > > Not that I know of. I can create a new ticket for this if you would like. > > Thanks, > Jim > > > ------------------------------------------------------------------------------ > _______________________________________________ > Bigdata-developers mailing list > Big...@li... > https://lists.sourceforge.net/lists/listinfo/bigdata-developers |