From: Bryan T. <br...@sy...> - 2014-11-06 15:10:44
|
Jim, 502 is about support for expressions (other than simple variables in ORDER_BY). If there is an issue with DISTINCT + ORDER_BY then this would be a new ticket. Just post the EXPLAIN (attach to the email) for the moment. I want to see how this is being generated. We should then check the specification and make sure that the correct behavior is DISTINCT followed by ORDER BY with any limit applied after the ORDER BY. I can then check the code for how we are handling this. The relevant logic is in AST2BOpUtility at line 451. You can see that it is already attempting to handle this and that there was a historical ticket for this issue (#563). /* * Note: The DISTINCT operators also enforce the projection. * * Note: REDUCED allows, but does not require, either complete or * partial filtering of duplicates. It is part of what openrdf does * for a DESCRIBE query. * * Note: We do not currently have special operator for REDUCED. One * could be created using chunk wise DISTINCT. Note that REDUCED may * not change the order in which the solutions appear (but we are * evaluating it before ORDER BY so that is Ok.) * * TODO If there is an ORDER BY and a DISTINCT then the sort can be * used to impose the distinct without the overhead of a hash index * by filtering out the duplicate solutions after the sort. */ // When true, DISTINCT must preserve ORDER BY ordering. final boolean preserveOrder; if (orderBy != null && !orderBy.isEmpty()) { * /** * * Note: ORDER BY before DISTINCT, so DISTINCT must preserve* * * order.* * * * * * @see https://sourceforge.net/apps/trac/bigdata/ticket/563 <https://sourceforge.net/apps/trac/bigdata/ticket/563>* * * (ORDER BY + DISTINCT)* * */* preserveOrder = true; left = addOrderBy(left, queryBase, orderBy, ctx); } else { preserveOrder = false; } if (projection.isDistinct() || projection.isReduced()) { left = addDistinct(left, queryBase, preserveOrder, ctx); } } else { /* * TODO Under what circumstances can the projection be [null]? */ if (orderBy != null && !orderBy.isEmpty()) { left = addOrderBy(left, queryBase, orderBy, ctx); } } Bryan ---- Bryan Thompson Chief Scientist & Founder SYSTAP, LLC 4501 Tower Road Greensboro, NC 27410 br...@sy... http://bigdata.com http://mapgraph.io CONFIDENTIALITY NOTICE: This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. On Thu, Nov 6, 2014 at 10:03 AM, Jim Balhoff <ba...@ne...> wrote: > Hi Bryan, > > Just to clarify, would you like me to attach the info to ticket 502, or > continue posting to the developer list? > > Thanks, > Jim > > > > On Nov 6, 2014, at 8:28 AM, Bryan Thompson <br...@sy...> wrote: > > > > The ticket for allowing aggregates in ORDER BY is: > > > > - http://trac.bigdata.com/ticket/502 (Allow aggregates in ORDER BY > clause) > > > > Can you attach the EXPLAIN of the query with and without DISTINCT. The > issue may be that the DISTINCT is being applied after the ORDER BY. I seem > to remember some issue historically with operations being performed > before/after the ORDER BY, but I do not have any distinct recollection of a > problematic interaction between DISTINCT and ORDER BY. > > > > Bryan > > > > ---- > > Bryan Thompson > > Chief Scientist & Founder > > SYSTAP, LLC > > 4501 Tower Road > > Greensboro, NC 27410 > > br...@sy... > > http://bigdata.com > > http://mapgraph.io > > CONFIDENTIALITY NOTICE: This email and its contents and attachments are > for the sole use of the intended recipient(s) and are confidential or > proprietary to SYSTAP. Any unauthorized review, use, disclosure, > dissemination or copying of this email or its contents or attachments is > prohibited. If you have received this communication in error, please notify > the sender by reply email and permanently delete all copies of the email > and its contents and attachments. > > > > > > > On Wed, Nov 5, 2014 at 6:14 PM, Jim Balhoff <ba...@ne...> wrote: > > > On Nov 5, 2014, at 5:46 PM, Jeremy J Carroll <jj...@sy...> wrote: > > > > > > > > >> On Nov 5, 2014, at 1:02 PM, Bryan Thompson <br...@sy...> wrote: > > >> > > >> There could be an issue with ORDER BY operating on an anonymous and > non-projected variable. Try declaring and binding a variable for > STR(?label) inside of the query and then using that variable in the ORDER > BY clause. > > > > > > > > > Yes I tend to find the results of ORDER BY are more what I expect if I > do not include an expression in the ORDER BY but simply variables. I BIND > any expression before the ORDER BY. > > > > > > I believe there is a trac item for this, but since the workaround is > easy, I have never seen it as high priority > > > > > > > As suggested I tried binding a variable as `BIND (STR(?term_label) AS > ?string_label)` and using that to sort. Still incorrect ordering. But, I > tried removing DISTINCT, and then the ordering is correct. Even going back > to the anonymous `ORDER BY STR(?term_label)`, ordering is still correct if > I remove DISTINCT. For this specific query DISTINCT is not needed, but I do > need it for my application. Is there a reason to not expect DISTINCT to > work correctly with ORDER BY? > > > > Thanks both of you for all of your help, > > Jim > > > > > > |