bigdata-developers Mailing List for Blazegraph (powered by bigdata) (Page 30)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Oh no, I added the BIND to get rid of that. I'll redo and update you.

> On Nov 6, 2014, at 12:01 PM, Bryan Thompson <br...@sy...> wrote:
> 
> What happens if you replace that last line with:
> 
> ORDER BY ?string_label
> 
> rather than 
> 
> ORDER BY STR(?string_label)
> 
> Remember, it is assuming that the ORDER BY is using simple variables.
> 
> Bryan
> 
> On Thu, Nov 6, 2014 at 11:58 AM, Jim Balhoff <ba...@ne...> wrote:
> Here is the exact query (with or without DISTINCT) for the linked results:
> 
> PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
> PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
> PREFIX owl: <http://www.w3.org/2002/07/owl#>
> 
> SELECT DISTINCT ?term ?string_label
> WHERE
> {
> ?term rdf:type owl:Class .
> ?term rdfs:label ?term_label .
> BIND (STR(?term_label) AS ?string_label)
> }
> ORDER BY STR(?string_label)
> 
> 
> Results (same number of rows either way):
> SELECT DISTINCT:
> explain: https://dl.dropboxusercontent.com/u/6704325/bigdata/2014-11-6/with_distinct_explain.html
> result: https://dl.dropboxusercontent.com/u/6704325/bigdata/2014-11-6/with_distinct_result.csv
> 
> SELECT:
> explain: https://dl.dropboxusercontent.com/u/6704325/bigdata/2014-11-6/no_distinct_explain.html
> result: https://dl.dropboxusercontent.com/u/6704325/bigdata/2014-11-6/no_distinct_result.csv
> 
> You can diff the two results files to see the out-of-order blocks.
> 
> I suppose it does look like the DISTINCT query plan has ORDER BY applied before DISTINCT, if I am reading it right.
> 
> Thanks,
> Jim
> 
> 
> 
> 
> > On Nov 6, 2014, at 10:10 AM, Bryan Thompson <br...@sy...> wrote:
> >
> > Jim,
> >
> > 502 is about support for expressions (other than simple variables in ORDER_BY).
> >
> > If there is an issue with DISTINCT + ORDER_BY then this would be a new ticket.
> >
> > Just post the EXPLAIN (attach to the email) for the moment.  I want to see how this is being generated.  We should then check the specification and make sure that the correct behavior is DISTINCT followed by ORDER BY with any limit applied after the ORDER BY.  I can then check the code for how we are handling this.
> >
> > The relevant logic is in AST2BOpUtility at line 451.  You can see that it is already attempting to handle this and that there was a historical ticket for this issue (#563).
> >
> >
> >
> >             /*
> >
> >              * Note: The DISTINCT operators also enforce the projection.
> >
> >              *
> >
> >              * Note: REDUCED allows, but does not require, either complete or
> >
> >              * partial filtering of duplicates. It is part of what openrdf does
> >
> >              * for a DESCRIBE query.
> >
> >              *
> >
> >              * Note: We do not currently have special operator for REDUCED. One
> >
> >              * could be created using chunk wise DISTINCT. Note that REDUCED may
> >
> >              * not change the order in which the solutions appear (but we are
> >
> >              * evaluating it before ORDER BY so that is Ok.)
> >
> >              *
> >
> >              * TODO If there is an ORDER BY and a DISTINCT then the sort can be
> >
> >              * used to impose the distinct without the overhead of a hash index
> >
> >              * by filtering out the duplicate solutions after the sort.
> >
> >              */
> >
> >
> >
> >             // When true, DISTINCT must preserve ORDER BY ordering.
> >
> >             final boolean preserveOrder;
> >
> >
> >
> >             if (orderBy != null && !orderBy.isEmpty()) {
> >
> >
> >
> >                 /*
> >
> >                  * Note: ORDER BY before DISTINCT, so DISTINCT must preserve
> >
> >                  * order.
> >
> >                  *
> >
> >                  * @see https://sourceforge.net/apps/trac/bigdata/ticket/563
> >
> >                  * (ORDER BY + DISTINCT)
> >
> >                  */
> >
> >
> >                 preserveOrder = true;
> >
> >
> >
> >                 left = addOrderBy(left, queryBase, orderBy, ctx);
> >
> >
> >
> >             } else {
> >
> >
> >                 preserveOrder = false;
> >
> >
> >             }
> >
> >
> >
> >             if (projection.isDistinct() || projection.isReduced()) {
> >
> >
> >
> >                 left = addDistinct(left, queryBase, preserveOrder, ctx);
> >
> >
> >
> >             }
> >
> >
> >
> >         } else {
> >
> >
> >
> >             /*
> >
> >              * TODO Under what circumstances can the projection be [null]?
> >
> >              */
> >
> >
> >             if (orderBy != null && !orderBy.isEmpty()) {
> >
> >
> >
> >                 left = addOrderBy(left, queryBase, orderBy, ctx);
> >
> >
> >
> >             }
> >
> >
> >
> >         }
> >
> >
> >
> > Bryan
> >
> >
> > ----
> > Bryan Thompson
> > Chief Scientist & Founder
> > SYSTAP, LLC
> > 4501 Tower Road
> > Greensboro, NC 27410
> > br...@sy...
> > http://bigdata.com
> > http://mapgraph.io
> > CONFIDENTIALITY NOTICE:  This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > >
> >
> >
> > On Thu, Nov 6, 2014 at 10:03 AM, Jim Balhoff <ba...@ne...> wrote:
> > Hi Bryan,
> >
> > Just to clarify, would you like me to attach the info to ticket 502, or continue posting to the developer list?
> >
> > Thanks,
> > Jim
> >
> >
> > > On Nov 6, 2014, at 8:28 AM, Bryan Thompson <br...@sy...> wrote:
> > >
> > > The ticket for allowing aggregates in ORDER BY is:
> > >
> > > - http://trac.bigdata.com/ticket/502 (Allow aggregates in ORDER BY clause)
> > >
> > > Can you attach the EXPLAIN of the query with and without DISTINCT.  The issue may be that the DISTINCT is being applied after the ORDER BY.  I seem to remember some issue historically with operations being performed before/after the ORDER BY, but I do not have any distinct recollection of a problematic interaction between DISTINCT and ORDER BY.
> > >
> > > Bryan
> > >
> > > ----
> > > Bryan Thompson
> > > Chief Scientist & Founder
> > > SYSTAP, LLC
> > > 4501 Tower Road
> > > Greensboro, NC 27410
> > > br...@sy...
> > > http://bigdata.com
> > > http://mapgraph.io
> > > CONFIDENTIALITY NOTICE:  This email and its contents and attachments are for the sole use of the intended recipient(s) and are confidential or proprietary to SYSTAP. Any unauthorized review, use, disclosure, dissemination or copying of this email or its contents or attachments is prohibited. If you have received this communication in error, please notify the sender by reply email and permanently delete all copies of the email and its contents and attachments. > > >
> > >
> > >
> > > On Wed, Nov 5, 2014 at 6:14 PM, Jim Balhoff <ba...@ne...> wrote:
> > > > On Nov 5, 2014, at 5:46 PM, Jeremy J Carroll <jj...@sy...> wrote:
> > > >
> > > >
> > > >> On Nov 5, 2014, at 1:02 PM, Bryan Thompson <br...@sy...> wrote:
> > > >>
> > > >> There could be an issue with ORDER BY operating on an anonymous and non-projected variable.  Try declaring and binding a variable for STR(?label) inside of the query and then using that variable in the ORDER BY clause.
> > > >
> > > >
> > > > Yes I tend to find the results of ORDER BY are more what I expect if I do not include an expression in the ORDER BY but simply variables. I BIND any expression before the ORDER BY.
> > > >
> > > > I believe there is a trac item for this, but since the workaround is easy, I have never seen it as high priority
> > > >
> > >
> > > As suggested I tried binding a variable as `BIND (STR(?term_label) AS ?string_label)` and using that to sort. Still incorrect ordering. But, I tried removing DISTINCT, and then the ordering is correct. Even going back to the anonymous `ORDER BY STR(?term_label)`, ordering is still correct if I remove DISTINCT. For this specific query DISTINCT is not needed, but I do need it for my application. Is there a reason to not expect DISTINCT to work correctly with ORDER BY?
> > >
> > > Thanks both of you for all of your help,
> > > Jim
> > >
> > >
> >
> >
> 
> 

2010	Jan	Feb (19)	Mar (8)	Apr (25)	May (16)	Jun (77)	Jul (131)	Aug (76)	Sep (30)	Oct (7)	Nov (3)	Dec
2011	Jan	Feb	Mar	Apr	May (2)	Jun (2)	Jul (16)	Aug (3)	Sep (1)	Oct	Nov (7)	Dec (7)
2012	Jan (10)	Feb (1)	Mar (8)	Apr (6)	May (1)	Jun (3)	Jul (1)	Aug	Sep (1)	Oct	Nov (8)	Dec (2)
2013	Jan (5)	Feb (12)	Mar (2)	Apr (1)	May (1)	Jun (1)	Jul (22)	Aug (50)	Sep (31)	Oct (64)	Nov (83)	Dec (28)
2014	Jan (31)	Feb (18)	Mar (27)	Apr (39)	May (45)	Jun (15)	Jul (6)	Aug (27)	Sep (6)	Oct (67)	Nov (70)	Dec (1)
2015	Jan (3)	Feb (18)	Mar (22)	Apr (121)	May (42)	Jun (17)	Jul (8)	Aug (11)	Sep (26)	Oct (15)	Nov (66)	Dec (38)
2016	Jan (14)	Feb (59)	Mar (28)	Apr (44)	May (21)	Jun (12)	Jul (9)	Aug (11)	Sep (4)	Oct (2)	Nov (1)	Dec
2017	Jan (20)	Feb (7)	Mar (4)	Apr (18)	May (7)	Jun (3)	Jul (13)	Aug (2)	Sep (4)	Oct (9)	Nov (2)	Dec (5)
2018	Jan	Feb	Mar	Apr (2)	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec
2019	Jan	Feb	Mar (1)	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec

bigdata-developers Mailing List for Blazegraph (powered by bigdata) (Page 30)

Fast, scalable, robust graph database platform

bigdata-developers — List for bigdata developers