|
From: Jeremy J C. <jj...@sy...> - 2013-10-10 16:57:37
|
I can revert the MIN/MAX change from yesterday (which resurrects the old semantics) and then we can discuss tomorrow how to proceed. -- I did some digging: > Those semantics for error handling were hammered out with lee feigenbaum. I would not comment out or disable that behavior without verifying the correct interpretation if the spec for errors. B The spec has been fairly clear since the introduction of MIN in http://www.w3.org/TR/2010/WD-sparql11-query-20100126/ [[ Definition: Min The multiset of values passed as an argument is converted to a sequence S, this sequence is ordered as per the ORDER BY ASC clause. Min(S) = S0 ]] The current key phrase [[ The[y] make use of the SPARQL ORDER BY ordering definition, to allow ordering over arbitrarily typed expressions. ]] is in http://www.w3.org/TR/2010/WD-sparql11-query-20100601/ Lee himself seems to argue for the current semantics in http://lists.w3.org/Archives/Public/public-rdf-dawg/2009OctDec/0408.html All of this predates the test which seem to be introduced around Dec 2011 in http://bigdata.svn.sourceforge.net/viewvc/bigdata?logsort=cvs&view=revision&revision=5822 Looking at the rec further, there should be a test that unbound comes before other values hmm something like SELECT MIN(?x) MAX(?x) { { BIND(1 as ?x)} UNION { BIND(1 +"x" as ?x)} } I think should return (UNBOUND, 1) === Jeremy J Carroll Principal Architect Syapse, Inc. On Oct 10, 2013, at 8:26 AM, Bryan Thompson <br...@sy...> wrote: > Those semantics for error handling were hammered out with lee feigenbaum. I would not comment out or disable that behavior without verifying the correct interpretation if the spec for errors. B > > > -------- Original message -------- > From: Jeremy J Carroll <jj...@sy...> > Date: 10/10/2013 11:02 AM (GMT-05:00) > To: Bryan Thompson <br...@sy...> > Cc: Big...@li... > Subject: Re: [Bigdata-developers] MIN and MAX aggregates fixed: trac 736 (with question?) > > > Shall I update these tests to not check for error, but to check for giving the correct min or max. > > It will lose checking that the error is sticky, but that is hard since I don't believe we should have expected errors for MIN and MAX? > > (We could discuss tomorrow) > > === > > Rationale: > > These tests both check for the old behavior which explicitly does not implement the recommendation (and had the FIXME for fixing it). > > The key part for both tests is: > > > final IBindingSet data [] = new IBindingSet [] > { > new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book1, price9 } ) > , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth1, book2, price5 } ) > , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org1, auth2, book3, auth2 } ) > , new ListBindingSet ( new IVariable<?> [] { org, auth, book, lprice }, new IConstant [] { org2, auth3, book4, price7 } ) > }; > > They then call MAX or MIN on the ?lprice. > The old behavior was to raise an error, my guess is that the new behavior is to give auth2 MIN and price9 MAX (or it might be price5 MIN and auth2 MAX) > > > Jeremy J Carroll > Principal Architect > Syapse, Inc. > > > > On Oct 10, 2013, at 7:49 AM, Jeremy J Carroll <jj...@sy...> wrote: > >> thanks >> Jeremy J Carroll >> Principal Architect >> Syapse, Inc. >> >> >> >> On Oct 10, 2013, at 2:01 AM, Bryan Thompson <br...@sy...> wrote: >> >>> Jeremy, >>> >>> com.bigdata.bop.rdf.aggregate.TestMAX.test_max_with_errors (from com.bigdata.rdf.TestAll) >>> >>> And also TestMIN.test_min_with_errors are failing in CI. >>> >>> Bryan >>> >>> On Oct 9, 2013, at 8:30 PM, "Jeremy J Carroll" <jj...@sy...> wrote: >>> >>>> I have verified that MIN and MAX both inherit from INeedsMaterialization >>>> and added tests that use some of the literals from TestIVComparator that need materialization. >>>> >>>> Unless Jenkins says otherwise I believe the change is good >>>> >>>> Jeremy J Carroll >>>> Principal Architect >>>> Syapse, Inc. >>>> >>>> >>>> >>>> On Oct 9, 2013, at 5:04 PM, Jeremy J Carroll <jj...@sy...> wrote: >>>> >>>>> This is really helpful, I was just about to go home, but think I will spend a few more minutes on this! >>>>> >>>>> >>>>> Jeremy J Carroll >>>>> Principal Architect >>>>> Syapse, Inc. >>>>> >>>>> >>>>> >>>>> On Oct 9, 2013, at 5:00 PM, Bryan Thompson <br...@sy...> wrote: >>>>> >>>>>> From the ticket you mentioned -- the INeedsMaterialization interface is >>>>>> used to declare which bops require materialization. This gets lifted from >>>>>> subexpressions to decide if we need to do materialization before entering >>>>>> the expression. >>>>>> >>>>>>>>> SNIP<<< >>>>>> There are some known issues with comparisons of inline and non-inline >>>>>> IVs which MikeP will be addressing shortly. Basically, we need an >>>>>> INeedsMaterialization option for operators which always require the >>>>>> materialization of non-inline IVs. ORDER_BY and aggregation both have >>>>>> this characteristic since they must run over all solutions at once and >>>>>> can not re-try solutions which have failed with a >>>>>> NotMaterializedException? >>>>>> <https://sourceforge.net/apps/trac/bigdata/wiki/NotMaterializedException>. >>>>>> As an initial step in that direction, I have defined an IVComparator >>>>>> and a test suite for that class and also added a >>>>>> compareLiteral(IV,Literal) to IVUtility. While this covers some cases, >>>>>> it does not cover all as demonstrated by TestIVComparator. >>>>>> >>>>>>>>> SNIP<<< >>>>>> >>>>>> I think that the question for MIN and MAX is just whether they are >>>>>> imposing the total ordering for SPARQL over RDF Values. Again, I would >>>>>> have to look at the code. That might be an old FIXME. Or it might be a >>>>>> valid problem. However, I think that there are DAWG test cases that cover >>>>>> this. I would check there first and make sure that there is a problem (or >>>>>> an absence of test coverage). >>>>>> >>>>>> >>>>>> Bryan >>>>>> >>>>>> >>>>>> On 10/9/13 7:26 PM, "Jeremy J Carroll" <jj...@sy...> wrote: >>>>>> >>>>>>> >>>>>>> Please glance at this code (from MIN.java or MAX.java) >>>>>>> >>>>>>> ... >>>>>>> >>>>>>> + private static IVComparator comparator = new IVComparator(); >>>>>>> >>>>>>> Š >>>>>>> >>>>>>> >>>>>>> - /** >>>>>>> - * FIXME This needs to use the ordering define by >>>>>>> ORDER_BY. The >>>>>>> - * CompareBOp imposes the ordering defined for the "<" >>>>>>> operator >>>>>>> - * which is less robust and will throw a type exception >>>>>>> if you >>>>>>> - * attempt to compare unlike Values. >>>>>>> - * >>>>>>> - * @see >>>>>>> https://sourceforge.net/apps/trac/bigdata/ticket/300#comment:5 >>>>>>> - */ >>>>>>> - if (CompareBOp.compare(iv, min, CompareOp.LT)) { >>>>>>> + if (comparator.compare(iv, min)<0) { >>>>>>> >>>>>>> min = iv; >>>>>>> >>>>>>> } >>>>>>> >>>>>>> >>>>>>> >>>>>>> It seems to work, but I read something about requiring materialization >>>>>>> which I did not understand and chose to ignore - was that a mistake? >>>>>>> >>>>>>> Jeremy J Carroll >>>>>>> Principal Architect >>>>>>> Syapse, Inc. >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -------------------------------------------------------------------------- >>>>>>> ---- >>>>>>> October Webinars: Code for Performance >>>>>>> Free Intel webinars can help you accelerate application performance. >>>>>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the most >>>>>>> from >>>>>>> the latest Intel processors and coprocessors. See abstracts and register > >>>>>>> http://pubads.g.doubleclick.net/gampad/clk?id=60134071&iu=/4140/ostg.clktr >>>>>>> k >>>>>>> _______________________________________________ >>>>>>> Bigdata-developers mailing list >>>>>>> Big...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/bigdata-developers >>>>>> >>>>> >>>> >> > |