|
From: Max - M. <ma...@mi...> - 2015-06-01 17:20:46
|
Ok Rob,
Sadly I must agree with you after reviewing the recommendation in depth and
even more sadly here is another SPARQL deception and disappointment here :(
They must have had reasons to define this as is but I feel they breached
the left-join logic join since the left-join may finally behave like a full
outer join on LHS binding errors/unbound value, where they follow the
natural-join logic (empty joined-variable bindings being filtered out)
while dealing with relational-like structures...
2015-06-01 15:37 GMT+02:00 Rob Vesse <rv...@do...>:
> No I don't think so
>
> Consider:
>
> {
> ?s ?p ?o .
> BIND (?o / 0 AS ?example)
> }
>
> ?o is a fixed variable, however regardless of whether it is fixed/floating
> the expression it is involved in may error (or in this example always
> error) and so we always have to treat ?example as a floating variable
>
> That modification will only work for trivial left joins (and sometimes not
> even then if you have FILTERs over the left join), deep left joins will
> almost certainly be broken by that change.
>
> The logic around whether we flow results is based on the logic used by
> Apache Jena ARQ which is pretty much the reference implementation of SPARQL
> since it is maintained by the editor of the SPARQL Query specification.
>
> Rob
>
> From: Max - Micrologiciel <ma...@mi...>
> Reply-To: dotNetRDF Developer Discussion and Feature Request <
> dot...@li...>
> Date: Thursday, 28 May 2015 16:01
> To: dotNetRDF Developer Discussion and Feature Request <
> dot...@li...>
> Subject: Re: [dotNetRDF-Develop] About PR#36
>
> Ok forget this, I just realized I was running the test-case against my
> Sesame Store instead of the InMemory (which effectively handles the join
> example correctly).
> I'll file them a bug request.
>
> Sorry for the time wasted and thanks for your patience.
>
> However for the BIND LHS with left join, the variables I use in the
> expression are also bound from a triplePattern. Here is an example
>
> ex:cmd ex:hasDefaultGraph ?g .
> BIND(IRI(CONCAT("tmp:, STR(?g))) AS ?tmpG)
> OPTIONAL { GRAPH ?tmpG { ...
>
> Could it be safe to say that the BIND variable is floating if any of the
> expression variables is floating and otherwise make it fixed ?
> And modify the CanFlowResultsToRhs l.312 into
>
> if (rhsFloating.Any(v => lhsFloating.Contains(v) /* ||
> lhsFixed.Contains(v) */ )) return false;
>
> This seems to do the trick in my case.
>
> Max.
>
>
>
> 2015-05-28 15:32 GMT+02:00 Rob Vesse <rv...@do...>:
>
>> Comments inline
>>
>>
>> From: Max - Micrologiciel <ma...@mi...>
>> Date: Thursday, 28 May 2015 13:36
>> To: Rob Vesse <rv...@do...>
>> Cc: dotNetRDF Developer Discussion and Feature Request <
>> dot...@li...>
>> Subject: Re: About PR#36
>>
>> Sorry but I missed the conclusion of the demonstration :
>>
>> This to show that after a variable is defined by a BIND statement it
>> should not be considered as a floating variable.
>>
>>
>> However your conclusion about floating variables and BIND is wrong at
>> least according to how we define and use the concept of a floating variable
>> within the Leviathan engine
>>
>> A floating variable is a variable whose value is not guaranteed to be
>> bound which as I pointed out is the exact definition of a BIND variable,
>> the expression could error and so the variable could be unbound
>>
>>
>>
>> To my understanding, the only floating variables to be considered should
>> come from VALUES clauses.
>>
>>
>> Floating variables can come from BIND, OPTIONAL, VALUES, SELECT
>> expressions, aggregates I.e. anywhere where an expression is evaluated or
>> where it is possible to have unbound values
>>
>>
>> Max.
>>
>> 2015-05-28 14:31 GMT+02:00 Max - Micrologiciel <ma...@mi...>:
>>
>>> Rob,
>>>
>>> thanks for the answers.
>>>
>>> Concerning the extend case, I then definitely believe there is a flaw in
>>> our evaluation logic.
>>> *To me, the join operation should behave as it would in relational logic
>>> meaning comparing NULL with NULL will always return false so no result.*
>>>
>>
>> It already does, you seem to be conflating joins with left joins
>> (OPTIONAL) and the two are not the same
>>
>>
>>> Here's my demonstration of the case.
>>>
>>> First about the join evaluation, based on the recommendation we get :
>>>
>>> 1. §18.5 : Join(Ω1, Ω2) = { merge(μ1, μ2) | μ1 in Ω1and μ2 in Ω2,
>>> and μ1 and μ2 are compatible }
>>> 2. $18.3 : Two solution mappings μ1 and μ2 are compatible if, for
>>> every variable v in dom(μ1) and in dom(μ2), μ1(v) = μ2(v)
>>> Here, μ1(v) = μ2(v) means that μ1(v) and μ2(v) are the same RDF term.
>>>
>>> Inferred from this the join definition would be equivalent to Join(Ω1, Ω
>>> 2) = { merge(μ1, μ2) | μ1 in Ω1and μ2 in Ω2, and for each variable v in
>>> intersect(dom(μ1) dom(μ2)) sameterm(μ1(v), μ2(v)) is true }
>>> which means the join
>>>
>>> ?s ?p1 ?o1 .
>>> ?s ?p2 ?o2
>>>
>>> is equivalent to
>>>
>>> ?s1 ?p1 ?o1 .
>>> ?s2 ?p2 ?o2
>>> FILTER (sameterm(s1,s2)
>>>
>>> But we also have :
>>>
>>> 1. $17 Specifically, FILTERs eliminate any solutions that, when
>>> substituted into the expression, either result in an effective boolean
>>> value of false or produce an error.
>>> 2. §17.2 sameterm will produce a type error if any arguments are
>>> unbound
>>>
>>>
>>> Then about the extend case, let's say we have this graph pattern:
>>>
>>> ?s ?p ?o . FILTER(isLiteral(?o))
>>> ?s2 ?p2 ?o2 .
>>>
>>> The evaluation will return a cross join of both triple pattern mutlisets
>>> since according to $18.3, they are compatible because having no common
>>> variable.
>>>
>>> On the other hand, given the following pattern,
>>>
>>> {?s ?p ?o . FILTER(isBlank(?o)) }
>>> BIND (iri(?o) as ?s2) .
>>> ?s2 ?p2 ?o2
>>>
>>> Under your logic, the join would return me the same results since
>>> iri(?o) will produce a type error ?o being a blank node which is not
>>> accepted by the Iri function.
>>>
>>
>> Nowhere did I say this
>>
>> With regards to index joins we were talking specifically about the case
>> of a BIND being on the LHS of an OPTIONAL which is completely different
>> because left joins are not commutative
>>
>> Rob
>>
>>
>>>
>>> I do not agree with this since :
>>>
>>> 1. §10.1 Use of BIND ends the preceding basic graph pattern.
>>> 2. If the evaluation of the expression produces an error, the
>>> variable remains unbound for that solution but the query evaluation
>>> continues.
>>>
>>> Which means to me that in fact we now have to perform a join between the
>>> two mutlisets μ1[?s ?p ?o ?s2] and μ2[?s2 ?p2 ?o2]
>>>
>>>
>>> So still according to §18.5 and §18.3, both multisets are then now
>>> incompatible since they share the ?s2 variable which can not be compared
>>> under the sameterm conditions.
>>>
>>> Thus We should get no result back from the query.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> 2015-05-28 12:50 GMT+02:00 Rob Vesse <rv...@do...>:
>>>
>>>> Max
>>>>
>>>> Comments inline:
>>>>
>>>> From: Max - Micrologiciel <ma...@mi...>
>>>> Date: Wednesday, 27 May 2015 13:48
>>>> To: Rob Vesse <rv...@do...>
>>>> Subject: About PR#36
>>>>
>>>> Hi Rob,
>>>>
>>>> just been reviewing some comment you made in the #36 PR
>>>> <https://bitbucket.org/dotnetrdf/dotnetrdf/pull-request/36/new-spin-library> about
>>>> a change I made at first with the order of join arguments between the
>>>> query's algebra and any possible BindingPattern.
>>>>
>>>> You wrote :
>>>> "Though I think our handling of VALUES may already be broken in some
>>>> cases anyway e.g. interaction with GROUP BY"
>>>>
>>>> Would you have some example that exposes the problem, so I can have a
>>>> look into it ?
>>>>
>>>>
>>>> If memory serves the problem is that we apply VALUES too soon. It
>>>> should apply after any GROUP BY, HAVING and SELECT expressions but we apply
>>>> it before those. This is a fairly trivial fix which I simply haven't got
>>>> round to because it is a rare enough case that nobody has ever complained
>>>> that it is broken (NB - It's fixed in the new Medusa engine on the 1.9
>>>> branch)
>>>>
>>>>
>>>>
>>>> On the other hand, I do not agree with you when you say that inverting
>>>> the join parameters would break our compliance with the spec : since the
>>>> join operation is normally commutative (and neither does the recommendation
>>>> specifies explicitly in which order the sets are to be joined), we should
>>>> be able to join the arguments in both orders and get the same results .
>>>>
>>>>
>>>> In principal yes, however once you start doing indexed joins this does
>>>> have the potential to break things if you aren't careful though we are
>>>> fairly careful these days so probably doesn't make a difference nowadays
>>>>
>>>> Moreover, evaluating the bindings first could also lead to better
>>>> performances since bound variables injection into the RHS whenever possible
>>>> would lighten the multiset to join with.
>>>>
>>>> There is also an issue I encountered and I'd like to discuss with the
>>>> Extend algebra.
>>>> When used as a left join LHS, it prevents injecting the bound variables
>>>> into the Rhs due to the CanFlowResultsToRhs workings and how the extended
>>>> variable is always treated as floating.
>>>>
>>>>
>>>> Well anything introduced by Extend always has to be treated as floating
>>>> because the expression could produce an error or an unbound value
>>>>
>>>> There are a couple of cases when the expression is a constant value or
>>>> a copy of a variable (provided we know that variable to be fixed) that we
>>>> could special case but otherwise we can't do anything more.
>>>>
>>>> If you are generating Extends simply to introduce constants generating
>>>> Values instead may be a better approach and will benefit from index joins
>>>> as you note.
>>>>
>>>> Rob
>>>>
>>>>
>>>> Perhaps it would be better to discuss these live, if you're available ?
>>>>
>>>> Cheers,
>>>> Max.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>>
>> _______________________________________________
>> dotNetRDF-develop mailing list
>> dot...@li...
>> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-develop
>>
>>
> ------------------------------------------------------------------------------
> _______________________________________________ dotNetRDF-develop mailing
> list dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-develop
>
>
>
> ------------------------------------------------------------------------------
>
> _______________________________________________
> dotNetRDF-develop mailing list
> dot...@li...
> https://lists.sourceforge.net/lists/listinfo/dotnetrdf-develop
>
>
|