|
From: Rob V. <rv...@do...> - 2015-05-28 10:51:34
|
Max Comments inline: From: Max - Micrologiciel <ma...@mi...> Date: Wednesday, 27 May 2015 13:48 To: Rob Vesse <rv...@do...> Subject: About PR#36 > Hi Rob, > > just been reviewing some comment you made in the #36 PR > <https://bitbucket.org/dotnetrdf/dotnetrdf/pull-request/36/new-spin-library> > about a change I made at first with the order of join arguments between the > query's algebra and any possible BindingPattern. > > You wrote : > "Though I think our handling of VALUES may already be broken in some cases > anyway e.g. interaction with GROUP BY" > > Would you have some example that exposes the problem, so I can have a look > into it ? If memory serves the problem is that we apply VALUES too soon. It should apply after any GROUP BY, HAVING and SELECT expressions but we apply it before those. This is a fairly trivial fix which I simply haven't got round to because it is a rare enough case that nobody has ever complained that it is broken (NB - It's fixed in the new Medusa engine on the 1.9 branch) > > > On the other hand, I do not agree with you when you say that inverting the > join parameters would break our compliance with the spec : since the join > operation is normally commutative (and neither does the recommendation > specifies explicitly in which order the sets are to be joined), we should be > able to join the arguments in both orders and get the same results . In principal yes, however once you start doing indexed joins this does have the potential to break things if you aren't careful though we are fairly careful these days so probably doesn't make a difference nowadays > Moreover, evaluating the bindings first could also lead to better performances > since bound variables injection into the RHS whenever possible would lighten > the multiset to join with. > > There is also an issue I encountered and I'd like to discuss with the Extend > algebra. > When used as a left join LHS, it prevents injecting the bound variables into > the Rhs due to the CanFlowResultsToRhs workings and how the extended variable > is always treated as floating. Well anything introduced by Extend always has to be treated as floating because the expression could produce an error or an unbound value There are a couple of cases when the expression is a constant value or a copy of a variable (provided we know that variable to be fixed) that we could special case but otherwise we can't do anything more. If you are generating Extends simply to introduce constants generating Values instead may be a better approach and will benefit from index joins as you note. Rob > > Perhaps it would be better to discuss these live, if you're available ? > > Cheers, > Max. > > > > > |