From: Henning T. <has...@he...> - 2008-07-10 21:24:10
|
On Thu, 10 Jul 2008, Justin Bailey wrote: > By the way, another way to write your query is: > > do > p <- table Points.points > unique > project (Points.c << p!Points.c) >From the description of 'unique' it was not clear to me, which 'project' the 'unique' refers to. At first I thought, 'unique' refers to the last 'project' before 'unique'. But then I thought, that 'project' is just like 'return' and thus 'unique' cannot "see" it. So, is the location of 'unique' irrelevant? Isn't it better then to make it a function of type unique :: Query (Rel r) -> Query (Rel r) which is not applied by monadic binding, but as a transformation of the whole query? Currently it's implemented by GROUP BY but it could also be implemented by DISTINCT, right? Now I thought a bit more about grouping and aggregations. When writing database queries in the Query monad then I compare that with the list monad. Tables or multi-sets could just be seen as lists with irrelevant order. However, in the list monad a 'unique' function, that is applied by monadic bind, could not be implemented, because the monadic bind always feeds single elements to its second operand. But 'unique' must have access to the whole table in order to check for duplicates. So, is it sensible to have the 'unique' function how it currently is, if it can only be implemented in terms of query expressions but not in terms of real data (namely lists)? I think a big deficiency of SQL is, that it tries to handle three types of query answers in the same query form. There are r -- scalar types as answers produced by aggregations Rel r -- the answer of a regular query Rel (Rel r) -- an intermediate type that arises when grouping I hoped 'unique' would give us a way to avoid the last type, but it seems, that I was wrong. Firstly I liked, if HaskellDB would be more precise than SQL with respect to the types. If aggregations would have type (Rel r -> r) then it would not be possible to accidentally apply an aggregation twice (AVG(AVG(x2))), mix aggregations with simple column accesses (SELECT avg(x1), x2 FROM ...) and there would be no need to check whether the list returned by the query indeed consists of a single element. Then there are aggregations in conjunction with grouping, where they turn an intermediate nested relation of type (Rel (Rel a)) back into a (Rel b). A function which groups and aggregates may have type groupAndAggregate :: (r -> a) -- select columns or other values to group for -> (a -> Rel r -> b) -- aggregation for each group with access to the grouping criterion -> Rel r -> Rel b |