All,
I've created an initial implementation which removes the 'distinct'
option attached to all queries, and implements an alternate method of
specifying 'distinct' rows. I added a 'unique' function, exported by
HaskellDB.Query, which ensures that all projected attributes
ultimately end up in a 'group by' clause, unless the attribute is
itself an aggregate already.
Technically, I added a 'Group' constructor to the 'PrimQuery' data
type. I did not modify the existing aggregate expression
implementation. I believe my change does not affect that at all. The
'Group' could really just be named 'Unique' for now, since it's only
used to indicate all rows should be unique, but I suspect it will
expand into other "aggregate" operators from the relational algebra.
I then added another function the the generator which handles groups.
When converting from PrimQuery to SqlSelect, defaultSqlGroup (in
Default.hs) is used to ensure all non-aggregate columns are in the
'group by' clause. Because I do not get the "PrimExpr" values that
define the attributes (by this point I only get [(SqlColumn,
SqlExpression)] pairs if I'm lucky), I rely on a hack to determine if
a column is aggregate or not[1]. If the column is defined as a
FunSqlExpression, I assume it is aggregate. This hack is implemented
in "isAggregateSqlExpr", which is exported by Sql.hs.
I'd like comments on the code and the approach. Please don't commit
the patch as I haven't tested it much and I'm sure my implementation
will continue to evolve.
Justin
[1] I think I now see what Jeremy meant when he said the "conversion
to internal representation" happened too early. Not having this
information made life painful.
[2] A quick example of how this is used (this uses my local table
definitions so it won't compile directly). The two queries are:
allROs = table DB.ro_table
unique_ROs = do
ro_hdr <- allROs
unique ro_hdr
And the SQL produced (eliding many columns):
*Main> putStrLn $ showSql allROs
SELECT ...
mod_dt
FROM ro_table as T1
*Main> putStrLn $ showSql unique_ROs
SELECT ...
mod_dt1 as mod_dt
FROM (SELECT ...
mod_dt as mod_dt1
FROM ro_table as T1
GROUP BY ...
mod_dt) as T1
|