From: Justin B. <jgb...@gm...> - 2008-03-25 19:32:17
|
One of the benefits of haskelldb is the ability to decompose a query into component parts and put them together in different ways. I believe I have a found a bug in the library that makes this hard to do. This literate email demonstrates the bug. The full source is given at the end, but I want to talk about the cause here. Imagine two queries that project one aggregate column each: tbl1 = do x <- table ... restrict ... project $ ... id1 << count ( ... ) tbl2 = do x <- table ... restrict .. project $ .. id2 << count ( ... ) When each query is run individually, the SQL produced is as expected (i.e., "select xx as id1 from (select count(...) as xx from ..)"). When the queries are combined by hand: cmb1 = do x1 <- table .. x2 <- table .. restrict ( x1 ! ...) restrict (x2 ! ...) project $ id1 << count ( ... ) # id2 << count (...) The SQL produced is also as expected: SELECT id13 as id1, id23 as id2 FROM (SELECT COUNT(id11) as id13, COUNT(id22) as id23 FROM (SELECT id2 as id22 FROM sample_table as T1 WHERE id1 < 0) as T1, (SELECT id1 as id11 FROM sample_table as T1 WHERE id1 > 0) as T2) as T1 The aggregates are at the same level, which is very important. However, if the individual queries are reused: cmb2 = do x1 <- tbl1 x2 <- tbl2 project $ id1 << x1 ! id1 # id2 << x2 ! id 2 the SQL produced is NOT correct. It will execute but the aggregates are at different levels: SELECT id12 as id1, id24 as id2 FROM (SELECT COUNT(id23) as id24, id12 FROM (SELECT id2 as id23 FROM sample_table as T1 WHERE id1 < 0) as T1, (SELECT COUNT(id11) as id12 FROM (SELECT id1 as id11 FROM sample_table as T1) as T1 WHERE id11 > 0) as T2 GROUP BY id12) as T1 When this query is executed, it will return NO rows if "select ID2 as ID23 .." returns no rows. In the correct query, 0 is returned in both columns (as expected with aggregates). That looks like a bug to me (though possibly its a quirk in postgres). The same SQL is produced by this, which is definitely counter-intuitive: cmb3 = do t1 <- do { x <- table ...; restrict ...; project $ id1 << count (x ! id1) } t2 <- do { x <- table ...; restrict ...; project $ id2 << count (x ! id2) } project $ id1 << t1 ! id1 # id2 << t2 ! id2 The bug occurs when the t2 value is constructed. The "table" operation causes a "Binary Times" value to be placed in the PrimQuery being built, BEFORE the aggregate projection is added. I would expect the query for t2 to be built in isolation first, and then added via Times to the existing PrimQuery. This structure can be seen by loading this email into GHCi and entering cmb1Qry, cmb2Qry, and cmb3Qry at the prompt. tbl1Qry and tbl2Qry also show the individual queries in isolation. tbl1SQL, tbl2SQL, etc. are provided which show the SQL generated by each different query. There really isn't a workaround for this bug, except to redefine queries. I am not sure how to fix it as I tried many different approaches. Any suggestions? Justin \begin{code} import Database.HaskellDB import Database.HaskellDB.DBLayout tbl1 :: Query (Rel (RecCons Id1 (Expr Int) RecNil)) tbl1 = do t1 <- table sample_table restrict (t1 ! id1 .>. constant 0) project $ id1 << count (t1 ! id1) tbl2 :: Query (Rel (RecCons Id2 (Expr Int) RecNil)) tbl2 = do t1 <- table sample_table restrict (t1 ! id1 .<. constant 0) project $ id2 << count (t1 ! id2) cmb1 :: Query (Rel (RecCons Id1 (Expr Int) (RecCons Id2 (Expr Int) RecNil))) cmb1 = do t1 <- table sample_table t2 <- table sample_table restrict (t1 ! id1 .>. constant 0) restrict (t2 ! id1 .<. constant 0) project $ id1 << count(t1 ! id1) # id2 << count(t2 ! id2) cmb2 :: Query (Rel (RecCons Id1 (Expr Int) (RecCons Id2 (Expr Int) RecNil))) cmb2 = do t1 <- tbl1 t2 <- tbl2 project $ id1 << t1 ! id1 # id2 << t2 ! id2 cmb3 :: Query (Rel (RecCons Id1 (Expr Int) (RecCons Id2 (Expr Int) RecNil))) cmb3 = do t1 <- do { x <- table sample_table; restrict (x ! id1 .>. constant 0); project $ id1 << count (x ! id1) } t2 <- do { x <- table sample_table; restrict (x ! id1 .<. constant 0); project $ id2 << count (x ! id2) } project $ id1 << t1 ! id1 # id2 << t2 ! id2 tbl1SQL = putStrLn $ showSql tbl1 tbl2SQL = putStrLn $ showSql tbl2 cmb1SQL = putStrLn $ showSql cmb1 cmb2SQL = putStrLn $ showSql cmb2 cmb3SQL = putStrLn $ showSql cmb3 tbl1Qry = putStrLn $ showQuery tbl1 tbl2Qry = putStrLn $ showQuery tbl2 cmb1Qry = putStrLn $ showQuery cmb1 cmb2Qry = putStrLn $ showQuery cmb2 cmb3Qry = putStrLn $ showQuery cmb3 data Id1 = Id1 data Id2 = Id2 instance FieldTag Id1 where fieldName _ = "id1" instance FieldTag Id2 where fieldName _ = "id2" id1 :: Attr Id1 Int id1 = mkAttr Id1 id2 :: Attr Id2 Int id2 = mkAttr Id2 sample_table :: Table (RecCons Id1 (Expr Int) (RecCons Id2 (Expr Int) RecNil)) sample_table = baseTable "sample_table" $ hdbMakeEntry Id1 # hdbMakeEntry Id2 \end{code} |