From: Pavan D. <pav...@gm...> - 2013-07-11 08:48:37
|
Hello All, The following case shows a bug in the FQS logic. I am using fairly recent master branch (may be a few days old). So if this has been fixed recently, please let me know: test=# CREATE TABLE ltab_repl (a int, b char(10)) DISTRIBUTE BY REPLICATION; CREATE TABLE test=# CREATE TABLE rtab (ar int, br char(10)); CREATE TABLE test=# INSERT INTO ltab_repl SELECT generate_series(1,5), 'foo'; INSERT 0 5 test=# INSERT INTO rtab SELECT generate_series(1,4), 'bar'; INSERT 0 4 test=# SELECT * FROM ltab_repl ; a | b ---+------------ 1 | foo 2 | foo 3 | foo 4 | foo 5 | foo (5 rows) test=# SELECT * FROM rtab; ar | br ----+------------ 1 | bar 2 | bar 3 | bar 4 | bar (4 rows) test=# set enable_fast_query_shipping TO off; SET test=# SELECT * FROM ltab_repl LEFT JOIN rtab ON (a = ar); a | b | ar | br ---+------------+----+------------ 1 | foo | 1 | bar 2 | foo | 2 | bar 3 | foo | 3 | bar 4 | foo | 4 | bar 5 | foo | | (5 rows) test=# set enable_fast_query_shipping TO on; SET test=# SELECT * FROM ltab_repl LEFT JOIN rtab ON (a = ar); a | b | ar | br ---+------------+----+------------ 1 | foo | 1 | bar 2 | foo | 2 | bar 3 | foo | | 4 | foo | | 5 | foo | | 1 | foo | | 2 | foo | | 3 | foo | 3 | bar 4 | foo | 4 | bar 5 | foo | | (10 rows) test=# EXPLAIN VERBOSE SELECT * FROM ltab_repl LEFT JOIN rtab ON (a = ar); QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------- Data Node Scan on "__REMOTE_FQS_QUERY__" (cost=0.00..0.00 rows=0 width=0) Output: ltab_repl.a, ltab_repl.b, rtab.ar, rtab.br Node/s: d1, d2 Remote query: SELECT ltab_repl.a, ltab_repl.b, rtab.ar, rtab.br FROM (public.ltab_repl LEFT JOIN public.rtab ON ((ltab_repl.a = rtab.ar))) (4 rows) As you would see the query is giving a wrong result when FQS is turned ON. I don't think its correct to push down the query as it is to the datanodes. What's happening really is that each datanode is returning 5 rows each (since ltab_repl is a replicated table, contains 5 rows and is at the left side of the left join) and they are being appended together. I think I'd suggested in the past to run regression by turning these various optimization GUCs on/off and comparing the results. While we might see some row ordering issues when these GUCs are turned on/off, the final result should remain the same. Such an exercise will help to uncover such bugs. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee |