Menu

#50 Join on varchar not pushing down properly

9.2rc
closed
mason_s
None
2
None
2015-03-08
2015-03-03
mason_s
No

Join push down is not being done properly for varchar. For example:
create table kc21
(akc190 varchar, akc193 varchar, col2 int)
distribute by hash(akc190);

create table kc24
(akc190 varchar, col3 int)
distribute by hash(akc190);

create table kc60
(akc190 varchar, col4 int)
distribute by hash(akc190);

create table ka06
(akc193 varchar, col5 int)
distribute by replication;

analyze kc21;
analyze kc24;
analyze kc60;
analyze ka06;

explain
select *
from kc21
inner join kc24 on kc21.akc190 = kc24.akc190
inner join kc60 on kc24.akc190 = kc60.akc190
inner join ka06 on kc21.akc193 = ka06.akc193;

                                           QUERY PLAN

------------------------------------------------------------------------------------------------

Remote Subquery Scan on all (dn1,dn2) (cost=709.87..3743.89 rows=193048 width=176)
-> Merge Join (cost=709.87..3743.89 rows=193048 width=176)
Merge Cond: ((kc24.akc190)::text = (kc21.akc190)::text)
-> Remote Subquery Scan on all (dn1,dn2) (cost=270.85..1517.50 rows=7564 width=72)
Distribute results by R
-> Merge Join (cost=170.85..290.46 rows=7564 width=72)
Merge Cond: ((kc24.akc190)::text = (kc60.akc190)::text)
-> Remote Subquery Scan on all (dn1,dn2) (cost=100.00..217.01 rows=1230 w
idth=36)
Distribute results by R
-> Sort (cost=280.14..283.21 rows=1230 width=36)
Sort Key: kc24.akc190
-> Seq Scan on kc24 (cost=0.00..22.30 rows=1230 width=36)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: kc60.akc190
-> Seq Scan on kc60 (cost=0.00..22.30 rows=1230 width=36)
-> Sort (cost=539.02..551.78 rows=5104 width=104)
Sort Key: kc21.akc190
-> Merge Join (cost=143.97..224.68 rows=5104 width=104)
Merge Cond: ((kc21.akc193)::text = (ka06.akc193)::text)
-> Sort (cost=58.54..60.62 rows=830 width=68)
Sort Key: kc21.akc193
-> Seq Scan on kc21 (cost=0.00..18.30 rows=830 width=68)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: ka06.akc193
-> Seq Scan on ka06 (cost=0.00..22.30 rows=1230 width=36)
(25 rows)

Note the extra "Remote Subquery Scan" nodes

but, changing the data type to text works ok:

drop table kc21;
drop table kc24;
drop table kc60;
drop table ka06;

create table kc21
(akc190 text, akc193 text, col2 int)
distribute by hash(akc190);

create table kc24
(akc190 text, col3 int)
distribute by hash(akc190);

create table kc60
(akc190 text, col4 int)
distribute by hash(akc190);

create table ka06
(akc193 text, col5 int)
distribute by replication;

analyze kc21;
analyze kc24;
analyze kc60;
analyze ka06;

explain
select *
from kc21
inner join kc24 on kc21.akc190 = kc24.akc190
inner join kc60 on kc24.akc190 = kc60.akc190
inner join ka06 on kc21.akc193 = ka06.akc193;

                                  QUERY PLAN

Remote Subquery Scan on all (dn1,dn2) (cost=709.87..3743.89 rows=193048 width=176)
-> Merge Join (cost=709.87..3743.89 rows=193048 width=176)
Merge Cond: (kc24.akc190 = kc21.akc190)
-> Merge Join (cost=170.85..290.46 rows=7564 width=72)
Merge Cond: (kc24.akc190 = kc60.akc190)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: kc24.akc190
-> Seq Scan on kc24 (cost=0.00..22.30 rows=1230 width=36)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: kc60.akc190
-> Seq Scan on kc60 (cost=0.00..22.30 rows=1230 width=36)
-> Sort (cost=539.02..551.78 rows=5104 width=104)
Sort Key: kc21.akc190
-> Merge Join (cost=143.97..224.68 rows=5104 width=104)
Merge Cond: (kc21.akc193 = ka06.akc193)
-> Sort (cost=58.54..60.62 rows=830 width=68)
Sort Key: kc21.akc193
-> Seq Scan on kc21 (cost=0.00..18.30 rows=830 width=68)
-> Sort (cost=85.43..88.50 rows=1230 width=36)
Sort Key: ka06.akc193
-> Seq Scan on ka06 (cost=0.00..22.30 rows=1230 width=36)
(21 rows)

We need to make sure that varchar can be pushed down.

Discussion

  • Andrei Martsinchyk

    Fixed by 0de5ca63d97c406fc1581abc4434cb897c147c9f
    Solution is to strip off RelabelType from the expression in join condition when evaluating join distribution

     
  • Andrei Martsinchyk

    • status: open --> closed
     

Log in to post a comment.