Postgres-XL / Tickets / #62 Co-distributed joins still getting re-distributed

#62 Co-distributed joins still getting re-distributed

Milestone: 9.2rc

Status: open

Owner: Koichi Suzuki

Labels: None

Priority (1-5): 3

Est. Days: None

Assigned To: nobody

Updated: 2015-06-10

Created: 2015-06-10

Creator: Bob A.

Private: No

Hello, I have installed postgres-xl on 2 servers: gtm + coord1 + datanode1 on server A, and coord2 + datanode2 on server B. I have 2 fairly large tables distributed by hash(c) and a fairly small master data table that is replicated to all nodes.

In my first example: you can see the problem: I've joined 2 very large tables by their distribution key, but the explain plan definitely shows pg-xl redistributing both of the tables by hash(c) again (even though it was already distributed by hash(c) ). I thought the ability to push-down co-distributed joins without re-distributing was a big difference between pg-xc and pg-xl:

[local]:5432 pgxl = # explain select sum(my_data1.b) from my_data1, my_data2 **where my_data1.c = my_data2.c**;
                         QUERY PLAN
 Aggregate  (cost=20580574.24..20580574.25 rows=1 width=4)
   ->  Remote Subquery Scan on all (dn1,dn2)  (cost=3364881.84..20330900.00 rows=99869696 width=4)
         ->  Aggregate  (cost=3364881.84..20330900.00 rows=1 width=4)
               ->  Hash Join  (cost=3364881.84..20330900.00 rows=99869696 width=4)
                     Hash Cond: ((my_data2.c)::text = (my_data1.c)::text)
                     ->  **Remote Subquery Scan on all** (dn1,dn2)  (cost=100.00..3064213.62 rows=99869696 width=6)
                           **Distribute results by H: c**
                           ->  Seq Scan on my_data2  (cost=0.00..1965546.96 rows=99869696 width=6)
                     ->  Hash  (cost=3326710.21..3326710.21 rows=99974304 width=12)
                           ->  **Remote Subquery Scan on all** (dn1,dn2)  (cost=100.00..3326710.21 rows=99974304 width=12)
                                 **Distribute results by H: c**
                                 ->  Seq Scan on my_data1  (cost=0.00..1627047.04 rows=99974304 width=12)

In my second example, you'll notice the same explain plan as above, even though the 2nd query is not joining on the distribution key:

[local]:5432 pgxl = # explain select sum(my_data1.b) from my_data1, my_data2 where my_data1.b = my_data2.b;
                        QUERY PLAN

 Aggregate  (cost=20385312.24..20385312.25 rows=1 width=4)
   ->  Remote Subquery Scan on all (dn1,dn2)  (cost=3267250.84..20135638.00 rows=99869696 width=4)
         ->  Aggregate  (cost=3267250.84..20135638.00 rows=1 width=4)
               ->  Hash Join  (cost=3267250.84..20135638.00 rows=99869696 width=4)
                     Hash Cond: (my_data2.b = my_data1.b)
                     ->**  Remote Subquery Scan on all (dn1,dn2)**  (cost=100.00..2864474.22 rows=99869696 width=4)
                           **Distribute results by H: b**
                           ->  Seq Scan on my_data2  (cost=0.00..1965546.96 rows=99869696 width=4)
                     ->  Hash  (cost=2526915.78..2526915.78 rows=99974304 width=4)
                           ->  **Remote Subquery Scan on all** (dn1,dn2)  (cost=100.00..2526915.78 rows=99974304 width=4)
                                 **Distribute results by H: b**
                                 ->  Seq Scan on my_data1  (cost=0.00..1627047.04 rows=99974304 width=4)
(12 rows)

In my third example, you can see pg-xl doing a great job of joining to a replicated table. There is no re-distribution of the data.

[local]:5432 pgxl = # explain select sum(b) from my_data1, master_data where my_data1.a = master_data.a;
                       QUERY PLAN
 Aggregate  (cost=3251630.55..3251630.56 rows=1 width=4)
   ->  Remote Subquery Scan on all (dn1,dn2)  (cost=1.07..3001694.79 rows=99974304 width=4)
         ->  Aggregate  (cost=1.07..3001694.79 rows=1 width=4)
               ->  Hash Join  (cost=1.07..3001694.79 rows=99974304 width=4)
                     Hash Cond: ((my_data1.a)::text = (master_data.a)::text)
                     ->  Seq Scan on my_data1  (cost=0.00..1627047.04 rows=99974304 width=6)
                     ->  Hash  (cost=1.03..1.03 rows=3 width=2)
                           ->  Seq Scan on master_data  (cost=0.00..1.03 rows=3 width=2)

Is there a way to force pg-xl to join co-distributed tables without re-distributing the data?

Thank you,
Bob

Discussion

Bob A. - 2015-06-10

Sorry, I forgot to mention that the 2nd and 3rd examples above have the good-looking explain plans that I expected. But I expected the 1st explain plan (co-distributed join) to look more like the 3rd explain plan (join to a replicated table) because each node has all of the data it needs to perform the join without re-distribution. Unfortunately, the 1st explain plan (co-distributed join) is identical to the 2nd explain plan (join on non-distributed field).

My expectation is based on this quote from the pg-xl documentation:

If the equi-join is done by distribution columns and they share the distribution method (hash/modulo), the Coordinator can tell the Datanode to perform join. If not, it may tell Datanodes to ship rows to other Datanodes and expect rows from other Datanodes to be shipped to it.

Otherwise, I really do like the product. I just need to figure out this one piece.

If you would like to refer to this comment somewhere else in this project, copy and paste the following link:

Co-distributed joins still getting re-distributed

Milestone

Searches

Help

#62 Co-distributed joins still getting re-distributed

Discussion