Menu

#52 Performance Optimization for WindowAgg

9.2rc
open
mason_s
None
4
None
nobody
2015-03-11
2015-03-11
mason_s
No

Reported by Kaijiang Chen:

I have 1 coordinator and 2 data nodes. I create a table as:

create table kakou (uid int, timest as int, kkname as text) distribute by hash(uid);

explain select uid,name1,time1,name2,time2 from

( select r1.uid as uid, r1.kkname as name1,r1.timest as time1, r2.kkname as name2, r2.timest as time2, rank()
over (partition by r1.uid order by r1.timest desc)
as pos from kakou as r1 join kakou as r2 on r1.uid = r2.uid where r1.kkname < r2.kkname and abs(r1.timest-r2.timest) < 30 and r1.timest>10 and r2.timest>10 )

as tmp1 where pos=1;

I got:

Subquery Scan on tmp1 (cost=34.17..86.31 rows=1 width=76)
Filter: (tmp1.pos = 1)
-> WindowAgg (cost=34.17..85.28 rows=83 width=76)
-> Remote Subquery Scan on all (dn1,dn2) (cost=34.17..83.82 rows=83 width=76)
-> Sort (cost=86.47..86.68 rows=83 width=76)
Sort Key: r1.uid, r1.timest
-> Hash Join (cost=34.17..83.82 rows=83 width=76)
Hash Cond: (r1.uid = r2.uid)
Join Filter: ((r1.kkname < r2.kkname) AND (abs((r1.timest - r2.timest)) < 30))
-> Bitmap Heap Scan on kakou r1 (cost=7.25..22.09 rows=387 width=40)
Recheck Cond: (timest > 10)
-> Bitmap Index Scan on kakou_timest (cost=0.00..7.15 rows=387 width=0)
Index Cond: (timest > 10)
-> Hash (cost=22.09..22.09 rows=387 width=40)
-> Bitmap Heap Scan on kakou r2 (cost=7.25..22.09 rows=387 width=40)
Recheck Cond: (timest > 10)
-> Bitmap Index Scan on kakou_timest (cost=0.00..7.15 rows=387 width=0)
Index Cond: (timest > 10)

I think the computing of the WindowAgg can be done in 2 data nodes because the WindowAgg uses r1.uid (the distribution key) as the aggregate key. But in the above XL's plan, the WindowAgg is done in coordinator.

Pushing down the WindowAgg to data nodes should be a performance boost.

Discussion


Log in to post a comment.

Want the latest updates on software, tech news, and AI?
Get latest updates about software, tech news, and AI from SourceForge directly in your inbox once a month.