Re: [Postgres-xc-general] pushing order by clause to datanodes

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

I maybe missing something. Is it the case that postgres or pgxc doesn't 
support
streaming aggregates? That would allow aggregation over sorted streams 
to produce a
aggregated stream that is also sorted.
However, I am not too sure about this.

-Sandeep

On 10/17/2013 07:13 AM, Ashutosh Bapat wrote:
> That's not possible right now. For grouping, either the grouped input 
> from the datanode gets shuffled at the coordinator or ordered on the 
> grouping column. Thus even if we get the ordered intput from datanode 
> on a particular column that order is disturbed at the coordinator 
> because of grouping.
>
>
> On Thu, Oct 17, 2013 at 4:37 PM, Sandeep Gupta 
> <gup...@gm... <mailto:gup...@gm...>> wrote:
>
>     How about a plan where the datanodes perform the sort as well and the
>     coordinator performs a sorted merge?
>
>     Are such plans not part of the query planner?
>
>     -Sandeep
>
>
>     On 10/17/2013 07:04 AM, Ashutosh Bapat wrote:
>>     There is GROUP BY clause that needs to be evaluated before the
>>     result can be ordered. Thus GROUP BY is sent to the datanode but
>>     not ORDER BY.
>>
>>
>>     On Thu, Oct 17, 2013 at 4:31 PM, Sandeep Gupta
>>     <gup...@gm... <mailto:gup...@gm...>> wrote:
>>
>>         Hi Ashutosh,
>>
>>
>>           Attached below is the query and the corresponding query
>>         plan. I am using version 1.1.
>>         Thanks for taking a look at this.
>>
>>         -Sandeep
>>
>>         SELECT exposed_time effectedDate, ROUND(COUNT(a.pid)/10) COUNT FROM public.vt_demography_info_xc d, public.ses_vt_20130805_xc a WHERE d.pid=a.pid AND d.countyid='50015' AND d.age BETWEEN 5 AND 18 AND d.gender=1 GROUP BY exposed_time ORDER BY exposed_time;
>>
>>
>>
>>
>>                                             QUERY PLAN (Coordinator)
>>
>>           Sort  (cost=10000000005.03..10000000005.03 rows=1 width=8)
>>             Output: a.exposed_time, (round(((count((count(a.pid))) / 10))::double precision))
>>             Sort Key: a.exposed_time
>>             ->  HashAggregate  (cost=5.00..5.02 rows=1 width=8)
>>                   Output: a.exposed_time, round(((count((count(a.pid))) / 10))::double precision)
>>                   ->  Data Node Scan on "__REMOTE_GROUP_QUERY__"  (cost=0.00..0.00 rows=1000 width=8)
>>                         Output: a.exposed_time, (count(a.pid))
>>                         Node/s: datanode1, datanode10, datanode11, datanode12, datanode13, datanode14, datanode15, datanode16, datanode2, datanode3, datanode4, datanode5, datanode6, datanode7, datanode8, datanode9
>>                         Remote query: SELECT r.a_1, count(r.a_2) FROM ((SELECT d.pid FROM ONLY public.vt_demography_info_xc d WHERE ((d.age >= 5) AND (d.age <= 18) AND ((d.countyid)::text = '50015'::text) AND (d.gender = 1))) l(a_1) JOIN (SELECT a.exposed_time, a.pid FROM ONLY public.ses_vt_20130805_xc a WHERE ((a.exposed_time >= 4667) AND (a.exposed_time <= 5031))) r(a_1, a_2) ON (true)) WHERE (l.a_1 = r.a_2) GROUP BY 1
>>         (9 rows)
>>
>>                                               QUERY PLAN (Datanode)
>>
>>           GroupAggregate  (cost=0.00..47862.29 rows=225 width=8)
>>             Output: a.exposed_time, round(((count(a.pid) / 10))::double precision)
>>             ->  Nested Loop  (cost=0.00..47856.05 rows=460 width=8)
>>                   Output: a.exposed_time, a.pid
>>                   ->  Index Scan using et_ses on public.ses_vt_20130805_xc a  (cost=0.00..7283.10 rows=129583 width=8)
>>                         Output: a.pid, a.rep, a.exposed_time, a.infectious_time, a.recovered_time
>>                         Index Cond: ((a.exposed_time >= 4667) AND (a.exposed_time <= 5031))
>>                   ->  Index Scan using pid_demo on public.vt_demography_info_xc d  (cost=0.00..0.30 rows=1 width=4)
>>                         Output: d.pid, d.hid, d.age, d.gender, d.zipode, d.blockgroupid, d.longitude, d.lattitude, d.county, d.countyid
>>                         Index Cond: (d.pid = a.pid)
>>                         Filter: ((d.age >= 5) AND (d.age <= 18) AND ((d.countyid)::text = '50015'::text) AND (d.gender = 1))
>>         (11 rows)
>>
>>
>>
>>
>>         On Thu, Oct 17, 2013 at 12:12 AM, Ashutosh Bapat
>>         <ash...@en...
>>         <mailto:ash...@en...>> wrote:
>>
>>             Sandeep,
>>             It would be nice if you mention the version of XC in your
>>             mail. Sort push down is available from 1.1 onwards. If
>>             you do not see sort getting pushed down in 1.1, please
>>             report detailed definitions of the tables, query and the
>>             EXPLAIN output.
>>
>>
>>             On Thu, Oct 17, 2013 at 1:09 AM, Sandeep Gupta
>>             <gup...@gm...
>>             <mailto:gup...@gm...>> wrote:
>>
>>                 Hi,
>>
>>                  In an another query that requires the result to be
>>                 aggregated and ordered by a field (lets say timeo)
>>                 the query planner currently pulls  the results and
>>                 then performs a sort with hash aggregate.
>>
>>                 The table at the datanodes are clustered by timeo. I
>>                 was wondering if it possible
>>                 for query planner to push down the order by clause at
>>                 the datanode and then perform
>>                 sort-merge aggregate at the coordinator. Surely, that
>>                 would be a better query plan.
>>
>>                 We have tried enable_sort=off etc. but that doesn't
>>                 work.
>>
>>                 Thanks.
>>                 Sandeep
>>
>>
>>
>>                 ------------------------------------------------------------------------------
>>                 October Webinars: Code for Performance
>>                 Free Intel webinars can help you accelerate
>>                 application performance.
>>                 Explore tips for MPI, OpenMP, advanced profiling, and
>>                 more. Get the most from
>>                 the latest Intel processors and coprocessors. See
>>                 abstracts and register >
>>                 http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk
>>                 _______________________________________________
>>                 Postgres-xc-general mailing list
>>                 Pos...@li...
>>                 <mailto:Pos...@li...>
>>                 https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>>
>>
>>
>>
>>             -- 
>>             Best Wishes,
>>             Ashutosh Bapat
>>             EnterpriseDB Corporation
>>             The Postgres Database Company
>>
>>
>>
>>
>>
>>     -- 
>>     Best Wishes,
>>     Ashutosh Bapat
>>     EnterpriseDB Corporation
>>     The Postgres Database Company
>
>
>
>
> -- 
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company