Re: [Postgres-xc-developers] Postgres-xc coordinator operation

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi, Please see below

\d+ Table1
Table "Table1"
    Column    |         Type          | Modifiers | Storage  | Stats
target | Description 
--------------+-----------------------+-----------+----------+--------------+-------------
 id          | numeric               |           | main     | 1000
| 
 age          | numeric               |           | main     | 1000
| 
 gender       | numeric               |           | main     | 1000
| 
 col1     | character varying(5)  |           | extended | 1000
| 
Indexes:
    "index_age_table1" btree (age)
    "index_col1_table1" btree (col1)
    "index_gender_table1" btree (gender)
    "index_id_table1" btree (id)
Has OIDs: no
Distribute By: HASH(id)
Location Nodes: ALL DATANODES

\d+ Table2
Table "Table2"
     Column      |  Type   | Modifiers | Storage | Stats target |
Description 
-----------------+---------+-----------+---------+--------------+-------------
 id             | numeric |           | main    | 1000         | 
 rep             | numeric |           | main    | 1000         | 
 time    | numeric |           | main    | 1000         | 

Indexes:
    "index_time_table2" btree (time)
    "index_id_table2" btree (id)
    "index_rep_table2" btree (rep)
Has OIDs: no
Distribute By: HASH(id)
Location Nodes: ALL DATANODES

Explain verbose of the query:
GroupAggregate  (cost=0.07..0.10 rows=1 width=56)
   Output: d.col1, round(((count(d.id) / 10))::double precision)
   ->  Sort  (cost=0.07..0.08 rows=1 width=56)
         Output: d.col1, d.id
         Sort Key: d.col1
         ->  Hash Join  (cost=0.01..0.06 rows=1 width=56)
               Output: d.col1, d.id
               Hash Cond: (d.id = i.id)
               ->  Data Node Scan on Table1 "_REMOTE_TABLE_QUERY_"
(cost=0.00..0.00 rows=1000 width=56)
                     Output: d.col1, d.id
                     Node/s: datanode_c2_d1, datanode_c3_d1,
datanode_c4_d1, datanode_c5_d1
                     Remote query: SELECT col1, id FROM ONLY Table1 d
WHERE true
               ->  Hash  (cost=0.00..0.00 rows=1000 width=32)
                     Output: i.id
                     ->  Data Node Scan on Table2 "_REMOTE_TABLE_QUERY_"
(cost=0.00..0.00 rows=1000 width=32)
                           Output: i.id
                           Node/s: datanode_c2_d1, datanode_c3_d1,
datanode_c4_d1, datanode_c5_d1
                           Remote query: SELECT id, time FROM ONLY
Table2 i WHERE true
                           Coordinator quals: (((i.time)::double
precision <= trunc(((('now'::cstring)::date - to_date('01 Jan
2000'::text, 'DD Mon YYYY'::text)))::double precis
ion)) AND ((i.time)::double precision >=
trunc(((((('now'::cstring)::date - to_date('01 Jan 2000'::text, 'DD Mon
YYYY'::text)) + 1) - 365))::double precision)))

Also, one weird thing that I noticed is if I don't give the 'trunc()'
date manipulation in the query and just gave some valid numbers in the
between clause like 'between 5266 and 5632', am getting a different
query plan and the query executes in just a few seconds. The following
is the query planner output when I give the numbers directly in the
between clause. 
GroupAggregate  (cost=49.83..57.35 rows=1 width=56)
   Output: d.col1, round(((count((count(d.id))) / 10))::double
precision)
   ->  Sort  (cost=49.83..52.33 rows=1000 width=56)
         Output: d.col1, (count(d.id))
         Sort Key: d.col1
         ->  Data Node Scan on "__REMOTE_GROUP_QUERY__"
(cost=0.00..0.00 rows=1000 width=56)
               Output: d.col1, (count(d.id))
               Node/s: datanode_c2_d1, datanode_c3_d1, datanode_c4_d1,
datanode_c5_d1
               Remote query: SELECT l.a_1, count(l.a_2) FROM ((SELECT
d.col1, d.id FROM ONLY Table1 d WHERE true) l(a_1, a_2) JOIN (SELECT
i.id FROM ONLY Table2 i WHERE ((i.time >= 5266::numeric) AND (i.time <=
5632::numeric))) r(a_1) ON (true)) WHERE (l.a_2 = r.a_1) GROUP BY 1
ORDER BY 1 DESC

But my requirement is with the trunc() date manipulation.

Thanks

On Wed, 2015-06-03 at 18:08 -0700, Abbas Butt wrote:
> 
> 
> 
> 
> On Wed, Jun 3, 2015 at 5:10 PM, Manikandan Soundarapandian
> <ma...@vt...> wrote:
> 
>         Hi Abbas,
>         
>         I just looked at the planner for the query that am running.
>         This is the query that am running,
>         
>         SELECT d.col1 RVALUE, ROUND(COUNT(d.col2)/10) COUNT FROM
>         Table1 d, Table2 i WHERE d.id=i.id AND i.time BETWEEN
>         TRUNC(CURRENT_DATE-TO_DATE('01 Jan 2000','DD Mon YYYY')+1-365)
>         AND TRUNC(CURRENT_DATE-TO_DATE('01 Jan 2000','DD Mon YYYY'))
>         GROUP BY d.col1 ORDER BY d.col1 DESC;
>         
>         And the query plan for a configuration with 4 datanodes and 1
>         coordinator,
>         
>         GroupAggregate  (cost=0.07..0.10 rows=1 width=56)
>            ->  Sort  (cost=0.07..0.08 rows=1 width=56)
>                  Sort Key: d.col1
>                  ->  Hash Join  (cost=0.01..0.06 rows=1 width=56)
>                        Hash Cond: (d.id = i.id)
>                        ->  Data Node Scan on Table1
>         "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000 width=56)
>                              Node/s: datanode_c2_d1, datanode_c3_d1,
>         datanode_c4_d1, datanode_c5_d1
>                        ->  Hash  (cost=0.00..0.00 rows=1000 width=32)
>                              ->  Data Node Scan on Table2
>         "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000 width=32)
>                                    Node/s: datanode_c2_d1,
>         datanode_c3_d1, datanode_c4_d1, datanode_c5_d1
>                                    Coordinator quals:
>         (((i.time)::double precision <= trunc(((('now'::cstring)::date
>         - to_date('01 Jan 2000'::text, 'DD Mon YYYY'::text)))::double
>         precisio
>         n)) AND ((i.time)::double precision >=
>         trunc(((((('now'::cstring)::date - to_date('01 Jan
>         2000'::text, 'DD Mon YYYY'::text)) + 1) - 365))::double
>         precision)))
>         
>         
>         Table1 contains 22 million records and Table2 contains 10
>         million records.
>         This query took several minutes (almost 35) to get results and
>         one third of that time is spent in FetchTuple. Do you see
>         something wrong with the query plan?
>         
> 
> 
> Can you provide output of 
> 
> \d+ Table1
> \d+ Table2
> 
> I am interested in the distribution column of the tables. 
> 
> 
> Also please provide the output of EXPLAIN VERBOSE on the query.
> 
> 
> 
> 
> 
>         
>         Since the multiplexed IO isn't the real bottleneck here, my
>         proposal would be to change the way how populating the
>         TupleStore at the coordinator is working and introduce
>         parallelism there. Is this a right direction? Please let me
>         know your thoughts.
>         
>         Thanks
>         Mani
>         
>         
>         
>         On Wed, 2015-06-03 at 01:49 -0700, Abbas Butt wrote: 
>         
>         > 
>         > 
>         > On Tue, Jun 2, 2015 at 10:17 PM, Manikandan Soundarapandian
>         > <ma...@vt...> wrote:
>         > 
>         >         Hi Abbas,
>         >         
>         >         Thanks for the reply. I understand how
>         >         pgxc_node_receive is working to
>         >         get incoming data from the datanodes.
>         >         Yes, I have made experiments to study the
>         >         postgres-xc system in a setup
>         >         with 12 datanodes and a coordinator and performed
>         >         join, groupby and
>         >         order by operations on two tables with around 20
>         >         million and 10 million
>         >         records respectively. I have found that around
>         >         20-30% of the time is
>         >         spent in the 'FetchTuple' method which reads into
>         >         the provided tupleslot
>         >         one datarow at a time from the combiner's buffer
>         >         which is in turn filled
>         >         by the pgxc_node_receive method. 
>         > 
>         > 
>         > XC planner pushes parts of query processing to datanodes to
>         > speed things up.
>         > 
>         > In the queries you tried, did the query planner perform this
>         > optimization?
>         > 
>         > If not then you can gain advantage by improving the planner
>         > to generate optimized plans that avoid UN-necessary
>         > materialization at the coordinator.
>         > 
>         > This will bring the FetchTuple time down since there will be
>         > less tuples to fetch.
>         > 
>         > 
>         > If the planner is already producing optimized plans, then
>         > what's your proposal?
>         > 
>         > 
>         > 
>         > 
>         >         
>         >         Thanks 
>         >         
>         >         On Wed, 2015-06-03 at 09:45 +0500, Abbas Butt wrote:
>         >         >
>         >         >
>         >         > On Wed, Jun 3, 2015 at 4:11 AM, Manikandan
>         >         Soundarapandian
>         >         > <ma...@vt...> wrote:
>         >         >         Hi,
>         >         >
>         >         >
>         >         >         I am a graduate student working on my
>         >         research in parallel
>         >         >         databases. I would like to know how the
>         >         postgres-xc
>         >         >         coordinator works. I understand that the
>         >         datanodes run the
>         >         >         query in parallel and the results are
>         >         collected by the
>         >         >         coordinator which runs any more
>         >         computation that is required
>         >         >         or just provides the output to the client
>         >         that requested the
>         >         >         query. I would like to know whether the
>         >         coordinator does this
>         >         >         data collection from datanodes in a
>         >         sequential fashion?
>         >         >
>         >         >
>         >         > The coordinator uses multiplexed IO using select
>         >         on all fds of
>         >         > datanodes. For more details please see
>         >         pgxc_node_receive function in
>         >         > pgxcnode.c. The loop for reading data on all set
>         >         fds is sequential,
>         >         > but the coordinator does not wait for data from
>         >         the datanode to which
>         >         > the coordinator had sent the query first.
>         >         >
>         >         >         For example, lets consider we want to run
>         >         the query on table
>         >         >         table_x which is hash distributed among 10
>         >         datanodes,
>         >         >         select count(*) from table_x;
>         >         >         Each datanode will run the query and give
>         >         their local counts
>         >         >         and the coordinator has to collect the
>         >         individual counts and
>         >         >         come up with the final count before
>         >         sending the output. Is the
>         >         >         data collection process at the coordinator
>         >         done in a
>         >         >         sequential fashion? I am actually looking
>         >         to introduce some
>         >         >         kind of parallelism in this data
>         >         collection if it is
>         >         >         sequential and do performance studies.
>         >         Please clarify.
>         >         >
>         >         >
>         >         > To improve performance of any system, first study
>         >         the bottleneck, and
>         >         > target to widen that. Have you done any study of
>         >         Postgres-XC to find
>         >         > where the performance bottleneck is?
>         >         >
>         >         >
>         >         >
>         >         >         --
>         >         >         Thanks
>         >         >         Mani
>         >         >         Department of Computer Science
>         >         >         Virginia Tech
>         >         >
>         >         >
>         >          ------------------------------------------------------------------------------
>         >         >
>         >         >
>         >          _______________________________________________
>         >         >         Postgres-xc-developers mailing list
>         >         >
>         >          Pos...@li...
>         >         >
>         >          https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>         >         >
>         >         >
>         >         >
>         >         >
>         >         >
>         >         > --
>         >         > --
>         >         > Abbas
>         >         > Architect
>         >         >
>         >         >
>         >         > Ph: 92.334.5100153
>         >         >
>         >         > Skype ID: gabbasb
>         >         >
>         >         > www.enterprisedb.com
>         >         >
>         >         > Follow us on Twitter
>         >         > @EnterpriseDB
>         >         >
>         >         > Visit EnterpriseDB for tutorials, webinars,
>         >         whitepapers and more
>         >         >
>         >         
>         >         
>         >         
>         > 
>         > 
>         > 
>         > 
>         > -- 
>         > -- 
>         > Abbas
>         > Architect
>         > 
>         > 
>         > Ph: 92.334.5100153
>         > 
>         > Skype ID: gabbasb
>         > 
>         > www.enterprisedb.com
>         > 
>         > Follow us on Twitter
>         > @EnterpriseDB 
>         > 
>         > Visit EnterpriseDB for tutorials, webinars, whitepapers and
>         > more
>         > 
>         
>         
>         
> 
> 
> 
> 
> -- 
> 
> -- 
> Abbas
> 
> Architect
> 
> 
> Ph: 92.334.5100153
> 
> Skype ID: gabbasb
> 
> 
> www.enterprisedb.com
> 
> Follow us on Twitter
> @EnterpriseDB 
> 
> Visit EnterpriseDB for tutorials, webinars, whitepapers and more
>