|
From: Amit K. <ami...@en...> - 2012-07-04 08:37:43
|
On 4 July 2012 11:35, Aris Setyawan <ari...@gm...> wrote:
> > XC planner is pretty smart, all the clauses are analyzed at the
> Coordinator level.
>
> If I'm not mistaken, in WITH clause, after a first query run, many sub
> of first query will be produced and these sub queries may produce
> another queries too (or go to termination condition ). This is a run
> time query.
>
> Every sub query produced from another query will be send to
> coordinator, to distributed to some of data nodes.
>
> Consider this example from postgres documentation.
>
> WITH RECURSIVE search_graph(id, link, data, depth, path, cycle) AS (
> SELECT g.id, g.link, g.data, 1,
> ARRAY[ROW(g.f1, g.f2)],
> false
> FROM graph g
> UNION ALL
> SELECT g.id, g.link, g.data, sg.depth + 1,
> path || ROW(g.f1, g.f2),
> ROW(g.f1, g.f2) = ANY(path)
> FROM graph g, search_graph sg
> WHERE g.id = sg.link AND NOT cycle
> )
> SELECT * FROM search_graph;
>
> I think many cross node join (intermediated with coordinator) will be
> happened.
> And then WITH clause (in graph search case) will always longer
> executed in a cluster than in a single node.
>
>
Hi Aris,
In the above query, the recursive part is iteratively re-run. So suppose
the recursive part query is planned as a hash join of the table 'graph' and
the intermediate work table. For each iteration, the Hash Join plan is
*rescanned*, so I don't think there would be a new join created for each
iteration, rather, the same hash is re-used.
Also the Work Table Scan is materialized at the coordinator. It does not
keep fetching the data again and again.
Check the explain output for this query below, which might clarify the
above explaination for you. But please let me know for any more issues you
have.
QUERY
PLAN
----------------------------------------------------------------------------------------------------------------
Sort (cost=2308.17..2311.30 rows=1250 width=73)
Output: search_graph.f, search_graph.t, search_graph.label,
search_graph.path, search_graph.cycle
Sort Key: search_graph.path
CTE search_graph
-> Recursive Union (cost=0.00..2218.88 rows=1250 width=72)
-> Data Node Scan on graph "_REMOTE_TABLE_QUERY_"
(cost=0.00..0.00 rows=1000 width=40)
Output: g.f, g.t, g.label, ARRAY[ROW(g.f, g.t)], false
Node/s: data_node_1, data_node_2
Remote query: SELECT f, t, label FROM ONLY graph g WHERE
true
-> Hash Join (cost=0.01..219.39 rows=25 width=72)
Output: g.f, g.t, g.label, (sg.path || ROW(g.f, g.t)),
(ROW(g.f, g.t) = ANY (sg.path))
Hash Cond: (sg.t = g.f)
-> WorkTable Scan on search_graph sg (cost=0.00..200.00
rows=5000 width=36)
Output: sg.f, sg.t, sg.label, sg.path, sg.cycle
Filter: (NOT sg.cycle)
-> Hash (cost=0.00..0.00 rows=1000 width=40)
Output: g.f, g.t, g.label
-> Data Node Scan on graph "_REMOTE_TABLE_QUERY_"
(cost=0.00..0.00 rows=1000 width=40)
Output: g.f, g.t, g.label
Node/s: data_node_1, data_node_2
Remote query: SELECT f, t, label FROM ONLY
graph g WHERE true
-> CTE Scan on search_graph (cost=0.00..25.00 rows=1250 width=73)
Output: search_graph.f, search_graph.t, search_graph.label,
search_graph.path, search_graph.cycle
(23 rows)
On 7/4/12, Michael Paquier <mic...@gm...> wrote:
> > On Wed, Jul 4, 2012 at 2:38 PM, Aris Setyawan <ari...@gm...>
> wrote:
> >
> >> Hi All,
> >>
> >> > Hi Aris,
> >> > We found that documents were not updated. WITH clause is supported in
> >> XC. Please try
> >> > to use it and let us know if it doesn't work for you. Thanks for
> >> pointing it out.
> >>
> >> But how the coordinator will split the WITH clause (recursive) query?
> >> If the query wrongly splitted, then many cross datanode join will
> >> occurred.
> >> This is a well known issue in a graph partitioned database.
> >>
> > XC planner is pretty smart, all the clauses are analyzed at the
> Coordinator
> > level.
> > Then only the necessary clauses and expressions are shipped to the
> > necessary remote nodes depending on the table distribution.
> > It may be possible that a lot of data is fetched back to Coordinator, but
> > this depends on how you defined the table distribution strategy of your
> > application.
> > --
> > Michael Paquier
> > http://michael.otacoo.com
> >
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
|