Re: [Postgres-xc-general] modifying values at the datanode

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Michael,

 Sure. I am using pgxc v.1.
For the query explain verbose update person set intervened = 84 from d1_sum
WHERE person.pid=d1_sum.pid;

Query Plan:

 Update on public.person  (cost=0.00..0.00 rows=1000 width=24)
   Node/s: datanode1
   Node expr: person.pid
   Remote query: UPDATE ONLY public.person SET intervened = $3 WHERE
((person.ctid = $4) AND (person.xc_node_id = $5))
   ->  Data Node Scan on "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00 rows=1000
width=24)
         Output: person.pid, person.persons, 84, person.pid, person.ctid,
person.xc_node_id, d1_sum.ctid
         Node/s: datanode1
         Remote query: SELECT l.a_1, l.a_2, l.a_3, l.a_4, r.a_1 FROM
((SELECT person.pid, person.persons, person.ctid, person.xc_node_i
d FROM ONLY public.person WHERE true) l(a_1, a_2, a_3, a_4) JOIN (SELECT
d1_sum.ctid, d1_sum.pid FROM ONLY public.d1_sum WHERE true) r(
a_1, a_2) ON (true)) WHERE (l.a_1 = r.a_2)
(8 rows)

For the second style

explain verbose update person set intervened = 84 where person.pid =
(select d1_sum.pid from d1_sum,person WHERE person.pid=d1_sum.pid);

 Update on public.person  (cost=0.00..0.00 rows=1000 width=18)
   Node/s: datanode1
   Node expr: public.person.pid
   Remote query: UPDATE ONLY public.person SET intervened = $3 WHERE
((person.ctid = $4) AND (person.xc_node_id = $5))
   InitPlan 1 (returns $0)
     ->  Data Node Scan on "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00
rows=1000 width=4)
           Output: d1_sum.pid
           Node/s: datanode1
           Remote query: SELECT l.a_1 FROM ((SELECT d1_sum.pid FROM ONLY
public.d1_sum WHERE true) l(a_1) JOIN (SELECT person.pid FROM
ONLY public.person WHERE true) r(a_1) ON (true)) WHERE (l.a_1 = r.a_1)
   ->  Data Node Scan on person "_REMOTE_TABLE_QUERY_"  (cost=0.00..0.00
rows=1000 width=18)
         Output: public.person.pid, public.person.persons, 84,
public.person.pid, public.person.ctid, public.person.xc_node_id
         Node/s: datanode1
         Remote query: SELECT pid, persons, ctid, xc_node_id FROM ONLY
public.person WHERE true
         Coordinator quals: (public.person.pid = $0)

In both the scenarios the planner breaks it into two parts: update and
join. The results of join is pulled up at the coordinator and then shipped
one by one for update.

Thanks for taking a look.

-Sandeep

On Sat, Oct 5, 2013 at 10:26 AM, Michael Paquier
<mic...@gm...>wrote:

> On Sat, Oct 5, 2013 at 11:14 PM, Sandeep Gupta <gup...@gm...>
> wrote:
> > Thanks Michael. I understand. The only issue is that we have an update
> > query as
> >
> > update T set T.a = -1 from A where A.x = T.x
> >
> >
> > Both A and T and distributed by x column. The problem is that coordinator
> > first does the join and then
> > calls update several times at each datanode. This is turning out to be
> too
> > slow. Would have
> > been better if the entire query was shipped to the datanodes.
> Hum?! Logically, I would imagine that if A and T are distributed by x
> this WHERE clause should be pushed down as the SET clause is a
> constant. However perhaps UPDATE FROM does not have an explicit
> support... Could you provide the version number and an EXPLAIN VERBOSE
> output?
>
> What if you put the where join in a subquery or a WITH clause? Like
> that for example:
> update T set T.a = -1 where A.x = (select A.x from A,T where A.x = T.x);
> --
> Michael
>