Re: [Postgres-xc-developers] Replicated update and delete

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Yes, we need to focus on such general solution for replicated tuple identification.

I'm afraid it may take much more research and implementation work.   I believe the submitted patch handles tuple replica based on the primary key or other equivalents if available.   If not, the code falls into the current case, local CTID.   The latter could lead to inconsistent replica but it is far better than the current situation.

For short-term solution, I think Mason's code looks reasonable if I understand the patch correctly.

Mason, do you have any more thoughts/comments?
---
Koichi Suzuki

On 2013/11/07, at 14:21, Amit Khandekar <ami...@en...<mailto:ami...@en...>>
 wrote:

On 6 November 2013 18:31, Michael Paquier <mic...@gm...<mailto:mic...@gm...>> wrote:
On Wed, Nov 6, 2013 at 3:28 PM, Amit Khandekar
<ami...@en...<mailto:ami...@en...>> wrote:
> What exactly does the PostgreSQL FDW doc say about updates and primary key ?
By having a look here:
http://www.postgresql.org/docs/9.3/static/fdw-callbacks.html#FDW-CALLBACKS-UPDATE
It is recommended to use a kind of row ID or the primary key columns.
In the case of XC row ID = CTID, and its uniqueness is not guaranteed
except if coupled with a node ID, which I think it has... Using a CTID
+ node ID combination makes the analysis of tuple uniqueness
impossible for replicated tables either way, so a primary key would be
better IMO.

> How does the postgres_fdw update a table that has no primary or unique key ?
It uses the CTID when scanning remote tuples for UPDATE/DELETE, thing
guarantying that tuples are unique in this case as the FDW deals with
a single server, here is for example the case of 2 nodes listening
ports 5432 and 5433.
$ psql -p 5433 -c "CREATE TABLE aa (a int, b int);"
CREATE TABLE

On server with port 5432:
=# CREATE EXTENSION postgres_fdw;
CREATE EXTENSION
=# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw
OPTIONS (host 'localhost', port '5432', dbname 'ioltas');
CREATE SERVER
=# CREATE USER MAPPING FOR PUBLIC SERVER postgres_server OPTIONS (password '');
CREATE USER MAPPING
=# CREATE FOREIGN TABLE aa_foreign (a int, b int) SERVER
postgres_server OPTIONS (table_name 'aa');
CREATE FOREIGN TABLE
=# explain verbose update  aa_foreign set a = 1, b=2 where a = 1;
                                   QUERY PLAN
--------------------------------------------------------------------------------
 Update on public.aa_foreign  (cost=100.00..144.40 rows=14 width=6)
   Remote SQL: UPDATE public.aa SET a = $2, b = $3 WHERE ctid = $1
   ->  Foreign Scan on public.aa_foreign  (cost=100.00..144.40 rows=14 width=6)
         Output: 1, 2, ctid
         Remote SQL: SELECT ctid FROM public.aa WHERE ((a = 1)) FOR UPDATE
(5 rows)
And ctid is used for scanning...

> In the patch, what do we do when the replicated table has no unique/primary
> key ?
I didn't look at the patch, but I think that replicated tables should
also need a primary key. Let's imagine something like that with
sessions S1 and S2 for a replication table, and 2 datanodes (1 session
runs in common on 1 Coordinator and each Datanode):
S1: INSERT VALUES foo in Dn1
S2: INSERT VALUES foo2 in Dn1
S2: INSERT VALUES foo2 in Dn2
S1: INSERT VALUES foo in Dn2
This will imply that those tuples have a different CTID, so a primary
key would be necessary as I think that this is possible.

If the patch does not handle the case of replicated table without unique key, I think we should have a common solution which takes care of this case also. Or else, if this solution can be extended to handle no-unique-key case, then that would be good. But I think we would end up in having two different implementations, one for unique-key method, and another for the other method, which does not seem good.

The method I had in mind was :
In the scan plan, fetch ctid, node_id from all the datanodes.  Use UPDATE where ctd = ? , but use nodeid-based method to  generate the ExecNodes at execute-time  (enhance ExecNodes->en_expr evaluation so as to use the nodeid from source plan, as against the distribution column that it currently uses for distributed tables) .
But this method will not work as-is in case of non-shippable row triggers. Because trigger needs to be fired only once per row, and we are going to execute UPDATE for all of the ctids of a given row corresponding to all of the datanodes. So somehow we should fire triggers only once. This method will also hit performance, because currently we fetch *all* columns and not just ctid, so it's better to first do that optimization of fetching only reqd columns (there's one pending patch submitted in the mailing list, which fixes this).

This is just one approach, there might be better approaches.

Overall, I think if we decide to get this issue solved (and I think we should really, this is a serious issue), sufficient resource time needs to be given to think over and have discussions before we finalize the approach.

--
Michael

------------------------------------------------------------------------------
November Webinars for C, C++, Fortran Developers
Accelerate application performance with scalable programming models. Explore
techniques for threading, error checking, porting, and tuning. Get the most
from the latest Intel processors and coprocessors. See abstracts and register
http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk_______________________________________________
Postgres-xc-developers mailing list
Pos...@li...
https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers