From: Michael P. <mic...@gm...> - 2013-11-06 13:01:11
|
On Wed, Nov 6, 2013 at 3:28 PM, Amit Khandekar <ami...@en...> wrote: > What exactly does the PostgreSQL FDW doc say about updates and primary key ? By having a look here: http://www.postgresql.org/docs/9.3/static/fdw-callbacks.html#FDW-CALLBACKS-UPDATE It is recommended to use a kind of row ID or the primary key columns. In the case of XC row ID = CTID, and its uniqueness is not guaranteed except if coupled with a node ID, which I think it has... Using a CTID + node ID combination makes the analysis of tuple uniqueness impossible for replicated tables either way, so a primary key would be better IMO. > How does the postgres_fdw update a table that has no primary or unique key ? It uses the CTID when scanning remote tuples for UPDATE/DELETE, thing guarantying that tuples are unique in this case as the FDW deals with a single server, here is for example the case of 2 nodes listening ports 5432 and 5433. $ psql -p 5433 -c "CREATE TABLE aa (a int, b int);" CREATE TABLE On server with port 5432: =# CREATE EXTENSION postgres_fdw; CREATE EXTENSION =# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host 'localhost', port '5432', dbname 'ioltas'); CREATE SERVER =# CREATE USER MAPPING FOR PUBLIC SERVER postgres_server OPTIONS (password ''); CREATE USER MAPPING =# CREATE FOREIGN TABLE aa_foreign (a int, b int) SERVER postgres_server OPTIONS (table_name 'aa'); CREATE FOREIGN TABLE =# explain verbose update aa_foreign set a = 1, b=2 where a = 1; QUERY PLAN -------------------------------------------------------------------------------- Update on public.aa_foreign (cost=100.00..144.40 rows=14 width=6) Remote SQL: UPDATE public.aa SET a = $2, b = $3 WHERE ctid = $1 -> Foreign Scan on public.aa_foreign (cost=100.00..144.40 rows=14 width=6) Output: 1, 2, ctid Remote SQL: SELECT ctid FROM public.aa WHERE ((a = 1)) FOR UPDATE (5 rows) And ctid is used for scanning... > In the patch, what do we do when the replicated table has no unique/primary > key ? I didn't look at the patch, but I think that replicated tables should also need a primary key. Let's imagine something like that with sessions S1 and S2 for a replication table, and 2 datanodes (1 session runs in common on 1 Coordinator and each Datanode): S1: INSERT VALUES foo in Dn1 S2: INSERT VALUES foo2 in Dn1 S2: INSERT VALUES foo2 in Dn2 S1: INSERT VALUES foo in Dn2 This will imply that those tuples have a different CTID, so a primary key would be necessary as I think that this is possible. -- Michael |