Re: [Postgres-xc-developers] Replicated update and delete

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Isn't it fair to error out if there are no primary/unique keys on the
replicated table? IMO, Mason's approach handles things pretty ok for tables
with unique type of keys.

Regards,
Nikhils

On Thu, Nov 7, 2013 at 11:15 AM, 鈴木 幸市 <ko...@in...> wrote:

>  Yes, we need to focus on such general solution for replicated tuple
> identification.
>
>  I'm afraid it may take much more research and implementation work.   I
> believe the submitted patch handles tuple replica based on the primary key
> or other equivalents if available.   If not, the code falls into the
> current case, local CTID.   The latter could lead to inconsistent replica
> but it is far better than the current situation.
>
>  For short-term solution, I think Mason's code looks reasonable if I
> understand the patch correctly.
>
>  Mason, do you have any more thoughts/comments?
> ---
> Koichi Suzuki
>
>  On 2013/11/07, at 14:21, Amit Khandekar <ami...@en...>
>  wrote:
>
>
>
>
> On 6 November 2013 18:31, Michael Paquier <mic...@gm...>wrote:
>
>> On Wed, Nov 6, 2013 at 3:28 PM, Amit Khandekar
>> <ami...@en...> wrote:
>> > What exactly does the PostgreSQL FDW doc say about updates and primary
>> key ?
>>  By having a look here:
>>
>> http://www.postgresql.org/docs/9.3/static/fdw-callbacks.html#FDW-CALLBACKS-UPDATE
>> It is recommended to use a kind of row ID or the primary key columns.
>> In the case of XC row ID = CTID, and its uniqueness is not guaranteed
>> except if coupled with a node ID, which I think it has... Using a CTID
>> + node ID combination makes the analysis of tuple uniqueness
>> impossible for replicated tables either way, so a primary key would be
>> better IMO.
>>
>> > How does the postgres_fdw update a table that has no primary or unique
>> key ?
>>  It uses the CTID when scanning remote tuples for UPDATE/DELETE, thing
>> guarantying that tuples are unique in this case as the FDW deals with
>> a single server, here is for example the case of 2 nodes listening
>> ports 5432 and 5433.
>> $ psql -p 5433 -c "CREATE TABLE aa (a int, b int);"
>> CREATE TABLE
>>
>> On server with port 5432:
>> =# CREATE EXTENSION postgres_fdw;
>> CREATE EXTENSION
>> =# CREATE SERVER postgres_server FOREIGN DATA WRAPPER postgres_fdw
>> OPTIONS (host 'localhost', port '5432', dbname 'ioltas');
>> CREATE SERVER
>> =# CREATE USER MAPPING FOR PUBLIC SERVER postgres_server OPTIONS
>> (password '');
>> CREATE USER MAPPING
>> =# CREATE FOREIGN TABLE aa_foreign (a int, b int) SERVER
>> postgres_server OPTIONS (table_name 'aa');
>> CREATE FOREIGN TABLE
>> =# explain verbose update  aa_foreign set a = 1, b=2 where a = 1;
>>                                    QUERY PLAN
>>
>> --------------------------------------------------------------------------------
>>  Update on public.aa_foreign  (cost=100.00..144.40 rows=14 width=6)
>>    Remote SQL: UPDATE public.aa SET a = $2, b = $3 WHERE ctid = $1
>>    ->  Foreign Scan on public.aa_foreign  (cost=100.00..144.40 rows=14
>> width=6)
>>          Output: 1, 2, ctid
>>          Remote SQL: SELECT ctid FROM public.aa WHERE ((a = 1)) FOR UPDATE
>> (5 rows)
>> And ctid is used for scanning...
>>
>> > In the patch, what do we do when the replicated table has no
>> unique/primary
>> > key ?
>>  I didn't look at the patch, but I think that replicated tables should
>> also need a primary key. Let's imagine something like that with
>> sessions S1 and S2 for a replication table, and 2 datanodes (1 session
>> runs in common on 1 Coordinator and each Datanode):
>> S1: INSERT VALUES foo in Dn1
>> S2: INSERT VALUES foo2 in Dn1
>> S2: INSERT VALUES foo2 in Dn2
>> S1: INSERT VALUES foo in Dn2
>> This will imply that those tuples have a different CTID, so a primary
>> key would be necessary as I think that this is possible.
>>
>
>  If the patch does not handle the case of replicated table without unique
> key, I think we should have a common solution which takes care of this case
> also. Or else, if this solution can be extended to handle no-unique-key
> case, then that would be good. But I think we would end up in having two
> different implementations, one for unique-key method, and another for the
> other method, which does not seem good.
>
>  The method I had in mind was :
> In the scan plan, fetch ctid, node_id from all the datanodes.  Use UPDATE
> where ctd = ? , but use nodeid-based method to  generate the ExecNodes at
> execute-time  (enhance ExecNodes->en_expr evaluation so as to use the
> nodeid from source plan, as against the distribution column that it
> currently uses for distributed tables) .
> But this method will not work as-is in case of non-shippable row triggers.
> Because trigger needs to be fired only once per row, and we are going to
> execute UPDATE for all of the ctids of a given row corresponding to all of
> the datanodes. So somehow we should fire triggers only once. This method
> will also hit performance, because currently we fetch *all* columns and not
> just ctid, so it's better to first do that optimization of fetching only
> reqd columns (there's one pending patch submitted in the mailing list,
> which fixes this).
>
>  This is just one approach, there might be better approaches..
>
>  Overall, I think if we decide to get this issue solved (and I think we
> should really, this is a serious issue), sufficient resource time needs to
> be given to think over and have discussions before we finalize the approach.
>
>
>  --
>> Michael
>>
>
>
> ------------------------------------------------------------------------------
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models.
> Explore
> techniques for threading, error checking, porting, and tuning. Get the
> most
> from the latest Intel processors and coprocessors. See abstracts and
> register
>
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk_______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>
>
>
> ------------------------------------------------------------------------------
> November Webinars for C, C++, Fortran Developers
> Accelerate application performance with scalable programming models.
> Explore
> techniques for threading, error checking, porting, and tuning. Get the most
> from the latest Intel processors and coprocessors. See abstracts and
> register
> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

-- 
StormDB - http://www.stormdb.com
The Database Cloud