postgres-xc-developers Mailing List for Postgres-XC (Page 10)

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

On Wed, Feb 12, 2014 at 3:47 PM, Mason Sharp <ms...@tr...>wrote:

>
>
>
> On Wed, Feb 12, 2014 at 1:08 AM, 鈴木 幸市 <ko...@in...> wrote:
>
>>  2014/02/12 15:00、Ashutosh Bapat <ash...@en...> のメール：
>>
>>
>>
>>
>> On Tue, Feb 11, 2014 at 8:03 PM, Abbas Butt <abb...@en...>wrote:
>>
>>>
>>>  The summary of the discussion so far:
>>>
>>>  Approach A: (Suggested by Amit)
>>> In the scan plan, fetch ctid, node_id from all the datanodes.
>>> While scanning, the tuples need to be fetched in the same order,
>>>  may be using order by 1, 2, 3, ...
>>>  Use UPDATE where ctd = ? , but use nodeid-based method to
>>> generate the ExecNodes at execute-time (enhance ExecNodes->en_expr
>>> evaluation so as to use the nodeid from source plan, as against
>>> the distribution column that it currently uses for distributed tables).
>>>  This method will not work as-is in case of non-shippable row triggers.
>>> Because trigger needs to be fired only once per row, and we are going
>>> to execute UPDATE for all of the ctids of a given row corresponding
>>> to all of the datanodes. So somehow we should fire triggers only once.
>>> This method will also hit performance, because currently we fetch *all*
>>> columns and not just ctid, so it's better to first do that optimization
>>> of fetching only reqd columns (there's one pending patch submitted in
>>> the mailing list, which fixes this).
>>>
>>
>>>  Approach B: (Suggested by many)
>>> If the replicated table does not have primary or unique not null key
>>> then error out on a non-shippable update or delete otherwise use the
>>> patch sent by Mason after some testing and refactoring.
>>>
>>>
>>  This would break backward compatibility. Also, a table which is fairly
>> stable and doesn't have a primary key or unique key, will need to be
>> distributed even though it's a perfect candidate for being a replicated
>> table. Also, we have to see if updating primary or unique key would cause a
>> problem.
>>
>>
>>  Then we should keep using the same WHERE close in shipped statement too.
>>    As pointed out, using ctid in replicated table is very dangerous.
>>
>
> I agree.
>
> I don't think it is unreasonable at all to require a primary key or unique
> index for replicated tables... normally one would want to do that. If they
> don't have a primary key, they themselves can just add a SERIAL at creation
> time and use that.
>
> As an alternative, all columns could be used as a fake primary key to try
> to find the particular row. In GridSQL we used that approach, but does not
> seem so clean... I believe that there is a check in there that if multiple
> rows match the criteria that the operation fails since the row is not
> uniquely identifiable. In hindsight, I wish we had not bothered.
>
>
>
>>
>>
>>   Approach C: (Suggested by Amit)
>>> Always have some kind of a hidden (or system) column for replicated
>>> tables.
>>>  Its type can be serial type, or an int column with default
>>> nextval('sequence_type')
>>>  so that it will always be executed on coordinator and use this colum
>>> as primary key.
>>>
>>>
>>  This looks a better approach, but also means that the inserts in the
>> replicated table have to be driven through the coordinator. This might not
>> be that stringent a condition, given that the replicated tables are
>> expected to be fairly stable. Any replicated table being inserted so often
>> would anyway get into the performance problem.
>>
>>
>>  I’m afraid it takes long effort to fix all the influences of this
>> change.   How do you think about this?    As I noted, approach C has good
>> point.   The issue is how long it takes.   With approach B, we can easily
>> change this handling to approach C.   I’d like to have you opinion on this.
>>
>
> It seems unnecessary if the table already has a primary key or unique
> index. Anyway, approach C is the approach that I originally took with
> GridSQL/Stado, adding something called xrowid, but we later disabled it by
> default.
>

Was there any other reason of disabling it other than code simplicity and
maintainability?

> In hindsight I would have saved the trouble and not implemented it to keep
> the code simpler and easier to maintain, and just left it up to the user to
> use a key.
>
> To summarize, I would go with B.
>

What is your stance on the fact that going with option B makes us backward
in-compatible?

>
>
>
>
>>
>>
>>>  My vote is for approach B.
>>>
>>>
>>  Whatever approach is taken, we will have to stick to it in future
>> versions of XC. We can not keep on changing the user visible functionality
>> with every version. Till 1.2 we didn't have the restriction that replicated
>> tables should have a primary key. Now introducing that requirement, means
>> users have to modify their applications. If we change it again, in the next
>> version, we will break the compatibility again. So, considering the long
>> term benefit, we should go with C.
>>
>>
>> This is reasonable to require at this stage since how it is done now is
> very dangerous. As previously mentioned, I have seen this cause problems
> with production data.  I think people would gladly add a key in exchange
> for not having bad things happen with their data.
>
> Also, I would not release any new version of Postgres-XC without this fix.
> If a release of 1.2 is a ways away, there should be an intermediate 1.1.x
> release that fixes this soon. I would not recommend using Postgres-XC for
> people who will be updating replicated tables without the patch I
> submitted, it is too dangerous.
>
>
> --
> Mason Sharp
>
> TransLattice - http://www.translattice.com
> Distributed and Clustered Database Solutions
>
>
>
>
> ------------------------------------------------------------------------------
> Android apps run on BlackBerry 10
> Introducing the new BlackBerry 10.2.1 Runtime for Android apps.
> Now with support for Jelly Bean, Bluetooth, Mapview and more.
> Get your Android app in front of a whole new audience.  Start now.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk
> _______________________________________________
> Postgres-xc-developers mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers
>
>

-- 
-- 
*Abbas*
 Architect

Ph: 92.334.5100153
 Skype ID: gabbasb
www.enterprisedb.co
<http://www.enterprisedb.com/>m<http://www.enterprisedb.com/>

*Follow us on Twitter*
@EnterpriseDB

Visit EnterpriseDB for tutorials, webinars,
whitepapers<http://www.enterprisedb.com/resources-community>and
more<http://www.enterprisedb.com/resources-community>

2010	Jan	Feb	Mar	Apr (10)	May (17)	Jun (3)	Jul	Aug	Sep (8)	Oct (18)	Nov (51)	Dec (74)
2011	Jan (47)	Feb (44)	Mar (44)	Apr (102)	May (35)	Jun (25)	Jul (56)	Aug (69)	Sep (32)	Oct (37)	Nov (31)	Dec (16)
2012	Jan (34)	Feb (127)	Mar (218)	Apr (252)	May (80)	Jun (137)	Jul (205)	Aug (159)	Sep (35)	Oct (50)	Nov (82)	Dec (52)
2013	Jan (107)	Feb (159)	Mar (118)	Apr (163)	May (151)	Jun (89)	Jul (106)	Aug (177)	Sep (49)	Oct (63)	Nov (46)	Dec (7)
2014	Jan (65)	Feb (128)	Mar (40)	Apr (11)	May (4)	Jun (8)	Jul (16)	Aug (11)	Sep (4)	Oct (1)	Nov (5)	Dec (16)
2015	Jan (5)	Feb	Mar (2)	Apr (5)	May (4)	Jun (12)	Jul	Aug	Sep	Oct	Nov	Dec (4)
2019	Jan	Feb	Mar	Apr	May	Jun	Jul (2)	Aug	Sep	Oct	Nov	Dec

postgres-xc-developers Mailing List for Postgres-XC (Page 10)

postgres-xc-developers — Postgres-XC hackers and developers