From: Amit K. <ami...@en...> - 2014-02-18 18:50:46
|
On 13 February 2014 11:54, Ashutosh Bapat <ash...@en...>wrote: > One more solution would be to use cursors for replicated tables. The idea > is to open cursors on all the copies of the table and append the query with > an ORDER BY clause on all the columns. Thus we are sure that the current of > each of these cursors point to same row on all the copies. While fetching a > row from a replicated table, we fetch from all the cursors and choose only > one row for the data processing. While updating or deleting we send UPDATE > or DELETE with WHERE CURRENT OF. The down side of this approach is that, if > there are coordinator quals, we will end up locking more rows than > necessary, increasing the probability of the deadlock but at least there > won't be a necessary restriction of having primary or unique key and we > won't break backward compatibility. > > If there two identical rows, we might mix the update from different nodes, > but then who knew which of them were corresponded across the nodes to start > with. > > Locking all rows doesn't look good especially because we are looking for a permanent long term solution. If we can come up with some other solution that avoids this, we better avoid this compromise. For a replicated table with 10000 rows, all concurrent updates will be serialized even if they are updating a different row. Other thing is datanode performance impact for ORDER BY all columns, especially with many large size columns. I had also mentioned about ORDER BY in approach A. I am not sure whether there is some kind of optimization in the sort, such as: if we find unique rows with the first n columns, it does not compare the rest of the columns. I think declaring cursors is a cool idea in general for DMLs but it requires refactoring of DML planning, and also it requires ORDER BY. There is a concurrent update issue #398 for which we do require a refactor of DML handling. While doing that it will be clearer whether declaring cursor is really beneficial or if it's not feasible. For ORDER BY, again, for long term, we should have a primary key or an internal unique key so that rows can be ordered on that single column as against all columns. So again, we still are better of with a new system column. As regards to approach C, if we find a way to uniquely generate a new row id independently, then the task of generating rowid will be pretty lightweight. We won't require any other table to store it or generate it. The coordinator will generate it at each insert (both fqs and non-fqs), or may be datanodes themselves find a way to generate a new rowid which is always the same regardless of the datanode. A combination of gxid, timestamp and cmd id can be used to construct a unique rowid at the coordinator. I think one action plan can be : 1. Use Mason's patch and tweak it so that it needs very little modification later on if and when we add the system rowid column. 2. Check in the patch but let it not error out if the primary key is not there. This way we would at least make the replicated tables with primary keys work without data issues, but continue to work as it is now for tables without primary key. 3. Lastly support the new system row id implementation, and do an incremental change in Mason's checked in changes to use this id instead of primary key. > On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki <koi...@gm...>wrote: > >> Hi, >> >> I tested the patch and found that primary key is mandatory. We need >> to modify regression test considerably to give each replicated table >> primary keys. >> >> I think this patch helps but I'm not afraid this is good, especially >> when we try to take XC features back to PG. >> >> Did you post another patch to use all column values if primary key is >> not available? >> >> I think better way is as follows: >> >> 1) If primary key is defined, use it, >> 2) If not, create a primary key as system column, the size should be >> 64bit. >> 3) If primary key is added to a replicated table, remove system primary >> key. >> >> The value of primary key can be obtained as follows: >> >> 1) add new column to pgxc_class catalog to represent maximum value of >> the system primary key, >> 2) when first "insert" is done to the primary node, system primary key >> value is taken from 1) and 1) is updated. The value is returned to >> the coordinator to be propagated to other nodes. >> 3) when subsequent "insert" is being done, system primary key value is >> added to the column value. In this case, each datanode updates 1) >> column value if it is larger than the current maximum value. >> >> 3) is important to change primary node to another. This is needed to >> carry over the primary node to another. >> >> ALTER TABLE should take care of them. >> >> Other issues are: >> >> 4) pg_dump/pg_dumpall should not include this system column value, >> 5) cluster may need to handle this too to repack system primary key >> value (not now but at least in 1.3 or later). >> >> Regards; >> --- >> Koichi Suzuki >> >> >> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >> > Please see attached patch that tries to address the issue of XC using >> CTID >> > for replicated updates and deletes when it is evaluated at a coordinator >> > instead of being pushed down. >> > >> > The problem here is that CTID could be referring to a different tuple >> > altogether on a different data node, which is what happened for one of >> our >> > Postgres-XC support customers, leading to data issues. >> > >> > Instead, the patch looks for a primary key or unique index (with the >> primary >> > key preferred) and uses those values instead of CTID. >> > >> > The patch could be improved further. Extra parameters are set even if >> not >> > used in the execution of the prepared statement sent down to the data >> nodes. >> > >> > Regards, >> > >> > >> > -- >> > Mason Sharp >> > >> > TransLattice - http://www.translattice.com >> > Distributed and Clustered Database Solutions >> > >> > >> ------------------------------------------------------------------------------ >> > November Webinars for C, C++, Fortran Developers >> > Accelerate application performance with scalable programming models. >> Explore >> > techniques for threading, error checking, porting, and tuning. Get the >> most >> > from the latest Intel processors and coprocessors. See abstracts and >> > register >> > >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> > _______________________________________________ >> > Postgres-xc-developers mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > >> >> >> ------------------------------------------------------------------------------ >> Android apps run on BlackBerry 10 >> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >> Now with support for Jelly Bean, Bluetooth, Mapview and more. >> Get your Android app in front of a whole new audience. Start now. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> > > > > -- > Best Wishes, > Ashutosh Bapat > EnterpriseDB Corporation > The Postgres Database Company > > > ------------------------------------------------------------------------------ > Android apps run on BlackBerry 10 > Introducing the new BlackBerry 10.2.1 Runtime for Android apps. > Now with support for Jelly Bean, Bluetooth, Mapview and more. > Get your Android app in front of a whole new audience. Start now. > > http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > |