From: Abbas B. <abb...@en...> - 2014-02-20 10:38:43
|
On Thu, Feb 20, 2014 at 3:25 PM, Ahsan Hadi <ahs...@en...>wrote: > > > > On Wed, Feb 19, 2014 at 3:56 PM, Abbas Butt <abb...@en...>wrote: > >> Lets do this >> 1. If the replicated table has a unique key defined , then use Mason's >> patch after testing and refinement if required. >> 2. If the replicated table has no key, then we can use cursors idea >> proposed by Ashutosh. >> >> Does every body agree? >> > > Yes. > > Can you provide a patch for the above before monday? > I will try my best to. > > >> >> >> >> On Wed, Feb 19, 2014 at 12:15 PM, Koichi Suzuki <koi...@gm...>wrote: >> >>> 2014-02-19 16:00 GMT+09:00 Ahsan Hadi <ahs...@en...>: >>> > >>> > >>> > >>> > On Tue, Feb 18, 2014 at 11:26 PM, Amit Khandekar >>> > <ami...@en...> wrote: >>> >> >>> >> >>> >> >>> >> >>> >> On 13 February 2014 11:54, Ashutosh Bapat >>> >> <ash...@en...> wrote: >>> >>> >>> >>> One more solution would be to use cursors for replicated tables. The >>> idea >>> >>> is to open cursors on all the copies of the table and append the >>> query with >>> >>> an ORDER BY clause on all the columns. Thus we are sure that the >>> current of >>> >>> each of these cursors point to same row on all the copies. While >>> fetching a >>> >>> row from a replicated table, we fetch from all the cursors and >>> choose only >>> >>> one row for the data processing. While updating or deleting we send >>> UPDATE >>> >>> or DELETE with WHERE CURRENT OF. The down side of this approach is >>> that, if >>> >>> there are coordinator quals, we will end up locking more rows than >>> >>> necessary, increasing the probability of the deadlock but at least >>> there >>> >>> won't be a necessary restriction of having primary or unique key and >>> we >>> >>> won't break backward compatibility. >>> >>> >>> >>> If there two identical rows, we might mix the update from different >>> >>> nodes, but then who knew which of them were corresponded across the >>> nodes to >>> >>> start with. >>> >>> >>> >> >>> >> Locking all rows doesn't look good especially because we are looking >>> for a >>> >> permanent long term solution. If we can come up with some other >>> solution >>> >> that avoids this, we better avoid this compromise. For a replicated >>> table >>> >> with 10000 rows, all concurrent updates will be serialized even if >>> they are >>> >> updating a different row. >>> >> >>> >> Other thing is datanode performance impact for ORDER BY all columns, >>> >> especially with many large size columns. I had also mentioned about >>> ORDER BY >>> >> in approach A. I am not sure whether there is some kind of >>> optimization in >>> >> the sort, such as: if we find unique rows with the first n columns, >>> it does >>> >> not compare the rest of the columns. >>> >> >>> >> I think declaring cursors is a cool idea in general for DMLs but it >>> >> requires refactoring of DML planning, and also it requires ORDER BY. >>> There >>> >> is a concurrent update issue #398 for which we do require a refactor >>> of DML >>> >> handling. While doing that it will be clearer whether declaring >>> cursor is >>> >> really beneficial or if it's not feasible. For ORDER BY, again, for >>> long >>> >> term, we should have a primary key or an internal unique key so that >>> rows >>> >> can be ordered on that single column as against all columns. So >>> again, we >>> >> still are better of with a new system column. >>> >> >>> >> >>> >> As regards to approach C, if we find a way to uniquely generate a new >>> row >>> >> id independently, then the task of generating rowid will be pretty >>> >> lightweight. We won't require any other table to store it or generate >>> it. >>> >> The coordinator will generate it at each insert (both fqs and >>> non-fqs), or >>> >> may be datanodes themselves find a way to generate a new rowid which >>> is >>> >> always the same regardless of the datanode. A combination of gxid, >>> timestamp >>> >> and cmd id can be used to construct a unique rowid at the coordinator. >>> >> >>> >> I think one action plan can be : >>> >> 1. Use Mason's patch and tweak it so that it needs very little >>> >> modification later on if and when we add the system rowid column. >>> >> 2. Check in the patch but let it not error out if the primary key is >>> not >>> >> there. This way we would at least make the replicated tables with >>> primary >>> >> keys work without data issues, but continue to work as it is now for >>> tables >>> >> without primary key. >>> >> 3. Lastly support the new system row id implementation, and do an >>> >> incremental change in Mason's checked in changes to use this id >>> instead of >>> >> primary key. >>> >> >>> > >>> > Sounds good but as i understand 3 is not targeted for 1.2? >>> >>> I agree on this too. System column will benefit not only this, but >>> also WCO, more variety of the cursor, and Triggers. I agree this is >>> targeted to 1.3 or later. >>> >>> Regards; >>> --- >>> Koichi Suzuki >>> >>> > >>> >> >>> >> >>> >>> >>> >>> On Thu, Feb 13, 2014 at 9:45 AM, Koichi Suzuki < >>> koi...@gm...> >>> >>> wrote: >>> >>>> >>> >>>> Hi, >>> >>>> >>> >>>> I tested the patch and found that primary key is mandatory. We >>> need >>> >>>> to modify regression test considerably to give each replicated table >>> >>>> primary keys. >>> >>>> >>> >>>> I think this patch helps but I'm not afraid this is good, especially >>> >>>> when we try to take XC features back to PG. >>> >>>> >>> >>>> Did you post another patch to use all column values if primary key >>> is >>> >>>> not available? >>> >>>> >>> >>>> I think better way is as follows: >>> >>>> >>> >>>> 1) If primary key is defined, use it, >>> >>>> 2) If not, create a primary key as system column, the size should be >>> >>>> 64bit. >>> >>>> 3) If primary key is added to a replicated table, remove system >>> primary >>> >>>> key. >>> >>>> >>> >>>> The value of primary key can be obtained as follows: >>> >>>> >>> >>>> 1) add new column to pgxc_class catalog to represent maximum value >>> of >>> >>>> the system primary key, >>> >>>> 2) when first "insert" is done to the primary node, system primary >>> key >>> >>>> value is taken from 1) and 1) is updated. The value is returned to >>> >>>> the coordinator to be propagated to other nodes. >>> >>>> 3) when subsequent "insert" is being done, system primary key value >>> is >>> >>>> added to the column value. In this case, each datanode updates 1) >>> >>>> column value if it is larger than the current maximum value. >>> >>>> >>> >>>> 3) is important to change primary node to another. This is needed >>> to >>> >>>> carry over the primary node to another. >>> >>>> >>> >>>> ALTER TABLE should take care of them. >>> >>>> >>> >>>> Other issues are: >>> >>>> >>> >>>> 4) pg_dump/pg_dumpall should not include this system column value, >>> >>>> 5) cluster may need to handle this too to repack system primary key >>> >>>> value (not now but at least in 1.3 or later). >>> >>>> >>> >>>> Regards; >>> >>>> --- >>> >>>> Koichi Suzuki >>> >>>> >>> >>>> >>> >>>> 2013-11-02 9:26 GMT+09:00 Mason Sharp <ms...@tr...>: >>> >>>> > Please see attached patch that tries to address the issue of XC >>> using >>> >>>> > CTID >>> >>>> > for replicated updates and deletes when it is evaluated at a >>> >>>> > coordinator >>> >>>> > instead of being pushed down. >>> >>>> > >>> >>>> > The problem here is that CTID could be referring to a different >>> tuple >>> >>>> > altogether on a different data node, which is what happened for >>> one of >>> >>>> > our >>> >>>> > Postgres-XC support customers, leading to data issues. >>> >>>> > >>> >>>> > Instead, the patch looks for a primary key or unique index (with >>> the >>> >>>> > primary >>> >>>> > key preferred) and uses those values instead of CTID. >>> >>>> > >>> >>>> > The patch could be improved further. Extra parameters are set >>> even if >>> >>>> > not >>> >>>> > used in the execution of the prepared statement sent down to the >>> data >>> >>>> > nodes. >>> >>>> > >>> >>>> > Regards, >>> >>>> > >>> >>>> > >>> >>>> > -- >>> >>>> > Mason Sharp >>> >>>> > >>> >>>> > TransLattice - http://www.translattice.com >>> >>>> > Distributed and Clustered Database Solutions >>> >>>> > >>> >>>> > >>> >>>> > >>> ------------------------------------------------------------------------------ >>> >>>> > November Webinars for C, C++, Fortran Developers >>> >>>> > Accelerate application performance with scalable programming >>> models. >>> >>>> > Explore >>> >>>> > techniques for threading, error checking, porting, and tuning. >>> Get the >>> >>>> > most >>> >>>> > from the latest Intel processors and coprocessors. See abstracts >>> and >>> >>>> > register >>> >>>> > >>> >>>> > >>> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >>> >>>> > _______________________________________________ >>> >>>> > Postgres-xc-developers mailing list >>> >>>> > Pos...@li... >>> >>>> > >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >>>> > >>> >>>> >>> >>>> >>> >>>> >>> ------------------------------------------------------------------------------ >>> >>>> Android apps run on BlackBerry 10 >>> >>>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>> >>>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>> >>>> Get your Android app in front of a whole new audience. Start now. >>> >>>> >>> >>>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>> >>>> _______________________________________________ >>> >>>> Postgres-xc-developers mailing list >>> >>>> Pos...@li... >>> >>>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> Best Wishes, >>> >>> Ashutosh Bapat >>> >>> EnterpriseDB Corporation >>> >>> The Postgres Database Company >>> >>> >>> >>> >>> >>> >>> ------------------------------------------------------------------------------ >>> >>> Android apps run on BlackBerry 10 >>> >>> Introducing the new BlackBerry 10.2.1 Runtime for Android apps. >>> >>> Now with support for Jelly Bean, Bluetooth, Mapview and more. >>> >>> Get your Android app in front of a whole new audience. Start now. >>> >>> >>> >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=124407151&iu=/4140/ostg.clktrk >>> >>> _______________________________________________ >>> >>> Postgres-xc-developers mailing list >>> >>> Pos...@li... >>> >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >>> >>> >> >>> >> >>> >> >>> >> >>> ------------------------------------------------------------------------------ >>> >> Managing the Performance of Cloud-Based Applications >>> >> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>> >> Read the Whitepaper. >>> >> >>> >> >>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>> >> >>> >> _______________________________________________ >>> >> Postgres-xc-developers mailing list >>> >> Pos...@li... >>> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> >>> > >>> > >>> > >>> > -- >>> > Ahsan Hadi >>> > Snr Director Product Development >>> > EnterpriseDB Corporation >>> > The Enterprise Postgres Company >>> > >>> > Phone: +92-51-8358874 >>> > Mobile: +92-333-5162114 >>> > >>> > Website: www.enterprisedb.com >>> > EnterpriseDB Blog: http://blogs.enterprisedb.com/ >>> > Follow us on Twitter: http://www.twitter.com/enterprisedb >>> > >>> > This e-mail message (and any attachment) is intended for the use of the >>> > individual or entity to whom it is addressed. This message contains >>> > information from EnterpriseDB Corporation that may be privileged, >>> > confidential, or exempt from disclosure under applicable law. If you >>> are not >>> > the intended recipient or authorized to receive this for the intended >>> > recipient, any use, dissemination, distribution, retention, archiving, >>> or >>> > copying of this communication is strictly prohibited. If you have >>> received >>> > this e-mail in error, please notify the sender immediately by reply >>> e-mail >>> > and delete this message. >>> > >>> > >>> ------------------------------------------------------------------------------ >>> > Managing the Performance of Cloud-Based Applications >>> > Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>> > Read the Whitepaper. >>> > >>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>> > _______________________________________________ >>> > Postgres-xc-developers mailing list >>> > Pos...@li... >>> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> > >>> >>> >>> ------------------------------------------------------------------------------ >>> Managing the Performance of Cloud-Based Applications >>> Take advantage of what the Cloud has to offer - Avoid Common Pitfalls. >>> Read the Whitepaper. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> Postgres-xc-developers mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >>> >> >> >> >> -- >> -- >> *Abbas* >> Architect >> >> Ph: 92.334.5100153 >> Skype ID: gabbasb >> www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> >> >> *Follow us on Twitter* >> @EnterpriseDB >> >> Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> >> > > > > -- > Ahsan Hadi > Snr Director Product Development > EnterpriseDB Corporation > The Enterprise Postgres Company > > Phone: +92-51-8358874 > Mobile: +92-333-5162114 > > Website: www.enterprisedb.com > EnterpriseDB Blog: http://blogs.enterprisedb.com/ > Follow us on Twitter: http://www.twitter.com/enterprisedb > > This e-mail message (and any attachment) is intended for the use of the > individual or entity to whom it is addressed. This message contains > information from EnterpriseDB Corporation that may be privileged, > confidential, or exempt from disclosure under applicable law. If you are > not the intended recipient or authorized to receive this for the intended > recipient, any use, dissemination, distribution, retention, archiving, or > copying of this communication is strictly prohibited. If you have received > this e-mail in error, please notify the sender immediately by reply e-mail > and delete this message. > -- -- *Abbas* Architect Ph: 92.334.5100153 Skype ID: gabbasb www.enterprisedb.co <http://www.enterprisedb.com/>m<http://www.enterprisedb.com/> *Follow us on Twitter* @EnterpriseDB Visit EnterpriseDB for tutorials, webinars, whitepapers<http://www.enterprisedb.com/resources-community>and more<http://www.enterprisedb.com/resources-community> |