From: Abbas B. <abb...@en...> - 2013-02-23 09:29:31
Attachments:
1_fix_dump.patch
|
Hi, PFA a patch to fix a crash when COPY TO is used on a replicated table. This test case produces a crash create table tab_rep(a int, b int) distribute by replication; insert into tab_rep values(1,2), (3,4), (5,6), (7,8); COPY tab_rep (a, b) TO stdout; Here is a description of the problem and the fix In case of a read from a replicated table GetRelationNodes() returns all nodes and expects that the planner can choose one depending on the rest of the join tree. In case of COPY TO we should choose the first one in the node list This fixes a system crash and makes pg_dump work fine. -- Abbas Architect EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: 92-334-5100153 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Ashutosh B. <ash...@en...> - 2013-02-25 05:19:55
|
Thanks a lot Abbas for this quick fix. I am sorry, it's caused by my refactoring of GetRelationNodes(). If possible, can you please examine the other callers of GetRelationNodes() which would face the problems, esp. the ones for DML and utilities. This is other instance, where deciding the nodes to execute on at the time of execution will help. About the fix Can you please use GetPreferredReplicationNode() instead of list_truncate()? It will pick the preferred node instead of first one. If you find more places where we need this fix, it might be better to create a wrapper function and use it at those places. On Sat, Feb 23, 2013 at 2:59 PM, Abbas Butt <abb...@en...>wrote: > Hi, > PFA a patch to fix a crash when COPY TO is used on a replicated table. > > This test case produces a crash > > create table tab_rep(a int, b int) distribute by replication; > insert into tab_rep values(1,2), (3,4), (5,6), (7,8); > COPY tab_rep (a, b) TO stdout; > > Here is a description of the problem and the fix > In case of a read from a replicated table GetRelationNodes() > returns all nodes and expects that the planner can choose > one depending on the rest of the join tree. > In case of COPY TO we should choose the first one in the node list > This fixes a system crash and makes pg_dump work fine. > > -- > Abbas > Architect > EnterpriseDB Corporation > The Enterprise PostgreSQL Company > > Phone: 92-334-5100153 > > Website: www.enterprisedb.com > EnterpriseDB Blog: http://blogs.enterprisedb.com/ > Follow us on Twitter: http://www.twitter.com/enterprisedb > > This e-mail message (and any attachment) is intended for the use of > the individual or entity to whom it is addressed. This message > contains information from EnterpriseDB Corporation that may be > privileged, confidential, or exempt from disclosure under applicable > law. If you are not the intended recipient or authorized to receive > this for the intended recipient, any use, dissemination, distribution, > retention, archiving, or copying of this communication is strictly > prohibited. If you have received this e-mail in error, please notify > the sender immediately by reply e-mail and delete this message. > > ------------------------------------------------------------------------------ > Everyone hates slow websites. So do we. > Make your web apps faster with AppDynamics > Download AppDynamics Lite for free today: > http://p.sf.net/sfu/appdyn_d2d_feb > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Ashutosh B. <ash...@en...> - 2013-02-25 05:21:27
|
Oh, BTW, I had run regression before committing the fix, but it didn't catch this crash. One reason is we do not have PG tests running for replicated tables. But in this specific case, may be we should add a test in xc_misc or something. On Mon, Feb 25, 2013 at 10:48 AM, Ashutosh Bapat < ash...@en...> wrote: > Thanks a lot Abbas for this quick fix. > > I am sorry, it's caused by my refactoring of GetRelationNodes(). > > If possible, can you please examine the other callers of > GetRelationNodes() which would face the problems, esp. the ones for DML and > utilities. This is other instance, where deciding the nodes to execute on > at the time of execution will help. > > About the fix > Can you please use GetPreferredReplicationNode() instead of > list_truncate()? It will pick the preferred node instead of first one. If > you find more places where we need this fix, it might be better to create a > wrapper function and use it at those places. > > On Sat, Feb 23, 2013 at 2:59 PM, Abbas Butt <abb...@en...>wrote: > >> Hi, >> PFA a patch to fix a crash when COPY TO is used on a replicated table. >> >> This test case produces a crash >> >> create table tab_rep(a int, b int) distribute by replication; >> insert into tab_rep values(1,2), (3,4), (5,6), (7,8); >> COPY tab_rep (a, b) TO stdout; >> >> Here is a description of the problem and the fix >> In case of a read from a replicated table GetRelationNodes() >> returns all nodes and expects that the planner can choose >> one depending on the rest of the join tree. >> In case of COPY TO we should choose the first one in the node list >> This fixes a system crash and makes pg_dump work fine. >> >> -- >> Abbas >> Architect >> EnterpriseDB Corporation >> The Enterprise PostgreSQL Company >> >> Phone: 92-334-5100153 >> >> Website: www.enterprisedb.com >> EnterpriseDB Blog: http://blogs.enterprisedb.com/ >> Follow us on Twitter: http://www.twitter.com/enterprisedb >> >> This e-mail message (and any attachment) is intended for the use of >> the individual or entity to whom it is addressed. This message >> contains information from EnterpriseDB Corporation that may be >> privileged, confidential, or exempt from disclosure under applicable >> law. If you are not the intended recipient or authorized to receive >> this for the intended recipient, any use, dissemination, distribution, >> retention, archiving, or copying of this communication is strictly >> prohibited. If you have received this e-mail in error, please notify >> the sender immediately by reply e-mail and delete this message. >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_d2d_feb >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Michael P. <mic...@gm...> - 2013-02-25 05:40:52
|
On Mon, Feb 25, 2013 at 2:18 PM, Ashutosh Bapat < ash...@en...> wrote: > Thanks a lot Abbas for this quick fix. > > I am sorry, it's caused by my refactoring of GetRelationNodes(). > > If possible, can you please examine the other callers of > GetRelationNodes() which would face the problems, esp. the ones for DML and > utilities. This is other instance, where deciding the nodes to execute on > at the time of execution will help. > > About the fix > Can you please use GetPreferredReplicationNode() instead of > list_truncate()? It will pick the preferred node instead of first one. If > you find more places where we need this fix, it might be better to create a > wrapper function and use it at those places. > Also could you add a test in xc_copy to be sure that this does not happen again? I think there are already safeguards in this test but just to be sure... -- Michael |
From: Abbas B. <abb...@en...> - 2013-03-08 07:25:54
Attachments:
11_fix_dump.patch
|
Attached please find revised patch that provides the following in addition to what it did earlier. 1. Uses GetPreferredReplicationNode() instead of list_truncate() 2. Adds test cases to xc_alter_table and xc_copy. I tested the following in reasonable detail to find whether any other caller of GetRelationNodes() needs some fixing or not and found that none of the other callers needs any more fixing. I tested a) copy b) alter table redistribute c) utilities d) dmls etc However while testing ALTER TABLE, I found that replicated to hash is not working correctly. This test case fails, since only SIX rows are expected in the final result. test=# create table t_r_n12(a int, b int) distribute by replication to node (DATA_NODE_1, DATA_NODE_2); CREATE TABLE test=# insert into t_r_n12 values(1,777),(3,4),(5,6),(20,30),(NULL,999), (NULL, 999); INSERT 0 6 test=# -- rep to hash test=# ALTER TABLE t_r_n12 distribute by hash(a); ALTER TABLE test=# SELECT * FROM t_r_n12 order by 1; a | b ----+----- 1 | 777 3 | 4 5 | 6 20 | 30 | 999 | 999 | 999 | 999 (8 rows) test=# drop table t_r_n12; DROP TABLE I have added a source forge bug tracker id to this case (Artifact 3607290<https://sourceforge.net/tracker/?func=detail&aid=3607290&group_id=311227&atid=1310232>). The reason for this error is that the function distrib_delete_hash does not take into account that the distribution column can be null. I will provide a separate fix for that one. Regression shows no extra failure except that test case xc_alter_table would fail until 3607290 is fixed. Regards On Mon, Feb 25, 2013 at 10:18 AM, Ashutosh Bapat < ash...@en...> wrote: > Thanks a lot Abbas for this quick fix. > > I am sorry, it's caused by my refactoring of GetRelationNodes(). > > If possible, can you please examine the other callers of > GetRelationNodes() which would face the problems, esp. the ones for DML and > utilities. This is other instance, where deciding the nodes to execute on > at the time of execution will help. > > About the fix > Can you please use GetPreferredReplicationNode() instead of > list_truncate()? It will pick the preferred node instead of first one. If > you find more places where we need this fix, it might be better to create a > wrapper function and use it at those places. > > On Sat, Feb 23, 2013 at 2:59 PM, Abbas Butt <abb...@en...>wrote: > >> Hi, >> PFA a patch to fix a crash when COPY TO is used on a replicated table. >> >> This test case produces a crash >> >> create table tab_rep(a int, b int) distribute by replication; >> insert into tab_rep values(1,2), (3,4), (5,6), (7,8); >> COPY tab_rep (a, b) TO stdout; >> >> Here is a description of the problem and the fix >> In case of a read from a replicated table GetRelationNodes() >> returns all nodes and expects that the planner can choose >> one depending on the rest of the join tree. >> In case of COPY TO we should choose the first one in the node list >> This fixes a system crash and makes pg_dump work fine. >> >> -- >> Abbas >> Architect >> EnterpriseDB Corporation >> The Enterprise PostgreSQL Company >> >> Phone: 92-334-5100153 >> >> Website: www.enterprisedb.com >> EnterpriseDB Blog: http://blogs.enterprisedb.com/ >> Follow us on Twitter: http://www.twitter.com/enterprisedb >> >> This e-mail message (and any attachment) is intended for the use of >> the individual or entity to whom it is addressed. This message >> contains information from EnterpriseDB Corporation that may be >> privileged, confidential, or exempt from disclosure under applicable >> law. If you are not the intended recipient or authorized to receive >> this for the intended recipient, any use, dissemination, distribution, >> retention, archiving, or copying of this communication is strictly >> prohibited. If you have received this e-mail in error, please notify >> the sender immediately by reply e-mail and delete this message. >> >> ------------------------------------------------------------------------------ >> Everyone hates slow websites. So do we. >> Make your web apps faster with AppDynamics >> Download AppDynamics Lite for free today: >> http://p.sf.net/sfu/appdyn_d2d_feb >> _______________________________________________ >> Postgres-xc-developers mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers >> >> > > > -- > Best Wishes, > Ashutosh Bapat > EntepriseDB Corporation > The Enterprise Postgres Company > -- -- Abbas Architect EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: 92-334-5100153 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |