From: Kaiji C. <ch...@im...> - 2013-03-29 10:35:49
|
Hi, I'm working on a data partitioning project on PostgreSQL by adding a middleware between the database cluster interface and applications that modify the SQL statement to specific data nodes. I just find that PostgresXC has a nice GTM that can help me do the distributed transaction management works, I considered to transfer my project on it. It seems the sliders (http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) intend that user defined table distribution is not available, but the coordinator can choose specific data node when processing the queries, and the table will be distributed to by default if DISTRIBUTED BY is not specified. Then I wonder if I can specify a data node in each query and stop the default auto distributing process. |
From: Michael P. <mic...@gm...> - 2013-03-30 04:55:12
|
On Fri, Mar 29, 2013 at 7:19 PM, Kaiji Chen <ch...@im...> wrote: > Hi, > I'm working on a data partitioning project on PostgreSQL by adding a > middleware between the database cluster interface and applications that > modify the SQL statement to specific data nodes. I just find that > PostgresXC has a nice GTM that can help me do the distributed transaction > management works, I considered to transfer my project on it. > It seems the sliders ( > http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) > intend that user defined table distribution is not available, but the > coordinator can choose specific data node when processing the queries, and > the table will be distributed to by default if DISTRIBUTED BY is not > specified. Then I wonder if I can specify a data node in each query and > stop the default auto distributing process. > For SELECT queries, you can use EXECUTE DIRECT: http://postgres-xc.sourceforge.net/docs/1_0_2/sql-executedirect.html The results you get might not be exact as not global query planning is not done and the query string is sent as-is. Note that you cannot use EXECUTE DIRECT with DML or the whole cluster consistency would be broken. -- Michael |
From: Kaiji C. <ch...@im...> - 2013-03-30 09:12:24
|
Thanks for your reply! It seems ok if I use EXECUTE DIRECT and manually maintain the data concurrency and a global index in my middleware. But it looks like I've skipped the PostgresXC coordinator, it will not be the best choice. I just come up a idea applying external data partitioning design to the PostgresXC. As stated in the document XC can distribute tables to data nodes using hash function. Then can I manipulate the original table and add a new column as my partition decision and let the table distributed by this column. Then we add this column to the compound primary key of the table and let the coordinator deal with the query planning work. I think this can be done if for different tables, the same hash value will be partitioned to the same data node if there is no modification to the set of data nodes. Yours, Kaiji Chen PhD Candidate<mailto:ch...@im...> IMADA, Southern Denmark University Email: ch...@im...<mailto:ch...@im...> ________________________________ From: Michael Paquier [mic...@gm...] Sent: Saturday, March 30, 2013 5:55 AM To: Kaiji Chen Cc: pos...@li... Subject: Re: [Postgres-xc-developers] Manually Table Partitioning On Fri, Mar 29, 2013 at 7:19 PM, Kaiji Chen <ch...@im...<mailto:ch...@im...>> wrote: Hi, I'm working on a data partitioning project on PostgreSQL by adding a middleware between the database cluster interface and applications that modify the SQL statement to specific data nodes. I just find that PostgresXC has a nice GTM that can help me do the distributed transaction management works, I considered to transfer my project on it. It seems the sliders (http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) intend that user defined table distribution is not available, but the coordinator can choose specific data node when processing the queries, and the table will be distributed to by default if DISTRIBUTED BY is not specified. Then I wonder if I can specify a data node in each query and stop the default auto distributing process. For SELECT queries, you can use EXECUTE DIRECT: http://postgres-xc.sourceforge.net/docs/1_0_2/sql-executedirect.html The results you get might not be exact as not global query planning is not done and the query string is sent as-is. Note that you cannot use EXECUTE DIRECT with DML or the whole cluster consistency would be broken. -- Michael |
From: Ashutosh B. <ash...@en...> - 2013-04-01 04:24:37
|
Kaiji, EXECUTE DIRECT is only for debugging and troubleshooting, it shouldn't be used in applications or tools based on XC, (except for the purposes stated). We may change or deprecate the EXECUTE DIRECT usage in future. On Sat, Mar 30, 2013 at 2:39 PM, Kaiji Chen <ch...@im...> wrote: > Thanks for your reply! > > It seems ok if I use EXECUTE DIRECT and manually maintain the data > concurrency and a global index in my middleware. But it looks like I've > skipped the PostgresXC coordinator, it will not be the best choice. > > I just come up a idea applying external data partitioning design to the > PostgresXC. As stated in the document XC can distribute tables to data > nodes using hash function. Then can I manipulate the original table and add > a new column as my partition decision and let the table distributed by this > column. Then we add this column to the compound primary key of the table > and let the coordinator deal with the query planning work. I think this can > be done if for different tables, the same hash value will be partitioned to > the same data node if there is no modification to the set of data nodes. > > > > > Yours, > Kaiji Chen > PhD Candidate <ch...@im...> > IMADA, Southern Denmark University > Email: ch...@im... > ------------------------------ > *From:* Michael Paquier [mic...@gm...] > *Sent:* Saturday, March 30, 2013 5:55 AM > *To:* Kaiji Chen > *Cc:* pos...@li... > *Subject:* Re: [Postgres-xc-developers] Manually Table Partitioning > > > > > On Fri, Mar 29, 2013 at 7:19 PM, Kaiji Chen <ch...@im...> wrote: > >> Hi, >> I'm working on a data partitioning project on PostgreSQL by adding a >> middleware between the database cluster interface and applications that >> modify the SQL statement to specific data nodes. I just find that >> PostgresXC has a nice GTM that can help me do the distributed transaction >> management works, I considered to transfer my project on it. >> It seems the sliders ( >> http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) >> intend that user defined table distribution is not available, but the >> coordinator can choose specific data node when processing the queries, and >> the table will be distributed to by default if DISTRIBUTED BY is not >> specified. Then I wonder if I can specify a data node in each query and >> stop the default auto distributing process. >> > For SELECT queries, you can use EXECUTE DIRECT: > http://postgres-xc.sourceforge.net/docs/1_0_2/sql-executedirect.html > The results you get might not be exact as not global query planning is > not done and the query string is sent as-is. > > Note that you cannot use EXECUTE DIRECT with DML or the whole cluster > consistency would be broken. > -- > Michael > > > ------------------------------------------------------------------------------ > Own the Future-Intel(R) Level Up Game Demo Contest 2013 > Rise to greatness in Intel's independent game demo contest. Compete > for recognition, cash, and the chance to get your game on Steam. > $5K grand prize plus 10 genre and skill prizes. Submit your demo > by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2 > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- Best Wishes, Ashutosh Bapat EntepriseDB Corporation The Enterprise Postgres Company |
From: Abbas B. <abb...@en...> - 2013-03-30 11:40:33
|
On Fri, Mar 29, 2013 at 3:19 PM, Kaiji Chen <ch...@im...> wrote: > Hi, > I'm working on a data partitioning project on PostgreSQL by adding a > middleware between the database cluster interface and applications that > modify the SQL statement to specific data nodes. I just find that > PostgresXC has a nice GTM that can help me do the distributed transaction > management works, I considered to transfer my project on it. > It seems the sliders ( > http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) > intend that user defined table distribution is not available, but the > coordinator can choose specific data node when processing the queries, and > the table will be distributed to by default if DISTRIBUTED BY is not > specified. Then I wonder if I can specify a data node in each query and > stop the default auto distributing process. > Here is what you can do. Add a column of type int in the table and distribute the table by modulo of the added column. Now if you want to specify in your query that the insert should go to first data node use value 0 for the added column, for second data node use 1 and so on. Off course a better way would be a add support for a user defined function for computing target data node in XC, but the above idea is valid for the current implementation. > > > ------------------------------------------------------------------------------ > Own the Future-Intel(R) Level Up Game Demo Contest 2013 > Rise to greatness in Intel's independent game demo contest. Compete > for recognition, cash, and the chance to get your game on Steam. > $5K grand prize plus 10 genre and skill prizes. Submit your demo > by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2 > _______________________________________________ > Postgres-xc-developers mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers > > -- -- Abbas Architect EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: 92-334-5100153 Website: www.enterprisedb.com EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |
From: Kaiji C. <ch...@im...> - 2013-03-30 12:35:26
|
Thanks for your help, I'll use this solution in my project. On Mar 30, 2013, at 12:40 PM, "Abbas Butt" <abb...@en...<mailto:abb...@en...>> wrote: On Fri, Mar 29, 2013 at 3:19 PM, Kaiji Chen <ch...@im...<mailto:ch...@im...>> wrote: Hi, I'm working on a data partitioning project on PostgreSQL by adding a middleware between the database cluster interface and applications that modify the SQL statement to specific data nodes. I just find that PostgresXC has a nice GTM that can help me do the distributed transaction management works, I considered to transfer my project on it. It seems the sliders (http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) intend that user defined table distribution is not available, but the coordinator can choose specific data node when processing the queries, and the table will be distributed to by default if DISTRIBUTED BY is not specified. Then I wonder if I can specify a data node in each query and stop the default auto distributing process. Here is what you can do. Add a column of type int in the table and distribute the table by modulo of the added column. Now if you want to specify in your query that the insert should go to first data node use value 0 for the added column, for second data node use 1 and so on. Off course a better way would be a add support for a user defined function for computing target data node in XC, but the above idea is valid for the current implementation. ------------------------------------------------------------------------------ Own the Future-Intel(R) Level Up Game Demo Contest 2013 Rise to greatness in Intel's independent game demo contest. Compete for recognition, cash, and the chance to get your game on Steam. $5K grand prize plus 10 genre and skill prizes. Submit your demo by 6/6/13. http://altfarm.mediaplex.com/ad/ck/12124-176961-30367-2 _______________________________________________ Postgres-xc-developers mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-developers -- -- Abbas Architect EnterpriseDB Corporation The Enterprise PostgreSQL Company Phone: 92-334-5100153 Website: www.enterprisedb.com<http://www.enterprisedb.com/> EnterpriseDB Blog: http://blogs.enterprisedb.com/ Follow us on Twitter: http://www.twitter.com/enterprisedb This e-mail message (and any attachment) is intended for the use of the individual or entity to whom it is addressed. This message contains information from EnterpriseDB Corporation that may be privileged, confidential, or exempt from disclosure under applicable law. If you are not the intended recipient or authorized to receive this for the intended recipient, any use, dissemination, distribution, retention, archiving, or copying of this communication is strictly prohibited. If you have received this e-mail in error, please notify the sender immediately by reply e-mail and delete this message. |