Re: [Postgres-xc-developers] Manually Table Partitioning

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Thanks for your reply!

It seems ok if I use EXECUTE DIRECT and manually maintain the data concurrency and a global index in my middleware. But it looks like I've skipped the PostgresXC coordinator, it will not be the best choice.

I just come up a idea applying external data partitioning design to the PostgresXC. As stated in the document XC can distribute tables to data nodes using hash function. Then can I manipulate the original table and add a new column as my partition decision and let the table distributed by this column. Then we add this column to the compound primary key of the table and let the coordinator deal with the query planning work. I think this can be done if for different tables, the same hash value will be partitioned to the same data node if there is no modification to the set of data nodes.

Yours,
Kaiji Chen
PhD Candidate<mailto:ch...@im...>
IMADA， Southern Denmark University
Email: ch...@im...<mailto:ch...@im...>
________________________________
From: Michael Paquier [mic...@gm...]
Sent: Saturday, March 30, 2013 5:55 AM
To: Kaiji Chen
Cc: pos...@li...
Subject: Re: [Postgres-xc-developers] Manually Table Partitioning

On Fri, Mar 29, 2013 at 7:19 PM, Kaiji Chen <ch...@im...<mailto:ch...@im...>> wrote:
Hi,
I'm working on a data partitioning project on PostgreSQL by adding a middleware between the database cluster interface and applications that modify the SQL statement to specific data nodes. I just find that PostgresXC has a nice GTM that can help me do the distributed transaction management works, I considered to transfer my project on it.
It seems the sliders (http://wiki.postgresql.org/images/f/f6/PGXC_Scalability_PGOpen2012.pdf) intend that user defined table distribution is not available, but the coordinator can choose specific data node when processing the queries, and the table will be distributed to by default if DISTRIBUTED BY is not specified. Then I wonder if I can specify a data node in each query and stop the default auto distributing process.
For SELECT queries, you can use EXECUTE DIRECT:
http://postgres-xc.sourceforge.net/docs/1_0_2/sql-executedirect.html
The results you get might not be exact as not global query planning is not done and the query string is sent as-is.

Note that you cannot use EXECUTE DIRECT with DML or the whole cluster consistency would be broken.
--
Michael