|
From: Koichi S. <koi...@gm...> - 2012-07-04 09:31:43
|
2012/7/4 Michael Paquier <mic...@gm...>: > > > On Wed, Jul 4, 2012 at 2:31 PM, Joseph Glanville > <jos...@or...> wrote: >> >> Hey guys, >> >> This is more of a feature request/question regarding how HA could be >> implemented with PostgreXC in the future. >> >> Could it be possible to have a composite table type which could >> replicate to X nodes and distribute to Y nodes in such a way that >> atleast X copies of every row is maintained but the table is shareded >> across Y data nodes. > > The answer is yes. It is possible. >> >> >> For example in a cluster of 6 nodes one would be able configure at >> table with REPLICATION 2, DISTRIBUTE 3 BY HASH etc (I can't remember >> what the table definitions look like) as such that the table would be >> replicated to 2 sets of 3 nodes. > > As you seem to be aware of, now XC only supports horizontal partitioning, > meaning that tuples are present on each node in a complete form with all the > column data. > So let's call call your feature partial horizontal partitioning... Or > something like this. Maybe multiple distribution, for example, CREATE TABLE T ... DISTRIBUTE BY HASH(a) TO (node1, node2), REPLICATE TO (node3); This has another application like CREATE TABLE T ... DISTRIBUTED BY HASH(a), HASH(b); In this case, we can choose what distribution is more suitable for SELECT statement. If WHERE T.a = xxx, then we can choose HASH(a) distribution and if WHERE T.b=yyy, then choose HASH(b). This is not only for HA arrangement but can enable more sophisticated query planning. Vertical partitioning is another issue and could be very challenging. > >> >> This is interesting becaues it can provide a flexible tradeoff between >> full write scalability (current PostgresXC distribute) and full read >> scalability (PostgresXC replicate or other slave solutions) >> What is most useful about this setup is using PostgresXC this can be >> maintained transparently without middleware and configured to be fully >> sync multi-master etc. > > Do you have some example of applications that may require that? > >> >> >> Are there significant technical challenges to the above and is this >> something the PostgresXC team would be interested in? > > The code would need to be changed at many places and might require some > effort especially for cursors and join determination at planner side. > > Another critical choice I see here is related to the preferential strategy > for node choice. > For example, in your case, the table is replicated on 3 nodes, and > distributed on 3 nodes by hash. > When a simple read query arrives at XC level, we need to make XC aware of > which set of nodes to choose in priority. > A simple session parameter which is table-based could manage that though, > but is it user-friendly? > A way to choose the set of nodes automatically would be to evaluate with a > global system of statistics the load on each table of read/write operations > for each set of nodes and choose the set of nodes the less loaded at the > moment query is fired when planning it. This is largely more complicated > however. > -- > Michael Paquier > http://michael.otacoo.com > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |