Re: [Postgres-xc-general] Composite table types, replicate + distribute.

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

2012/7/4 Joseph Glanville <jos...@or...>

> On 4 July 2012 22:36, Andrei Martsinchyk <and...@gm...>
> wrote:
> > Hi Joseph,
> >
> > If you just need HA you may configure stanby's for your datanodes.
> > PostgresXC supports synchronous and asynchronous replication.
> > There is a pitfall, if you would try to make you database highly
> available
> > using combined hash/replicated distribution. Basically if replicated
> > datanode failed you would not able to write to the table. Coordinator
> would
> > not be able to update the replica.
> > With standby datanodes you may have your tables replicated and any change
> > will be automatically propagated to standby's, and system will work fine
> if
> > any standby fails. However you need an external solution to monitor
> master
> > datanodes and promote standby to failover.
>
> I understand this and is the reason why I was proposing a future
> movement towards a more integrated HA solution.
> It's more of a personal opinion rather than one purely ground in
> technical merit which is why I enquired as to whether this is
> compatible with XC goals.
>
> To me this has been a massive thing missing from the Open Source
> databases for a really long time and I would be happy to help make it
> happen.
> The biggest barrier has always been PostgreSQL's core team opposition
> to built in distributed operation, however is XC gains enough steam
> this might no longer be an issue.
>
>
Definitely data distribution will be more flexible and HA-related options
will be integrated. I an just pointing out a solution which is already
available.


> >
> > 2012/7/4 Joseph Glanville <jos...@or...>
> >>
> >> On 4 July 2012 17:40, Michael Paquier <mic...@gm...>
> wrote:
> >> >
> >> >
> >> > On Wed, Jul 4, 2012 at 2:31 PM, Joseph Glanville
> >> > <jos...@or...> wrote:
> >> >>
> >> >> Hey guys,
> >> >>
> >> >> This is more of a feature request/question regarding how HA could be
> >> >> implemented with PostgreXC in the future.
> >> >>
> >> >> Could it be possible to have a composite table type which could
> >> >> replicate to X nodes and distribute to Y nodes in such a way that
> >> >> atleast X copies of every row is maintained but the table is shareded
> >> >> across Y data nodes.
> >> >
> >> > The answer is yes. It is possible.
> >> >>
> >> >>
> >> >> For example in a cluster of 6 nodes one would be able configure at
> >> >> table with REPLICATION 2, DISTRIBUTE 3 BY HASH etc (I can't remember
> >> >> what the table definitions look like) as such that the table would be
> >> >> replicated to 2 sets of 3 nodes.
> >> >
> >> > As you seem to be aware of, now XC only supports horizontal
> >> > partitioning,
> >> > meaning that tuples are present on each node in a complete form with
> all
> >> > the
> >> > column data.
> >> > So let's call call your feature partial horizontal partitioning... Or
> >> > something like this.
> >>
> >> I prefer to think of it as true horizontal scaling rather than a form
> >> of partitioning as partitioning is only part of what it would do. :)
> >>
> >> >
> >> >>
> >> >> This is interesting becaues it can provide a flexible tradeoff
> between
> >> >> full write scalability (current PostgresXC distribute) and full read
> >> >> scalability (PostgresXC replicate or other slave solutions)
> >> >> What is most useful about this setup is using PostgresXC this can be
> >> >> maintained transparently without middleware and configured to be
> fully
> >> >> sync multi-master etc.
> >> >
> >> > Do you have some example of applications that may require that?
> >>
> >> The applications are no different merely the SLA/uptime requirements
> >> and an overall reduction in complexity.
> >>
> >> In the current XC architecture datanodes need to be highly available,
> >> this change would shift the onus of high availability away from
> >> individual datanodes to the coordinators etc.
> >> The main advantage here is the reduction in moving parts and better
> >> awareness of the query engine to the state of the system.
> >>
> >> In theory if something along the lines of this could be implemented
> >> you could use the below REPLICATE/DISTRIBUTE strategy to maintain
> >> ability to service queries with up to 3 out of 6 servers down, as long
> >> as you lost the right 3 ( the entirety of one DISTRIBUTE cluster).
> >>
> >> As you are probably already aware current replication solutions for
> >> Postgres don't play nicely with each other middleware as there hasn't
> >> really been any integration up until now (streaming replcation is
> >> starting to change this but its overall integration is still poor with
> >> other middleware and applications)
> >>
> >> >
> >> >>
> >> >>
> >> >> Are there significant technical challenges to the above and is this
> >> >> something the PostgresXC team would be interested in?
> >> >
> >> > The code would need to be changed at many places and might require
> some
> >> > effort especially for cursors and join determination at planner side.
> >> >
> >> > Another critical choice I see here is related to the preferential
> >> > strategy
> >> > for node choice.
> >> > For example, in your case, the table is replicated on 3 nodes, and
> >> > distributed on 3 nodes by hash.
> >> > When a simple read query arrives at XC level, we need to make XC aware
> >> > of
> >> > which set of nodes to choose in priority.
> >> > A simple session parameter which is table-based could manage that
> >> > though,
> >> > but is it user-friendly?
> >> > A way to choose the set of nodes automatically would be to evaluate
> with
> >> > a
> >> > global system of statistics the load on each table of read/write
> >> > operations
> >> > for each set of nodes and choose the set of nodes the less loaded at
> the
> >> > moment query is fired when planning it. This is largely more
> complicated
> >> > however.
> >>
> >> This is true. My first thought was quite similar.
> >> If you have the same example as above where one has a total of 6
> >> datanodes, 2 sets of a 3 node distribute table you have 2 nodes that
> >> can service each read request.
> >> One could use a simple round robin approach to generate aforementioned
> >> table which would look somewhat similar to below:
> >>
> >>         |  shard1 | shard 2 | shard3
> >> rep1 |     1       |     2      |     1
> >> rep2 |     2       |     1      |     2
> >>
> >> This would allow both online and offline optimisation by either
> >> internal processes or manual intervention by the operator.
> >> Being so simple it is very easy to autogenerate said table. For a HASH
> >> style distribute read queries should be uniformly distributed across
> >> shard replicas.
> >>
> >> Personally I think the more complicated bit becomes restoring shard
> >> replicas that have left the cluster for some time.
> >> In my opinion it would be best to have XC do a row based restore
> >> because XC has alot of information that could make this process very
> >> fast.
> >>
> >> Assuming the case where one has many replicas configured (say 3 or
> >> more) read queries required to bring either an out of date replica up
> >> to speed or a completely new and empty replica to up to date status
> >> could be distributed across other replica members.
> >>
> >> > --
> >> > Michael Paquier
> >> > http://michael.otacoo.com
> >>
> >> I am aware that that the proposal is quite broad (from a technical
> >> perspective) but more what I am trying to asertain is if it is in
> >> conflict with the current XC's team vision.
> >>
> >> Joseph.
> >>
> >> --
> >> CTO | Orion Virtualisation Solutions | www.orionvm.com.au
> >> Phone: 1300 56 99 52 | Mobile: 0428 754 846
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Live Security Virtual Conference
> >> Exclusive live event will cover all the ways today's security and
> >> threat landscape has changed and how IT managers can respond.
> Discussions
> >> will include endpoint security, mobile security and the latest in
> malware
> >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >> _______________________________________________
> >> Postgres-xc-general mailing list
> >> Pos...@li...
> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> >
> >
> >
> >
> > --
> > Andrei Martsinchyk
> >
> > StormDB - http://www.stormdb.com
> > The Database Cloud
> >
> >
>
> Joseph.
>
> --
> CTO | Orion Virtualisation Solutions | www.orionvm.com.au
> Phone: 1300 56 99 52 | Mobile: 0428 754 846
>


-- 
Andrei Martsinchyk

StormDB - http://www.stormdb.com
The Database Cloud