Re: [Postgres-xc-general] Composite table types, replicate + distribute.

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Joseph,
I have come across this question about supporting mixed distribution
strategy a few times by now.

We have to judge it's advantages (taking into consideration that there can
be solutions outside of core XC for the same) against the efforts required
for implementing and maintaining it. If the pains in a. using third
party/outside XC solutions 2. implementing and maintaining it in core and
using it are more of less same, we may have to leave it out of the core at
least for some near future. If we take option 2 and find that using it is
equally painful as the option 1, we wasted our effort. In order to judge
the 2nd point, we can look at some other DBMS available with these features
and how do they perform from various aspects. So following questions are
relevant :- Is there another distributed database, having a similar scheme
of mixed distribution available? How (and widely) is that feature being
used in field? What is the pain point in using such a feature?

On Wed, Jul 4, 2012 at 7:35 PM, Joseph Glanville <
jos...@or...> wrote:

> On 4 July 2012 22:36, Andrei Martsinchyk <and...@gm...>
> wrote:
> > Hi Joseph,
> >
> > If you just need HA you may configure stanby's for your datanodes.
> > PostgresXC supports synchronous and asynchronous replication.
> > There is a pitfall, if you would try to make you database highly
> available
> > using combined hash/replicated distribution. Basically if replicated
> > datanode failed you would not able to write to the table. Coordinator
> would
> > not be able to update the replica.
> > With standby datanodes you may have your tables replicated and any change
> > will be automatically propagated to standby's, and system will work fine
> if
> > any standby fails. However you need an external solution to monitor
> master
> > datanodes and promote standby to failover.
>
> I understand this and is the reason why I was proposing a future
> movement towards a more integrated HA solution.
> It's more of a personal opinion rather than one purely ground in
> technical merit which is why I enquired as to whether this is
> compatible with XC goals.
>
> To me this has been a massive thing missing from the Open Source
> databases for a really long time and I would be happy to help make it
> happen.
> The biggest barrier has always been PostgreSQL's core team opposition
> to built in distributed operation, however is XC gains enough steam
> this might no longer be an issue.
>
> >
> > 2012/7/4 Joseph Glanville <jos...@or...>
> >>
> >> On 4 July 2012 17:40, Michael Paquier <mic...@gm...>
> wrote:
> >> >
> >> >
> >> > On Wed, Jul 4, 2012 at 2:31 PM, Joseph Glanville
> >> > <jos...@or...> wrote:
> >> >>
> >> >> Hey guys,
> >> >>
> >> >> This is more of a feature request/question regarding how HA could be
> >> >> implemented with PostgreXC in the future.
> >> >>
> >> >> Could it be possible to have a composite table type which could
> >> >> replicate to X nodes and distribute to Y nodes in such a way that
> >> >> atleast X copies of every row is maintained but the table is shareded
> >> >> across Y data nodes.
> >> >
> >> > The answer is yes. It is possible.
> >> >>
> >> >>
> >> >> For example in a cluster of 6 nodes one would be able configure at
> >> >> table with REPLICATION 2, DISTRIBUTE 3 BY HASH etc (I can't remember
> >> >> what the table definitions look like) as such that the table would be
> >> >> replicated to 2 sets of 3 nodes.
> >> >
> >> > As you seem to be aware of, now XC only supports horizontal
> >> > partitioning,
> >> > meaning that tuples are present on each node in a complete form with
> all
> >> > the
> >> > column data.
> >> > So let's call call your feature partial horizontal partitioning... Or
> >> > something like this.
> >>
> >> I prefer to think of it as true horizontal scaling rather than a form
> >> of partitioning as partitioning is only part of what it would do. :)
> >>
> >> >
> >> >>
> >> >> This is interesting becaues it can provide a flexible tradeoff
> between
> >> >> full write scalability (current PostgresXC distribute) and full read
> >> >> scalability (PostgresXC replicate or other slave solutions)
> >> >> What is most useful about this setup is using PostgresXC this can be
> >> >> maintained transparently without middleware and configured to be
> fully
> >> >> sync multi-master etc.
> >> >
> >> > Do you have some example of applications that may require that?
> >>
> >> The applications are no different merely the SLA/uptime requirements
> >> and an overall reduction in complexity.
> >>
> >> In the current XC architecture datanodes need to be highly available,
> >> this change would shift the onus of high availability away from
> >> individual datanodes to the coordinators etc.
> >> The main advantage here is the reduction in moving parts and better
> >> awareness of the query engine to the state of the system.
> >>
> >> In theory if something along the lines of this could be implemented
> >> you could use the below REPLICATE/DISTRIBUTE strategy to maintain
> >> ability to service queries with up to 3 out of 6 servers down, as long
> >> as you lost the right 3 ( the entirety of one DISTRIBUTE cluster).
> >>
> >> As you are probably already aware current replication solutions for
> >> Postgres don't play nicely with each other middleware as there hasn't
> >> really been any integration up until now (streaming replcation is
> >> starting to change this but its overall integration is still poor with
> >> other middleware and applications)
> >>
> >> >
> >> >>
> >> >>
> >> >> Are there significant technical challenges to the above and is this
> >> >> something the PostgresXC team would be interested in?
> >> >
> >> > The code would need to be changed at many places and might require
> some
> >> > effort especially for cursors and join determination at planner side.
> >> >
> >> > Another critical choice I see here is related to the preferential
> >> > strategy
> >> > for node choice.
> >> > For example, in your case, the table is replicated on 3 nodes, and
> >> > distributed on 3 nodes by hash.
> >> > When a simple read query arrives at XC level, we need to make XC aware
> >> > of
> >> > which set of nodes to choose in priority.
> >> > A simple session parameter which is table-based could manage that
> >> > though,
> >> > but is it user-friendly?
> >> > A way to choose the set of nodes automatically would be to evaluate
> with
> >> > a
> >> > global system of statistics the load on each table of read/write
> >> > operations
> >> > for each set of nodes and choose the set of nodes the less loaded at
> the
> >> > moment query is fired when planning it. This is largely more
> complicated
> >> > however.
> >>
> >> This is true. My first thought was quite similar.
> >> If you have the same example as above where one has a total of 6
> >> datanodes, 2 sets of a 3 node distribute table you have 2 nodes that
> >> can service each read request.
> >> One could use a simple round robin approach to generate aforementioned
> >> table which would look somewhat similar to below:
> >>
> >>         |  shard1 | shard 2 | shard3
> >> rep1 |     1       |     2      |     1
> >> rep2 |     2       |     1      |     2
> >>
> >> This would allow both online and offline optimisation by either
> >> internal processes or manual intervention by the operator.
> >> Being so simple it is very easy to autogenerate said table. For a HASH
> >> style distribute read queries should be uniformly distributed across
> >> shard replicas.
> >>
> >> Personally I think the more complicated bit becomes restoring shard
> >> replicas that have left the cluster for some time.
> >> In my opinion it would be best to have XC do a row based restore
> >> because XC has alot of information that could make this process very
> >> fast.
> >>
> >> Assuming the case where one has many replicas configured (say 3 or
> >> more) read queries required to bring either an out of date replica up
> >> to speed or a completely new and empty replica to up to date status
> >> could be distributed across other replica members.
> >>
> >> > --
> >> > Michael Paquier
> >> > http://michael.otacoo.com
> >>
> >> I am aware that that the proposal is quite broad (from a technical
> >> perspective) but more what I am trying to asertain is if it is in
> >> conflict with the current XC's team vision.
> >>
> >> Joseph.
> >>
> >> --
> >> CTO | Orion Virtualisation Solutions | www.orionvm.com.au
> >> Phone: 1300 56 99 52 | Mobile: 0428 754 846
> >>
> >>
> >>
> ------------------------------------------------------------------------------
> >> Live Security Virtual Conference
> >> Exclusive live event will cover all the ways today's security and
> >> threat landscape has changed and how IT managers can respond.
> Discussions
> >> will include endpoint security, mobile security and the latest in
> malware
> >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> >> _______________________________________________
> >> Postgres-xc-general mailing list
> >> Pos...@li...
> >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
> >
> >
> >
> >
> > --
> > Andrei Martsinchyk
> >
> > StormDB - http://www.stormdb.com
> > The Database Cloud
> >
> >
>
> Joseph.
>
> --
> CTO | Orion Virtualisation Solutions | www.orionvm.com.au
> Phone: 1300 56 99 52 | Mobile: 0428 754 846
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>

-- 
Best Wishes,
Ashutosh Bapat
EntepriseDB Corporation
The Enterprise Postgres Company