Re: [Postgres-xc-general] Composite table types, replicate + distribute.

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 4 July 2012 22:36, Andrei Martsinchyk <and...@gm...> wrote:
> Hi Joseph,
>
> If you just need HA you may configure stanby's for your datanodes.
> PostgresXC supports synchronous and asynchronous replication.
> There is a pitfall, if you would try to make you database highly available
> using combined hash/replicated distribution. Basically if replicated
> datanode failed you would not able to write to the table. Coordinator would
> not be able to update the replica.
> With standby datanodes you may have your tables replicated and any change
> will be automatically propagated to standby's, and system will work fine if
> any standby fails. However you need an external solution to monitor master
> datanodes and promote standby to failover.

I understand this and is the reason why I was proposing a future
movement towards a more integrated HA solution.
It's more of a personal opinion rather than one purely ground in
technical merit which is why I enquired as to whether this is
compatible with XC goals.

To me this has been a massive thing missing from the Open Source
databases for a really long time and I would be happy to help make it
happen.
The biggest barrier has always been PostgreSQL's core team opposition
to built in distributed operation, however is XC gains enough steam
this might no longer be an issue.

>
> 2012/7/4 Joseph Glanville <jos...@or...>
>>
>> On 4 July 2012 17:40, Michael Paquier <mic...@gm...> wrote:
>> >
>> >
>> > On Wed, Jul 4, 2012 at 2:31 PM, Joseph Glanville
>> > <jos...@or...> wrote:
>> >>
>> >> Hey guys,
>> >>
>> >> This is more of a feature request/question regarding how HA could be
>> >> implemented with PostgreXC in the future.
>> >>
>> >> Could it be possible to have a composite table type which could
>> >> replicate to X nodes and distribute to Y nodes in such a way that
>> >> atleast X copies of every row is maintained but the table is shareded
>> >> across Y data nodes.
>> >
>> > The answer is yes. It is possible.
>> >>
>> >>
>> >> For example in a cluster of 6 nodes one would be able configure at
>> >> table with REPLICATION 2, DISTRIBUTE 3 BY HASH etc (I can't remember
>> >> what the table definitions look like) as such that the table would be
>> >> replicated to 2 sets of 3 nodes.
>> >
>> > As you seem to be aware of, now XC only supports horizontal
>> > partitioning,
>> > meaning that tuples are present on each node in a complete form with all
>> > the
>> > column data.
>> > So let's call call your feature partial horizontal partitioning... Or
>> > something like this.
>>
>> I prefer to think of it as true horizontal scaling rather than a form
>> of partitioning as partitioning is only part of what it would do. :)
>>
>> >
>> >>
>> >> This is interesting becaues it can provide a flexible tradeoff between
>> >> full write scalability (current PostgresXC distribute) and full read
>> >> scalability (PostgresXC replicate or other slave solutions)
>> >> What is most useful about this setup is using PostgresXC this can be
>> >> maintained transparently without middleware and configured to be fully
>> >> sync multi-master etc.
>> >
>> > Do you have some example of applications that may require that?
>>
>> The applications are no different merely the SLA/uptime requirements
>> and an overall reduction in complexity.
>>
>> In the current XC architecture datanodes need to be highly available,
>> this change would shift the onus of high availability away from
>> individual datanodes to the coordinators etc.
>> The main advantage here is the reduction in moving parts and better
>> awareness of the query engine to the state of the system.
>>
>> In theory if something along the lines of this could be implemented
>> you could use the below REPLICATE/DISTRIBUTE strategy to maintain
>> ability to service queries with up to 3 out of 6 servers down, as long
>> as you lost the right 3 ( the entirety of one DISTRIBUTE cluster).
>>
>> As you are probably already aware current replication solutions for
>> Postgres don't play nicely with each other middleware as there hasn't
>> really been any integration up until now (streaming replcation is
>> starting to change this but its overall integration is still poor with
>> other middleware and applications)
>>
>> >
>> >>
>> >>
>> >> Are there significant technical challenges to the above and is this
>> >> something the PostgresXC team would be interested in?
>> >
>> > The code would need to be changed at many places and might require some
>> > effort especially for cursors and join determination at planner side.
>> >
>> > Another critical choice I see here is related to the preferential
>> > strategy
>> > for node choice.
>> > For example, in your case, the table is replicated on 3 nodes, and
>> > distributed on 3 nodes by hash.
>> > When a simple read query arrives at XC level, we need to make XC aware
>> > of
>> > which set of nodes to choose in priority.
>> > A simple session parameter which is table-based could manage that
>> > though,
>> > but is it user-friendly?
>> > A way to choose the set of nodes automatically would be to evaluate with
>> > a
>> > global system of statistics the load on each table of read/write
>> > operations
>> > for each set of nodes and choose the set of nodes the less loaded at the
>> > moment query is fired when planning it. This is largely more complicated
>> > however.
>>
>> This is true. My first thought was quite similar.
>> If you have the same example as above where one has a total of 6
>> datanodes, 2 sets of a 3 node distribute table you have 2 nodes that
>> can service each read request.
>> One could use a simple round robin approach to generate aforementioned
>> table which would look somewhat similar to below:
>>
>>         |  shard1 | shard 2 | shard3
>> rep1 |     1       |     2      |     1
>> rep2 |     2       |     1      |     2
>>
>> This would allow both online and offline optimisation by either
>> internal processes or manual intervention by the operator.
>> Being so simple it is very easy to autogenerate said table. For a HASH
>> style distribute read queries should be uniformly distributed across
>> shard replicas.
>>
>> Personally I think the more complicated bit becomes restoring shard
>> replicas that have left the cluster for some time.
>> In my opinion it would be best to have XC do a row based restore
>> because XC has alot of information that could make this process very
>> fast.
>>
>> Assuming the case where one has many replicas configured (say 3 or
>> more) read queries required to bring either an out of date replica up
>> to speed or a completely new and empty replica to up to date status
>> could be distributed across other replica members.
>>
>> > --
>> > Michael Paquier
>> > http://michael.otacoo.com
>>
>> I am aware that that the proposal is quite broad (from a technical
>> perspective) but more what I am trying to asertain is if it is in
>> conflict with the current XC's team vision.
>>
>> Joseph.
>>
>> --
>> CTO | Orion Virtualisation Solutions | www.orionvm.com.au
>> Phone: 1300 56 99 52 | Mobile: 0428 754 846
>>
>>
>> ------------------------------------------------------------------------------
>> Live Security Virtual Conference
>> Exclusive live event will cover all the ways today's security and
>> threat landscape has changed and how IT managers can respond. Discussions
>> will include endpoint security, mobile security and the latest in malware
>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
>> _______________________________________________
>> Postgres-xc-general mailing list
>> Pos...@li...
>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
>
>
>
>
> --
> Andrei Martsinchyk
>
> StormDB - http://www.stormdb.com
> The Database Cloud
>
>

Joseph.

-- 
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846