Re: [Postgres-xc-general] Pgxc_ctl Primer draft

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Is it possible to do the row visibility check based on a time based policy? That is,

1. Each data node maintains a data structure:  gtid - start time - end time. Only the gtids modifying data on current data node are contained.  
2. Each data node maintains the oldest alive gtid, which may not be updated synchronously.
3. GTM is only responsible to generate a sequence of GTID, which is only an integer value. 
4. The time in different data nodes may be not consistent, but I think in some scenario, the application can bear the little difference. 

Is there any potential issues?

Thanks

> Date: Sun, 4 May 2014 19:36:20 +0900
> From: koi...@gm...
> To: dor...@gm...
> CC: pos...@li...
> Subject: Re: [Postgres-xc-general] Pgxc_ctl Primer draft
> 
> As discussed in the last year's XC-day, GTM proxy should be integrated
> as postmaster backend.     Maybe GTM can be.     Coordinator/Datanode
> can also be integrated into one.
> 
> Apparently, this is the direction we should take.    At first, there
> were no such good experience to start with.    Before version 1.0, we
> determined that the datanode and the coordinator can share the same
> binary.       It is true that we started with the idea to provide
> cluster-wide MVCC and now we found the next direction.
> 
> With this integration and when start with only one node, we don't need
> GTM, which looks identical to standalone PG.   When we add the server,
> at present we do need GTM.   Only accumulating local transactions in
> the nodes cannot maintain cluster-wide database consistency.
> 
> I'm still investigating an idea how to get rid of GTM.   We need to do
> the following:
> 
> 1) To provide cluster wide MVCC,
> 2) To provide good means to determine which row can be vacuumed.
> 
> My current idea is: if we associate any local XID to the root
> transaction (the transaction which application created), we may be
> able to provide cluster wide MVCC by calculating cluster-wide snapshot
> when needed.   I don't know how efficient it is and t don't have good
> idea how to determine if a given row can be vacuumed.
> 
> This is the current situation.
> 
> Hope to have much more input on this.
> 
> Anyway, hope my draft helps people who is trying to use Postgres-XC.
> 
> Best;
> ---
> Koichi Suzuki
> 
> 
> 2014-05-04 19:05 GMT+09:00 Dorian Hoxha <dor...@gm...>:
> > Probably even the gtm-proxy need to be merged with datanode+coordinator from
> > what i read.
> >
> > If you make only local transactions (inside 1 datanode) + not using global
> > sequences, will there be no traffic to the GTM for that transaction ?
> >
> >
> > On Sun, May 4, 2014 at 6:24 AM, Michael Paquier <mic...@gm...>
> > wrote:
> >>
> >> On Sun, May 4, 2014 at 12:59 AM, Dorian Hoxha <dor...@gm...>
> >> wrote:
> >> >> You just need commodity INTEL server runnign Linux.
> >> > Are INTEL cpu required ? If not INTEL can be removed ? (also running
> >> > typo)
> >> Not really... I agree to what you mean here.
> >>
> >> >> For datawarehouse
> >> >>
> >> >> applications, you may need separate patch which devides complexed query
> >> >> into smaller
> >> >>
> >> >> chunks which run in datanodes in parallel.    StormDB will provide such
> >> >> patche.
> >> >
> >> > Wasn't stormdb bought by another company ? Is there an opensource
> >> > alternative ? Fix the "patche" typo ?
> >> >
> >> > A way to make it simpler is by merging coordinator and datanode into 1
> >> > and
> >> > making it possible for a 'node' to not hold data (be a coordinator
> >> > only),
> >> > like in elastic-search, but you probably already know that.
> >> +1. This would alleviate data transfer between cross-node joins where
> >> Coordinator and Datanodes are on separate servers. You could always
> >> have both nodes on the same server with the XC of now... But that's
> >> double number of nodes to monitor.
> >>
> >> > What exact things does the gtm-proxy do? For example, a single row
> >> > insert
> >> > wouldn't need the gtm (coordinator just inserts it to the right
> >> > data-node)(asumming no sequences, since for that the gtm is needed)?
> >> Grouping messages between Coordinator/Datanode and GTM to reduce
> >> package interferences and improve performance.
> >>
> >> > If multiple tables are sharded on the same key (example: user_id). Will
> >> > all
> >> > the rows, from the same user in different tables be in the same
> >> > data-node ?
> >> Yep. Node choice algorithm is based using the data type of the key.
> >> --
> >> Michael
> >
> >
> 
> ------------------------------------------------------------------------------
> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
> Instantly run your Selenium tests across 300+ browser/OS combos.  Get 
> unparalleled scalability from the best Selenium testing platform available.
> Simple to use. Nothing to install. Get started now for free."
> http://p.sf.net/sfu/SauceLabs
> _______________________________________________
> Postgres-xc-general mailing list
> Pos...@li...
> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general