You can subscribe to this list here.
2010 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
(19) |
Nov
(1) |
Dec
|
---|---|---|---|---|---|---|---|---|---|---|---|---|
2011 |
Jan
(12) |
Feb
(1) |
Mar
(4) |
Apr
(4) |
May
(32) |
Jun
(12) |
Jul
(11) |
Aug
(1) |
Sep
(6) |
Oct
(3) |
Nov
|
Dec
(10) |
2012 |
Jan
(11) |
Feb
(1) |
Mar
(3) |
Apr
(25) |
May
(53) |
Jun
(38) |
Jul
(103) |
Aug
(54) |
Sep
(31) |
Oct
(66) |
Nov
(77) |
Dec
(20) |
2013 |
Jan
(91) |
Feb
(86) |
Mar
(103) |
Apr
(107) |
May
(25) |
Jun
(37) |
Jul
(17) |
Aug
(59) |
Sep
(38) |
Oct
(78) |
Nov
(29) |
Dec
(15) |
2014 |
Jan
(23) |
Feb
(82) |
Mar
(118) |
Apr
(101) |
May
(103) |
Jun
(45) |
Jul
(6) |
Aug
(10) |
Sep
|
Oct
(32) |
Nov
|
Dec
(9) |
2015 |
Jan
(3) |
Feb
(5) |
Mar
|
Apr
(1) |
May
|
Jun
|
Jul
(9) |
Aug
(4) |
Sep
(3) |
Oct
|
Nov
|
Dec
|
2016 |
Jan
(3) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2017 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(3) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(4) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
From: Koichi S. <koi...@gm...> - 2014-05-27 02:01:35
|
As mentioned, counting the line of code was done in very rough way. I'd like to make comparison a bit carefully and make a correction on it. 30k is very small estimation. At least, gtm has more than 10k, pgxc_ctl more than 10k, and many others including pooler, FQS, planner, executor, DDL, and so on. Regards; --- Koichi Suzuki 2014-05-27 8:37 GMT+09:00 Michael Paquier <mic...@gm...>: > On Tue, May 27, 2014 at 3:55 AM, Josh Berkus <jo...@ag...> wrote: >> All: >> >> The meeting notes from last week's meeting are up on the wiki. If you >> presented slides, please link them from the wiki page. > With a link it is better: > https://wiki.postgresql.org/wiki/Pgcon2014PostgresXCmeeting > I cannot really believe that XC has 470k of diffs with upstream > Postgres. With my last estimations it was closer to 30k. > -- > Michael > > ------------------------------------------------------------------------------ > The best possible search technologies are now affordable for all companies. > Download your FREE open source Enterprise Search Engine today! > Our experts will assist you in its installation for $59/mo, no commitment. > Test it for FREE on our Cloud platform anytime! > http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Koichi S. <koi...@gm...> - 2014-05-27 01:59:03
|
Year, this is what needed now. I think we should make some more discussion on it to get the best approach. I understood that streaming replication is not for this usecase and we need something better. As Josh suggested, I'd like to have much more idea/requirement on this. Thank you; --- Koichi Suzuki 2014-05-27 9:10 GMT+09:00 Tim Uckun <tim...@gm...>: > Would it be possible to keep the shards in multiple data nodes so that if > one data node failed you'd just replace it when you can get around to it. > > Elasticsearch uses this strategy. > > > On Sun, May 25, 2014 at 8:04 AM, Koichi Suzuki <koi...@gm...> > wrote: >> >> At present, XC advises to make a replica with synchronize replication. >> Pgxc_ctl configures slaves in this way. >> >> I understand that this is not for performance and we may need some >> other solution for this. >> >> To begin with, there are a couple of ideas for this. >> >> 1. To allow async., when a node fails, fall back whole cluster status >> to the latest consistent state, such as pointed by a barrier. I can >> provide some detailed thought on this if interesting. >> >> 2. Allow to have a copy of shards to another node at planner/executor >> level. >> >> 3. Implement another replication better for XC using BDR, just for >> distributed tables, for example. >> >> At present, XC uses hash value of the node name to determine each row >> location for distributed tables. For ideas 2 and 3, we need to add >> some infrastructure to make this allocation more flexible. >> >> Further input is welcome. >> >> Thank you. >> --- >> Koichi Suzuki >> >> >> 2014-05-24 14:53 GMT-04:00 Josh Berkus <jo...@ag...>: >> > All: >> > >> > So, in addition to the stability issues raised at the PostgresXC summit, >> > I need to raise something which is a deficiency of both XC and XL and >> > should be (in my opinion) our #2 priority after stability. And that's >> > node/shard redundancy. >> > >> > Right now, if single node fails, the cluster is frozen for writes ... >> > and fails some reads ... until the node is replaced by the user from a >> > replica. It's also not clear that we *can* actually replace a node from >> > a replica because the replica will be async rep, and thus not at exactly >> > the same GXID as the rest of the cluster. This makes XC a >> > low-availability solution. >> > >> > The answer for this is to do the same thing which every other clustering >> > system has done: write each shard to multiple locations. Default would >> > be two. If each shard is present on two different nodes, then losing a >> > node is just a performance problem, not a downtime event. >> > >> > Thoughts? >> > >> > -- >> > Josh Berkus >> > PostgreSQL Experts Inc. >> > http://pgexperts.com >> > >> > >> > ------------------------------------------------------------------------------ >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> > Instantly run your Selenium tests across 300+ browser/OS combos. >> > Get unparalleled scalability from the best Selenium testing platform >> > available >> > Simple to use. Nothing to install. Get started now for free." >> > http://p.sf.net/sfu/SauceLabs >> > _______________________________________________ >> > Postgres-xc-general mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> >> ------------------------------------------------------------------------------ >> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> Instantly run your Selenium tests across 300+ browser/OS combos. >> Get unparalleled scalability from the best Selenium testing platform >> available >> Simple to use. Nothing to install. Get started now for free." >> http://p.sf.net/sfu/SauceLabs >> _______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > |
From: Tim U. <tim...@gm...> - 2014-05-27 00:10:10
|
Would it be possible to keep the shards in multiple data nodes so that if one data node failed you'd just replace it when you can get around to it. Elasticsearch uses this strategy. On Sun, May 25, 2014 at 8:04 AM, Koichi Suzuki <koi...@gm...>wrote: > At present, XC advises to make a replica with synchronize replication. > Pgxc_ctl configures slaves in this way. > > I understand that this is not for performance and we may need some > other solution for this. > > To begin with, there are a couple of ideas for this. > > 1. To allow async., when a node fails, fall back whole cluster status > to the latest consistent state, such as pointed by a barrier. I can > provide some detailed thought on this if interesting. > > 2. Allow to have a copy of shards to another node at planner/executor > level. > > 3. Implement another replication better for XC using BDR, just for > distributed tables, for example. > > At present, XC uses hash value of the node name to determine each row > location for distributed tables. For ideas 2 and 3, we need to add > some infrastructure to make this allocation more flexible. > > Further input is welcome. > > Thank you. > --- > Koichi Suzuki > > > 2014-05-24 14:53 GMT-04:00 Josh Berkus <jo...@ag...>: > > All: > > > > So, in addition to the stability issues raised at the PostgresXC summit, > > I need to raise something which is a deficiency of both XC and XL and > > should be (in my opinion) our #2 priority after stability. And that's > > node/shard redundancy. > > > > Right now, if single node fails, the cluster is frozen for writes ... > > and fails some reads ... until the node is replaced by the user from a > > replica. It's also not clear that we *can* actually replace a node from > > a replica because the replica will be async rep, and thus not at exactly > > the same GXID as the rest of the cluster. This makes XC a > > low-availability solution. > > > > The answer for this is to do the same thing which every other clustering > > system has done: write each shard to multiple locations. Default would > > be two. If each shard is present on two different nodes, then losing a > > node is just a performance problem, not a downtime event. > > > > Thoughts? > > > > -- > > Josh Berkus > > PostgreSQL Experts Inc. > > http://pgexperts.com > > > > > ------------------------------------------------------------------------------ > > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > > Instantly run your Selenium tests across 300+ browser/OS combos. > > Get unparalleled scalability from the best Selenium testing platform > available > > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Michael P. <mic...@gm...> - 2014-05-26 23:37:42
|
On Tue, May 27, 2014 at 3:55 AM, Josh Berkus <jo...@ag...> wrote: > All: > > The meeting notes from last week's meeting are up on the wiki. If you > presented slides, please link them from the wiki page. With a link it is better: https://wiki.postgresql.org/wiki/Pgcon2014PostgresXCmeeting I cannot really believe that XC has 470k of diffs with upstream Postgres. With my last estimations it was closer to 30k. -- Michael |
From: Josh B. <jo...@ag...> - 2014-05-26 18:56:05
|
All: The meeting notes from last week's meeting are up on the wiki. If you presented slides, please link them from the wiki page. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Josh B. <jo...@ag...> - 2014-05-26 18:03:47
|
On 05/24/2014 04:02 PM, Koichi Suzuki wrote: > 2014-05-24 17:10 GMT-04:00 Josh Berkus <jo...@ag...>: >> Koichi, >> >>> 1. To allow async., when a node fails, fall back whole cluster status >>> to the latest consistent state, such as pointed by a barrier. I can >>> provide some detailed thought on this if interesting. >> >> This is not interesting to me. If I have to accept major data loss for >> a single node failure, then I can use solutions which do not require an GTM. >> >>> 2. Allow to have a copy of shards to another node at planner/executor level. >> >> Yes. This should be at the executor level, in my opinion. All writes >> go to all shards and do not complete until they all succeed or the shard >> times out (and then is marked disabled). >> >> What to do with reads is more nuanced. If we load-balance reads, then >> we are increasing throughput of the cluster. If we send each read to >> all duplicate shards, then we are improving response times while >> decreasing throughput. I think that deserves some testing. > > Planner needs some more to choose the best one which pushdown is the > best path to do. Also, to handle conflicting writes in different > cooI'rdinators, we may need to define node priority where to go first. I guess I'm not clear on how we could have a conflict in the first place? As far as reads are concerned, I can only see two options: 1) Push read down to one random shard -- maximizes throughput 2) Push read down to all shards, take first response -- minimizes response time The choice of (1) or (2) is application-specific, so ultimately I think we will need to implement both and allow the user to choose, maybe as a config option. The best functionality, of course, would be to provide the user an option to choose as a userset GUC, so that they could switch on a per-query basis. >> >>> 3. Implement another replication better for XC using BDR, just for >>> distributed tables, for example. >> >> This has the same problems as solution #1. > > We can implement better synchronization suitable for XC need. Also, > only shards can be replicated to reduce the overhead. I think this > has better potential than streaming replication. Ah, ok. When you get back home, maybe you can sketch out what you're thinking with BDR? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Koichi S. <koi...@gm...> - 2014-05-25 18:04:42
|
I see. Your have good usecase for read-only transactions. Because of the nature of log shipping and sharing/clustering, it is not simple to provide read-only transaction in XC. Two essential reasons: 1. Delay in WAL playback in each slave may be different. It makes providing consistent database view extremely difficult. 2. At present, slave calculates snapshot of the transaction from the WAL. Current code does not allow missing XIDs. There will be memory leak and crash by OOM if there's many missing XIDs in the WAL stream. In XC, it is disabled and the database view may be inconsistent. Please note that this does not affect recovery and promotion. Read only scalability is obviously a candidate of our TODO list. Based on the discussion in PGCon cluster summit, we will open-up our roadmap discussion this week and ask for input of feature/performance/quality discussion at our general/developer mailing list. I hope you can post your usecase and requirement to this discussion. Best Regards; --- Koichi Suzuki 2014-05-25 5:23 GMT-07:00 ZhangJulian <jul...@ou...>: > Hi Koichi, > > Thanks for the explaination. > > We have a system which has some OLTP applications and some REPORT > applications, and the REPORT system can bear some inconsistency. We do not > want the REPORT system influencing the statibility of the OLTP system, so > read/write separation is applicable in this scenario. > > From your advices, I feel I should limit the use cases to a smaller > scenario, for example, even the GUC is enabled, only the SELECT statement > under the autocommit=true could be routed to the slaves. > > From the other mail thread, the community has planned some other approachs > to achive the similar goal. Because our team has no much experience on the > development, we plan to train ourselves by this task even it will not be > adopted by the community. > > We may ask help from you if we have some questions, thanks in advance! > > Thanks > Julian > >> Date: Fri, 23 May 2014 08:52:28 -0400 > >> Subject: Re: [Postgres-xc-general] Do you think the new feature is >> meaningful? - Read/Write Separation >> From: koi...@gm... >> To: jul...@ou... >> CC: pos...@li... >> >> Hello; >> >> Find my reply inline. >> >> Thank you; >> --- >> Koichi Suzuki >> >> >> 2014-05-22 23:49 GMT-04:00 ZhangJulian <jul...@ou...>: >> > Hi Koichi, >> > >> > Thanks for your comments! >> > >> > 1. pgxc_node issue. >> > I feel the pgxc_node in data node have no use currently, right? >> > In current codebase, if a coordinator slave or a data node slave is >> > promoted, ALTER NODE statement must be executed in all the coordinators >> > since the pgxc_node table is a local table in each node. >> > Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will >> > be >> > updated to update the master and slave together. Once a coordinator >> > slave or >> > a data node slave is prompted, the information in other coordinators and >> > the >> > prompted coordinator could be updated as the previous behavior. >> >> I understand your goal and it sounds attractive to have such >> master-slave info inside the database. Maybe we need better idea >> which survives slave promotion. >> >> > >> > 2. the data between the master and the slave may not be consistency >> > every >> > time. >> > It should be a common issue on PostgreSQL, and other non-cluster >> > database >> > platform. There are many users who use the master-slave infrastructure >> > to >> > achive the read/write separation. If the user open the feature, they >> > should >> > know the risk. >> >> The use case should be limited. The transaction has to be read only. >> We cannot transfer statement-by-statement basis. Even with >> transaction-basis transfer, we may be suffered from such >> inconsistency. I'm afraid this may not be understood widely. >> Given this, anyway, synchronizing WAL playback in slaves is essential >> issue to provide read transaction on slaves. This was discussed in >> the cluster summit at PGCon this Tuesday. >> >> > >> > 3. the GXID issue. >> > It is too complex to me, I can not understand it thoroughly, :) But if >> > the >> > user can bear the data is not consistency in a short time, it will be >> > not a >> > issue, right? >> >> GXID issue is a solution to provide "atomic visibility" among read and >> write distributed transactions. It is quite new and may need another >> material to understand. Let me prepare a material to describe why it >> is needed and what issues this solves. >> >> This kind of thing is essential to provide consistent database view in >> the cluster. >> >> Please allow me a bit to provide background information on this. >> >> > >> > Thanks >> > Julian >> > >> >> Date: Thu, 22 May 2014 09:21:28 -0400 >> >> Subject: Re: [Postgres-xc-general] Do you think the new feature is >> >> meaningful? - Read/Write Separation >> >> From: koi...@gm... >> >> To: jul...@ou... >> >> CC: pos...@li... >> > >> >> >> >> Hello; >> >> >> >> Thanks a lot for the idea. Please find my comments inline. >> >> >> >> Hope you consider them and more forward to make your goal more >> >> feasible? >> >> >> >> Regards; >> >> --- >> >> Koichi Suzuki >> >> >> >> >> >> 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: >> >> > Hi All, >> >> > >> >> > I plan to implement it as the below idea. >> >> > 1. add a new GUC to the coordinator configuration, which control the >> >> > READ/WRITE Separation feature is ON/OFF. >> >> > 2. extend the catalog table pgxc_node by adding new columns: >> >> > slave1_host, >> >> > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose >> >> > at >> >> > most >> >> > two slaves are supported. >> >> >> >> I don't think this is a good idea. If we have these info in the >> >> catalog, this will all goes to the slave by WAL shipping and will be >> >> used when a slave is promoted. >> >> >> >> This information is not valid when the master is gone and one of the >> >> slaves is promoted. >> >> >> >> > 3. a read only transaction or the front read only part of a >> >> > transaction >> >> > will >> >> > be routed to the slave node to execute. >> >> >> >> In current WAL shipping, we have to expect some difference when a >> >> transaction or statement update is visible to the slave. At least, >> >> even with >> >> synchronized replication, there's slight delay after the WAL record is >> >> received and is replayed to be available to hot standby. There's >> >> even a chance that such update is visible before it is visible at the >> >> master. >> >> >> >> Therefore, usecase of current hot standby should allow such >> >> differences. I don't think your example allow such WAL shipping >> >> replication characteristics. >> >> >> >> Moreover, current hot standby implementation assumes the slave will >> >> receive every XID in updates. It does not assume there could be >> >> missing XIDs and this assumption is used to generate snapshot to >> >> enforce update visibility. >> >> >> >> In XC, because of GXID nature, some GXID may be missing at some slave. >> >> >> >> At present, because we didn't have sufficient resources, snapshot >> >> generation is disabled. >> >> >> >> In addition to this, local snapshot may not work. We need global XID >> >> (GXID) to get consistent result. >> >> >> >> By such reasons, it is not simple to provide consistent database view >> >> from slaves. >> >> >> >> I discussed this in PGCon cluster summit this Tuesday and I'm afraid >> >> this need much more analysis, research and design. >> >> >> >> > >> >> > For example, >> >> > begin; >> >> > select ....; ==>go to slave node >> >> > select ....; ==>go to slave node >> >> > insert ....; ==>go to master node >> >> > select ....; ==>go to master node, since it may visit the row >> >> > inserted >> >> > by >> >> > the previous insert statement. >> >> > end; >> >> > >> >> > By this, in a cluster, >> >> > some coordinator can be configured to support the OLTP system, the >> >> > query >> >> > will be routed to the master data nodes; >> >> > others coordinators can be configured to support the report system, >> >> > the >> >> > query will be routed to the slave data nodes; >> >> > the different wordloads will be applied to different coordinators and >> >> > data >> >> > nodes, then they can be isolated. >> >> > >> >> > Do you think if it is valuable? Do you have some advices? >> >> > >> >> > Thanks >> >> > Julian >> >> > >> >> > >> >> > >> >> > ------------------------------------------------------------------------------ >> >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For >> >> > FREE >> >> > Instantly run your Selenium tests across 300+ browser/OS combos. >> >> > Get unparalleled scalability from the best Selenium testing platform >> >> > available >> >> > Simple to use. Nothing to install. Get started now for free." >> >> > http://p.sf.net/sfu/SauceLabs >> >> > _______________________________________________ >> >> > Postgres-xc-general mailing list >> >> > Pos...@li... >> >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> > |
From: ZhangJulian <jul...@ou...> - 2014-05-25 12:23:59
|
Hi Koichi, Thanks for the explaination. We have a system which has some OLTP applications and some REPORT applications, and the REPORT system can bear some inconsistency. We do not want the REPORT system influencing the statibility of the OLTP system, so read/write separation is applicable in this scenario. From your advices, I feel I should limit the use cases to a smaller scenario, for example, even the GUC is enabled, only the SELECT statement under the autocommit=true could be routed to the slaves. From the other mail thread, the community has planned some other approachs to achive the similar goal. Because our team has no much experience on the development, we plan to train ourselves by this task even it will not be adopted by the community. We may ask help from you if we have some questions, thanks in advance! Thanks Julian > Date: Fri, 23 May 2014 08:52:28 -0400 > Subject: Re: [Postgres-xc-general] Do you think the new feature is meaningful? - Read/Write Separation > From: koi...@gm... > To: jul...@ou... > CC: pos...@li... > > Hello; > > Find my reply inline. > > Thank you; > --- > Koichi Suzuki > > > 2014-05-22 23:49 GMT-04:00 ZhangJulian <jul...@ou...>: > > Hi Koichi, > > > > Thanks for your comments! > > > > 1. pgxc_node issue. > > I feel the pgxc_node in data node have no use currently, right? > > In current codebase, if a coordinator slave or a data node slave is > > promoted, ALTER NODE statement must be executed in all the coordinators > > since the pgxc_node table is a local table in each node. > > Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will be > > updated to update the master and slave together. Once a coordinator slave or > > a data node slave is prompted, the information in other coordinators and the > > prompted coordinator could be updated as the previous behavior. > > I understand your goal and it sounds attractive to have such > master-slave info inside the database. Maybe we need better idea > which survives slave promotion. > > > > > 2. the data between the master and the slave may not be consistency every > > time. > > It should be a common issue on PostgreSQL, and other non-cluster database > > platform. There are many users who use the master-slave infrastructure to > > achive the read/write separation. If the user open the feature, they should > > know the risk. > > The use case should be limited. The transaction has to be read only. > We cannot transfer statement-by-statement basis. Even with > transaction-basis transfer, we may be suffered from such > inconsistency. I'm afraid this may not be understood widely. > Given this, anyway, synchronizing WAL playback in slaves is essential > issue to provide read transaction on slaves. This was discussed in > the cluster summit at PGCon this Tuesday. > > > > > 3. the GXID issue. > > It is too complex to me, I can not understand it thoroughly, :) But if the > > user can bear the data is not consistency in a short time, it will be not a > > issue, right? > > GXID issue is a solution to provide "atomic visibility" among read and > write distributed transactions. It is quite new and may need another > material to understand. Let me prepare a material to describe why it > is needed and what issues this solves. > > This kind of thing is essential to provide consistent database view in > the cluster. > > Please allow me a bit to provide background information on this. > > > > > Thanks > > Julian > > > >> Date: Thu, 22 May 2014 09:21:28 -0400 > >> Subject: Re: [Postgres-xc-general] Do you think the new feature is > >> meaningful? - Read/Write Separation > >> From: koi...@gm... > >> To: jul...@ou... > >> CC: pos...@li... > > > >> > >> Hello; > >> > >> Thanks a lot for the idea. Please find my comments inline. > >> > >> Hope you consider them and more forward to make your goal more feasible? > >> > >> Regards; > >> --- > >> Koichi Suzuki > >> > >> > >> 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: > >> > Hi All, > >> > > >> > I plan to implement it as the below idea. > >> > 1. add a new GUC to the coordinator configuration, which control the > >> > READ/WRITE Separation feature is ON/OFF. > >> > 2. extend the catalog table pgxc_node by adding new columns: > >> > slave1_host, > >> > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at > >> > most > >> > two slaves are supported. > >> > >> I don't think this is a good idea. If we have these info in the > >> catalog, this will all goes to the slave by WAL shipping and will be > >> used when a slave is promoted. > >> > >> This information is not valid when the master is gone and one of the > >> slaves is promoted. > >> > >> > 3. a read only transaction or the front read only part of a transaction > >> > will > >> > be routed to the slave node to execute. > >> > >> In current WAL shipping, we have to expect some difference when a > >> transaction or statement update is visible to the slave. At least, > >> even with > >> synchronized replication, there's slight delay after the WAL record is > >> received and is replayed to be available to hot standby. There's > >> even a chance that such update is visible before it is visible at the > >> master. > >> > >> Therefore, usecase of current hot standby should allow such > >> differences. I don't think your example allow such WAL shipping > >> replication characteristics. > >> > >> Moreover, current hot standby implementation assumes the slave will > >> receive every XID in updates. It does not assume there could be > >> missing XIDs and this assumption is used to generate snapshot to > >> enforce update visibility. > >> > >> In XC, because of GXID nature, some GXID may be missing at some slave. > >> > >> At present, because we didn't have sufficient resources, snapshot > >> generation is disabled. > >> > >> In addition to this, local snapshot may not work. We need global XID > >> (GXID) to get consistent result. > >> > >> By such reasons, it is not simple to provide consistent database view > >> from slaves. > >> > >> I discussed this in PGCon cluster summit this Tuesday and I'm afraid > >> this need much more analysis, research and design. > >> > >> > > >> > For example, > >> > begin; > >> > select ....; ==>go to slave node > >> > select ....; ==>go to slave node > >> > insert ....; ==>go to master node > >> > select ....; ==>go to master node, since it may visit the row inserted > >> > by > >> > the previous insert statement. > >> > end; > >> > > >> > By this, in a cluster, > >> > some coordinator can be configured to support the OLTP system, the query > >> > will be routed to the master data nodes; > >> > others coordinators can be configured to support the report system, the > >> > query will be routed to the slave data nodes; > >> > the different wordloads will be applied to different coordinators and > >> > data > >> > nodes, then they can be isolated. > >> > > >> > Do you think if it is valuable? Do you have some advices? > >> > > >> > Thanks > >> > Julian > >> > > >> > > >> > ------------------------------------------------------------------------------ > >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > >> > Instantly run your Selenium tests across 300+ browser/OS combos. > >> > Get unparalleled scalability from the best Selenium testing platform > >> > available > >> > Simple to use. Nothing to install. Get started now for free." > >> > http://p.sf.net/sfu/SauceLabs > >> > _______________________________________________ > >> > Postgres-xc-general mailing list > >> > Pos...@li... > >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > >> > |
From: Koichi S. <koi...@gm...> - 2014-05-24 23:03:06
|
2014-05-24 17:10 GMT-04:00 Josh Berkus <jo...@ag...>: > Koichi, > >> 1. To allow async., when a node fails, fall back whole cluster status >> to the latest consistent state, such as pointed by a barrier. I can >> provide some detailed thought on this if interesting. > > This is not interesting to me. If I have to accept major data loss for > a single node failure, then I can use solutions which do not require an GTM. > >> 2. Allow to have a copy of shards to another node at planner/executor level. > > Yes. This should be at the executor level, in my opinion. All writes > go to all shards and do not complete until they all succeed or the shard > times out (and then is marked disabled). > > What to do with reads is more nuanced. If we load-balance reads, then > we are increasing throughput of the cluster. If we send each read to > all duplicate shards, then we are improving response times while > decreasing throughput. I think that deserves some testing. Planner needs some more to choose the best one which pushdown is the best path to do. Also, to handle conflicting writes in different coordinators, we may need to define node priority where to go first. > >> 3. Implement another replication better for XC using BDR, just for >> distributed tables, for example. > > This has the same problems as solution #1. We can implement better synchronization suitable for XC need. Also, only shards can be replicated to reduce the overhead. I think this has better potential than streaming replication. Regards; --- Koichi Suzuki > >> At present, XC uses hash value of the node name to determine each row >> location for distributed tables. For ideas 2 and 3, we need to add >> some infrastructure to make this allocation more flexible. > > Yes. We would need a shard ID which is separate from the node name. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com |
From: Josh B. <jo...@ag...> - 2014-05-24 21:10:47
|
Koichi, > 1. To allow async., when a node fails, fall back whole cluster status > to the latest consistent state, such as pointed by a barrier. I can > provide some detailed thought on this if interesting. This is not interesting to me. If I have to accept major data loss for a single node failure, then I can use solutions which do not require an GTM. > 2. Allow to have a copy of shards to another node at planner/executor level. Yes. This should be at the executor level, in my opinion. All writes go to all shards and do not complete until they all succeed or the shard times out (and then is marked disabled). What to do with reads is more nuanced. If we load-balance reads, then we are increasing throughput of the cluster. If we send each read to all duplicate shards, then we are improving response times while decreasing throughput. I think that deserves some testing. > 3. Implement another replication better for XC using BDR, just for > distributed tables, for example. This has the same problems as solution #1. > At present, XC uses hash value of the node name to determine each row > location for distributed tables. For ideas 2 and 3, we need to add > some infrastructure to make this allocation more flexible. Yes. We would need a shard ID which is separate from the node name. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Koichi S. <koi...@gm...> - 2014-05-24 20:04:56
|
At present, XC advises to make a replica with synchronize replication. Pgxc_ctl configures slaves in this way. I understand that this is not for performance and we may need some other solution for this. To begin with, there are a couple of ideas for this. 1. To allow async., when a node fails, fall back whole cluster status to the latest consistent state, such as pointed by a barrier. I can provide some detailed thought on this if interesting. 2. Allow to have a copy of shards to another node at planner/executor level. 3. Implement another replication better for XC using BDR, just for distributed tables, for example. At present, XC uses hash value of the node name to determine each row location for distributed tables. For ideas 2 and 3, we need to add some infrastructure to make this allocation more flexible. Further input is welcome. Thank you. --- Koichi Suzuki 2014-05-24 14:53 GMT-04:00 Josh Berkus <jo...@ag...>: > All: > > So, in addition to the stability issues raised at the PostgresXC summit, > I need to raise something which is a deficiency of both XC and XL and > should be (in my opinion) our #2 priority after stability. And that's > node/shard redundancy. > > Right now, if single node fails, the cluster is frozen for writes ... > and fails some reads ... until the node is replaced by the user from a > replica. It's also not clear that we *can* actually replace a node from > a replica because the replica will be async rep, and thus not at exactly > the same GXID as the rest of the cluster. This makes XC a > low-availability solution. > > The answer for this is to do the same thing which every other clustering > system has done: write each shard to multiple locations. Default would > be two. If each shard is present on two different nodes, then losing a > node is just a performance problem, not a downtime event. > > Thoughts? > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Josh B. <jo...@ag...> - 2014-05-24 18:53:13
|
All: So, in addition to the stability issues raised at the PostgresXC summit, I need to raise something which is a deficiency of both XC and XL and should be (in my opinion) our #2 priority after stability. And that's node/shard redundancy. Right now, if single node fails, the cluster is frozen for writes ... and fails some reads ... until the node is replaced by the user from a replica. It's also not clear that we *can* actually replace a node from a replica because the replica will be async rep, and thus not at exactly the same GXID as the rest of the cluster. This makes XC a low-availability solution. The answer for this is to do the same thing which every other clustering system has done: write each shard to multiple locations. Default would be two. If each shard is present on two different nodes, then losing a node is just a performance problem, not a downtime event. Thoughts? -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Koichi S. <koi...@gm...> - 2014-05-24 16:09:36
|
Sorry for the late response. What version are you using? 1.2.1 includes several fix for GTM connectivity. --- Koichi Suzuki 2014-05-22 12:28 GMT-04:00 Aaron Jackson <aja...@re...>: > Given my past experience with compiler issues, I'm a little hesitant to even > report this. That said, I have a three node cluster, each with a > coordinator, data node and gtm proxy. I have a standalone gtm instance > without a slave. Often, when I come in after the servers have been up for a > while, I'm greeted with a variety of issues. > > There are several warnings in the coordinator and data node logs, that read > "Do not have a GTM snapshot available" - I've discarded these as mostly > benign for the moment. > > The coordinator is much worse.. > > 30770 | 2014-05-22 15:53:06 UTC | ERROR: current transaction is aborted, > commands ignored until end of transaction block > 30770 | 2014-05-22 15:53:06 UTC | STATEMENT: DISCARD ALL > 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode > 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode > 4560 | 2014-05-22 15:54:30 UTC | WARNING: can not connect to node 16390 > 30808 | 2014-05-22 15:54:30 UTC | LOG: failed to acquire connections > > > Usually, I reset the coordinator and datanode and the world is happy again. > However, it makes me somewhat concerned that I'm seeing these kinds of > failures on a daily basis. I wouldn't rule out the compiler again as it's > been the reason for previous failures, but has anyone else seen anything > like this?? > > Aaron > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Koichi S. <koi...@gm...> - 2014-05-23 12:52:35
|
Hello; Find my reply inline. Thank you; --- Koichi Suzuki 2014-05-22 23:49 GMT-04:00 ZhangJulian <jul...@ou...>: > Hi Koichi, > > Thanks for your comments! > > 1. pgxc_node issue. > I feel the pgxc_node in data node have no use currently, right? > In current codebase, if a coordinator slave or a data node slave is > promoted, ALTER NODE statement must be executed in all the coordinators > since the pgxc_node table is a local table in each node. > Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will be > updated to update the master and slave together. Once a coordinator slave or > a data node slave is prompted, the information in other coordinators and the > prompted coordinator could be updated as the previous behavior. I understand your goal and it sounds attractive to have such master-slave info inside the database. Maybe we need better idea which survives slave promotion. > > 2. the data between the master and the slave may not be consistency every > time. > It should be a common issue on PostgreSQL, and other non-cluster database > platform. There are many users who use the master-slave infrastructure to > achive the read/write separation. If the user open the feature, they should > know the risk. The use case should be limited. The transaction has to be read only. We cannot transfer statement-by-statement basis. Even with transaction-basis transfer, we may be suffered from such inconsistency. I'm afraid this may not be understood widely. Given this, anyway, synchronizing WAL playback in slaves is essential issue to provide read transaction on slaves. This was discussed in the cluster summit at PGCon this Tuesday. > > 3. the GXID issue. > It is too complex to me, I can not understand it thoroughly, :) But if the > user can bear the data is not consistency in a short time, it will be not a > issue, right? GXID issue is a solution to provide "atomic visibility" among read and write distributed transactions. It is quite new and may need another material to understand. Let me prepare a material to describe why it is needed and what issues this solves. This kind of thing is essential to provide consistent database view in the cluster. Please allow me a bit to provide background information on this. > > Thanks > Julian > >> Date: Thu, 22 May 2014 09:21:28 -0400 >> Subject: Re: [Postgres-xc-general] Do you think the new feature is >> meaningful? - Read/Write Separation >> From: koi...@gm... >> To: jul...@ou... >> CC: pos...@li... > >> >> Hello; >> >> Thanks a lot for the idea. Please find my comments inline. >> >> Hope you consider them and more forward to make your goal more feasible? >> >> Regards; >> --- >> Koichi Suzuki >> >> >> 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: >> > Hi All, >> > >> > I plan to implement it as the below idea. >> > 1. add a new GUC to the coordinator configuration, which control the >> > READ/WRITE Separation feature is ON/OFF. >> > 2. extend the catalog table pgxc_node by adding new columns: >> > slave1_host, >> > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at >> > most >> > two slaves are supported. >> >> I don't think this is a good idea. If we have these info in the >> catalog, this will all goes to the slave by WAL shipping and will be >> used when a slave is promoted. >> >> This information is not valid when the master is gone and one of the >> slaves is promoted. >> >> > 3. a read only transaction or the front read only part of a transaction >> > will >> > be routed to the slave node to execute. >> >> In current WAL shipping, we have to expect some difference when a >> transaction or statement update is visible to the slave. At least, >> even with >> synchronized replication, there's slight delay after the WAL record is >> received and is replayed to be available to hot standby. There's >> even a chance that such update is visible before it is visible at the >> master. >> >> Therefore, usecase of current hot standby should allow such >> differences. I don't think your example allow such WAL shipping >> replication characteristics. >> >> Moreover, current hot standby implementation assumes the slave will >> receive every XID in updates. It does not assume there could be >> missing XIDs and this assumption is used to generate snapshot to >> enforce update visibility. >> >> In XC, because of GXID nature, some GXID may be missing at some slave. >> >> At present, because we didn't have sufficient resources, snapshot >> generation is disabled. >> >> In addition to this, local snapshot may not work. We need global XID >> (GXID) to get consistent result. >> >> By such reasons, it is not simple to provide consistent database view >> from slaves. >> >> I discussed this in PGCon cluster summit this Tuesday and I'm afraid >> this need much more analysis, research and design. >> >> > >> > For example, >> > begin; >> > select ....; ==>go to slave node >> > select ....; ==>go to slave node >> > insert ....; ==>go to master node >> > select ....; ==>go to master node, since it may visit the row inserted >> > by >> > the previous insert statement. >> > end; >> > >> > By this, in a cluster, >> > some coordinator can be configured to support the OLTP system, the query >> > will be routed to the master data nodes; >> > others coordinators can be configured to support the report system, the >> > query will be routed to the slave data nodes; >> > the different wordloads will be applied to different coordinators and >> > data >> > nodes, then they can be isolated. >> > >> > Do you think if it is valuable? Do you have some advices? >> > >> > Thanks >> > Julian >> > >> > >> > ------------------------------------------------------------------------------ >> > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> > Instantly run your Selenium tests across 300+ browser/OS combos. >> > Get unparalleled scalability from the best Selenium testing platform >> > available >> > Simple to use. Nothing to install. Get started now for free." >> > http://p.sf.net/sfu/SauceLabs >> > _______________________________________________ >> > Postgres-xc-general mailing list >> > Pos...@li... >> > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> > |
From: ZhangJulian <jul...@ou...> - 2014-05-23 03:49:11
|
Hi Koichi, Thanks for your comments! 1. pgxc_node issue. I feel the pgxc_node in data node have no use currently, right? In current codebase, if a coordinator slave or a data node slave is promoted, ALTER NODE statement must be executed in all the coordinators since the pgxc_node table is a local table in each node. Assume the feature is applied, ALTER NODE/CREATE NODE syntax also will be updated to update the master and slave together. Once a coordinator slave or a data node slave is prompted, the information in other coordinators and the prompted coordinator could be updated as the previous behavior. 2. the data between the master and the slave may not be consistency every time. It should be a common issue on PostgreSQL, and other non-cluster database platform. There are many users who use the master-slave infrastructure to achive the read/write separation. If the user open the feature, they should know the risk. 3. the GXID issue. It is too complex to me, I can not understand it thoroughly, :) But if the user can bear the data is not consistency in a short time, it will be not a issue, right? Thanks Julian > Date: Thu, 22 May 2014 09:21:28 -0400 > Subject: Re: [Postgres-xc-general] Do you think the new feature is meaningful? - Read/Write Separation > From: koi...@gm... > To: jul...@ou... > CC: pos...@li... > > Hello; > > Thanks a lot for the idea. Please find my comments inline. > > Hope you consider them and more forward to make your goal more feasible? > > Regards; > --- > Koichi Suzuki > > > 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: > > Hi All, > > > > I plan to implement it as the below idea. > > 1. add a new GUC to the coordinator configuration, which control the > > READ/WRITE Separation feature is ON/OFF. > > 2. extend the catalog table pgxc_node by adding new columns: slave1_host, > > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at most > > two slaves are supported. > > I don't think this is a good idea. If we have these info in the > catalog, this will all goes to the slave by WAL shipping and will be > used when a slave is promoted. > > This information is not valid when the master is gone and one of the > slaves is promoted. > > > 3. a read only transaction or the front read only part of a transaction will > > be routed to the slave node to execute. > > In current WAL shipping, we have to expect some difference when a > transaction or statement update is visible to the slave. At least, > even with > synchronized replication, there's slight delay after the WAL record is > received and is replayed to be available to hot standby. There's > even a chance that such update is visible before it is visible at the > master. > > Therefore, usecase of current hot standby should allow such > differences. I don't think your example allow such WAL shipping > replication characteristics. > > Moreover, current hot standby implementation assumes the slave will > receive every XID in updates. It does not assume there could be > missing XIDs and this assumption is used to generate snapshot to > enforce update visibility. > > In XC, because of GXID nature, some GXID may be missing at some slave. > > At present, because we didn't have sufficient resources, snapshot > generation is disabled. > > In addition to this, local snapshot may not work. We need global XID > (GXID) to get consistent result. > > By such reasons, it is not simple to provide consistent database view > from slaves. > > I discussed this in PGCon cluster summit this Tuesday and I'm afraid > this need much more analysis, research and design. > > > > > For example, > > begin; > > select ....; ==>go to slave node > > select ....; ==>go to slave node > > insert ....; ==>go to master node > > select ....; ==>go to master node, since it may visit the row inserted by > > the previous insert statement. > > end; > > > > By this, in a cluster, > > some coordinator can be configured to support the OLTP system, the query > > will be routed to the master data nodes; > > others coordinators can be configured to support the report system, the > > query will be routed to the slave data nodes; > > the different wordloads will be applied to different coordinators and data > > nodes, then they can be isolated. > > > > Do you think if it is valuable? Do you have some advices? > > > > Thanks > > Julian > > > > ------------------------------------------------------------------------------ > > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > > Instantly run your Selenium tests across 300+ browser/OS combos. > > Get unparalleled scalability from the best Selenium testing platform > > available > > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > |
From: Aaron J. <aja...@re...> - 2014-05-22 16:28:27
|
Given my past experience with compiler issues, I'm a little hesitant to even report this. That said, I have a three node cluster, each with a coordinator, data node and gtm proxy. I have a standalone gtm instance without a slave. Often, when I come in after the servers have been up for a while, I'm greeted with a variety of issues. There are several warnings in the coordinator and data node logs, that read "Do not have a GTM snapshot available" - I've discarded these as mostly benign for the moment. The coordinator is much worse.. 30770 | 2014-05-22 15:53:06 UTC | ERROR: current transaction is aborted, commands ignored until end of transaction block 30770 | 2014-05-22 15:53:06 UTC | STATEMENT: DISCARD ALL 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode 4560 | 2014-05-22 15:54:30 UTC | LOG: failed to connect to Datanode 4560 | 2014-05-22 15:54:30 UTC | WARNING: can not connect to node 16390 30808 | 2014-05-22 15:54:30 UTC | LOG: failed to acquire connections Usually, I reset the coordinator and datanode and the world is happy again. However, it makes me somewhat concerned that I'm seeing these kinds of failures on a daily basis. I wouldn't rule out the compiler again as it's been the reason for previous failures, but has anyone else seen anything like this?? Aaron |
From: Koichi S. <koi...@gm...> - 2014-05-22 13:21:35
|
Hello; Thanks a lot for the idea. Please find my comments inline. Hope you consider them and more forward to make your goal more feasible? Regards; --- Koichi Suzuki 2014-05-22 4:19 GMT-04:00 ZhangJulian <jul...@ou...>: > Hi All, > > I plan to implement it as the below idea. > 1. add a new GUC to the coordinator configuration, which control the > READ/WRITE Separation feature is ON/OFF. > 2. extend the catalog table pgxc_node by adding new columns: slave1_host, > slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at most > two slaves are supported. I don't think this is a good idea. If we have these info in the catalog, this will all goes to the slave by WAL shipping and will be used when a slave is promoted. This information is not valid when the master is gone and one of the slaves is promoted. > 3. a read only transaction or the front read only part of a transaction will > be routed to the slave node to execute. In current WAL shipping, we have to expect some difference when a transaction or statement update is visible to the slave. At least, even with synchronized replication, there's slight delay after the WAL record is received and is replayed to be available to hot standby. There's even a chance that such update is visible before it is visible at the master. Therefore, usecase of current hot standby should allow such differences. I don't think your example allow such WAL shipping replication characteristics. Moreover, current hot standby implementation assumes the slave will receive every XID in updates. It does not assume there could be missing XIDs and this assumption is used to generate snapshot to enforce update visibility. In XC, because of GXID nature, some GXID may be missing at some slave. At present, because we didn't have sufficient resources, snapshot generation is disabled. In addition to this, local snapshot may not work. We need global XID (GXID) to get consistent result. By such reasons, it is not simple to provide consistent database view from slaves. I discussed this in PGCon cluster summit this Tuesday and I'm afraid this need much more analysis, research and design. > > For example, > begin; > select ....; ==>go to slave node > select ....; ==>go to slave node > insert ....; ==>go to master node > select ....; ==>go to master node, since it may visit the row inserted by > the previous insert statement. > end; > > By this, in a cluster, > some coordinator can be configured to support the OLTP system, the query > will be routed to the master data nodes; > others coordinators can be configured to support the report system, the > query will be routed to the slave data nodes; > the different wordloads will be applied to different coordinators and data > nodes, then they can be isolated. > > Do you think if it is valuable? Do you have some advices? > > Thanks > Julian > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: ZhangJulian <jul...@ou...> - 2014-05-22 08:19:43
|
Hi All, I plan to implement it as the below idea. 1. add a new GUC to the coordinator configuration, which control the READ/WRITE Separation feature is ON/OFF. 2. extend the catalog table pgxc_node by adding new columns: slave1_host, slave1_port, slave1_id, slave2_host, slave2_port, slave2_id. Suppose at most two slaves are supported. 3. a read only transaction or the front read only part of a transaction will be routed to the slave node to execute. For example, begin; select ....; ==>go to slave node select ....; ==>go to slave node insert ....; ==>go to master node select ....; ==>go to master node, since it may visit the row inserted by the previous insert statement. end; By this, in a cluster, some coordinator can be configured to support the OLTP system, the query will be routed to the master data nodes; others coordinators can be configured to support the report system, the query will be routed to the slave data nodes; the different wordloads will be applied to different coordinators and data nodes, then they can be isolated. Do you think if it is valuable? Do you have some advices? Thanks Julian |
From: Ashutosh B. <ash...@en...> - 2014-05-21 05:00:38
|
Hi Aaron, >From the plan you have given we can see that INSERT is happening on the coordinator, inserting one row at a time. Although the INSERT statement is prepared on the datanode, each EXECUTE incurs the libpq and execution overheads on datanode. What should ideally happen is, all the rows to be inserted on a same datanode should be stored in some sort of file and bulk inserted (using COPY protocol). But this is not implemented yet, because 1. We do not have resources to implement it 2. We do not have global statistics at the coordinator to estimate, how many rows SELECT is going to returns, and hence can not decide whether to use single insert at a time (for small number of rows) or bulk insert (large number of rows). On Tue, Apr 29, 2014 at 10:08 PM, Aaron Jackson <aja...@re...>wrote: > When I load data into my table "detail" with COPY, the table loads at a > rate of about 56k rows per second. The data is distributed on a key to > achieve this rate of insert (width is 678). However, when I do the > following: > > INSERT INTO DETAIL SELECT 123 as Id, ... FROM DETAIL WHERE Id = 500; > > I see the write performance drop to only 2.5K rows per second. The > total data set loaded from Id = 500 is 200k rows and takes about 7s to load > into the data coordinator. So, I can attribute almost all of the time > (about 80 seconds) directly to the insert. > > Insert on detail (cost=0.00..10.00 rows=1000 width=678) (actual > time=79438.038..79438.038 rows=0 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > Node expr: productid > -> Data Node Scan on detail "_REMOTE_TABLE_QUERY_" (cost=0.00..10.00 > rows=1000 width=678) (actual time=3.917..2147.231 rows=200000 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > > IMO, it seems like an insert like this should approach the performance > of a COPY. Am I missing something or can you recommend a different > approach? > > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company |
From: Koichi S. <koi...@gm...> - 2014-05-21 01:53:31
|
Thank you Josh. This could be because of the transmission overhead. In XC, file to copy is first transferred to each target datanode before handled by copy handler. Regards; --- Koichi Suzuki 2014-05-20 12:23 GMT-04:00 Josh Berkus <jo...@ag...>: > On 04/29/2014 12:38 PM, Aaron Jackson wrote: >> When I load data into my table "detail" with COPY, the table loads at a rate of about 56k rows per second. The data is distributed on a key to achieve this rate of insert (width is 678). However, when I do the following: >> >> INSERT INTO DETAIL SELECT 123 as Id, ... FROM DETAIL WHERE Id = 500; >> >> I see the write performance drop to only 2.5K rows per second. The total data set loaded from Id = 500 is 200k rows and takes about 7s to load into the data coordinator. So, I can attribute almost all of the time (about 80 seconds) directly to the insert. >> >> Insert on detail (cost=0.00..10.00 rows=1000 width=678) (actual time=79438.038..79438.038 rows=0 loops=1) >> Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 >> Node expr: productid >> -> Data Node Scan on detail "_REMOTE_TABLE_QUERY_" (cost=0.00..10.00 rows=1000 width=678) (actual time=3.917..2147.231 rows=200000 loops=1) >> Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 >> >> IMO, it seems like an insert like this should approach the performance of a COPY. Am I missing something or can you recommend a different approach? > > Well, COPY is much faster on vanilla Postgres, for a variety of > optimization reasons. I don't see why PostgresXC would be different. > > Admittedly, the 20X differential is higher than single-node Postgres, so > that seems worth investigating. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
From: Josh B. <jo...@ag...> - 2014-05-20 16:23:58
|
On 04/29/2014 12:38 PM, Aaron Jackson wrote: > When I load data into my table "detail" with COPY, the table loads at a rate of about 56k rows per second. The data is distributed on a key to achieve this rate of insert (width is 678). However, when I do the following: > > INSERT INTO DETAIL SELECT 123 as Id, ... FROM DETAIL WHERE Id = 500; > > I see the write performance drop to only 2.5K rows per second. The total data set loaded from Id = 500 is 200k rows and takes about 7s to load into the data coordinator. So, I can attribute almost all of the time (about 80 seconds) directly to the insert. > > Insert on detail (cost=0.00..10.00 rows=1000 width=678) (actual time=79438.038..79438.038 rows=0 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > Node expr: productid > -> Data Node Scan on detail "_REMOTE_TABLE_QUERY_" (cost=0.00..10.00 rows=1000 width=678) (actual time=3.917..2147.231 rows=200000 loops=1) > Node/s: node_pgs01_1, node_pgs01_2, node_pgs02_1, node_pgs02_2 > > IMO, it seems like an insert like this should approach the performance of a COPY. Am I missing something or can you recommend a different approach? Well, COPY is much faster on vanilla Postgres, for a variety of optimization reasons. I don't see why PostgresXC would be different. Admittedly, the 20X differential is higher than single-node Postgres, so that seems worth investigating. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Mason S. <ms...@tr...> - 2014-05-17 16:47:33
|
Hi Dorian, On Sat, May 17, 2014 at 3:25 AM, Dorian Hoxha <dor...@gm...>wrote: > Postgres-XL Released: Scale-out PostgreSQL Cluster > > http://www.postgresql.org/about/news/1523/ > > Yes, Koichi Suzuki asked me to explain more about Postgres-XL at the Postgres-XC meeting. I will put together some slides to highlight the enhancements and differences. I will also give a very brief update at the Clustering Summit. > > On Sat, May 17, 2014 at 12:33 AM, Josh Berkus <jo...@ag...> wrote: > >> All: >> >> The PostgresXC Meeting and the Clustering Summit at pgCon next week have >> been moved to University Center Room 205 in order to accomodate a >> larger-thank-expected group. >> >> -- >> Josh Berkus >> PostgreSQL Experts Inc. >> http://pgexperts.com > > -- Mason Sharp TransLattice - http://www.translattice.com Distributed and Clustered Database Solutions |
From: Dorian H. <dor...@gm...> - 2014-05-17 07:26:21
|
Postgres-XL Released: Scale-out PostgreSQL Cluster http://www.postgresql.org/about/news/1523/ On Sat, May 17, 2014 at 12:33 AM, Josh Berkus <jo...@ag...> wrote: > All: > > The PostgresXC Meeting and the Clustering Summit at pgCon next week have > been moved to University Center Room 205 in order to accomodate a > larger-thank-expected group. > > -- > Josh Berkus > PostgreSQL Experts Inc. > http://pgexperts.com > > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. > Get unparalleled scalability from the best Selenium testing platform > available > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
From: Josh B. <jo...@ag...> - 2014-05-16 22:51:16
|
All: The PostgresXC Meeting and the Clustering Summit at pgCon next week have been moved to University Center Room 205 in order to accomodate a larger-thank-expected group. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com |
From: Aaron J. <aja...@re...> - 2014-05-16 14:24:35
|
Yes, All nodes were running fine. However, I didn't account for the fact that I rebuilt the server to add PAM support. When I did that, I reset autoconf and it built with my apparently braindead version of gcc-4.8. So, the issue was resolved once I rebuilt with gcc-4.7 and redistributed the proper binary. Thank, sorry for the goose chase. ________________________________ From: Pavan Deolasee [pav...@gm...] Sent: Monday, May 12, 2014 11:22 AM To: Aaron Jackson Cc: pos...@li... Subject: Re: [Postgres-xc-general] Failed to get pooled connections - overnight Sent from my iPhone On 12-May-2014, at 8:51 pm, Aaron Jackson <aja...@re...<mailto:aja...@re...>> wrote: This morning I came in and connected to my coordinator. I issued a query to count table A and this succeeded. I then asked it to count table B and it failed with "Failed to get pooled connections" - I did an explain on both tables and this is what it told me.. explain select count(*) from tableA; Aggregate (cost=2.50..2.51 rows=1 width=0) -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) Node/s: node_pgs01_1, node_pgs02_1, node_pgs03_1 (3 rows) explain select count(*) from tableB; Aggregate (cost=2.50..2.51 rows=1 width=0) -> Data Node Scan on "__REMOTE_GROUP_QUERY__" (cost=0.00..0.00 rows=1000 width=0) Node/s: node_pgs01_1, node_pgs01_1, node_pgs02_1, node_pgs01_1, node_pgs03_1, node_pgs01_1 (3 rows) I've seen this twice now, so I figured that maybe the pool needed to be reloaded... so I issued pgxc_pool_reload() but that did not help. Restarting the coordinator did not change anything, so it appears to be that the metadata for table B is bad? No nodes have been added or removed since this table was created. Any thoughts? Did you check if all the nodes are running fine ? Thanks, Pavan Aaron ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs _______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |