|
From: Juned K. <jkh...@gm...> - 2014-04-28 11:22:00
|
Hi all,
I setup pgxc, but today when i tried fire one manual query i got below
message. i am not able to show table descriptions as well.
database=# \d
ERROR: could not access status of transaction 0
DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space
left on device.
sometimes its showing table description in below format.
database=# \dt
<table border="1">
<caption>List of relations</caption>
<tr>
<th align="center">Schema</th>
<th align="center">Name</th>
<th align="center">Type</th>
<th align="center">Owner</th>
</tr>
--
Thanks,
Juned Khan
<http://www.inextrix.com/>
|
|
From: Juned K. <jkh...@gm...> - 2014-04-28 11:31:19
|
in logs i found this DETAIL: The failed archive command was: rsync pg_xlog/000000010000001C0000006A postgres@db01 :/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32) rsync: write failed on "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No space left on device (28) rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] rsync: connection unexpectedly closed (28 bytes received so far) [sender] rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9] LOG: archive command failed with exit code 12 On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...> wrote: > Hi all, > > I setup pgxc, but today when i tried fire one manual query i got below > message. i am not able to show table descriptions as well. > > database=# \d > ERROR: could not access status of transaction 0 > DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space > left on device. > > sometimes its showing table description in below format. > > database=# \dt > <table border="1"> > <caption>List of relations</caption> > <tr> > <th align="center">Schema</th> > <th align="center">Name</th> > <th align="center">Type</th> > <th align="center">Owner</th> > </tr> > > > -- > Thanks, > Juned Khan > > <http://www.inextrix.com/> > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: Juned K. <jkh...@gm...> - 2014-05-01 06:00:21
|
Anyone can please help me with this how to solve this issue ? On Mon, Apr 28, 2014 at 5:01 PM, Juned Khan <jkh...@gm...> wrote: > in logs i found this > DETAIL: The failed archive command was: rsync > pg_xlog/000000010000001C0000006A postgres@db01 > :/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A > rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: > Broken pipe (32) > rsync: write failed on > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > space left on device (28) > rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9] > rsync: connection unexpectedly closed (28 bytes received so far) [sender] > rsync error: error in rsync protocol data stream (code 12) at io.c(605) > [sender=3.0.9] > LOG: archive command failed with exit code 12 > > > > On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...> wrote: > >> Hi all, >> >> I setup pgxc, but today when i tried fire one manual query i got below >> message. i am not able to show table descriptions as well. >> >> database=# \d >> ERROR: could not access status of transaction 0 >> DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space >> left on device. >> >> sometimes its showing table description in below format. >> >> database=# \dt >> <table border="1"> >> <caption>List of relations</caption> >> <tr> >> <th align="center">Schema</th> >> <th align="center">Name</th> >> <th align="center">Type</th> >> <th align="center">Owner</th> >> </tr> >> >> >> -- >> Thanks, >> Juned Khan >> >> <http://www.inextrix.com/> >> > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: 鈴木 幸市 <ko...@in...> - 2014-05-01 06:22:17
|
Could you share gtm.control file which is placed at the same directory as gtm.conf. I’m afraid this file has been removed or corrupt. Did it use to work and then happened this problem? If so, did you met any other issues on the operation? Do you really have enough file space at the datanode? Xlog archive consumes much file space. This is no difference from PostgreSQL.
Did you setup archive_cleanup_command configuration in your recovery.conf file to clean up old archive log? pg_archivecleanup utility will help.
Regards;
---
Koichi Suzuki
2014/05/01 15:00、Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> のメール:
Anyone can please help me with this how to solve this issue ?
On Mon, Apr 28, 2014 at 5:01 PM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote:
in logs i found this
DETAIL: The failed archive command was: rsync pg_xlog/000000010000001C0000006A postgres@db01:/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A
rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: Broken pipe (32)
rsync: write failed on "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No space left on device (28)
rsync error: error in file IO (code 11) at receiver.c(322) [receiver=3.0.9]
rsync: connection unexpectedly closed (28 bytes received so far) [sender]
rsync error: error in rsync protocol data stream (code 12) at io.c(605) [sender=3.0.9]
LOG: archive command failed with exit code 12
On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote:
Hi all,
I setup pgxc, but today when i tried fire one manual query i got below message. i am not able to show table descriptions as well.
database=# \d
ERROR: could not access status of transaction 0
DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space left on device.
sometimes its showing table description in below format.
database=# \dt
<table border="1">
<caption>List of relations</caption>
<tr>
<th align="center">Schema</th>
<th align="center">Name</th>
<th align="center">Type</th>
<th align="center">Owner</th>
</tr>
--
Thanks,
Juned Khan
<http://www.inextrix.com/>
--
Thanks,
Juned Khan
iNextrix Technologies Pvt Ltd.
www.inextrix.com<http://www.inextrix.com/>
--
Thanks,
Juned Khan
iNextrix Technologies Pvt Ltd.
www.inextrix.com<http://www.inextrix.com/>
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos. Get
unparalleled scalability from the best Selenium testing platform available.
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs_______________________________________________
Postgres-xc-general mailing list
Pos...@li...
https://lists.sourceforge.net/lists/listinfo/postgres-xc-general
|
|
From: Juned K. <jkh...@gm...> - 2014-05-01 06:32:54
|
Hi koichi, Yeah my gtm.control file is exists but it contains no data. yes it used to work earlier this problem happened suddenly. still 123G space is available on server. I didn't setup anything in recovery.conf file manually. On Thu, May 1, 2014 at 11:52 AM, 鈴木 幸市 <ko...@in...> wrote: > Could you share gtm.control file which is placed at the same directory as > gtm.conf. I’m afraid this file has been removed or corrupt. Did it use > to work and then happened this problem? If so, did you met any other > issues on the operation? Do you really have enough file space at the > datanode? Xlog archive consumes much file space. This is no difference > from PostgreSQL. > > Did you setup archive_cleanup_command configuration in your > recovery.conf file to clean up old archive log? pg_archivecleanup utility > will help. > > Regards; > --- > Koichi Suzuki > > 2014/05/01 15:00、Juned Khan <jkh...@gm...> のメール: > > Anyone can please help me with this how to solve this issue ? > > > On Mon, Apr 28, 2014 at 5:01 PM, Juned Khan <jkh...@gm...> wrote: > >> in logs i found this >> DETAIL: The failed archive command was: rsync >> pg_xlog/000000010000001C0000006A postgres@db01 >> :/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A >> rsync: writefd_unbuffered failed to write 4 bytes to socket [sender]: >> Broken pipe (32) >> rsync: write failed on >> "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No >> space left on device (28) >> rsync error: error in file IO (code 11) at receiver.c(322) >> [receiver=3.0.9] >> rsync: connection unexpectedly closed (28 bytes received so far) [sender] >> rsync error: error in rsync protocol data stream (code 12) at io.c(605) >> [sender=3.0.9] >> LOG: archive command failed with exit code 12 >> >> >> >> On Mon, Apr 28, 2014 at 4:51 PM, Juned Khan <jkh...@gm...> wrote: >> >>> Hi all, >>> >>> I setup pgxc, but today when i tried fire one manual query i got below >>> message. i am not able to show table descriptions as well. >>> >>> database=# \d >>> ERROR: could not access status of transaction 0 >>> DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No >>> space left on device. >>> >>> sometimes its showing table description in below format. >>> >>> database=# \dt >>> <table border="1"> >>> <caption>List of relations</caption> >>> <tr> >>> <th align="center">Schema</th> >>> <th align="center">Name</th> >>> <th align="center">Type</th> >>> <th align="center">Owner</th> >>> </tr> >>> >>> >>> -- >>> Thanks, >>> Juned Khan >>> >>> <http://www.inextrix.com/> >>> >> >> >> >> -- >> Thanks, >> Juned Khan >> >> <http://www.inextrix.com/> >> > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs_______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > -- Thanks, Juned Khan <http://www.inextrix.com/> |
|
From: Michael P. <mic...@gm...> - 2014-05-01 07:20:49
|
On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> wrote: > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > space left on device (28) This means that the partition containing folder /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df and monitor the size of this partition/disk that you are using for your archive files. -- Michael |
|
From: Juned K. <jkh...@gm...> - 2014-05-01 10:02:29
|
I am not sure but may be datanode_archlog directory is not exists in my case. may be its have different name. however size of /home/postgres/pgxc/nodes/dn_master directory is *42G* and /home/postgres/pgxc/nodes/gtm_pxy is of *31G*. Please suggest. On Thu, May 1, 2014 at 12:50 PM, Michael Paquier <mic...@gm...>wrote: > On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> wrote: > > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > > space left on device (28) > This means that the partition containing folder > /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df > and monitor the size of this partition/disk that you are using for > your archive files. > -- > Michael > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: 鈴木 幸市 <ko...@in...> - 2014-05-01 10:20:58
|
I needd an info how you configured your cluster, manually or pgxc_ctl. If pgxc_ctl, I need your configuration file. Also, it’s very funny that gtm.control is missing. If so, next thing you can do is: 1) Stop all the coordinators/datanodes/gtm, master and slave. 2) Bake file-level backup of all the resources (cold backup) 3) Detach coordinator/datanode slaves and see if masters run normally. 4) If so, then reconstruct slaves. 5) If masters does not work correctly, you need to hack each node’s pg_control to see current GXID value and restore gtm.control with this value to restart gtm. Even if gtm fails, gtm.control is available to indicate a safe restart point. Do you have any idea how it’s been gone? Regards; --- Koichi Suzuki 2014/05/01 19:02、Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> のメール: I am not sure but may be datanode_archlog directory is not exists in my case. may be its have different name. however size of /home/postgres/pgxc/nodes/dn_master directory is 42G and /home/postgres/pgxc/nodes/gtm_pxy is of 31G. Please suggest. On Thu, May 1, 2014 at 12:50 PM, Michael Paquier <mic...@gm...<mailto:mic...@gm...>> wrote: On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote: > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > space left on device (28) This means that the partition containing folder /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df and monitor the size of this partition/disk that you are using for your archive files. -- Michael -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
|
From: Juned K. <jkh...@gm...> - 2014-05-01 10:45:26
|
what will be the impact if i remove files under pg_xlog directory ? root@db02:/home/postgres/pgxc/nodes/dn_master/pg_xlog# du -sh 33G . On Thu, May 1, 2014 at 4:05 PM, Juned Khan <jkh...@gm...> wrote: > As of now i am attaching my pgxc_ctl.conf file, i have setup everything > using pgxc_ctl. this database is in production. to perform all above steps > i need to set downtime. so when i get a change i ll perform this steps . > please tell me if i have done any misconfiguration in this file. so i can > update this at that time. > > > > > On Thu, May 1, 2014 at 3:50 PM, 鈴木 幸市 <ko...@in...> wrote: > >> I needd an info how you configured your cluster, manually or pgxc_ctl. >> If pgxc_ctl, I need your configuration file. >> >> Also, it’s very funny that gtm.control is missing. If so, next >> thing you can do is: >> >> 1) Stop all the coordinators/datanodes/gtm, master and slave. >> 2) Bake file-level backup of all the resources (cold backup) >> 3) Detach coordinator/datanode slaves and see if masters run normally. >> 4) If so, then reconstruct slaves. >> 5) If masters does not work correctly, you need to hack each node’s >> pg_control to see current GXID value and restore gtm.control with this >> value to restart gtm. >> >> Even if gtm fails, gtm.control is available to indicate a safe restart >> point. Do you have any idea how it’s been gone? >> >> Regards; >> --- >> Koichi Suzuki >> >> 2014/05/01 19:02、Juned Khan <jkh...@gm...> のメール: >> >> I am not sure but may be datanode_archlog directory is not exists in >> my case. may be its have different name. >> however size of /home/postgres/pgxc/nodes/dn_master directory is *42G*and /home/postgres/pgxc/nodes/gtm_pxy is of >> *31G*. >> >> Please suggest. >> >> >> On Thu, May 1, 2014 at 12:50 PM, Michael Paquier < >> mic...@gm...> wrote: >> >>> On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> wrote: >>> > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": >>> No >>> > space left on device (28) >>> This means that the partition containing folder >>> /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df >>> and monitor the size of this partition/disk that you are using for >>> your archive files. >>> -- >>> Michael >>> >> >> >> >> -- >> Thanks, >> Juned Khan >> iNextrix Technologies Pvt Ltd. >> www.inextrix.com >> ------------------------------------------------------------------------------ >> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >> Instantly run your Selenium tests across 300+ browser/OS combos. Get >> unparalleled scalability from the best Selenium testing platform >> available. >> Simple to use. Nothing to install. Get started now for free." >> >> http://p.sf.net/sfu/SauceLabs_______________________________________________ >> Postgres-xc-general mailing list >> Pos...@li... >> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >> >> >> > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: Michael P. <mic...@gm...> - 2014-05-01 11:22:01
|
On Thu, May 1, 2014 at 7:45 PM, Juned Khan <jkh...@gm...> wrote: > what will be the impact if i remove files under pg_xlog directory ? > > root@db02:/home/postgres/pgxc/nodes/dn_master/pg_xlog# du -sh > 33G . Don't do it. This is one of the shortest way to corrupt your database server. You have here something like 2050 WAL files, that's a lot to replay in case of crash, you should try to reduce checkpoint_segments first, or enforce a checkpoint to flush all the pages to disk, and be able to recycle properly the existing WAL files. -- Michael |
|
From: 鈴木 幸市 <ko...@in...> - 2014-05-01 10:49:56
|
Do you mean pg_xlog at the slave? Because pg_xlog contains all the redo logs needed to recovery, it is quite harmful to do so. If you are removing pg_xlog at the slave, it is also harmful too. Pgxc_ctl configures the slave to remove unnecessary files under pg_xlog automatically. Anyway, the problem looks to be caused by missing gtm.control. Do you have any indication how it’s been gone? And what is your current status of the database? Do you have any chance to stop the database, take backups and see what’s going on? Regards; --- Koichi Suzuki 2014/05/01 19:45、Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> のメール: what will be the impact if i remove files under pg_xlog directory ? root@db02:/home/postgres/pgxc/nodes/dn_master/pg_xlog# du -sh 33G . On Thu, May 1, 2014 at 4:05 PM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote: As of now i am attaching my pgxc_ctl.conf file, i have setup everything using pgxc_ctl. this database is in production. to perform all above steps i need to set downtime. so when i get a change i ll perform this steps . please tell me if i have done any misconfiguration in this file. so i can update this at that time. On Thu, May 1, 2014 at 3:50 PM, 鈴木 幸市 <ko...@in...<mailto:ko...@in...>> wrote: I needd an info how you configured your cluster, manually or pgxc_ctl. If pgxc_ctl, I need your configuration file. Also, it’s very funny that gtm.control is missing. If so, next thing you can do is: 1) Stop all the coordinators/datanodes/gtm, master and slave. 2) Bake file-level backup of all the resources (cold backup) 3) Detach coordinator/datanode slaves and see if masters run normally. 4) If so, then reconstruct slaves. 5) If masters does not work correctly, you need to hack each node’s pg_control to see current GXID value and restore gtm.control with this value to restart gtm. Even if gtm fails, gtm.control is available to indicate a safe restart point. Do you have any idea how it’s been gone? Regards; --- Koichi Suzuki 2014/05/01 19:02、Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> のメール: I am not sure but may be datanode_archlog directory is not exists in my case. may be its have different name. however size of /home/postgres/pgxc/nodes/dn_master directory is 42G and /home/postgres/pgxc/nodes/gtm_pxy is of 31G. Please suggest. On Thu, May 1, 2014 at 12:50 PM, Michael Paquier <mic...@gm...<mailto:mic...@gm...>> wrote: On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote: > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No > space left on device (28) This means that the partition containing folder /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df and monitor the size of this partition/disk that you are using for your archive files. -- Michael -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs_______________________________________________ Postgres-xc-general mailing list Pos...@li...<mailto:Pos...@li...> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> |
|
From: Juned K. <jkh...@gm...> - 2014-05-01 11:12:06
|
nope i mean pg_xlog at datanode master, datanode slave is already down due to some connection related problem<http://sourceforge.net/p/postgres-xc/mailman/message/32232987/> . My current database status is PGXC monitor all Running: gtm master Running: gtm slave Running: gtm proxy gtm_pxy1 Running: gtm proxy gtm_pxy2 Running: coordinator master coord1 Running: coordinator master coord2 Running: datanode master datanode1 Not running: datanode slave datanode1 only datanode master is up as of now, i can not stop my database to check such things. Also while restarting components if something wrong will happen then my whole system will go down, so i can not take change at this movement. hope you understand. On Thu, May 1, 2014 at 4:19 PM, 鈴木 幸市 <ko...@in...> wrote: > Do you mean pg_xlog at the slave? Because pg_xlog contains all the redo > logs needed to recovery, it is quite harmful to do so. If you are > removing pg_xlog at the slave, it is also harmful too. Pgxc_ctl > configures the slave to remove unnecessary files under pg_xlog > automatically. > > Anyway, the problem looks to be caused by missing gtm.control. Do you > have any indication how it’s been gone? And what is your current status > of the database? Do you have any chance to stop the database, take > backups and see what’s going on? > > Regards; > --- > Koichi Suzuki > > 2014/05/01 19:45、Juned Khan <jkh...@gm...> のメール: > > what will be the impact if i remove files under pg_xlog directory ? > > root@db02:/home/postgres/pgxc/nodes/dn_master/pg_xlog# du -sh > 33G . > > > > On Thu, May 1, 2014 at 4:05 PM, Juned Khan <jkh...@gm...> wrote: > >> As of now i am attaching my pgxc_ctl.conf file, i have setup everything >> using pgxc_ctl. this database is in production. to perform all above steps >> i need to set downtime. so when i get a change i ll perform this steps . >> please tell me if i have done any misconfiguration in this file. so i >> can update this at that time. >> >> >> >> >> On Thu, May 1, 2014 at 3:50 PM, 鈴木 幸市 <ko...@in...> wrote: >> >>> I needd an info how you configured your cluster, manually or pgxc_ctl. >>> If pgxc_ctl, I need your configuration file. >>> >>> Also, it’s very funny that gtm.control is missing. If so, next >>> thing you can do is: >>> >>> 1) Stop all the coordinators/datanodes/gtm, master and slave. >>> 2) Bake file-level backup of all the resources (cold backup) >>> 3) Detach coordinator/datanode slaves and see if masters run normally. >>> 4) If so, then reconstruct slaves. >>> 5) If masters does not work correctly, you need to hack each node’s >>> pg_control to see current GXID value and restore gtm.control with this >>> value to restart gtm. >>> >>> Even if gtm fails, gtm.control is available to indicate a safe restart >>> point. Do you have any idea how it’s been gone? >>> >>> Regards; >>> --- >>> Koichi Suzuki >>> >>> 2014/05/01 19:02、Juned Khan <jkh...@gm...> のメール: >>> >>> I am not sure but may be datanode_archlog directory is not exists in >>> my case. may be its have different name. >>> however size of /home/postgres/pgxc/nodes/dn_master directory is *42G*and /home/postgres/pgxc/nodes/gtm_pxy is of >>> *31G*. >>> >>> Please suggest. >>> >>> >>> On Thu, May 1, 2014 at 12:50 PM, Michael Paquier < >>> mic...@gm...> wrote: >>> >>>> On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> >>>> wrote: >>>> > >>>> "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": No >>>> > space left on device (28) >>>> This means that the partition containing folder >>>> /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df >>>> and monitor the size of this partition/disk that you are using for >>>> your archive files. >>>> -- >>>> Michael >>>> >>> >>> >>> >>> -- >>> Thanks, >>> Juned Khan >>> iNextrix Technologies Pvt Ltd. >>> www.inextrix.com >>> >>> ------------------------------------------------------------------------------ >>> "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE >>> Instantly run your Selenium tests across 300+ browser/OS combos. Get >>> unparalleled scalability from the best Selenium testing platform >>> available. >>> Simple to use. Nothing to install. Get started now for free." >>> >>> http://p.sf.net/sfu/SauceLabs_______________________________________________ >>> Postgres-xc-general mailing list >>> Pos...@li... >>> https://lists.sourceforge.net/lists/listinfo/postgres-xc-general >>> >>> >>> >> >> >> -- >> Thanks, >> Juned Khan >> iNextrix Technologies Pvt Ltd. >> www.inextrix.com >> > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > > > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: Michael P. <mic...@gm...> - 2014-05-01 11:23:52
|
On Thu, May 1, 2014 at 7:49 PM, 鈴木 幸市 <ko...@in...> wrote: > Do you mean pg_xlog at the slave? Because pg_xlog contains all the redo > logs needed to recovery, it is quite harmful to do so. If you are removing > pg_xlog at the slave, it is also harmful too. Pgxc_ctl configures the > slave to remove unnecessary files under pg_xlog automatically. Not harmful. Catastrophic enough to be haunted for 3 generations by your users if you do it. -- Michael |
|
From: 鈴木 幸市 <ko...@in...> - 2014-05-02 01:01:12
|
If you’re quite sure that the node does not fail until removed xlog is not used at the recovery. --- Koichi Suzuki 2014/05/01 20:23、Michael Paquier <mic...@gm...> のメール: > On Thu, May 1, 2014 at 7:49 PM, 鈴木 幸市 <ko...@in...> wrote: >> Do you mean pg_xlog at the slave? Because pg_xlog contains all the redo >> logs needed to recovery, it is quite harmful to do so. If you are removing >> pg_xlog at the slave, it is also harmful too. Pgxc_ctl configures the >> slave to remove unnecessary files under pg_xlog automatically. > Not harmful. Catastrophic enough to be haunted for 3 generations by > your users if you do it. > -- > Michael > |
|
From: Michael P. <mic...@gm...> - 2014-05-02 02:04:46
|
On Fri, May 2, 2014 at 10:01 AM, 鈴木 幸市 <ko...@in...> wrote: > If you’re quite sure that the node does not fail until removed xlog is not used at the recovery. Juned meant on master AFAIK. Per se: On Thu, May 1, 2014 at 7:45 PM, Juned Khan <jkh...@gm...> wrote: > what will be the impact if i remove files under pg_xlog directory ? > > root@db02:/home/postgres/pgxc/ nodes/dn_master/pg_xlog# du -sh > 33G . -- Michael |
|
From: Masataka S. <pg...@gm...> - 2014-05-02 02:46:09
|
I'm almost sure this issue caused by shot of the storage. Michael suggested followings. * The partition for XC is full. * df command will help you. (df is basic command enough to assume all of Linux engineers knows it) You could do many things by that suggestion. Did you run df command at every server? checked your free area? checked quota settings? determine if you can write some data into the directory? And usage of GTM proxy is unusual. I guess that the proxy produced enormous error logs about communication with the GTM. As GTM is not working well, the log may be no use and you can remove them. After that you need to recover gtm.control from GTM slave. If slave's one is also broken, I think it is difficult to recover the cluster. Regards. On 1 May 2014 19:02, Juned Khan <jkh...@gm...> wrote: > I am not sure but may be datanode_archlog directory is not exists in my > case. may be its have different name. > however size of /home/postgres/pgxc/nodes/dn_master directory is 42G and > /home/postgres/pgxc/nodes/gtm_pxy is of 31G. > > Please suggest. > > > On Thu, May 1, 2014 at 12:50 PM, Michael Paquier <mic...@gm...> > wrote: >> >> On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> wrote: >> > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": >> > No >> > space left on device (28) >> This means that the partition containing folder >> /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df >> and monitor the size of this partition/disk that you are using for >> your archive files. >> -- >> Michael > > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
|
From: Juned K. <jkh...@gm...> - 2014-05-02 05:02:07
|
I know df command but i am not sure about what to remove from which directory of postgre-xc as i mentioned in earlier reply , the size of pg_xlog directory is 33G and yeah still 123G space is available on DB server. On Fri, May 2, 2014 at 8:16 AM, Masataka Saito <pg...@gm...> wrote: > I'm almost sure this issue caused by shot of the storage. > > Michael suggested followings. > * The partition for XC is full. > * df command will help you. (df is basic command enough to assume all > of Linux engineers knows it) > > You could do many things by that suggestion. > Did you run df command at every server? checked your free area? > checked quota settings? determine if you can write some data into the > directory? > > And usage of GTM proxy is unusual. I guess that the proxy produced > enormous error logs about communication with the GTM. As GTM is not > working well, the log may be no use and you can remove them. > > After that you need to recover gtm.control from GTM slave. If slave's > one is also broken, I think it is difficult to recover the cluster. > > Regards. > > > On 1 May 2014 19:02, Juned Khan <jkh...@gm...> wrote: > > I am not sure but may be datanode_archlog directory is not exists in my > > case. may be its have different name. > > however size of /home/postgres/pgxc/nodes/dn_master directory is 42G and > > /home/postgres/pgxc/nodes/gtm_pxy is of 31G. > > > > Please suggest. > > > > > > On Thu, May 1, 2014 at 12:50 PM, Michael Paquier < > mic...@gm...> > > wrote: > >> > >> On Mon, Apr 28, 2014 at 8:31 PM, Juned Khan <jkh...@gm...> > wrote: > >> > "/home/postgres/pgxc/nodes/datanode_archlog/000000010000001C0000006A": > >> > No > >> > space left on device (28) > >> This means that the partition containing folder > >> /home/postgres/pgxc/nodes/datanode_archlog/ is full. Try to do some df > >> and monitor the size of this partition/disk that you are using for > >> your archive files. > >> -- > >> Michael > > > > > > > > > > -- > > Thanks, > > Juned Khan > > iNextrix Technologies Pvt Ltd. > > www.inextrix.com > > > > > ------------------------------------------------------------------------------ > > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > > Instantly run your Selenium tests across 300+ browser/OS combos. Get > > unparalleled scalability from the best Selenium testing platform > available. > > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: Michael P. <mic...@gm...> - 2014-05-02 05:01:40
|
On Fri, May 2, 2014 at 11:46 AM, Masataka Saito <pg...@gm...> wrote: > Michael suggested followings. > * The partition for XC is full. > * df command will help you. (df is basic command enough to assume all > of Linux engineers knows it) And reducing checkpoint_segments would help. I am fairly guessing that Juned set it up to an utterly high value to avoid any checkpoints to be kicked by server during his benchmark tests to avoid I/O spikes caused by dirty page flushes. -- Michael |
|
From: Juned K. <jkh...@gm...> - 2014-05-02 05:06:38
|
Actually i haven setup anything related to checkpoint_segments manually. On Fri, May 2, 2014 at 10:31 AM, Michael Paquier <mic...@gm...>wrote: > On Fri, May 2, 2014 at 11:46 AM, Masataka Saito <pg...@gm...> wrote: > > Michael suggested followings. > > * The partition for XC is full. > > * df command will help you. (df is basic command enough to assume all > > of Linux engineers knows it) > And reducing checkpoint_segments would help. I am fairly guessing that > Juned set it up to an utterly high value to avoid any checkpoints to > be kicked by server during his benchmark tests to avoid I/O spikes > caused by dirty page flushes. > -- > Michael > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: Juned K. <jkh...@gm...> - 2014-05-02 06:07:43
|
astpp=# select count(*) from accounts; ERROR: could not access status of transaction 0 DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space left on device. postgres@db02:~/pgxc/nodes/dn_master/pg_clog$ ls 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000A 000B 000C 000D 000E 000F 0010 0011 0012 0013 0014 0015 0016 0017 0018 0019 001A can i remove this file, although size of this directory not to big its just 6.8M On Fri, May 2, 2014 at 10:36 AM, Juned Khan <jkh...@gm...> wrote: > Actually i haven setup anything related to checkpoint_segments manually. > > > > On Fri, May 2, 2014 at 10:31 AM, Michael Paquier < > mic...@gm...> wrote: > >> On Fri, May 2, 2014 at 11:46 AM, Masataka Saito <pg...@gm...> wrote: >> > Michael suggested followings. >> > * The partition for XC is full. >> > * df command will help you. (df is basic command enough to assume all >> > of Linux engineers knows it) >> And reducing checkpoint_segments would help. I am fairly guessing that >> Juned set it up to an utterly high value to avoid any checkpoints to >> be kicked by server during his benchmark tests to avoid I/O spikes >> caused by dirty page flushes. >> -- >> Michael >> > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: 鈴木 幸市 <ko...@in...> - 2014-05-02 07:42:49
|
The size of the directory is not the matter. The file 000B cannot be extended in size. I hope you have plenty of space available in the file system. You can get this by df command. If there are sufficient space available, it must not be an issue. I’m curious why the transaction is writing to quite old clog 000B. I’m afraid, as suggested, gtm.control is corrupted. Could you share the first line of gtm.control which suggests restarting GXID value? Regards; --- Koichi Suzuki 2014/05/02 15:07、Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> のメール: astpp=# select count(*) from accounts; ERROR: could not access status of transaction 0 DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space left on device. postgres@db02:~/pgxc/nodes/dn_master/pg_clog$ ls 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000A 000B 000C 000D 000E 000F 0010 0011 0012 0013 0014 0015 0016 0017 0018 0019 001A can i remove this file, although size of this directory not to big its just 6.8M On Fri, May 2, 2014 at 10:36 AM, Juned Khan <jkh...@gm...<mailto:jkh...@gm...>> wrote: Actually i haven setup anything related to checkpoint_segments manually. On Fri, May 2, 2014 at 10:31 AM, Michael Paquier <mic...@gm...<mailto:mic...@gm...>> wrote: On Fri, May 2, 2014 at 11:46 AM, Masataka Saito <pg...@gm...<mailto:pg...@gm...>> wrote: > Michael suggested followings. > * The partition for XC is full. > * df command will help you. (df is basic command enough to assume all > of Linux engineers knows it) And reducing checkpoint_segments would help. I am fairly guessing that Juned set it up to an utterly high value to avoid any checkpoints to be kicked by server during his benchmark tests to avoid I/O spikes caused by dirty page flushes. -- Michael -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com<http://www.inextrix.com/> ------------------------------------------------------------------------------ "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE Instantly run your Selenium tests across 300+ browser/OS combos. Get unparalleled scalability from the best Selenium testing platform available. Simple to use. Nothing to install. Get started now for free." http://p.sf.net/sfu/SauceLabs_______________________________________________ Postgres-xc-general mailing list Pos...@li... https://lists.sourceforge.net/lists/listinfo/postgres-xc-general |
|
From: Juned K. <jkh...@gm...> - 2014-05-02 11:09:49
|
Sorry for the late reply here is the first 2 lines of gtm.control file 28875757 opensips.public.aliases_id_seq\00 34160 1 1 1 9223372036854775807 f t 1 On Fri, May 2, 2014 at 1:12 PM, 鈴木 幸市 <ko...@in...> wrote: > The size of the directory is not the matter. The file 000B cannot be > extended in size. I hope you have plenty of space available in the file > system. You can get this by df command. If there are sufficient space > available, it must not be an issue. > > I’m curious why the transaction is writing to quite old clog 000B. > I’m afraid, as suggested, gtm.control is corrupted. Could you share the > first line of gtm.control which suggests restarting GXID value? > > Regards; > --- > Koichi Suzuki > > 2014/05/02 15:07、Juned Khan <jkh...@gm...> のメール: > > astpp=# select count(*) from accounts; > ERROR: could not access status of transaction 0 > DETAIL: Could not write to file "pg_clog/000B" at offset 40960: No space > left on device. > > postgres@db02:~/pgxc/nodes/dn_master/pg_clog$ ls > 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 000A 000B > 000C 000D 000E 000F 0010 0011 0012 0013 0014 0015 0016 0017 > 0018 0019 001A > > can i remove this file, although size of this directory not to big its > just 6.8M > > > On Fri, May 2, 2014 at 10:36 AM, Juned Khan <jkh...@gm...> wrote: > >> Actually i haven setup anything related to checkpoint_segments manually. >> >> >> >> On Fri, May 2, 2014 at 10:31 AM, Michael Paquier < >> mic...@gm...> wrote: >> >>> On Fri, May 2, 2014 at 11:46 AM, Masataka Saito <pg...@gm...> >>> wrote: >>> > Michael suggested followings. >>> > * The partition for XC is full. >>> > * df command will help you. (df is basic command enough to assume all >>> > of Linux engineers knows it) >>> And reducing checkpoint_segments would help. I am fairly guessing that >>> Juned set it up to an utterly high value to avoid any checkpoints to >>> be kicked by server during his benchmark tests to avoid I/O spikes >>> caused by dirty page flushes. >>> -- >>> Michael >>> >> >> >> >> -- >> Thanks, >> Juned Khan >> iNextrix Technologies Pvt Ltd. >> www.inextrix.com >> > > > > -- > Thanks, > Juned Khan > iNextrix Technologies Pvt Ltd. > www.inextrix.com > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs_______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |
|
From: Juned K. <jkh...@gm...> - 2014-05-03 05:32:06
|
Hi Koichi, I tried to follow the steps removing datanode slave and adding it again but i no success. even earlier only datanode slave was not working but now gtm slave, gtm_pxy2, coord2 and dn_slave is not starting up. I have removed GTM slave and datanode slave here is my current status of pgxc PGXC monitor all Running: gtm master Running: gtm proxy gtm_pxy1 Not running: gtm proxy gtm_pxy2 Running: coordinator master coord1 Not running: coordinator master coord2 Running: datanode master datanode1 Now each time i have to connect on db02 only( i.e PGXC Psql -h db02 -d mydatabase) but its not allowing me to modify table structure. and giving me below error. mydatabase=# alter table invoice_summary_data add countrycode text not null default ''::character(1); ERROR: Failed to get pooled connections CONTEXT: SQL statement "EXECUTE DIRECT ON (coord2) 'SELECT pg_catalog.pg_try_advisory_xact_lock_shared(65535, 0)'" So as of now i am planning to remove those components which are not working (gtm_pxy2,coord2 ). so it will not go to connect cood2 which is down. what will be the impact of this, i don't want to loost database access completely. Please suggest. Regards Juned Khan |
|
From: Koichi S. <koi...@gm...> - 2014-05-04 10:40:32
|
Did you make any test before you did any action on your database? Also could you share what you did to have this situation? Anyway, with this situation, I believe no DDL has been handled except for temporary object, which is session specific. So I believe you can restart GTM proxy gtm_pxy2 (you don't have to reinitialize it as long as you maintain gtm_proxy.conf) and coord2. Anyway, it is very essential to record and share what you did, and more important thing is to test it with non-product environment and see what goes on, and then review any step you are taking before you do. Hope to have more info on this. Best Regards; --- Koichi Suzuki 2014-05-03 14:31 GMT+09:00 Juned Khan <jkh...@gm...>: > Hi Koichi, > > I tried to follow the steps removing datanode slave and adding it again but > i no success. even earlier only datanode slave was not working but now gtm > slave, gtm_pxy2, coord2 and dn_slave is not starting up. I have removed GTM > slave and datanode slave > > here is my current status of pgxc > > PGXC monitor all > Running: gtm master > Running: gtm proxy gtm_pxy1 > Not running: gtm proxy gtm_pxy2 > Running: coordinator master coord1 > Not running: coordinator master coord2 > Running: datanode master datanode1 > > Now each time i have to connect on db02 only( i.e PGXC Psql -h db02 -d > mydatabase) but its not allowing me to modify table structure. and giving me > below error. > > mydatabase=# alter table invoice_summary_data add countrycode text not null > default ''::character(1); > ERROR: Failed to get pooled connections > CONTEXT: SQL statement "EXECUTE DIRECT ON (coord2) 'SELECT > pg_catalog.pg_try_advisory_xact_lock_shared(65535, 0)'" > > So as of now i am planning to remove those components which are not working > (gtm_pxy2,coord2 ). so it will not go to connect cood2 which is down. > > what will be the impact of this, i don't want to loost database access > completely. > > Please suggest. > > Regards > Juned Khan > > ------------------------------------------------------------------------------ > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > Instantly run your Selenium tests across 300+ browser/OS combos. Get > unparalleled scalability from the best Selenium testing platform available. > Simple to use. Nothing to install. Get started now for free." > http://p.sf.net/sfu/SauceLabs > _______________________________________________ > Postgres-xc-general mailing list > Pos...@li... > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > |
|
From: Juned K. <jkh...@gm...> - 2014-05-06 12:50:23
|
nope i have directly did those steps on live server. I don't know exactly what cause that problem. one day i just tried to run some queries manually and got those errors. I have tried to restart all components several times without success. now another problem is i am not able to take dump of database. i am getting this error. postgres@db02:~$ pg_dump -h db02 --exclude-table=cdrs db -f db.sql pg_dump: [archiver (db)] query failed: ERROR: Failed to get pooled connections pg_dump: [archiver (db)] query was: LOCK TABLE public.accounts IN ACCESS SHARE MODE Please suggest. On Sun, May 4, 2014 at 4:10 PM, Koichi Suzuki <koi...@gm...> wrote: > Did you make any test before you did any action on your database? > Also could you share what you did to have this situation? Anyway, > with this situation, I believe no DDL has been handled except for > temporary object, which is session specific. > > So I believe you can restart GTM proxy gtm_pxy2 (you don't have to > reinitialize it as long as you maintain gtm_proxy.conf) and coord2. > > Anyway, it is very essential to record and share what you did, and > more important thing is to test it with non-product environment and > see what goes on, and then review any step you are taking before you > do. > > Hope to have more info on this. > > Best Regards; > --- > Koichi Suzuki > > > 2014-05-03 14:31 GMT+09:00 Juned Khan <jkh...@gm...>: > > Hi Koichi, > > > > I tried to follow the steps removing datanode slave and adding it again > but > > i no success. even earlier only datanode slave was not working but now > gtm > > slave, gtm_pxy2, coord2 and dn_slave is not starting up. I have removed > GTM > > slave and datanode slave > > > > here is my current status of pgxc > > > > PGXC monitor all > > Running: gtm master > > Running: gtm proxy gtm_pxy1 > > Not running: gtm proxy gtm_pxy2 > > Running: coordinator master coord1 > > Not running: coordinator master coord2 > > Running: datanode master datanode1 > > > > Now each time i have to connect on db02 only( i.e PGXC Psql -h db02 -d > > mydatabase) but its not allowing me to modify table structure. and > giving me > > below error. > > > > mydatabase=# alter table invoice_summary_data add countrycode text not > null > > default ''::character(1); > > ERROR: Failed to get pooled connections > > CONTEXT: SQL statement "EXECUTE DIRECT ON (coord2) 'SELECT > > pg_catalog.pg_try_advisory_xact_lock_shared(65535, 0)'" > > > > So as of now i am planning to remove those components which are not > working > > (gtm_pxy2,coord2 ). so it will not go to connect cood2 which is down. > > > > what will be the impact of this, i don't want to loost database access > > completely. > > > > Please suggest. > > > > Regards > > Juned Khan > > > > > ------------------------------------------------------------------------------ > > "Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE > > Instantly run your Selenium tests across 300+ browser/OS combos. Get > > unparalleled scalability from the best Selenium testing platform > available. > > Simple to use. Nothing to install. Get started now for free." > > http://p.sf.net/sfu/SauceLabs > > _______________________________________________ > > Postgres-xc-general mailing list > > Pos...@li... > > https://lists.sourceforge.net/lists/listinfo/postgres-xc-general > > > -- Thanks, Juned Khan iNextrix Technologies Pvt Ltd. www.inextrix.com |