From: Simone M. <ml...@kn...> - 2009-03-04 17:09:03
|
Hi is there a way to check for corrupted files in backuppc's pools? If my bakcuppc server hangs or gets reseted or other bad things.. at next boot it does an fsck of the filesystems and, suppose, fsck puts some file in lost+found.. Is there a way to discover wich backups are to be considered corrupted? Or wich one has lost some files? thanks in advance |
From: Nate <nm...@vi...> - 2009-03-04 23:43:41
|
We seem to be routinely having this issue where the server backuppc is running on throws a kernel panic and thus hard locks the machine. It's completely random, sometimes happens daily, sometimes we can have a lucky 2-3 weeks without a lockup. I've taken a screenshot and posted it here: http://locu.net/misc/kernelp_backuppc.jpg This hardware has been in use for years without as much as a burp before using backuppc, so I'm suspecting this could be an ext3 issue with the multitudes of files and ext3's inability to handle them? Prior to using backup pc, we backed up the same data just in flat .tgz files. System info: kernel: 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 x86_64 x86_64 GNU/Linux distro: centos 5.2 hw: athlon 64 3200+, 4GB ram Any thoughts? Thanks, Nathan |
From: Chris R. <cro...@gc...> - 2009-03-05 00:07:32
|
Nate wrote: > We seem to be routinely having this issue where the server backuppc > is running on throws a kernel panic and thus hard locks the > machine. It's completely random, sometimes happens daily, sometimes > we can have a lucky 2-3 weeks without a lockup. I've taken a > screenshot and posted it here: > > http://locu.net/misc/kernelp_backuppc.jpg > > This hardware has been in use for years without as much as a burp > before using backuppc, so I'm suspecting this could be an ext3 issue > with the multitudes of files and ext3's inability to handle > them? Prior to using backup pc, we backed up the same data just in > flat .tgz files. > > System info: > kernel: 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 > x86_64 x86_64 GNU/Linux > distro: centos 5.2 > hw: athlon 64 3200+, 4GB ram > > Any thoughts? > What kind of drive controller are you using? What kind of drives? Further searching finds a similar issue: http://bugs.centos.org/view.php?id=2321 > Thanks, > Nathan Chris |
From: Stephen V. <ste...@gm...> - 2009-03-05 00:13:43
|
Yeah I can't see how this is a backuppc issue On Thu, Mar 5, 2009 at 11:07 AM, Chris Robertson <cro...@gc...> wrote: > Nate wrote: > > We seem to be routinely having this issue where the server backuppc > > is running on throws a kernel panic and thus hard locks the > > machine. It's completely random, sometimes happens daily, sometimes > > we can have a lucky 2-3 weeks without a lockup. I've taken a > > screenshot and posted it here: > > > > http://locu.net/misc/kernelp_backuppc.jpg > > > > This hardware has been in use for years without as much as a burp > > before using backuppc, so I'm suspecting this could be an ext3 issue > > with the multitudes of files and ext3's inability to handle > > them? Prior to using backup pc, we backed up the same data just in > > flat .tgz files. > > > > System info: > > kernel: 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 > > x86_64 x86_64 GNU/Linux > > distro: centos 5.2 > > hw: athlon 64 3200+, 4GB ram > > > > Any thoughts? > > > > What kind of drive controller are you using? What kind of drives? > > Further searching finds a similar issue: > http://bugs.centos.org/view.php?id=2321 > > > Thanks, > > Nathan > > Chris > > > ------------------------------------------------------------------------------ > Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, > CA > -OSBC tackles the biggest issue in open source: Open Sourcing the > Enterprise > -Strategies to boost innovation and cut costs with open source > participation > -Receive a $600 discount off the registration fee with the source code: > SFAD > http://p.sf.net/sfu/XcvMzF8H > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > -- Best Regards, Stephen Sent from: Sydney Nsw Australia. |
From: Nate <nm...@vi...> - 2009-03-05 01:09:49
|
Yeah, I doubt very much it's a backuppc issue, sorry if I may have implied that. I'm fairly confident it's a ext3/driver issue. But as this popped up when we began using backuppc I suspect it may have to do with the massive quantities of files and had hoped another backuppc may have encountered and solved it. controller is onboard nvidia, using the sata_nv driver drives are all Seagate ST31500341AS 1.5TB drives drives are tethered by LVM to a 4.1TB single ext3 partition Seems that centos reported bug also is using LVM, perhaps a tie, but it's ext3 that's crashing by the crash output. At 04:13 PM 3/4/2009, Stephen Vaughan wrote: >Yeah I can't see how this is a backuppc issue > >On Thu, Mar 5, 2009 at 11:07 AM, Chris Robertson ><<mailto:cro...@gc...>cro...@gc...> wrote: >Nate wrote: > > We seem to be routinely having this issue where the server backuppc > > is running on throws a kernel panic and thus hard locks the > > machine. It's completely random, sometimes happens daily, sometimes > > we can have a lucky 2-3 weeks without a lockup. I've taken a > > screenshot and posted it here: > > > > > <http://locu.net/misc/kernelp_backuppc.jpg>http://locu.net/misc/kernelp_backuppc.jpg > > > > This hardware has been in use for years without as much as a burp > > before using backuppc, so I'm suspecting this could be an ext3 issue > > with the multitudes of files and ext3's inability to handle > > them? Prior to using backup pc, we backed up the same data just in > > flat .tgz files. > > > > System info: > > kernel: 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 > > x86_64 x86_64 GNU/Linux > > distro: centos 5.2 > > hw: athlon 64 3200+, 4GB ram > > > > Any thoughts? > > > >What kind of drive controller are you using? What kind of drives? > >Further searching finds a similar issue: ><http://bugs.centos.org/view.php?id=2321>http://bugs.centos.org/view.php?id=2321 > > > Thanks, > > Nathan > >Chris |
From: Chris R. <cro...@gc...> - 2009-03-05 01:53:00
|
Nate wrote: > Yeah, I doubt very much it's a backuppc issue, sorry if I may have > implied that. I'm fairly confident it's a ext3/driver issue. But as > this popped up when we began using backuppc I suspect it may have to > do with the massive quantities of files and had hoped another > backuppc may have encountered and solved it. > > controller is onboard nvidia, using the sata_nv driver > drives are all Seagate ST31500341AS 1.5TB drives > Hmmm... http://techreport.com/discussions.x/15863 > drives are tethered by LVM to a 4.1TB single ext3 partition > > Seems that centos reported bug also is using LVM, perhaps a tie, but > it's ext3 that's crashing by the crash output. > It's likely the kernel trying to access the ext3 data and not getting a response. What does "fgrep frozen /var/log/messages" show? You might try disabling the write cache on the physical drives (hdparm -W0 /dev/sd{a,b,c), as some have reported that solves the "stuttering" at the cost of performance. Chris |
From: Nate <nm...@vi...> - 2009-03-06 18:42:01
|
At 05:52 PM 3/4/2009, Chris Robertson wrote: >Nate wrote: > > Yeah, I doubt very much it's a backuppc issue, sorry if I may have > > implied that. I'm fairly confident it's a ext3/driver issue. But as > > this popped up when we began using backuppc I suspect it may have to > > do with the massive quantities of files and had hoped another > > backuppc may have encountered and solved it. > > > > controller is onboard nvidia, using the sata_nv driver > > drives are all Seagate ST31500341AS 1.5TB drives > > > >Hmmm... http://techreport.com/discussions.x/15863 > > > drives are tethered by LVM to a 4.1TB single ext3 partition > > > > Seems that centos reported bug also is using LVM, perhaps a tie, but > > it's ext3 that's crashing by the crash output. > > > >It's likely the kernel trying to access the ext3 data and not getting a >response. What does "fgrep frozen /var/log/messages" show? nothing returned in the current or past messages files. >You might try disabling the write cache on the physical drives (hdparm >-W0 /dev/sd{a,b,c), as some have reported that solves the "stuttering" >at the cost of performance. I'm going to give this a shot. I'm concerned at the impact it may have on backuppc's operations as disk i/o seems to be the biggest bottleneck anyways. If it solves the problem I should know in < 2 weeks hopefully if there are no lockups. |
From: Les M. <les...@gm...> - 2009-03-06 19:25:23
|
Nate wrote: > >> You might try disabling the write cache on the physical drives (hdparm >> -W0 /dev/sd{a,b,c), as some have reported that solves the "stuttering" >> at the cost of performance. > > I'm going to give this a shot. I'm concerned at the impact it may > have on backuppc's operations as disk i/o seems to be the biggest > bottleneck anyways. If it solves the problem I should know in < 2 > weeks hopefully if there are no lockups. With current hardware prices you are probably better off just replacing anything that doesn't work right, especially intermittently because you'll never know when you can trust it. I wasted a lot of time with a machine that had a RAM problem (took days of memtest86 to show it) and even after it was fixed the problems kept popping back up as the alternate copy of raid mirrors were read. -- Les Mikesell les...@gm... |
From: Nate <nm...@vi...> - 2009-06-26 19:27:53
|
At 04:26 PM 3/4/2009, Nate wrote: >We seem to be routinely having this issue where the server backuppc >is running on throws a kernel panic and thus hard locks the >machine. It's completely random, sometimes happens daily, sometimes >we can have a lucky 2-3 weeks without a lockup. I've taken a >screenshot and posted it here: > >http://locu.net/misc/kernelp_backuppc.jpg > >This hardware has been in use for years without as much as a burp >before using backuppc, so I'm suspecting this could be an ext3 issue >with the multitudes of files and ext3's inability to handle >them? Prior to using backup pc, we backed up the same data just in >flat .tgz files. > >System info: >kernel: 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 >x86_64 x86_64 GNU/Linux >distro: centos 5.2 >hw: athlon 64 3200+, 4GB ram > >Any thoughts? > >Thanks, >Nathan FYI, I thought I'd update the list. We tried about everything, including new hardware. The problem went away completely when we switched off LVM and just started using RAID. We're left with less flexibility in adding new drives, but at least the system doesn't crash. LVM + ext3 + BackupPC = *boom* |
From: Peter W. <pw...@it...> - 2009-06-26 20:40:58
|
Nate wrote: > At 04:26 PM 3/4/2009, Nate wrote: > >> We seem to be routinely having this issue where the server backuppc >> is running on throws a kernel panic and thus hard locks the >> machine. It's completely random, sometimes happens daily, sometimes >> we can have a lucky 2-3 weeks without a lockup. I've taken a >> screenshot and posted it here: >> >> http://locu.net/misc/kernelp_backuppc.jpg >> >> This hardware has been in use for years without as much as a burp >> before using backuppc, so I'm suspecting this could be an ext3 issue >> with the multitudes of files and ext3's inability to handle >> them? Prior to using backup pc, we backed up the same data just in >> flat .tgz files. >> >> System info: >> kernel: 2.6.18-92.1.22.el5 #1 SMP Tue Dec 16 11:57:43 EST 2008 x86_64 >> x86_64 x86_64 GNU/Linux >> distro: centos 5.2 >> hw: athlon 64 3200+, 4GB ram >> >> Any thoughts? >> >> Thanks, >> Nathan >> > > FYI, I thought I'd update the list. We tried about everything, > including new hardware. The problem went away completely when we > switched off LVM and just started using RAID. We're left with less > flexibility in adding new drives, but at least the system doesn't crash. > > LVM + ext3 + BackupPC = *boom* > > I run LVM + ext3 + BackupPC with Centos 5.2 and 5.3 and the systems are rock-solid. You might want to consider the thread in http://bugs.centos.org/view.php?id=2321 before declaring you know for sure what the problem is. Peter |