From: Adam G. <mai...@we...> - 2013-10-30 22:05:04
|
On 31/10/13 07:51, Holger Parplies wrote: > Yes, as it's basically an extension of "start off fresh" with the addition of > "keep old history around in parallel". The notable thing is that you need to > *make sure* you have eliminated the problem for there to be any point in > starting over. > > Aside from that, I would think it might be worth the effort of determining > whether all hosts are affected or not (though I can't really see why there > should be a difference between hosts). If some aren't, you could at least keep > their history. I suspect at least some hosts OR some backups are correct, or else OP wouldn't have anything in the pool. If you find the problem affects all hosts (the du command discussed previously will tell you that), then you might want to look at one individual host like this: du -sm /backup/pool /backup/cpool /backup/pc/host1/* This should be a *lot* quicker than the previous du command, and also should show minimal disk usage for each backup for host1. It is quicker because you are only looking at the set of files for the pool, plus one host. PS, at this stage, you may want to look at the recent thread regarding disk caches, and caching directory entries instead of file contents. It might help with all the directory based searches you are doing to find the problem. Long term you may (or not) want to keep the settings. Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au |
From: Craig O'B. <co...@fi...> - 2013-11-01 13:48:30
|
> This error shows BackupPC_dump segfault, and pointing to libperl.so > How do you install your BackupPC ? From source or from RPM? I did a yum install backuppc, which got it from epel > That tells you it was unmounted cleanly last time, not that everything checks out OK. > Try it with the -f option to make it do the actual checks. bash-4.1$ fsck -f /dev/sda1 fsck from util-linux-ng 2.17.2 e2fsck 1.41.12 (17-May-2010) Pass 1: Checking inodes, blocks, and sizes Pass 2: Checking directory structure Pass 3: Checking directory connectivity Pass 4: Checking reference counts Pass 5: Checking group summary information /dev/sda1: 20074505/2929688576 files (0.3% non-contiguous), 2775975116/2929686016 blocks bash-4.1$ > What distro are you using? (I use CentOS/RHEL) CentosOS release 6.4 > How many backups are/were you running in parallel? Typically 4. But when I switched everything to rsync I wanted fulls done on all the pc's so I was running up to 8 at a time. > I think that segfault in a perl process needs to be tracked down before expecting anything else to make sense. > Either bad RAM or mismatching perl libs could break about anything else. I installed perl-libs with yum as well. A yum info perl-libs tells me it was installed from the updates repo I think what I'm going to try at this point is to delete the bad backups, reinstall perl from epel, and keep an eye on it to see if it balloons up again. Thanks for all your help! Regards, Craig On Thu, Oct 31, 2013 at 10:09 PM, Sharuzzaman Ahmat Raslan < sha...@gm...> wrote: > In my experience, segfault in libraries usually caused by installing it > from different source. > > For example, when I install BackupPC for CentOS, I use the one in EPEL > repo. > > I make sure that all the libraries (perl and others), only come from > CentOS base repo, and not from other, as installing them from somewhere > else might cause incompatibilities. > > In fact, sometime EPEL repo also provide perl library that conflict with > CentOS base repo, but I just ignore it, and stick to base repo. > > > > > On Fri, Nov 1, 2013 at 3:57 AM, Les Mikesell <les...@gm...>wrote: > >> On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...> >> wrote: >> >> >> > That doesn't explain your situation, but it still might be something to >> think >> > about (and we might be seeing one problem on top of and as result of >> another). >> > I agree with Jeffrey - an "Unable to read ..." error *without* a >> preceeding >> > "Can't write len=... to .../RStmp" sounds like a mismatch between file >> length >> > according to attrib file and result of decompression of compressed file >> - >> > probably caused by corruption of the compressed file (or the attrib >> file, >> > though unlikely, because the size is not "way off"). >> >> I think that segfault in a perl process needs to be tracked down >> before expecting anything else to make sense. Either bad RAM or >> mismatching perl libs could break about anything else. >> >> -- >> Les Mikesell >> les...@gm... >> >> >> ------------------------------------------------------------------------------ >> Android is increasing in popularity, but the open development platform >> that >> developers love is also attractive to malware creators. Download this >> white >> paper to learn more about secure code signing practices that can help keep >> Android apps secure. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk >> _______________________________________________ >> BackupPC-users mailing list >> Bac...@li... >> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users >> Wiki: http://backuppc.wiki.sourceforge.net >> Project: http://backuppc.sourceforge.net/ >> > > > > -- > Sharuzzaman Ahmat Raslan > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > > |
From: Craig O'B. <co...@fi...> - 2013-11-01 14:11:14
|
>And this would explain why the elements are not being linked properly to the pool -- though I would have thought the more likely result would be a duplicate pool entry than an unlinked pool entry... >It might be interesting to look for pool chains with the same (uncompressed) content and with links < HardLinkMax (typically 31999) to see if pool entries are being unnecessarily duplicated. >Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec md5sum {} \;) | sort | uniq -d -w32 > Note this will find if there are any unnecessarily duplicated pool chains (beyond the base one). Note to keep it fast and simple I am > skipping the elements without a suffix... with the assumption being that if there are duplicated elements then there will probably be > whole chains of them... bash-4.1$ find . -type f -links -3198 -name "*_*" -exec md5sum {} \; | sort | uniq -d -w32 71f4cd3f08af68c2ab20c268d86fa9f3 ./c/9/0/c900361b8dc42b2094d836d43504708a_0 bash-4.1$ Looks like this did find something. What should I do with it? Regards, Craig On Fri, Nov 1, 2013 at 9:48 AM, Craig O'Brien <co...@fi...> wrote: > > This error shows BackupPC_dump segfault, and pointing to libperl.so > > How do you install your BackupPC ? From source or from RPM? > > I did a yum install backuppc, which got it from epel > > > That tells you it was unmounted cleanly last time, not that > everything checks out OK. > > Try it with the -f option to make it do the actual checks. > > bash-4.1$ fsck -f /dev/sda1 > fsck from util-linux-ng 2.17.2 > e2fsck 1.41.12 (17-May-2010) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > /dev/sda1: 20074505/2929688576 files (0.3% non-contiguous), > 2775975116/2929686016 blocks > bash-4.1$ > > > What distro are you using? (I use CentOS/RHEL) > > CentosOS release 6.4 > > > How many backups are/were you running in parallel? > > Typically 4. But when I switched everything to rsync I wanted fulls done > on all the pc's so I was running up to 8 at a time. > > > I think that segfault in a perl process needs to be tracked down before > expecting anything else to make sense. > > Either bad RAM or mismatching perl libs could break about anything else. > > I installed perl-libs with yum as well. A yum info perl-libs tells me it > was installed from the updates repo > > I think what I'm going to try at this point is to delete the bad backups, > reinstall perl from epel, and keep an eye on it to see if it balloons up > again. Thanks for all your help! > > > Regards, > Craig > > > On Thu, Oct 31, 2013 at 10:09 PM, Sharuzzaman Ahmat Raslan < > sha...@gm...> wrote: > >> In my experience, segfault in libraries usually caused by installing it >> from different source. >> >> For example, when I install BackupPC for CentOS, I use the one in EPEL >> repo. >> >> I make sure that all the libraries (perl and others), only come from >> CentOS base repo, and not from other, as installing them from somewhere >> else might cause incompatibilities. >> >> In fact, sometime EPEL repo also provide perl library that conflict with >> CentOS base repo, but I just ignore it, and stick to base repo. >> >> >> >> >> On Fri, Nov 1, 2013 at 3:57 AM, Les Mikesell <les...@gm...>wrote: >> >>> On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...> >>> wrote: >>> >> >>> > That doesn't explain your situation, but it still might be something >>> to think >>> > about (and we might be seeing one problem on top of and as result of >>> another). >>> > I agree with Jeffrey - an "Unable to read ..." error *without* a >>> preceeding >>> > "Can't write len=... to .../RStmp" sounds like a mismatch between file >>> length >>> > according to attrib file and result of decompression of compressed >>> file - >>> > probably caused by corruption of the compressed file (or the attrib >>> file, >>> > though unlikely, because the size is not "way off"). >>> >>> I think that segfault in a perl process needs to be tracked down >>> before expecting anything else to make sense. Either bad RAM or >>> mismatching perl libs could break about anything else. >>> >>> -- >>> Les Mikesell >>> les...@gm... >>> >>> >>> ------------------------------------------------------------------------------ >>> Android is increasing in popularity, but the open development platform >>> that >>> developers love is also attractive to malware creators. Download this >>> white >>> paper to learn more about secure code signing practices that can help >>> keep >>> Android apps secure. >>> >>> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk >>> _______________________________________________ >>> BackupPC-users mailing list >>> Bac...@li... >>> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users >>> Wiki: http://backuppc.wiki.sourceforge.net >>> Project: http://backuppc.sourceforge.net/ >>> >> >> >> >> -- >> Sharuzzaman Ahmat Raslan >> >> >> ------------------------------------------------------------------------------ >> Android is increasing in popularity, but the open development platform >> that >> developers love is also attractive to malware creators. Download this >> white >> paper to learn more about secure code signing practices that can help keep >> Android apps secure. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk >> _______________________________________________ >> BackupPC-users mailing list >> Bac...@li... >> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users >> Wiki: http://backuppc.wiki.sourceforge.net >> Project: http://backuppc.sourceforge.net/ >> >> > |
From: Les M. <les...@gm...> - 2013-11-01 14:54:10
|
On Fri, Nov 1, 2013 at 8:48 AM, Craig O'Brien <co...@fi...> wrote: >> This error shows BackupPC_dump segfault, and pointing to libperl.so >> How do you install your BackupPC ? From source or from RPM? > > I did a yum install backuppc, which got it from epel Do you see any other segfaults in your logs (not necessarily just from backuppc)? >> How many backups are/were you running in parallel? > > Typically 4. But when I switched everything to rsync I wanted fulls done on > all the pc's so I was running up to 8 at a time. Most machines would get better overall throughput with a max of 2 concurrent runs (depending on a lot of things, of course...). >> I think that segfault in a perl process needs to be tracked down before >> expecting anything else to make sense. >> Either bad RAM or mismatching perl libs could break about anything else. > > I installed perl-libs with yum as well. A yum info perl-libs tells me it was > installed from the updates repo Have you installed anything from repos other than the CentOS base and EPEL? You shouldn't have any trouble with anything from those. -- Les Mikesell les...@gm... |
From: Timothy J M. <tm...@ob...> - 2013-11-01 15:15:51
|
"Craig O'Brien" <co...@fi...> wrote on 11/01/2013 09:48:23 AM: > > This error shows BackupPC_dump segfault, and pointing to libperl.so > > How do you install your BackupPC ? From source or from RPM? > > I did a yum install backuppc, which got it from epel That's how I do it. > > That tells you it was unmounted cleanly last time, not that > everything checks out OK. > > Try it with the -f option to make it do the actual checks. > > bash-4.1$ fsck -f /dev/sda1 > fsck from util-linux-ng 2.17.2 > e2fsck 1.41.12 (17-May-2010) > Pass 1: Checking inodes, blocks, and sizes > Pass 2: Checking directory structure > Pass 3: Checking directory connectivity > Pass 4: Checking reference counts > Pass 5: Checking group summary information > /dev/sda1: 20074505/2929688576 files (0.3% non-contiguous), > 2775975116/2929686016 blocks > bash-4.1$ Good. I think we've eliminated a disk or filesystem issue. I think we're pretty comfortable it's a BackupPC corruption issue. It was hard to tell when your error messages said that it could not seek to a particular point in a file. > > What distro are you using? (I use CentOS/RHEL) > > CentosOS release 6.4 Same here. > > I think that segfault in a perl process needs to be tracked > down before expecting anything else to make sense. > > Either bad RAM or mismatching perl libs could break about anything else. > > I installed perl-libs with yum as well. A yum info perl-libs tells > me it was installed from the updates repo > > I think what I'm going to try at this point is to delete the bad > backups, reinstall perl from epel, and keep an eye on it to see if > it balloons up again. Thanks for all your help! That's a very reasonable, if not very subtle, solution. I think you need to monitor /var/log/messages for errors that mention backup. See if the crash returns. Jeff is (justifiably) worried that the crash caused your corruption, but it could just as easily be the other way around. Once you clean up from this, you want to make sure that nothing comes back. If you've got the time, running memtest for a weekend might be a good idea, too. The only thing it would cost is the downtime... Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Timothy J M. <tm...@ob...> - 2013-10-31 18:01:04
|
Les Mikesell <les...@gm...> wrote on 10/31/2013 01:54:24 PM: > On Thu, Oct 31, 2013 at 12:33 PM, Craig O'Brien <co...@fi...> wrote: > > > >> fsck the filesystem. > > > > bash-4.1$ fsck /dev/sda1 > > fsck from util-linux-ng 2.17.2 > > e2fsck 1.41.12 (17-May-2010) > > /dev/sda1: clean, 20074506/2929688576 files, 2775975889/2929686016 blocks > > bash-4.1$ > > That tells you it was unmounted cleanly last time, not that everything > checks out OK. Try it with the -f option to make it do the actual > checks. Good catch! This should take a long time: 20 minutes to an hour? Maybe more: the drives are full. Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Holger P. <wb...@pa...> - 2013-10-31 19:22:21
|
Hi, I've spent far too long writing an email and trying to make it make sense and then discarding it again. Just one thought I want to rescue: the RStmp file was really *large* (something like 1.5 GB), your backup trees are really *large* (1.4 TB), your pool FS is really *full* (27.5 GB free). Running out of space during a backup is a bad idea. Both the RStmp file(s) will be truncated (though that should trigger a second error when it is *written*, just before it is read again) and the NewFileList, which would, in turn, lead to BackupPC_link missing new files it would be supposed to link into the pool (resulting in unlinked files). That doesn't explain your situation, but it still might be something to think about (and we might be seeing one problem on top of and as result of another). I agree with Jeffrey - an "Unable to read ..." error *without* a preceeding "Can't write len=... to .../RStmp" sounds like a mismatch between file length according to attrib file and result of decompression of compressed file - probably caused by corruption of the compressed file (or the attrib file, though unlikely, because the size is not "way off"). How many backups are/were you running in parallel? Hope that helps. Regards, Holger |
From: Les M. <les...@gm...> - 2013-10-31 19:57:45
|
On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...> wrote: >> > That doesn't explain your situation, but it still might be something to think > about (and we might be seeing one problem on top of and as result of another). > I agree with Jeffrey - an "Unable to read ..." error *without* a preceeding > "Can't write len=... to .../RStmp" sounds like a mismatch between file length > according to attrib file and result of decompression of compressed file - > probably caused by corruption of the compressed file (or the attrib file, > though unlikely, because the size is not "way off"). I think that segfault in a perl process needs to be tracked down before expecting anything else to make sense. Either bad RAM or mismatching perl libs could break about anything else. -- Les Mikesell les...@gm... |
From: Sharuzzaman A. R. <sha...@gm...> - 2013-11-01 02:10:03
|
In my experience, segfault in libraries usually caused by installing it from different source. For example, when I install BackupPC for CentOS, I use the one in EPEL repo. I make sure that all the libraries (perl and others), only come from CentOS base repo, and not from other, as installing them from somewhere else might cause incompatibilities. In fact, sometime EPEL repo also provide perl library that conflict with CentOS base repo, but I just ignore it, and stick to base repo. On Fri, Nov 1, 2013 at 3:57 AM, Les Mikesell <les...@gm...> wrote: > On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...> > wrote: > >> > > That doesn't explain your situation, but it still might be something to > think > > about (and we might be seeing one problem on top of and as result of > another). > > I agree with Jeffrey - an "Unable to read ..." error *without* a > preceeding > > "Can't write len=... to .../RStmp" sounds like a mismatch between file > length > > according to attrib file and result of decompression of compressed file - > > probably caused by corruption of the compressed file (or the attrib file, > > though unlikely, because the size is not "way off"). > > I think that segfault in a perl process needs to be tracked down > before expecting anything else to make sense. Either bad RAM or > mismatching perl libs could break about anything else. > > -- > Les Mikesell > les...@gm... > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > -- Sharuzzaman Ahmat Raslan |
From: Les M. <les...@gm...> - 2013-10-31 17:54:30
|
On Thu, Oct 31, 2013 at 12:33 PM, Craig O'Brien <co...@fi...> wrote: > >> fsck the filesystem. > > bash-4.1$ fsck /dev/sda1 > fsck from util-linux-ng 2.17.2 > e2fsck 1.41.12 (17-May-2010) > /dev/sda1: clean, 20074506/2929688576 files, 2775975889/2929686016 blocks > bash-4.1$ That tells you it was unmounted cleanly last time, not that everything checks out OK. Try it with the -f option to make it do the actual checks. > > I don't suppose this helps give any insight to what happened? Thanks for all > your help! I think it is related to that RStmp file that isn't uncompressing correctly so rsync can merge the changes - I'm not sure what happens after that error, though, or how to find the compressed file that is probably causing it. -- Les Mikesell les...@gm... |
From: <bac...@ko...> - 2013-10-29 19:56:12
|
Craig O'Brien wrote at about 13:53:31 -0400 on Tuesday, October 29, 2013: > On the General Server Information page, it says "Pool is 2922.42GB > comprising 6061942 files and 4369 directories," but our pool file system > which contains nothing but backuppc and is 11 TB in size is 100% full. > > I'm confused how this happened and even ran the BackupPC_nightly script by > hand which didn't seem to clear up any space. Judging by the reported pool > size it should be less than 30% full. I could really use some help. Thanks > in advance for any ideas on how to go about troubleshooting this. > > Regards, > Craig > --------------------------------------------------------------------------- I suspect that you have (multiple) backups in the pc tree that are not linked to the pool... The following is a quick-and-dirty hack to find non-zero length *files* in the pc tree that have fewer than 2 hard links (note this will miss files that are linked to themselves but not to the pool): cd /var/lib/BackupPC/pc/; find */*/* -type f -links -2 -size +0 | grep -v "^/[^/]*/[0-9]*/backupInfo" |
From: Tyler J. W. <ty...@to...> - 2013-10-30 14:39:49
|
On 2013-10-29 17:53, Craig O'Brien wrote: > On the General Server Information page, it says "Pool is 2922.42GB > comprising 6061942 files and 4369 directories," but our pool file system > which contains nothing but backuppc and is 11 TB in size is 100% full. Did you forget to exclude the path to TopDir (usually /var/lib/backuppc) from the backup of the BackupPC server itself? I've seen that before. Heck, I've DONE that before. Regards, Tyler -- "The intellectual is constantly betrayed by his vanity. Godlike he blandly assumes that he can express everything in words; whereas the things one loves, lives, and dies for are not, in the last analysis completely expressible in words." -- Anne Morrow Lindbergh |
From: Craig O'B. <co...@fi...> - 2013-10-30 15:07:23
|
> I'm fairly sure: > du -sm /backup/pool /backup/cpool /backup/pc/* > It should count all the data under pool and cpool, and there should be minimal space used for the pc folders (because it counts the space for the first time the inode is seen) I'm trying that now. I'll report back when it finishes. > To delete a host, hit the Delete button. For Add, Delete, and configuration copy, changes don't take effect until you select Save. None of the deleted host's backups will be removed, so if you accidently delete a host, simply re-add it. *> To completely remove a host's backups, you need to manually remove the files below /var/lib/backuppc/pc/HOST * This is how I've done it when I've removed a host. I would delete the /backup/pc/host directory and remove the entry from /etc/BackupPC/hosts file. > I would not stake my life on this, but I would bet a pretty substantial amount of money: you did something to break the pooling. Most likely by copying backups around. This undid the hardlinks and > left you with individual copies of the files. I don't doubt the pooling is probably broken but I haven't moved any backups around. For what it's worth I before I switched all the pc's to use rsync instead of smb a couple months ago my pool file system was sitting at 30%. I don't know if that's relevant but it does seem odd that my problems seem to have started with that. > Or punt completely: rebuild the BackupPC server and start over. > You could do almost as well by confirming that your latest backups *are* hardlinking properly and then deleting all of the old backups except maybe a copy or two. > I would not delete the copies by > hand, but rather change the configuration to only keep 1 full and 1 incremental. > It might be a good idea to make some archives to make sure you have a good copy somewhere. In any case, once BackupPC has deleted all of the old backups, > go into your pc directories and make sure that there is indeed only the backups listed in the GUI in the folder structure. > Then, change the incremental and full keep counts back to what they should be and allow it to rebuild. I'll probably have to do that. At this point I'm just trying to add to the knowledge base and figure out how it went wrong so it doesn't just happen again. > My thought was to parse the output of "df /path/to/drive" and confirm that it was mounted correctly. Just in case it helps at all: bash-4.1$ df -h /backup Filesystem Size Used Avail Use% Mounted on /dev/sda1 11T 9.7T 28G 100% /backup bash-4.1$ > Did you forget to exclude the path to TopDir (usually /var/lib/backuppc) > from the backup of the BackupPC server itself? I've seen that before. Heck, > I've DONE that before. I don't have the server backing itself up. Here's my config file (with #comment lines removed) just in case that helps at all. ----------------------------------------- $Conf{ServerHost} = 'localhost'; $Conf{ServerPort} = -1; $Conf{ServerMesgSecret} = ''; $Conf{MyPath} = '/bin'; $Conf{UmaskMode} = 23; $Conf{WakeupSchedule} = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]; $Conf{MaxBackups} = 4; $Conf{MaxUserBackups} = 4; $Conf{MaxPendingCmds} = 15; $Conf{CmdQueueNice} = 10; $Conf{MaxBackupPCNightlyJobs} = 4; $Conf{BackupPCNightlyPeriod} = 1; $Conf{MaxOldLogFiles} = 14; $Conf{DfPath} = '/bin/df'; $Conf{DfCmd} = '$dfPath $topDir'; $Conf{SplitPath} = '/usr/bin/split'; $Conf{ParPath} = undef; $Conf{CatPath} = '/bin/cat'; $Conf{GzipPath} = '/bin/gzip'; $Conf{Bzip2Path} = '/usr/bin/bzip2'; $Conf{DfMaxUsagePct} = 95; $Conf{TrashCleanSleepSec} = 300; $Conf{DHCPAddressRanges} = []; $Conf{BackupPCUser} = 'backuppc'; $Conf{TopDir} = '/var/lib/BackupPC/'; $Conf{ConfDir} = '/etc/BackupPC/'; $Conf{LogDir} = '/var/log/BackupPC'; $Conf{InstallDir} = '/usr/share/BackupPC'; $Conf{CgiDir} = '/usr/share/BackupPC/sbin/'; $Conf{BackupPCUserVerify} = '1'; $Conf{HardLinkMax} = 31999; $Conf{PerlModuleLoad} = undef; $Conf{ServerInitdPath} = undef; $Conf{ServerInitdStartCmd} = ''; $Conf{FullPeriod} = 90; $Conf{IncrPeriod} = 7; $Conf{FullKeepCnt} = [ 4 ]; $Conf{FullKeepCntMin} = 1; $Conf{FullAgeMax} = 3000; $Conf{IncrKeepCnt} = 13; $Conf{IncrKeepCntMin} = 1; $Conf{IncrAgeMax} = 45; $Conf{IncrLevels} = [ 1 ]; $Conf{BackupsDisable} = 0; $Conf{PartialAgeMax} = 3; $Conf{IncrFill} = '0'; $Conf{RestoreInfoKeepCnt} = 5; $Conf{ArchiveInfoKeepCnt} = 5; $Conf{BackupFilesOnly} = {}; $Conf{BackupFilesExclude} = {}; $Conf{BlackoutBadPingLimit} = 3; $Conf{BlackoutGoodCnt} = 7; $Conf{BlackoutPeriods} = [ { 'hourEnd' => '19.5', 'weekDays' => [ 1, 2, 3, 4, 5 ], 'hourBegin' => 7 } ]; $Conf{BackupZeroFilesIsFatal} = '1'; $Conf{XferMethod} = 'rsyncd'; $Conf{XferLogLevel} = 1; $Conf{ClientCharset} = ''; $Conf{ClientCharsetLegacy} = 'iso-8859-1'; $Conf{SmbShareName} = [ 'C$' ]; $Conf{SmbShareUserName} = ''; $Conf{SmbSharePasswd} = ''; $Conf{SmbClientPath} = '/usr/bin/smbclient'; $Conf{SmbClientFullCmd} = '$smbClientPath \\\\$host\\$shareName $I_option -U $userName -E -d 1 -c tarmode\\ full -Tc$X_option - $fileList'; $Conf{SmbClientIncrCmd} = '$smbClientPath \\\\$host\\$shareName $I_option -U $userName -E -d 1 -c tarmode\\ full -TcN$X_option $timeStampFile - $fileList'; $Conf{SmbClientRestoreCmd} = '$smbClientPath \\\\$host\\$shareName $I_option -U $userName -E -N -d 1 -c tarmode\\ full -Tx -'; $Conf{TarShareName} = [ '/' ]; $Conf{TarClientCmd} = 'sudo $tarPath -c -v -f -C $sharename -totals'; $Conf{TarFullArgs} = '$fileList+'; $Conf{TarIncrArgs} = '--newer=$incrDate+ $fileList+'; $Conf{TarClientRestoreCmd} = 'sudo $tarPath -x -v -f -C $sharename -totals'; $Conf{TarClientPath} = '/bin/gtar'; $Conf{RsyncClientPath} = '/usr/bin/rsync'; $Conf{RsyncClientCmd} = '$sshPath -q -x -l root $host $rsyncPath $argList+'; $Conf{RsyncClientRestoreCmd} = '$sshPath -q -x -l root $host $rsyncPath $argList+'; $Conf{RsyncShareName} = [ 'netbackup' ]; $Conf{RsyncdClientPort} = 873; $Conf{RsyncdUserName} = ''; #Edited to remove detail $Conf{RsyncdPasswd} = ''; #Edited to remove detail $Conf{RsyncdAuthRequired} = '0'; $Conf{RsyncCsumCacheVerifyProb} = '0.01'; $Conf{RsyncArgs} = [ '--numeric-ids', '--perms', '--owner', '--group', '-D', '--links', '--hard-links', '--times', '--block-size=2048', '--recursive' ]; $Conf{RsyncArgsExtra} = []; $Conf{RsyncRestoreArgs} = [ '--numeric-ids', '--perms', '--owner', '--group', '-D', '--links', '--hard-links', '--times', '--block-size=2048', '--relative', '--ignore-times', '--recursive' ]; $Conf{FtpShareName} = [ '' ]; $Conf{FtpUserName} = ''; $Conf{FtpPasswd} = ''; $Conf{FtpPassive} = '1'; $Conf{FtpBlockSize} = 10240; $Conf{FtpPort} = 21; $Conf{FtpTimeout} = 120; $Conf{FtpFollowSymlinks} = '0'; $Conf{ArchiveDest} = '/tmp'; $Conf{ArchiveComp} = 'gzip'; $Conf{ArchivePar} = '0'; $Conf{ArchiveSplit} = 0; $Conf{ArchiveClientCmd} = '$Installdir/bin/BackupPC_archiveHost $tarCreatePath $splitpath $parpath $host $backupnumber $compression $compext $splitsize $archiveloc $parfile *'; $Conf{SshPath} = '/usr/bin/ssh'; $Conf{NmbLookupPath} = '/usr/bin/nmblookup'; $Conf{NmbLookupCmd} = '$nmbLookupPath -A $host'; $Conf{NmbLookupFindHostCmd} = '$nmbLookupPath $host'; $Conf{FixedIPNetBiosNameCheck} = '0'; $Conf{PingPath} = '/bin/ping'; $Conf{PingCmd} = '$pingPath -c 1 -w 3 $host'; $Conf{PingMaxMsec} = 80; $Conf{CompressLevel} = 3; $Conf{ClientTimeout} = 172000; $Conf{MaxOldPerPCLogFiles} = 12; $Conf{DumpPreUserCmd} = undef; $Conf{DumpPostUserCmd} = undef; $Conf{DumpPreShareCmd} = undef; $Conf{DumpPostShareCmd} = undef; $Conf{RestorePreUserCmd} = undef; $Conf{RestorePostUserCmd} = undef; $Conf{ArchivePreUserCmd} = undef; $Conf{ArchivePostUserCmd} = undef; $Conf{UserCmdCheckStatus} = '0'; $Conf{ClientNameAlias} = undef; $Conf{SendmailPath} = '/usr/sbin/sendmail'; $Conf{EMailNotifyMinDays} = '2.5'; $Conf{EMailFromUserName} = 'backuppc'; $Conf{EMailAdminUserName} = ''; #Edited to remove detail $Conf{EMailUserDestDomain} = ''; #Edited to remove detail $Conf{EMailNoBackupEverSubj} = undef; $Conf{EMailNoBackupEverMesg} = undef; $Conf{EMailNotifyOldBackupDays} = 7; $Conf{EMailNoBackupRecentSubj} = undef; $Conf{EMailNoBackupRecentMesg} = undef; $Conf{EMailNotifyOldOutlookDays} = 5; $Conf{EMailOutlookBackupSubj} = undef; $Conf{EMailOutlookBackupMesg} = undef; $Conf{EMailHeaders} = 'MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" '; $Conf{CgiAdminUserGroup} = 'admin'; $Conf{CgiAdminUsers} = ''; #Edited to remove detail $Conf{CgiURL} = ''; #Edited to remove detail $Conf{Language} = 'en'; $Conf{CgiUserHomePageCheck} = ''; $Conf{CgiUserUrlCreate} = 'mailto:%s'; $Conf{CgiDateFormatMMDD} = 1; $Conf{CgiNavBarAdminAllHosts} = '1'; $Conf{CgiSearchBoxEnable} = '1'; $Conf{CgiNavBarLinks} = [ { 'link' => '?action=view&type=docs', 'lname' => 'Documentation', 'name' => undef }, { 'link' => 'http://backuppc.wiki.sourceforge.net', 'lname' => undef, 'name' => 'Wiki' }, { 'link' => 'http://backuppc.sourceforge.net', 'lname' => undef, 'name' => 'SourceForge' } ]; $Conf{CgiStatusHilightColor} = { 'Reason_backup_failed' => '#ffcccc', 'Reason_backup_done' => '#ccffcc', 'Reason_backup_canceled_by_user' => '#ff9900', 'Reason_no_ping' => '#ffff99', 'Disabled_OnlyManualBackups' => '#d1d1d1', 'Status_backup_in_progress' => '#66cc99', 'Disabled_AllBackupsDisabled' => '#d1d1d1' }; $Conf{CgiHeaders} = '<meta http-equiv="pragma" content="no-cache">'; $Conf{CgiImageDir} = '/usr/share/BackupPC/html/'; $Conf{CgiExt2ContentType} = {}; $Conf{CgiImageDirURL} = '/BackupPC/images'; $Conf{CgiCSSFile} = 'BackupPC_stnd.css'; $Conf{CgiUserConfigEditEnable} = '1'; $Conf{CgiUserConfigEdit} = { 'EMailOutlookBackupSubj' => '1', 'ClientCharset' => '1', 'TarFullArgs' => '1', 'RsyncdPasswd' => '1', 'FtpBlockSize' => '1', 'IncrKeepCnt' => '1', 'PartialAgeMax' => '1', 'FixedIPNetBiosNameCheck' => '1', 'SmbShareUserName' => '1', 'EMailFromUserName' => '1', 'ArchivePreUserCmd' => '0', 'PingCmd' => '0', 'FullAgeMax' => '1', 'FtpUserName' => '1', 'PingMaxMsec' => '1', 'CompressLevel' => '1', 'DumpPreShareCmd' => '0', 'BackupFilesOnly' => '1', 'EMailNotifyOldBackupDays' => '1', 'EMailAdminUserName' => '1', 'RsyncCsumCacheVerifyProb' => '1', 'BlackoutPeriods' => '1', 'NmbLookupFindHostCmd' => '0', 'MaxOldPerPCLogFiles' => '1', 'TarClientCmd' => '0', 'EMailNotifyOldOutlookDays' => '1', 'SmbSharePasswd' => '1', 'SmbClientIncrCmd' => '0', 'FullKeepCntMin' => '1', 'RsyncArgs' => '1', 'FtpFollowSymlinks' => '1', 'ArchiveComp' => '1', 'TarIncrArgs' => '1', 'EMailUserDestDomain' => '1', 'TarClientPath' => '0', 'RsyncClientCmd' => '0', 'IncrFill' => '1', 'RestoreInfoKeepCnt' => '1', 'UserCmdCheckStatus' => '0', 'RsyncdClientPort' => '1', 'IncrAgeMax' => '1', 'RsyncdUserName' => '1', 'RsyncRestoreArgs' => '1', 'ClientCharsetLegacy' => '1', 'SmbClientFullCmd' => '0', 'ArchiveInfoKeepCnt' => '1', 'FtpShareName' => '1', 'BackupZeroFilesIsFatal' => '1', 'EMailNoBackupRecentMesg' => '1', 'FtpPort' => '1', 'FullKeepCnt' => '1', 'TarShareName' => '1', 'EMailNoBackupEverSubj' => '1', 'TarClientRestoreCmd' => '0', 'EMailNoBackupRecentSubj' => '1', 'ArchivePar' => '1', 'XferLogLevel' => '1', 'ArchiveDest' => '1', 'RsyncdAuthRequired' => '1', 'ClientTimeout' => '1', 'EMailNotifyMinDays' => '1', 'SmbClientRestoreCmd' => '0', 'ClientNameAlias' => '1', 'DumpPostShareCmd' => '0', 'IncrLevels' => '1', 'EMailOutlookBackupMesg' => '1', 'BlackoutBadPingLimit' => '1', 'BackupFilesExclude' => '1', 'FullPeriod' => '1', 'RsyncClientRestoreCmd' => '0', 'ArchivePostUserCmd' => '0', 'IncrPeriod' => '1', 'RsyncShareName' => '1', 'FtpTimeout' => '1', 'RestorePostUserCmd' => '0', 'BlackoutGoodCnt' => '1', 'ArchiveClientCmd' => '0', 'ArchiveSplit' => '1', 'FtpRestoreEnabled' => '1', 'XferMethod' => '1', 'NmbLookupCmd' => '0', 'BackupsDisable' => '1', 'SmbShareName' => '1', 'FtpPasswd' => '1', 'RestorePreUserCmd' => '0', 'RsyncArgsExtra' => '1', 'IncrKeepCntMin' => '1', 'EMailNoBackupEverMesg' => '1', 'EMailHeaders' => '1', 'DumpPreUserCmd' => '0', 'RsyncClientPath' => '0', 'DumpPostUserCmd' => '0' }; ------------------------------------------------ Each of the config files in /etc/BackupPC/pc look like this: bash-4.1$ cat mypc.pl $Conf{XferMethod} = 'rsyncd'; $Conf{RsyncdPasswd} = ''; #Edited to remove detail $Conf{RsyncShareName} = [ 'fileshare' ]; bash-4.1$ Regards, Craig |
From: <bac...@ko...> - 2013-10-31 01:01:06
|
Holger Parplies wrote at about 16:48:11 +0100 on Wednesday, October 30, 2013: > Jeffrey, I think we need a script to check pooling? My (still unfinished) > BackupPC_copyPool can generate a (huge) list of files, which can be sort(1)ed > by inode number. Parsing that should easily reveal anything not correctly > linked in an acceptable time frame (of course *generating* the list takes one > traversal of all pool and pc directories, but the rest would be fast enough). > Does that help, or have you already got something more suited? Are you > interested or should I be? ;-) I have code that will do this in 2 related ways: 1. Run my routine "BackupPC_copyPcPool.pl" with the "--fixlinks|-f" option which will fix missing (or invalid) pc-to-pool links on the fly as the routine crawls the pc tree (after creating an in-memory hash cache of the pool inodes) 3. Run my routine "BackupPC_copyPcPool.pl" to generate a list of the non-zero length, non-linked files (other than backupInfo which is the only non-zero length file in the pc tree that should not be linked to the pool) which lie in the pc tree. The routine always creates this list since these files would need to be transferred manually if not linked to the pool. Then pipe this list of unlinked files to my routine "BackupPc_relinkPcFiles.pl" to fix each of the non-linked files Note that BackupPC_copyPcPool works by caching the inode numbers of the pool/cpool entries in a hash which allows for quick lookup and checking of whether a pc tree file is linked to a valid pool file. Note that the above methods take care of the cases when pc tree files are: 1. Unlinked to anything else (nlinks =1) 2. Linked to other pc files (but not to the pool) Note that both methods properly either make a new link in the pool or delete the existing pc file and link it to a pre-existing pool entry depending on whether or not a pool entry already exists. |
From: Les M. <les...@gm...> - 2013-10-31 14:38:04
|
On Thu, Oct 31, 2013 at 7:49 AM, Craig O'Brien <co...@fi...> wrote: > > Unable to read 8388608 bytes from > /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, What is the underlying storage here - nfs? -- Les Mikesell les...@gm... |
From: Craig O'B. <co...@fi...> - 2013-10-31 14:47:27
|
> What is the underlying storage here - nfs? Local SATA disks in a RAID 5 (5 disks, 3TB each in capacity) Regards, Craig On Thu, Oct 31, 2013 at 10:37 AM, Les Mikesell <les...@gm...>wrote: > On Thu, Oct 31, 2013 at 7:49 AM, Craig O'Brien <co...@fi...> > wrote: > > > > Unable to read 8388608 bytes from > > /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, > > What is the underlying storage here - nfs? > > -- > Les Mikesell > les...@gm... > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: Les M. <les...@gm...> - 2013-10-31 15:15:50
|
On Thu, Oct 31, 2013 at 9:47 AM, Craig O'Brien <co...@fi...> wrote: >> What is the underlying storage here - nfs? > > Local SATA disks in a RAID 5 (5 disks, 3TB each in capacity) I think I'd force an fsck just on general principles even though it will take a long time to complete. Google turns up a few hits on similar problems, but I don't see a definitive answer. RStmp is supposed to be used to hold an uncompressed copy of the previous version of a large file with changes so rsync can seek to match up the changed block positions, so this error probably has something to do with your compressed copy being corrupted and not uncompressing properly. -- Les Mikesell les...@gm... |
From: Timothy J M. <tm...@ob...> - 2013-10-31 16:07:32
|
Holger Parplies <wb...@pa...> wrote on 10/30/2013 10:24:05 PM: > as I understand it, the backups from before the change from smb to rsyncd are > linked into the pool. Since the change, some or all are not. Whether the > change of XferMethod has anything to do with the problem or whether it > coincidentally happened at about the same point in time remains to be seen. > I still suspect the link to $topDir as cause, and BackupPC_link is independent > of the XferMethod used (so a change in XferMethod shouldn't have any > influence). To add my anecdote, I use a symbolic link for all of my BackupPC hosts: a couple dozen? And they all work fine. It's been my standard procedure for almost as long as I've been using BackupPC. Example: ls -l /var/lib lrwxrwxrwx. 1 root root 22 Apr 22 2013 BackupPC -> /data/BackupPC/TopDir/ mount /dev/sda1 on /data type ext4 (rw) I understand phobias from earlier problems (see my earlier e-mail about my thoughts on NFS and backups...) but I do not think this one is an issue. > If the log files show nothing, we're back to finding the problem, but I doubt > that. You can't "break pooling" by copying, as was suggested. Yes, you get > independent copies of files, and they might stay independent, but changed > files should get pooled again, and your file system usage wouldn't continue > growing in such a way as it seems to be. If pooling is currently "broken", > there's a reason for that, and there should be log messages indicating > problems. You are 100% correct; but it depends on how you define "break". Making a copy of a backup will absolutely break pooling--for the copy you just made! :) It won't prevent *future* copies from pooling, certainly. But it sure can fill up a drive: even if pooling *is* working correctly for new copies, they can still fill up the drive *and* BackupPC_nightly won't do a thing about it. Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Marcel M. <mai...@fo...> - 2013-10-31 16:36:10
|
Hi, > Example: > ls -l /var/lib > lrwxrwxrwx. 1 root root 22 Apr 22 2013 BackupPC -> > /data/BackupPC/TopDir/ > > mount > /dev/sda1 on /data type ext4 (rw) out of curiosity - why don't you just configure /data/BackupPC/TopDir in config.pl as the TopDir? Regards Marcel -- Registrierter Linux User #307343 |
From: Les M. <les...@gm...> - 2013-10-31 16:59:33
|
On Thu, Oct 31, 2013 at 11:36 AM, Marcel Meckel <mai...@fo...> wrote: > Hi, > >> Example: >> ls -l /var/lib >> lrwxrwxrwx. 1 root root 22 Apr 22 2013 BackupPC -> >> /data/BackupPC/TopDir/ >> >> mount >> /dev/sda1 on /data type ext4 (rw) > > out of curiosity - why don't you just configure /data/BackupPC/TopDir > in config.pl as the TopDir? Versions earlier than 3.2 didn't allow that after the initial install - and in distribution-packaged version (rpm/deb) the location decision had already been made by the packagers. -- Les Mikesell les...@gm... |
From: <bac...@ko...> - 2013-11-01 16:18:30
|
Craig O'Brien wrote at about 10:11:07 -0400 on Friday, November 1, 2013: > >And this would explain why the elements are not being linked properly to > the pool -- though I would have thought the more likely result would be a > duplicate pool entry than an unlinked pool entry... > > >It might be interesting to look for pool chains with the same (uncompressed) > content and with links < HardLinkMax (typically 31999) to see if pool > entries are being unnecessarily duplicated. > > >Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec > md5sum {} \;) | sort | uniq -d -w32 > > > Note this will find if there are any unnecessarily duplicated pool chains > (beyond the base one). Note to keep it fast and simple I am > > skipping the elements without a suffix... with the assumption being that > if there are duplicated elements then there will probably be > > whole chains of them... > I added some more bash-foo so that the following should find *any* and *all* unnecessary pool dups... (cd /var/lib/BackupPC/cpool; find . -name "*_0" | sed "s/_0$//" | (IFS=$'\n'; while read FILE; do find "${FILE}"* -links -3199 -exec md5sum {} \; | sort | uniq -D -w32 ; done)) Then do an 'ls -ial' to find their size and number of links they each have. The "-i" will also tell you the inode for later reference. |
From: Holger P. <wb...@pa...> - 2013-11-01 17:59:05
|
Hi, I get some diagnostics when reading this with 'use warnings "wrong_numbers"' ... bac...@ko... wrote on 2013-11-01 12:18:17 -0400 [Re: [BackupPC-users] Disk space used far higher than reported pool?size]: > Craig O'Brien wrote at about 10:11:07 -0400 on Friday, November 1, 2013: > > >And this would explain why the elements are not being linked properly to > > the pool -- though I would have thought the more likely result would be a > > duplicate pool entry than an unlinked pool entry... > > > > >It might be interesting to look for pool chains with the same (uncompressed) > > content and with links < HardLinkMax (typically 31999) to see if pool this one looks correct. 31999. Unless of course you've changed it in config.pl because your FS requirements differ. > > entries are being unnecessarily duplicated. > > > > >Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec This one doesn't. > > md5sum {} \;) | sort | uniq -d -w32 > > > > > Note this will find if there are any unnecessarily duplicated pool chains > > (beyond the base one). Note to keep it fast and simple I am > > > skipping the elements without a suffix... with the assumption being that > > if there are duplicated elements then there will probably be > > > whole chains of them... > > > > I added some more bash-foo so that the following should find *any* and *all* > unnecessary pool dups... > > (cd /var/lib/BackupPC/cpool; find . -name "*_0" | sed "s/_0$//" | (IFS=$'\n'; while read FILE; do find "${FILE}"* -links -3199 -exec md5sum {} \; | sort | uniq -D -w32 ; done)) Nor does this one (the 3199 again). While it will find chain members with less links than apparently necessary, it won't find all of them - only those with *far* too small link number. That might be sufficient, depending on what we're looking for. You probably wouldn't have chosen the (arbitrary) value "3199", though, if you hadn't in fact meant "31999" ;-). And you wouldn't be saying "*any* and *all*" if you were meaning "some". I'd like to point out three things: 1.) unnecessary duplication *within* the pool is not the problem we are looking for, 2.) if it were a problem, then because a duplicate was created way ahead of time and repeatedly, not because the overflow happens at 31950 instead of 31999, 3.) finding "unnecessary duplicates" can have a normal explanation: if at some point you had more than 31999 copies of one file (content) in your backups, BackupPC would have created a pool duplicate. Some of the backups linking to the first copy would have expired over time, leaving behind a link count < 31999. Further rsync backups would tend to link to the second copy, at least for unchanging existing files (in full backups). In other cases, the first copy might be reused, but there's no guarantee the link count would be exactly 31999 (though it would probably tend to be). Having so many copies of identical file content in your backups would tend to happen for small files rather than huge ones, I would expect, and it doesn't seem to be very common anyway (in my pools, I find exactly one file with a link count of 60673 (XFS) and a total of five with more than 10000 links, the largest having 103 bytes (compressed)). Regards, Holger |
From: <bac...@ko...> - 2013-11-01 18:47:05
|
Holger Parplies wrote at about 18:57:05 +0100 on Friday, November 1, 2013: > Hi, > > I get some diagnostics when reading this with 'use warnings "wrong_numbers"' ... > > bac...@ko... wrote on 2013-11-01 12:18:17 -0400 [Re: [BackupPC-users] Disk space used far higher than reported pool?size]: > > Craig O'Brien wrote at about 10:11:07 -0400 on Friday, November 1, 2013: > > > >And this would explain why the elements are not being linked properly to > > > the pool -- though I would have thought the more likely result would be a > > > duplicate pool entry than an unlinked pool entry... > > > > > > >It might be interesting to look for pool chains with the same (uncompressed) > > > content and with links < HardLinkMax (typically 31999) to see if pool > > this one looks correct. 31999. Unless of course you've changed it in config.pl > because your FS requirements differ. > > > > entries are being unnecessarily duplicated. > > > > > > >Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec > > This one doesn't. Oops typo... dropped a 9 > > > > md5sum {} \;) | sort | uniq -d -w32 > > > > > > > Note this will find if there are any unnecessarily duplicated pool chains > > > (beyond the base one). Note to keep it fast and simple I am > > > > skipping the elements without a suffix... with the assumption being that > > > if there are duplicated elements then there will probably be > > > > whole chains of them... > > > > > > > I added some more bash-foo so that the following should find *any* and *all* > > unnecessary pool dups... > > > > (cd /var/lib/BackupPC/cpool; find . -name "*_0" | sed "s/_0$//" | (IFS=$'\n'; while read FILE; do find "${FILE}"* -links -3199 -exec md5sum {} \; | sort | uniq -D -w32 ; done)) > > Nor does this one (the 3199 again). Typo again... dropped a 9 > You probably wouldn't have chosen the (arbitrary) value > "3199", though, if you hadn't in fact meant "31999" ;-). And you wouldn't be > saying "*any* and *all*" if you were meaning "some". > > I'd like to point out three things: > 1.) unnecessary duplication *within* the pool is not the problem we are > looking for, This is probably not his *primary* issue since the pool is (only) ~3T. But when he started talking about file read errors, I was concerned that if the pool file reads were being truncated, then there likely would be pool duplicates since the byte-by-byte comparisons would fail for a given partial file md5sum leading to extra chain creation... > 2.) if it were a problem, then because a duplicate was created way ahead of > time and repeatedly, not because the overflow happens at 31950 instead of > 31999, > 3.) finding "unnecessary duplicates" can have a normal explanation: if at some > point you had more than 31999 copies of one file (content) in your > backups, BackupPC would have created a pool duplicate. Some of the backups > linking to the first copy would have expired over time, leaving behind a > link count < 31999. Further rsync backups would tend to link to the second > copy, at least for unchanging existing files (in full backups). In other > cases, the first copy might be reused, but there's no guarantee the link > count would be exactly 31999 (though it would probably tend to > be). You are absolutely right that there are valid reasons for the link count overflowing 31999 and then later dropping below as links are expired. To tell you the truth, my use of "-links 31999" (corrected) was really more pedantic -- in reality, I have never seen a case of link counts getting that high... and if it does, it's probably extremely rare to have a single non-zero file repeated that many times unless you have a huge set of clients or a huge set of past full backups... (or some special situation where users keep large numbers of copies of certain files). So, basically, while there may be an exceptional case or two, anything spewed back by my shell one-liner is worth looking at from the perspective of potential issues with pool duplication. > Having so many copies of identical file content in your backups would tend > to happen for small files rather than huge ones, I would expect, and it > doesn't seem to be very common anyway (in my pools, I find exactly one file > with a link count of 60673 (XFS) and a total of five with more than 10000 > links, the largest having 103 bytes (compressed)). Exactly -- that's my point. So other than your one case of 60673 links, any other case of a duplicate pool chain would be due to an error somewhere... You may remain correct that adding "-nlinks 31999" or "-nlinks 31500" or any similar number is not going to limit the search in reality... and therefore won't have much of a practical difference... |
From: Les M. <les...@gm...> - 2013-11-01 19:24:25
|
On Fri, Nov 1, 2013 at 1:46 PM, <bac...@ko...> wrote: > > This is probably not his *primary* issue since the pool is (only) > ~3T. But when he started talking about file read errors, I was > concerned that if the pool file reads were being truncated, then there > likely would be pool duplicates since the byte-by-byte comparisons > would fail for a given partial file md5sum leading to extra chain creation... The read errors were in the RStmp files that is supposed to be the uncompressed copy of a large compressed file so rsync can seek around looking for a match. I wonder if there could be a file (huge database, mailbox,etc.) that compresses to a point that even with the safety factor of backups not starting at 95% full, the uncompressed copy won't fit. Or maybe a sparse dbm type file where the original doesn't allocate the space the length would indicate. -- Les Mikesell les...@gm... |
From: <bac...@ko...> - 2013-11-01 19:20:21
|
Holger Parplies wrote at about 18:57:05 +0100 on Friday, November 1, 2013: > 3.) finding "unnecessary duplicates" can have a normal explanation: if at some > point you had more than 31999 copies of one file (content) in your > backups, BackupPC would have created a pool duplicate. Some of the backups > linking to the first copy would have expired over time, leaving behind a > link count < 31999. Further rsync backups would tend to link to the second > copy, at least for unchanging existing files (in full backups). In other > cases, the first copy might be reused, but there's no guarantee the link > count would be exactly 31999 (though it would probably tend to be). Interesting... I think this depends on the transfer method. The rsync method looks to the immediately prior full for comparison, so new hard links will be made to the same chain as the last full. Thus, if earlier elements in the chain have reduced link count, they will tend not to be filled back in. It seems like the other transfer methods directly reference the PoolWrite package which always crawls up the chain looking for matches... If true, it does seem that one could in general, speed up fulls for the other algorithms by putting a matching candidate from the previous full (if any) first in the candidate match list... rather than matching them in chain order (or several simultaneously). In any case, if my quick reading of the code is correct, then the other methods will tend to fill in earlier chains elements first so that the link count will march back up to 31999. |