From: Craig O'B. <co...@fi...> - 2013-10-29 18:24:55
|
On the General Server Information page, it says "Pool is 2922.42GB comprising 6061942 files and 4369 directories," but our pool file system which contains nothing but backuppc and is 11 TB in size is 100% full. I'm confused how this happened and even ran the BackupPC_nightly script by hand which didn't seem to clear up any space. Judging by the reported pool size it should be less than 30% full. I could really use some help. Thanks in advance for any ideas on how to go about troubleshooting this. Regards, Craig |
From: Marcel M. <mai...@fo...> - 2013-10-29 18:35:59
|
Hi, > On the General Server Information page, it says "Pool is 2922.42GB > comprising 6061942 files and 4369 directories," but our pool file system > which contains nothing but backuppc and is 11 TB in size is 100% full. some details would be great! It's a bit hard to guess your setup details... Which type of filesystem? xfs/ext4/... Output of "df" Output of "df -i" Mount options from /etc/fstab Regards Marcel -- Registrierter Linux User #307343 |
From: Craig O'B. <co...@fi...> - 2013-10-29 18:48:54
|
File Type is ext4. bash-4.1$ df Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/vg_harp-lv_root 51615740 10132616 40958900 20% / tmpfs 4019640 828 4018812 1% /dev/shm /dev/md127p1 495844 135596 334648 29% /boot /dev/mapper/vg_harp-lv_home 178137608 191984 168896744 1% /home /dev/sda1 10985539464 10370739976 28862288 100% /backup naslite:/export/Disk-1 961424128 622227456 339196672 65% /mnt/naslite bash-4.1$ df -i Filesystem Inodes IUsed IFree IUse% Mounted on /dev/mapper/vg_harp-lv_root 3238400 171134 3067266 6% / tmpfs 1004910 4 1004906 1% /dev/shm /dev/md127p1 128016 66 127950 1% /boot /dev/mapper/vg_harp-lv_home 11313152 28 11313124 1% /home /dev/sda1 2929688576 20084766 2909603810 1% /backup naslite:/export/Disk-1 122109952 195254 121914698 1% /mnt/naslite bash-4.1$ bash-4.1$ cat /etc/fstab # # /etc/fstab # Created by anaconda on Fri Dec 16 12:02:05 2011 # # Accessible filesystems, by reference, are maintained under '/dev/disk' # See man pages fstab(5), findfs(8), mount(8) and/or blkid(8) for more info # /dev/mapper/vg_harp-lv_root / ext4 defaults 1 1 UUID=705ad8c5-3a17-4aac-aab2-e2fc1ce010c7 /boot ext4 defaults 1 2 /dev/mapper/vg_harp-lv_home /home ext4 defaults 1 2 /dev/mapper/vg_harp-lv_swap swap swap defaults 0 0 UUID=a76d2bf2-3649-4941-9db3-6129aa45d873 swap swap defaults 0 0 tmpfs /dev/shm tmpfs defaults 0 0 devpts /dev/pts devpts gid=5,mode=620 0 0 sysfs /sys sysfs defaults 0 0 proc /proc proc defaults 0 0 /dev/sda1 /backup ext4 defaults 0 4 naslite:/export/Disk-1 /mnt/naslite nfs rw,hard,intr 0 0 bash-4.1$ Backuppc is at version 3.2.1. OS is CentOS release 6.4 Regards, Craig On Tue, Oct 29, 2013 at 2:35 PM, Marcel Meckel < mai...@fo...> wrote: > Hi, > > > On the General Server Information page, it says "Pool is 2922.42GB > > comprising 6061942 files and 4369 directories," but our pool file system > > which contains nothing but backuppc and is 11 TB in size is 100% full. > > some details would be great! > > It's a bit hard to guess your setup details... > > Which type of filesystem? xfs/ext4/... > > Output of "df" > > Output of "df -i" > > Mount options from /etc/fstab > > Regards > > Marcel > > -- > Registrierter Linux User #307343 > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: Timothy J M. <tm...@ob...> - 2013-10-29 19:11:00
|
"Craig O'Brien" <co...@fi...> wrote on 10/29/2013 01:53:31 PM: > On the General Server Information page, it says "Pool is 2922.42GB > comprising 6061942 files and 4369 directories," but our pool file > system which contains nothing but backuppc and is 11 TB in size is 100% full. My strong guess is that, while you *think* nothing else is out there, that is not the case! :) > I'm confused how this happened and even ran the BackupPC_nightly > script by hand which didn't seem to clear up any space. Judging by > the reported pool size it should be less than 30% full. I could > really use some help. Thanks in advance for any ideas on how to go > about troubleshooting this. >From the other message, it seems that the filesystem you're worried about is /home. What is the TopDir of BackupPC? I assume it's something like /home/backuppc (or I sure hope it is!). Go to that path and type: du -hs This will take a *long* time: BackupPC has a *lot* of files. I would also hope that you see a number very similar to the pool size report above. So, if you find that your BackupPC TopDir contents (what you verified with du -hs) reports match the GUI, then you know that there's something on the drive but *outside* of the BackupPC TopDir. Find it and delete it. Because du -hs of the pool takes so long, you could, of course, do it the *other* way: do a du -hs of each directory *besides* your TopDir and see how much space is being used by them. Depends on how many other folders you have how hard that would be. But when you see some other folder using 8TB, you'll know where the space went! However, if you find that your du -hs does *not* match your GUI report, then you have to look more closely. Have you *EVER* done anything from the command line inside of the TopDir? Given that you mention running BackupPC_nightly by hand, I suspect you of monkeying with things, and that very well may have broken things. Tell us what you did! :) (It's not that you can't run BackupPC_nightly by hand; you can. It's more that if you're brave enough to run it by hand, there's no telling what *else* you might have done, and how you might have broken things! :) ). If you can't tell, I suspect something outside of BackupPC has used the space, *or* that you moved/copied/etc. something using tools outside of the BackupPC system and have broken things unintentionally. It is very unlikely that BackupPC is wrong on its pool report. Most likely it's something *else* that is consuming the space, and because of that all the BackupPC_nightly in the world isn't going to free up the space. Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Adam G. <mai...@we...> - 2013-10-30 13:19:16
|
On 31/10/13 00:04, Timothy J Massey wrote: > The only other thing that I can think of is that you did something > wrong with archiving and accidentally archived data somewhere within > the BackupPC tree. In my case, I archive to a removable hard drive > and sometimes the drive is not mounted when the archive runs. The > archives are then put on the backup drive (because that's where the > removable drive is mounted). That's tricky because you can't see the > files when the drive *is* mounted (which is the vast majority of the > time). I have to unmount the drive and then I can see terabytes of > archive data that should have been written to a removable drive. > > I don't know if that might be part of your problem. But it's the only > other thing I can think of. Not really relevant to this thread, but I have in the past added a empty file to each of the removable drives, then test if the file exists before creating the archives. If the drive isn't mounted, the file won't exist. Thus preventing that issue. I'm sure you've probably considered this previously already :) Regards, Adam -- Adam Goryachev Website Managers www.websitemanagers.com.au |
From: Timothy J M. <tm...@ob...> - 2013-10-30 13:39:49
|
Adam Goryachev <mai...@we...> wrote on 10/30/2013 09:18:59 AM: > Not really relevant to this thread, but I have in the past added a > empty file to each of the removable drives, then test if the file > exists before creating the archives. If the drive isn't mounted, the > file won't exist. Thus preventing that issue. > > I'm sure you've probably considered this previously already :) Thank you for the suggestion! My thought was to parse the output of "df /path/to/drive" and confirm that it was mounted correctly. (I already do that in the scripts I use to mount the removable drive and re-format it.) If it happened more than once or twice a year, I probably would! :) Your way certainly works, too. Because it's the root of the removable drive, I could simply look for lost+found, too! :) Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Holger P. <wb...@pa...> - 2013-10-31 02:26:05
|
Hi, Adam Goryachev wrote on 2013-10-31 09:04:48 +1100 [Re: [BackupPC-users] Disk space used far higher than reported pool size]: > On 31/10/13 07:51, Holger Parplies wrote: > > [...] > > Aside from that, I would think it might be worth the effort of determining > > whether all hosts are affected or not (though I can't really see why there > > should be a difference between hosts). If some aren't, you could at least > > keep their history. > I suspect at least some hosts OR some backups are correct, or else OP > wouldn't have anything in the pool. as I understand it, the backups from before the change from smb to rsyncd are linked into the pool. Since the change, some or all are not. Whether the change of XferMethod has anything to do with the problem or whether it coincidentally happened at about the same point in time remains to be seen. I still suspect the link to $topDir as cause, and BackupPC_link is independent of the XferMethod used (so a change in XferMethod shouldn't have any influence). > [...] you might want to look at one individual host like this: > du -sm /backup/pool /backup/cpool /backup/pc/host1/* > > This should be a *lot* quicker than the previous du command, and also > should show minimal disk usage for each backup for host1. It is quicker > because you are only looking at the set of files for the pool, plus one > host. Just keep in mind that *incrementals* might be small even if not linked to pool files. Oh, and there is still another method that is *orders of magnitude* faster: look into the log file(s), or even at the *size* of the log files. If it happens every day, for each host, it shouldn't be hard to find. You can even write a Perl one-liner to show you which hosts it happens for (give me a sample log line and I will). If the log files show nothing, we're back to finding the problem, but I doubt that. You can't "break pooling" by copying, as was suggested. Yes, you get independent copies of files, and they might stay independent, but changed files should get pooled again, and your file system usage wouldn't continue growing in such a way as it seems to be. If pooling is currently "broken", there's a reason for that, and there should be log messages indicating problems. > PS, at this stage, you may want to look at the recent thread regarding > disk caches, and caching directory entries instead of file contents. It > might help with all the directory based searches you are doing to find > the problem. Long term you may (or not) want to keep the settings. Yes, but remember that for a similarly sized pool it used up about 32 GB of 96 GB available memory. If you can do your investigation on a reasonably idle system (i.e. not running backups, without long pauses), you should get all the benefits of caching your amount of memory allows without any tuning. And even tuning won't let you hold 32 GB of file system metadata in 4 GB of memory :-). It all depends on file count and hardware memory configuration. Regards, Holger |
From: Craig O'B. <co...@fi...> - 2013-10-31 12:49:22
|
The du -hs /backup/pool /backup/cpool /backup/pc/* has finished. Basically I had 1 host that was taking up 6.9 TB of data with 2.8 TB in the cpool directory and most of the other hosts averaging a GB each. The 1 host was our file server (which I happen to know has a 2 TB volume (1.3 TB currently used) that is our main fileshare. I looked through the error log for this pc on backups with the most errors and found thousands of these: Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) Unable to read 8388608 bytes from /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,512,147872,1499463680,2422719488) I didn't see any of the ""BackupPC_link got error -4" errors. So now I'm running this command: du -hs /backup/pool /backup/cpool /backup/pc/myfileserver/* to see which backups are doing the most damage. I'll report back once that finishes. Thanks for all your help! Regards, Craig On Wed, Oct 30, 2013 at 10:24 PM, Holger Parplies <wb...@pa...> wrote: > Hi, > > Adam Goryachev wrote on 2013-10-31 09:04:48 +1100 [Re: [BackupPC-users] > Disk space used far higher than reported pool size]: > > On 31/10/13 07:51, Holger Parplies wrote: > > > [...] > > > Aside from that, I would think it might be worth the effort of > determining > > > whether all hosts are affected or not (though I can't really see why > there > > > should be a difference between hosts). If some aren't, you could at > least > > > keep their history. > > I suspect at least some hosts OR some backups are correct, or else OP > > wouldn't have anything in the pool. > > as I understand it, the backups from before the change from smb to rsyncd > are > linked into the pool. Since the change, some or all are not. Whether the > change of XferMethod has anything to do with the problem or whether it > coincidentally happened at about the same point in time remains to be seen. > I still suspect the link to $topDir as cause, and BackupPC_link is > independent > of the XferMethod used (so a change in XferMethod shouldn't have any > influence). > > > [...] you might want to look at one individual host like this: > > du -sm /backup/pool /backup/cpool /backup/pc/host1/* > > > > This should be a *lot* quicker than the previous du command, and also > > should show minimal disk usage for each backup for host1. It is quicker > > because you are only looking at the set of files for the pool, plus one > > host. > > Just keep in mind that *incrementals* might be small even if not linked to > pool files. > > Oh, and there is still another method that is *orders of magnitude* faster: > look into the log file(s), or even at the *size* of the log files. If it > happens every day, for each host, it shouldn't be hard to find. You can > even > write a Perl one-liner to show you which hosts it happens for (give me a > sample log line and I will). > > If the log files show nothing, we're back to finding the problem, but I > doubt > that. You can't "break pooling" by copying, as was suggested. Yes, you get > independent copies of files, and they might stay independent, but changed > files should get pooled again, and your file system usage wouldn't continue > growing in such a way as it seems to be. If pooling is currently "broken", > there's a reason for that, and there should be log messages indicating > problems. > > > PS, at this stage, you may want to look at the recent thread regarding > > disk caches, and caching directory entries instead of file contents. It > > might help with all the directory based searches you are doing to find > > the problem. Long term you may (or not) want to keep the settings. > > Yes, but remember that for a similarly sized pool it used up about 32 GB of > 96 GB available memory. If you can do your investigation on a reasonably > idle > system (i.e. not running backups, without long pauses), you should get all > the > benefits of caching your amount of memory allows without any tuning. And > even > tuning won't let you hold 32 GB of file system metadata in 4 GB of memory > :-). > It all depends on file count and hardware memory configuration. > > Regards, > Holger > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: Timothy J M. <tm...@ob...> - 2013-10-31 15:58:59
|
"Craig O'Brien" <co...@fi...> wrote on 10/31/2013 08:49:15 AM: > The du -hs /backup/pool /backup/cpool /backup/pc/* has finished. > Basically I had 1 host that was taking up 6.9 TB of data with 2.8 TB > in the cpool directory and most of the other hosts averaging a GB each. Well, there's your problem. > The 1 host was our file server (which I happen to know has a 2 TB > volume (1.3 TB currently used) that is our main fileshare. > > I looked through the error log for this pc on backups with the most > errors and found thousands of these: Just out of curiosity, why hadn't you already done that?!? > Unable to read 8388608 bytes from /var/lib/BackupPC//pc/ > myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0, > 512,147872,1499463680,2422719488) Interesting. I'd make sure that the filesystem is OK before I went much farther... Stop BackupPC, unmount /backup and fsck /dev/<whatever> > du -hs /backup/pool /backup/cpool /backup/pc/myfileserver/* > > to see which backups are doing the most damage. I'll report back > once that finishes. With that, you should be able to find the bakup number(s) that are not linked. You can delete them and free up space. The big question is, though, why they aren't linking. I'd really start at the bottom of the stack (the physical drives) and work your way up. Check dmesg for any hardware errors. fsck the filesystem. Did I read correctly that this is connected vis NFSv4? I sure hope not... (I'm willing to admit it's a phobia, but there's no *WAY* I would trust my backup to work across NFS...) Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Craig O'B. <co...@fi...> - 2013-10-31 17:33:38
|
> Just out of curiosity, why hadn't you already done that?!? I didn't know which host was the problem and didn't think of it. Although I'll readily admit it seems painfully obvious to me now. :) >The big question is, though, why they aren't linking. I'd really start at the bottom of the stack (the physical drives) and work your way up. Check dmesg for any hardware errors. bash-4.1$ grep -i backup /var/log/dmesg* bash-4.1$ bash-4.1$ grep -i backup /var/log/messages* messages-20131006:Sep 30 13:53:24 servername kernel: BackupPC_dump[15365]: segfault at a80 ip 000000310f695002 sp 00007fff438c9770 error 4 in libperl.so[310f600000+162000] messages-20131006:Sep 30 13:53:27 servername abrtd: Package 'BackupPC' isn't signed with proper key messages-20131020:Oct 19 01:24:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 01:24:54 servername kernel: BackupPC_dump D 0000000000000001 0 11922 10626 0x00000080 messages-20131020:Oct 19 01:30:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 01:30:54 servername kernel: BackupPC_dump D 0000000000000001 0 11922 10626 0x00000080 messages-20131020:Oct 19 01:32:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 01:32:54 servername kernel: BackupPC_dump D 0000000000000001 0 11922 10626 0x00000080 messages-20131020:Oct 19 01:32:54 servername kernel: INFO: task BackupPC_nightl:18390 blocked for more than 120 seconds. messages-20131020:Oct 19 01:32:54 servername kernel: BackupPC_nigh D 0000000000000001 0 18390 1262 0x00000080 messages-20131020:Oct 19 01:48:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 01:48:54 servername kernel: BackupPC_dump D 0000000000000003 0 11922 10626 0x00000080 messages-20131020:Oct 19 01:52:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 01:52:54 servername kernel: BackupPC_dump D 0000000000000001 0 11922 10626 0x00000080 messages-20131020:Oct 19 01:52:54 servername kernel: INFO: task BackupPC_nightl:18390 blocked for more than 120 seconds. messages-20131020:Oct 19 01:52:54 servername kernel: BackupPC_nigh D 0000000000000001 0 18390 1262 0x00000080 messages-20131020:Oct 19 01:56:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 01:56:54 servername kernel: BackupPC_dump D 0000000000000003 0 11922 10626 0x00000080 messages-20131020:Oct 19 02:10:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 02:10:54 servername kernel: BackupPC_dump D 0000000000000001 0 11922 10626 0x00000080 messages-20131020:Oct 19 02:12:54 servername kernel: INFO: task BackupPC_dump:11922 blocked for more than 120 seconds. messages-20131020:Oct 19 02:12:54 servername kernel: BackupPC_dump D 0000000000000001 0 11922 10626 0x00000080 messages-20131027:Oct 23 09:00:02 servername abrtd: Package 'BackupPC' isn't signed with proper key > fsck the filesystem. bash-4.1$ fsck /dev/sda1 fsck from util-linux-ng 2.17.2 e2fsck 1.41.12 (17-May-2010) /dev/sda1: clean, 20074506/2929688576 files, 2775975889/2929686016 blocks bash-4.1$ >Did I read correctly that this is connected vis NFSv4? I sure hope not... (I'm willing to admit it's a phobia, but there's no *WAY* I would trust my backup to work across NFS...) The drives are local SATA ones that I set up in a raid 5, directly mounted. Def not NFS. I had an unrelated drive mounted via NFS, but that had nothing to do with my backup system and that's probably the source of confusion. So the du command finished, here's the result: bash-4.1$ du -hs /backup/pool /backup/cpool /backup/pc/fileserver/* 4.0K /backup/pool 2.8T /backup/cpool 350M /backup/pc/fileserver/223 361M /backup/pc/fileserver/250 373M /backup/pc/fileserver/278 325M /backup/pc/fileserver/302 329M /backup/pc/fileserver/331 330M /backup/pc/fileserver/360 335M /backup/pc/fileserver/388 338M /backup/pc/fileserver/417 345M /backup/pc/fileserver/446 346M /backup/pc/fileserver/475 350M /backup/pc/fileserver/503 450M /backup/pc/fileserver/524 437M /backup/pc/fileserver/525 437M /backup/pc/fileserver/526 437M /backup/pc/fileserver/527 2.5G /backup/pc/fileserver/528 1.4T /backup/pc/fileserver/529 438M /backup/pc/fileserver/530 467M /backup/pc/fileserver/531 438M /backup/pc/fileserver/532 438M /backup/pc/fileserver/533 1.4T /backup/pc/fileserver/534 438M /backup/pc/fileserver/535 1013M /backup/pc/fileserver/536 442M /backup/pc/fileserver/537 441M /backup/pc/fileserver/538 441M /backup/pc/fileserver/539 1.4T /backup/pc/fileserver/540 441M /backup/pc/fileserver/541 442M /backup/pc/fileserver/542 442M /backup/pc/fileserver/543 1.4T /backup/pc/fileserver/544 442M /backup/pc/fileserver/545 441M /backup/pc/fileserver/546 442M /backup/pc/fileserver/547 442M /backup/pc/fileserver/548 1.3T /backup/pc/fileserver/549 8.0K /backup/pc/fileserver/backups 8.0K /backup/pc/fileserver/backups.old 0 /backup/pc/fileserver/LOCK 8.0K /backup/pc/fileserver/LOG.012013.z 4.0K /backup/pc/fileserver/LOG.022013.z 4.0K /backup/pc/fileserver/LOG.032013.z 4.0K /backup/pc/fileserver/LOG.042013.z 4.0K /backup/pc/fileserver/LOG.052013.z 4.0K /backup/pc/fileserver/LOG.062013.z 4.0K /backup/pc/fileserver/LOG.072013.z 4.0K /backup/pc/fileserver/LOG.082013.z 4.0K /backup/pc/fileserver/LOG.092013.z 12K /backup/pc/fileserver/LOG.102013 4.0K /backup/pc/fileserver/LOG.112012.z 8.0K /backup/pc/fileserver/LOG.122012.z 4.0K /backup/pc/fileserver/RestoreInfo.0 4.0K /backup/pc/fileserver/RestoreInfo.1 4.0K /backup/pc/fileserver/RestoreLOG.0.z 4.0K /backup/pc/fileserver/RestoreLOG.1.z 4.0K /backup/pc/fileserver/restores 4.0K /backup/pc/fileserver/restores.old 15M /backup/pc/fileserver/XferLOG.223.z 19M /backup/pc/fileserver/XferLOG.250.z 16M /backup/pc/fileserver/XferLOG.278.z 14M /backup/pc/fileserver/XferLOG.302.z 14M /backup/pc/fileserver/XferLOG.331.z 14M /backup/pc/fileserver/XferLOG.360.z 15M /backup/pc/fileserver/XferLOG.388.z 15M /backup/pc/fileserver/XferLOG.417.z 15M /backup/pc/fileserver/XferLOG.446.z 15M /backup/pc/fileserver/XferLOG.475.z 15M /backup/pc/fileserver/XferLOG.503.z 3.7M /backup/pc/fileserver/XferLOG.524.z 1.3M /backup/pc/fileserver/XferLOG.525.z 1.3M /backup/pc/fileserver/XferLOG.526.z 1.3M /backup/pc/fileserver/XferLOG.527.z 18M /backup/pc/fileserver/XferLOG.528.z 1.3M /backup/pc/fileserver/XferLOG.529.z 1.3M /backup/pc/fileserver/XferLOG.530.z 1.2M /backup/pc/fileserver/XferLOG.531.z 1.3M /backup/pc/fileserver/XferLOG.532.z 1.3M /backup/pc/fileserver/XferLOG.533.z 1.3M /backup/pc/fileserver/XferLOG.534.z 1.3M /backup/pc/fileserver/XferLOG.535.z 1.3M /backup/pc/fileserver/XferLOG.536.z 1.3M /backup/pc/fileserver/XferLOG.537.z 1.3M /backup/pc/fileserver/XferLOG.538.z 1.4M /backup/pc/fileserver/XferLOG.539.z 1.3M /backup/pc/fileserver/XferLOG.540.z 1.3M /backup/pc/fileserver/XferLOG.541.z 1.3M /backup/pc/fileserver/XferLOG.542.z 1.3M /backup/pc/fileserver/XferLOG.543.z 1.5M /backup/pc/fileserver/XferLOG.544.z 1.3M /backup/pc/fileserver/XferLOG.545.z 1.3M /backup/pc/fileserver/XferLOG.546.z 1.3M /backup/pc/fileserver/XferLOG.547.z 1.3M /backup/pc/fileserver/XferLOG.548.z 1.3M /backup/pc/fileserver/XferLOG.549.z 400K /backup/pc/fileserver/XferLOG.bad.z.old bash-4.1$ Some more info on the interesting ones: 1.4T /backup/pc/fileserver/529 (Incremental, level 1) (Error log contains thousands of "Unable to read 16384 bytes") 1.4T /backup/pc/fileserver/534 (Incremental, level 1) (Error log contains hundreds of "Unable to read 1507328 bytes") 1.4T /backup/pc/fileserver/540 (Incremental, level 2) (Error log contains hundreds of "Unable to read 1507328 bytes") 1.4T /backup/pc/fileserver/544 (incremental, level 1) (Error log contains hundreds of "Unable to read 1507328 bytes") 1.3T /backup/pc/fileserver/549 (incremental, level 1) (this one doesn't have any of those errors.) Just in case it's helpful, the fileserver's config file looks like this: bash-4.1$ cat /etc/BackupPC/pc/fileserver.pl $Conf{RsyncShareName} = ['fileshare', 'servershare']; $Conf{RsyncdPasswd} = ''; #Edited to remove detail $Conf{XferMethod} = 'rsyncd'; $Conf{IncrKeepCnt} = 25; $Conf{FullKeepCnt} = [ 12 ]; $Conf{FullPeriod} = 30; $Conf{IncrLevels} = [ 1, 2, 3, 4, 5 ]; $Conf{IncrPeriod} = 1; I don't suppose this helps give any insight to what happened? Thanks for all your help! Regards, Craig On Thu, Oct 31, 2013 at 12:59 PM, Les Mikesell <les...@gm...>wrote: > On Thu, Oct 31, 2013 at 11:36 AM, Marcel Meckel > <mai...@fo...> wrote: > > Hi, > > > >> Example: > >> ls -l /var/lib > >> lrwxrwxrwx. 1 root root 22 Apr 22 2013 BackupPC -> > >> /data/BackupPC/TopDir/ > >> > >> mount > >> /dev/sda1 on /data type ext4 (rw) > > > > out of curiosity - why don't you just configure /data/BackupPC/TopDir > > in config.pl as the TopDir? > > Versions earlier than 3.2 didn't allow that after the initial install > - and in distribution-packaged version (rpm/deb) the location decision > had already been made by the packagers. > > -- > Les Mikesell > les...@gm... > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: Timothy J M. <tm...@ob...> - 2013-10-31 17:57:53
|
"Craig O'Brien" <co...@fi...> wrote on 10/31/2013 01:33:30 PM: > > Just out of curiosity, why hadn't you already done that?!? > > I didn't know which host was the problem and didn't think of it. > Although I'll readily admit it seems painfully obvious to me now. :) Just so you're sufficiently humble... :) For everyone's future reference: ALWAYS check the server error log *and* the per-host logs... :) > >The big question is, though, why they aren't linking. I'd really > start at the bottom of the stack (the physical drives) and work your > way up. Check dmesg for any hardware errors. > > bash-4.1$ grep -i backup /var/log/dmesg* > bash-4.1$ Nice try, but won't help: you need to be looking for the correct sd or ata device that is used. Don't bother with a grep like that. do a dmesg > dmesg.txt and then vi (or whatever) dmesg.txt and look for scary errors... Look particularly for sda (or sdb or whatever), or ata0 (or 1 or whatever) messages, or possibly scsi messages (yes, SATA is SCSI to Linux) too. But if they're there, these should not be hard to find: there tends to be *LOTS* of them. > bash-4.1$ grep -i backup /var/log/messages* Mine comes back with nothing. > messages-20131006:Sep 30 13:53:24 servername kernel: BackupPC_dump > [15365]: segfault at a80 ip 000000310f695002 sp 00007fff438c9770 > error 4 in libperl.so[310f600000+162000] > messages-20131006:Sep 30 13:53:27 servername abrtd: Package > 'BackupPC' isn't signed with proper key > messages-20131020:Oct 19 01:24:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:24:54 servername kernel: BackupPC_dump D > 0000000000000001 0 11922 10626 0x00000080 > messages-20131020:Oct 19 01:30:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:30:54 servername kernel: BackupPC_dump D > 0000000000000001 0 11922 10626 0x00000080 > messages-20131020:Oct 19 01:32:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:32:54 servername kernel: BackupPC_dump D > 0000000000000001 0 11922 10626 0x00000080 > messages-20131020:Oct 19 01:32:54 servername kernel: INFO: task > BackupPC_nightl:18390 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:32:54 servername kernel: BackupPC_nigh D > 0000000000000001 0 18390 1262 0x00000080 > messages-20131020:Oct 19 01:48:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:48:54 servername kernel: BackupPC_dump D > 0000000000000003 0 11922 10626 0x00000080 > messages-20131020:Oct 19 01:52:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:52:54 servername kernel: BackupPC_dump D > 0000000000000001 0 11922 10626 0x00000080 > messages-20131020:Oct 19 01:52:54 servername kernel: INFO: task > BackupPC_nightl:18390 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:52:54 servername kernel: BackupPC_nigh D > 0000000000000001 0 18390 1262 0x00000080 > messages-20131020:Oct 19 01:56:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 01:56:54 servername kernel: BackupPC_dump D > 0000000000000003 0 11922 10626 0x00000080 > messages-20131020:Oct 19 02:10:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 02:10:54 servername kernel: BackupPC_dump D > 0000000000000001 0 11922 10626 0x00000080 > messages-20131020:Oct 19 02:12:54 servername kernel: INFO: task > BackupPC_dump:11922 blocked for more than 120 seconds. > messages-20131020:Oct 19 02:12:54 servername kernel: BackupPC_dump D > 0000000000000001 0 11922 10626 0x00000080 > messages-20131027:Oct 23 09:00:02 servername abrtd: Package > 'BackupPC' isn't signed with proper key I'd try Googling those: they have no meaning for me (and my servers don't have them). What distro are you using? (I use CentOS/RHEL) > > fsck the filesystem. > > bash-4.1$ fsck /dev/sda1 > fsck from util-linux-ng 2.17.2 > e2fsck 1.41.12 (17-May-2010) > /dev/sda1: clean, 20074506/2929688576 files, 2775975889/2929686016 blocks > bash-4.1$ Definitely a good sign. > >Did I read correctly that this is connected vis NFSv4? I sure hope > not... (I'm willing to admit it's a phobia, but there's no *WAY* I > would trust my backup to work across NFS...) > > The drives are local SATA ones that I set up in a raid 5, directly > mounted. Def not NFS. I had an unrelated drive mounted via NFS, but > that had nothing to do with my backup system and that's probably the > source of confusion. md raid5? What's the status of /dev/mdstat ? > So the du command finished, here's the result: > > bash-4.1$ du -hs /backup/pool /backup/cpool /backup/pc/fileserver/* > 1.4T /backup/pc/fileserver/529 > 1.4T /backup/pc/fileserver/534 > 1.4T /backup/pc/fileserver/540 > 1.4T /backup/pc/fileserver/544 > 1.3T /backup/pc/fileserver/549 First, you may want to delete one or more of these to free up space. Second, these are all 5 backups apart. 5 is an odd number. If they were fulls I would expect them to be *7* days apart, unless you have something crazy like it taking 3 days to run a full backup or something. But I'm going to assume that those are full backups. Next, examine the logs for those backups and find out what went wrong. It's probably the error message that you already copied, which Jeff commented on. How many errors are we talking about? Find which files are causing the problem. Is it just a few large files, or a lot of little ones? It's possible that those files have become corrupted *within* the pool and that's what's causing problems. If it's not an underlying device/filesystem problem, then it might be the compression as Jeff mentioned. (Reason #53 why I have *nothing* to do with compression with my backups!) You may be able to delete these files out of the pool and BackupPC will re-create them when you do your next backup. In short, though, it seems that your pool is corrupted. I tend to be *VERY* conservative when it comes to my backups. When I don't need them, they are completely valueless. But when I need them, they are GOLD. So right this second, while you don't need them, I would suggest biting the bullet and rebuilding the pool. (In my book, rebuilding the pool means starting from scratch: re-create the array, reformat the partition and reinstall BackupPC.) Of course, I wouldn't do that without some *other* sort of backup. But it seems you have less than 3TB of *total* data (for a single copy). I'd buy an external drive and do a backup of each and every system (using some other tool such as NTBackup or Windows Server backup for Windows, and a complete manual rsync for Linux) to it before I destroyed my BackupPC. But that's me. I'm extremely conservative with backup. Unfortunately, now that you've localized the problem, I am unlikely to be able to help. I have no knowledge related to the error messages you've reported, and you (and others) can operate Google as well as I can... Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Craig O'B. <co...@fi...> - 2013-10-29 19:30:53
|
The topdir is /var/lib/BackupPC which is a link to /backup If I do an ls -l /var/lib I get a bunch of other directories as well as: lrwxrwxrwx. 1 root root 7 Dec 17 2011 BackupPC -> /backup bash-4.1$ ls -l /backup total 20 drwxr-x---. 18 backuppc root 4096 Oct 25 21:01 cpool drwx------. 2 root root 4096 Dec 17 2011 lost+found drwxr-x---. 76 backuppc root 4096 Oct 24 16:00 pc drwxr-x---. 2 backuppc root 4096 Dec 24 2012 pool drwxr-x---. 2 backuppc root 4096 Oct 29 01:05 trash bash-4.1$ It's only backuppc stuff on there. I did it this way to give the backuppc pool a really large drive to itself. As far as things done from the command line I've deleted computers inside of the pc directory that I no longer needed to backup. From my understanding that combined with removing the pc from the /etc/BackupPC/hosts file would free up any space those backups used to use in the pool. I've manually stopped and started the backuppc daemon when I've made config changes, or added/removed a pc. At one point I had almost all of the pc's being backed up with SMB, and switched them all to using rsync. I ran the du -hs command you recommended, I'll post the results when it eventually finishes. Thank you. |
From: Marcel M. <mai...@fo...> - 2013-10-29 19:48:32
|
> I've deleted computers inside of the pc directory that I no longer > needed to backup. From my understanding that combined with removing the pc > from the /etc/BackupPC/hosts file would free up any space those backups > used to use in the pool. No. You'll have to wait until the next BackupPC_nightly jobs ran through the pool completely. If you only traverse half the pool each night it takes 2 days (and so on). Regards Marcel -- Registrierter Linux User #307343 |
From: Timothy J M. <tm...@ob...> - 2013-10-29 21:37:03
|
"Craig O'Brien" <co...@fi...> wrote on 10/29/2013 03:30:46 PM: > The topdir is /var/lib/BackupPC which is a link to /backup I missed that in your previous e-mail. Stupid proportional fonts... (And you might want add a -h for commands like du and df: the -h is for human-readable... When the numbers are for things in the *terabytes*, it's a lot of digits to manage...) > If I do an ls -l /var/lib > I get a bunch of other directories as well as: > lrwxrwxrwx. 1 root root 7 Dec 17 2011 BackupPC -> /backup > > bash-4.1$ ls -l /backup > total 20 > drwxr-x---. 18 backuppc root 4096 Oct 25 21:01 cpool > drwx------. 2 root root 4096 Dec 17 2011 lost+found > drwxr-x---. 76 backuppc root 4096 Oct 24 16:00 pc > drwxr-x---. 2 backuppc root 4096 Dec 24 2012 pool > drwxr-x---. 2 backuppc root 4096 Oct 29 01:05 trash > bash-4.1$ That is much clearer, thank you. > It's only backuppc stuff on there. I did it this way to give the > backuppc pool a really large drive to itself. Sounds reasonable. > As far as things done > from the command line I've deleted computers inside of the pc > directory that I no longer needed to backup. From my understanding > that combined with removing the pc from the /etc/BackupPC/hosts file > would free up any space those backups used to use in the pool. The way the pooling works is that any files with only one hardlink are deleted from the pool. Given that BackupPC is saying that the pool is only <3TB big, then your problem is *not* that there are things in the pool that aren't anywhere else. Your problem is the exact opposite: you have files somewhere else that are *not* part of the pool! > I've > manually stopped and started the backuppc daemon when I've made > config changes, or added/removed a pc. At one point I had almost all > of the pc's being backed up with SMB, and switched them all to using rsync. Again, I could imagine lots of ways this might explode your pool, but you have the exact opposite problem: your pool is too small! Please run Jeff's command to find out where you have files that are not in the pool. That will be most informative. > I ran the du -hs command you recommended, I'll post the results when > it eventually finishes. Thank you. I doubt that will help if you did it from /backup. The point of that was to isolate other non-BackupPC folders. Check lost+found and trash while you're at it and see what's in there. They should both be empty. I'm with Jeff: I think that you have multiple PC trees that are not part of the pool. How you managed that I'm not sure. But you need to find those files and clean them up. Start with Jeff's command and go from there. Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Les M. <les...@gm...> - 2013-10-29 21:51:18
|
On Tue, Oct 29, 2013 at 4:30 PM, Timothy J Massey <tm...@ob...> wrote: > > > Check lost+found and trash while you're at it and see what's in there. They should both be empty. > > I'm with Jeff: I think that you have multiple PC trees that are not part of the pool. How you managed that I'm not sure. But you need to find those files and clean them up. Start with Jeff's command and go from there. This could happen if the backups were originally on a different filesystem and were copied over without preserving the pool hardlinks. For example if you rsync an individual pc directory into place, subsequent rsync runs will link against those copies for existing files but will only make the pool links for new/changed files. -- Les Mikesell les...@gm... |
From: <bac...@ko...> - 2013-10-29 22:07:54
|
Les Mikesell wrote at about 16:51:12 -0500 on Tuesday, October 29, 2013: > On Tue, Oct 29, 2013 at 4:30 PM, Timothy J Massey <tm...@ob...> wrote: > > > > > > Check lost+found and trash while you're at it and see what's in there. They should both be empty. > > > > I'm with Jeff: I think that you have multiple PC trees that are not part of the pool. How you managed that I'm not sure. But you need to find those files and clean them up. Start with Jeff's command and go from there. > > This could happen if the backups were originally on a different > filesystem and were copied over without preserving the pool hardlinks. > For example if you rsync an individual pc directory into place, > subsequent rsync runs will link against those copies for existing > files but will only make the pool links for new/changed files. > > -- It also can happen if you have filesystems with flaky hard linking -- I once had that issue with a bad user-space nfs module. |
From: Craig O'B. <co...@fi...> - 2013-10-30 00:21:18
|
The folder /backup is the root of the disk. I mounted the disk there, doing the ls -l /backup showed all the root folders on the disk. Perhaps there is something going on with the PC folders, as the lost+found and trash folders are both empty. I'm not sure how I can go about determining if a particular backup is using the pool or just storing the files in the PC folder. What's the best way to check if a given backup set is represented in the pool or not? Would knowing the size of all the pc folders help narrow it down? I'm not sure if this is the best way to check the hard linking, but here's a test I thought might be helpful. I did this command to see if a common file in these backups are pointing to the same inodes. bash-4.1$ ls -i /backup/pc/*/*/ffileshare/fWindows/fexplorer.exe The output is long so I'll give a snippet: bash-4.1$ ls -i /backup/pc/*/*/ffileshare/fWindows/fexplorer.exe 635979167 /backup/pc/120p1m1/75/ffileshare/fWindows/fexplorer.exe 646452561 /backup/pc/7qk56d1/79/ffileshare/fWindows/fexplorer.exe 635979167 /backup/pc/120p1m1/76/ffileshare/fWindows/fexplorer.exe 646452561 /backup/pc/7qk56d1/80/ffileshare/fWindows/fexplorer.exe 635979167 /backup/pc/327kkn1/87/ffileshare/fWindows/fexplorer.exe 646452561 /backup/pc/7qk56d1/81/ffileshare/fWindows/fexplorer.exe 635979167 /backup/pc/327kkn1/88/ffileshare/fWindows/fexplorer.exe 646452561 /backup/pc/7qk56d1/82/ffileshare/fWindows/fexplorer.exe And it continued like that which shows me that a common file is going to the same inodes in these backups which tells me the pool should be working in theory. (I'm assuming the 2 variants account for different versions of windows.) So I'm pretty stumped at how to figure out what happened to it. Regards, Craig On Tue, Oct 29, 2013 at 6:07 PM, <bac...@ko...> wrote: > Les Mikesell wrote at about 16:51:12 -0500 on Tuesday, October 29, 2013: > > On Tue, Oct 29, 2013 at 4:30 PM, Timothy J Massey <tm...@ob...> > wrote: > > > > > > > > > Check lost+found and trash while you're at it and see what's in > there. They should both be empty. > > > > > > I'm with Jeff: I think that you have multiple PC trees that are not > part of the pool. How you managed that I'm not sure. But you need to find > those files and clean them up. Start with Jeff's command and go from there. > > > > This could happen if the backups were originally on a different > > filesystem and were copied over without preserving the pool hardlinks. > > For example if you rsync an individual pc directory into place, > > subsequent rsync runs will link against those copies for existing > > files but will only make the pool links for new/changed files. > > > > -- > > It also can happen if you have filesystems with flaky hard linking -- > I once had that issue with a bad user-space nfs module. > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > |
From: Adam G. <mai...@we...> - 2013-10-30 02:42:53
|
On 30/10/13 11:21, Craig O'Brien wrote: > The folder /backup is the root of the disk. I mounted the disk there, > doing the ls -l /backup showed all the root folders on the disk. > Perhaps there is something going on with the PC folders, as the > lost+found and trash folders are both empty. > > I'm not sure how I can go about determining if a particular backup is > using the pool or just storing the files in the PC folder. What's the > best way to check if a given backup set is represented in the pool or > not? Would knowing the size of all the pc folders help narrow it down? > > I'm not sure if this is the best way to check the hard linking, but > here's a test I thought might be helpful. I did this command to see if > a common file in these backups are pointing to the same inodes. > I'm fairly sure: du -sm /backup/pool /backup/cpool /backup/pc/* It should count all the data under pool and cpool, and there should be minimal space used for the pc folders (because it counts the space for the first time the inode is seen) The other way I've checked is with "stat filename" which will show the number of links to the file. Regards, Adam > On Tue, Oct 29, 2013 at 6:07 PM, <bac...@ko... > <mailto:bac...@ko...>> wrote: > > Les Mikesell wrote at about 16:51:12 -0500 on Tuesday, October 29, > 2013: > > On Tue, Oct 29, 2013 at 4:30 PM, Timothy J Massey > <tm...@ob... <mailto:tm...@ob...>> wrote: > > > > > > > > > Check lost+found and trash while you're at it and see what's > in there. They should both be empty. > > > > > > I'm with Jeff: I think that you have multiple PC trees that > are not part of the pool. How you managed that I'm not sure. But > you need to find those files and clean them up. Start with Jeff's > command and go from there. > > > > This could happen if the backups were originally on a different > > filesystem and were copied over without preserving the pool > hardlinks. > > For example if you rsync an individual pc directory into place, > > subsequent rsync runs will link against those copies for existing > > files but will only make the pool links for new/changed files. > > > > -- > > It also can happen if you have filesystems with flaky hard linking -- > I once had that issue with a bad user-space nfs module. > -- Adam Goryachev Website Managers www.websitemanagers.com.au |
From: Sharuzzaman A. R. <sha...@gm...> - 2013-10-30 02:58:51
|
Have you removed some PC from the backup list? If you have, the folder to that PC is still available in /backup/pc/<pc name> . You have to remove the folder manually. I believe that will cause high disk usage, as it is not linking to the pool. Note at the bottom of Edit Hosts: To delete a host, hit the Delete button. For Add, Delete, and configuration copy, changes don't take effect until you select Save. None of the deleted host's backups will be removed, so if you accidently delete a host, simply re-add it. *To completely remove a host's backups, you need to manually remove the files below /var/lib/backuppc/pc/HOST * Thanks. On Wed, Oct 30, 2013 at 8:21 AM, Craig O'Brien <co...@fi...> wrote: > The folder /backup is the root of the disk. I mounted the disk there, > doing the ls -l /backup showed all the root folders on the disk. Perhaps > there is something going on with the PC folders, as the lost+found and > trash folders are both empty. > > I'm not sure how I can go about determining if a particular backup is > using the pool or just storing the files in the PC folder. What's the best > way to check if a given backup set is represented in the pool or not? Would > knowing the size of all the pc folders help narrow it down? > > I'm not sure if this is the best way to check the hard linking, but here's > a test I thought might be helpful. I did this command to see if a common > file in these backups are pointing to the same inodes. > > bash-4.1$ ls -i /backup/pc/*/*/ffileshare/fWindows/fexplorer.exe > > The output is long so I'll give a snippet: > > bash-4.1$ ls -i /backup/pc/*/*/ffileshare/fWindows/fexplorer.exe > 635979167 /backup/pc/120p1m1/75/ffileshare/fWindows/fexplorer.exe > 646452561 /backup/pc/7qk56d1/79/ffileshare/fWindows/fexplorer.exe > 635979167 /backup/pc/120p1m1/76/ffileshare/fWindows/fexplorer.exe > 646452561 /backup/pc/7qk56d1/80/ffileshare/fWindows/fexplorer.exe > 635979167 /backup/pc/327kkn1/87/ffileshare/fWindows/fexplorer.exe > 646452561 /backup/pc/7qk56d1/81/ffileshare/fWindows/fexplorer.exe > 635979167 /backup/pc/327kkn1/88/ffileshare/fWindows/fexplorer.exe > 646452561 /backup/pc/7qk56d1/82/ffileshare/fWindows/fexplorer.exe > > And it continued like that which shows me that a common file is going to > the same inodes in these backups which tells me the pool should be working > in theory. (I'm assuming the 2 variants account for different versions of > windows.) > > So I'm pretty stumped at how to figure out what happened to it. > > > Regards, > Craig > > > On Tue, Oct 29, 2013 at 6:07 PM, <bac...@ko...> wrote: > >> Les Mikesell wrote at about 16:51:12 -0500 on Tuesday, October 29, 2013: >> > On Tue, Oct 29, 2013 at 4:30 PM, Timothy J Massey <tm...@ob...> >> wrote: >> > > >> > > >> > > Check lost+found and trash while you're at it and see what's in >> there. They should both be empty. >> > > >> > > I'm with Jeff: I think that you have multiple PC trees that are not >> part of the pool. How you managed that I'm not sure. But you need to find >> those files and clean them up. Start with Jeff's command and go from there. >> > >> > This could happen if the backups were originally on a different >> > filesystem and were copied over without preserving the pool hardlinks. >> > For example if you rsync an individual pc directory into place, >> > subsequent rsync runs will link against those copies for existing >> > files but will only make the pool links for new/changed files. >> > >> > -- >> >> It also can happen if you have filesystems with flaky hard linking -- >> I once had that issue with a bad user-space nfs module. >> >> >> ------------------------------------------------------------------------------ >> Android is increasing in popularity, but the open development platform >> that >> developers love is also attractive to malware creators. Download this >> white >> paper to learn more about secure code signing practices that can help keep >> Android apps secure. >> >> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk >> _______________________________________________ >> BackupPC-users mailing list >> Bac...@li... >> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users >> Wiki: http://backuppc.wiki.sourceforge.net >> Project: http://backuppc.sourceforge.net/ >> > > > > ------------------------------------------------------------------------------ > Android is increasing in popularity, but the open development platform that > developers love is also attractive to malware creators. Download this white > paper to learn more about secure code signing practices that can help keep > Android apps secure. > http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk > _______________________________________________ > BackupPC-users mailing list > Bac...@li... > List: https://lists.sourceforge.net/lists/listinfo/backuppc-users > Wiki: http://backuppc.wiki.sourceforge.net > Project: http://backuppc.sourceforge.net/ > > -- Sharuzzaman Ahmat Raslan |
From: Timothy J M. <tm...@ob...> - 2013-10-30 13:11:22
|
"Craig O'Brien" <co...@fi...> wrote on 10/29/2013 08:21:11 PM: > I'm not sure how I can go about determining if a particular backup > is using the pool or just storing the files in the PC folder. What's > the best way to check if a given backup set is represented in the > pool or not? Would knowing the size of all the pc folders help narrow it down? Nope. > I'm not sure if this is the best way to check the hard linking, but > here's a test I thought might be helpful. I did this command to see > if a common file in these backups are pointing to the same inodes. You want to look for files in the pc directory that have only one hardlink. These files are not in the pool and need to either be deleted or connected to files in the pool. Jeff Kosowsky gave you a command to list files with only one hardlink. Adam Goryachev gave you a good du command to find out how much space is being taken by the pc directory *after* counting the files in the pool separately. That command will take a long time to run, but it will give you a pretty clear idea of where the space is being consumed. I would not stake my life on this, but I would bet a pretty substantial amount of money: you did something to break the pooling. Most likely by copying backups around. This undid the hardlinks and left you with individual copies of the files. Or punt completely: rebuild the BackupPC server and start over. You could do almost as well by confirming that your latest backups *are* hardlinking properly and then deleting all of the old backups except maybe a copy or two. I would not delete the copies by hand, but rather change the configuration to only keep 1 full and 1 incremental. It might be a good idea to make some archives to make sure you have a good copy somewhere. In any case, once BackupPC has deleted all of the old backups, go into your pc directories and make sure that there is indeed only the backups listed in the GUI in the folder structure. Then, change the incremental and full keep counts back to what they should be and allow it to rebuild. The only other thing that I can think of is that you did something wrong with archiving and accidentally archived data somewhere within the BackupPC tree. In my case, I archive to a removable hard drive and sometimes the drive is not mounted when the archive runs. The archives are then put on the backup drive (because that's where the removable drive is mounted). That's tricky because you can't see the files when the drive *is* mounted (which is the vast majority of the time). I have to unmount the drive and then I can see terabytes of archive data that should have been written to a removable drive. I don't know if that might be part of your problem. But it's the only other thing I can think of. Tim Massey Out of the Box Solutions, Inc. Creative IT Solutions Made Simple! http://www.OutOfTheBoxSolutions.com tm...@ob... 22108 Harper Ave. St. Clair Shores, MI 48080 Office: (800)750-4OBS (4627) Cell: (586)945-8796 |
From: Holger P. <wb...@pa...> - 2013-10-30 15:14:01
|
Hi, I'll reply here, because I think the issue is visible here. Craig O'Brien wrote on 2013-10-29 15:30:46 -0400 [Re: [BackupPC-users] Disk space used far higher than reported pool size]: > The topdir is /var/lib/BackupPC which is a link to /backup I believe that may be your problem. I'm not sure why, but I vaguely recall reports of linking problems related to softlinking the pool directory (though I'm not absolutely sure about the exact circumstances, hence "vaguely recall" ;-), though *in theory* it should work. I'd always prefer mounting the partition to /var/lib/backuppc instead of the softlink if you don't have a *very* good reason for requiring the pool to be mounted elsewhere. Like Tim, I also wouldn't bet my life on it, but I'm fairly sure you'll find large amounts of "BackupPC_link got error -4 when calling MakeFileLink" messages in your log files. Also note that at least for *rsync backups* files will be hardlinked to identical copies in previous backups even if pooling isn't working. Hope that helps, and wish I'd found the time yesterday, Regards, Holger |
From: Les M. <les...@gm...> - 2013-10-30 15:26:47
|
On Wed, Oct 30, 2013 at 10:12 AM, Holger Parplies <wb...@pa...> wrote: > > Craig O'Brien wrote on 2013-10-29 15:30:46 -0400 [Re: [BackupPC-users] Disk space used far higher than reported pool size]: >> The topdir is /var/lib/BackupPC which is a link to /backup > > I believe that may be your problem. I'm not sure why, but I vaguely recall > reports of linking problems related to softlinking the pool directory (though > I'm not absolutely sure about the exact circumstances, hence "vaguely > recall" ;-), though *in theory* it should work. I'd always prefer mounting the > partition to /var/lib/backuppc instead of the softlink if you don't have a > *very* good reason for requiring the pool to be mounted elsewhere. > > Like Tim, I also wouldn't bet my life on it, but I'm fairly sure you'll find > large amounts of "BackupPC_link got error -4 when calling MakeFileLink" > messages in your log files. If you can find those errors in the logs for files that still exist you might try doing the same link with the ln command to see if it gives a better diagnostic for why it fails. It could be something quirky like a small limit to the number of hardlinks allowed in your target filesystem, or something you are missing about permissions/ownership (nfsv4 can be weird). -- Les Mikesell les...@gm... |
From: Holger P. <wb...@pa...> - 2013-10-30 15:50:09
|
Just to add two things ... Holger Parplies wrote on 2013-10-30 16:12:02 +0100 [Re: [BackupPC-users] Disk space used far higher than reported pool size]: > [...] > Like Tim, I also wouldn't bet my life on it, but I'm fairly sure you'll find > large amounts of "BackupPC_link got error -4 when calling MakeFileLink" > messages in your log files. That would be the main server log file ($topDir/log/LOG and the rotated older copies), I guess. > Also note that at least for *rsync backups* files will be hardlinked to > identical copies in previous backups even if pooling isn't working. Since you have just written that you are, in fact, using rsync, I should add that this will make recovery more difficult, since unchanged files will continue to be just linked to the version in the reference backup. This file should normally already be linked to the pool, and BackupPC makes no further effort to check and fix that. This means that if you delete all but a few recent backups after fixing the root cause of the problem, only newly changed files will be added to the pool, while unchanged files will remain outside the pool. This may or may not be a problem for you. If it is, you will need to fix pooling for (some) existing backups, which is Hard(tm) (i.e. costly in terms of CPU power). Jeffrey, I think we need a script to check pooling? My (still unfinished) BackupPC_copyPool can generate a (huge) list of files, which can be sort(1)ed by inode number. Parsing that should easily reveal anything not correctly linked in an acceptable time frame (of course *generating* the list takes one traversal of all pool and pc directories, but the rest would be fast enough). Does that help, or have you already got something more suited? Are you interested or should I be? ;-) Regards, Holger |
From: Les M. <les...@gm...> - 2013-10-30 16:28:32
|
On Wed, Oct 30, 2013 at 10:48 AM, Holger Parplies <wb...@pa...> wrote: > >> Also note that at least for *rsync backups* files will be hardlinked to >> identical copies in previous backups even if pooling isn't working. > > Since you have just written that you are, in fact, using rsync, I should add > that this will make recovery more difficult, since unchanged files will > continue to be just linked to the version in the reference backup. This file > should normally already be linked to the pool, and BackupPC makes no further > effort to check and fix that. > > This means that if you delete all but a few recent backups after fixing the > root cause of the problem, only newly changed files will be added to the pool, > while unchanged files will remain outside the pool. This may or may not be a > problem for you. If it is, you will need to fix pooling for (some) existing > backups, which is Hard(tm) (i.e. costly in terms of CPU power). If you can free up some space to work, you could try renaming the pc/host directories one at a time to hold a copy that you could access in an emergency and let it start over with a fresh full run where all files would be new and linked to the pool. Once you have sufficient history in the new tree, you can delete the old one that you renamed. This should eventually clean things up as long as you don't continue to have link errors. Alternatively, if you don't need more than the last backup for one or more targets, you could archive it with the archive host setup or running Backuppc_tarCreate on to some other media, delete the host and add it back to get a clean start. -- Les Mikesell les...@gm... |
From: Holger P. <wb...@pa...> - 2013-10-30 20:53:25
|
Hi, Les Mikesell wrote on 2013-10-30 11:28:26 -0500 [Re: [BackupPC-users] Disk space used far higher than reported pool size]: > On Wed, Oct 30, 2013 at 10:48 AM, Holger Parplies <wb...@pa...> wrote: > >> Also note that at least for *rsync backups* files will be hardlinked to > >> identical copies in previous backups even if pooling isn't working. > > [...] > > This means that if you delete all but a few recent backups after fixing the > > root cause of the problem, only newly changed files will be added to the > > pool, while unchanged files will remain outside the pool. This may or may > > not be a problem for you. If it is, you will need to fix pooling for (some) > > existing backups, which is Hard(tm) (i.e. costly in terms of CPU power). I intentionally did not go into detail, because we have not yet confirmed that this is indeed the problem, and if so, what the requirements of the OP are. > If you can free up some space to work, you could try renaming the > pc/host directories one at a time to hold a copy that you could access > in an emergency and let it start over with a fresh full run where all > files would be new and linked to the pool. Correct, but with an already almost full pool file system you *will* run into problems, because even unchanged files will now require a new copy. "Some space" might be a bit of an understatement :). > [...] > This should eventually clean things up as long as you don't continue > to have link errors. Yes, as it's basically an extension of "start off fresh" with the addition of "keep old history around in parallel". The notable thing is that you need to *make sure* you have eliminated the problem for there to be any point in starting over. Aside from that, I would think it might be worth the effort of determining whether all hosts are affected or not (though I can't really see why there should be a difference between hosts). If some aren't, you could at least keep their history. Regards, Holger |