"Craig O'Brien" <cobrien@fishman.com> wrote on 10/31/2013 08:49:15 AM:

> The du -hs /backup/pool /backup/cpool /backup/pc/* has finished.
> Basically I had 1 host that was taking up 6.9 TB of data with 2.8 TB
> in the cpool directory and most of the other hosts averaging a GB each.


Well, there's your problem.

> The 1 host was our file server (which I happen to know has a 2 TB
> volume (1.3 TB currently used) that is our main fileshare. 

>
> I looked through the error log for this pc on backups with the most
> errors and found thousands of these: 


Just out of curiosity, why hadn't you already done that?!?

> Unable to read 8388608 bytes from /var/lib/BackupPC//pc/
> myfileserver/new//ffileshare/RStmp got=0, seekPosn=1501757440 (0,
> 512,147872,1499463680,2422719488)

Interesting.  I'd make sure that the filesystem is OK before I went much farther...  Stop BackupPC, unmount /backup and fsck /dev/<whatever>

> du -hs /backup/pool /backup/cpool /backup/pc/myfileserver/* 
>
> to see which backups are doing the most damage. I'll report back
> once that finishes.


With that, you should be able to find the bakup number(s) that are not linked.  You can delete them and free up space.

The big question is, though, why they aren't linking.  I'd really start at the bottom of the stack (the physical drives) and work your way up.  Check dmesg for any hardware errors.  fsck the filesystem.  Did I read correctly that this is connected vis NFSv4?  I sure hope not...  (I'm willing to admit it's a phobia, but there's no *WAY* I would trust my backup to work across NFS...)

Tim Massey
 
Out of the Box Solutions, Inc.
Creative IT Solutions Made Simple!

http://www.OutOfTheBoxSolutions.com
tmassey@obscorp.com
      22108 Harper Ave.
St. Clair Shores, MI 48080
Office: (800)750-4OBS (4627)
Cell: (586)945-8796