Thread: Re: [BackupPC-users] Disk space used far higher than reported pool size (Page 2)

backuppc-users

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Adam G. <mai...@we...> - 2013-10-30 22:05:04

On 31/10/13 07:51, Holger Parplies wrote:
> Yes, as it's basically an extension of "start off fresh" with the addition of
> "keep old history around in parallel". The notable thing is that you need to
> *make sure* you have eliminated the problem for there to be any point in
> starting over.
>
> Aside from that, I would think it might be worth the effort of determining
> whether all hosts are affected or not (though I can't really see why there
> should be a difference between hosts). If some aren't, you could at least keep
> their history.
I suspect at least some hosts OR some backups are correct, or else OP 
wouldn't have anything in the pool. If you find the problem affects all 
hosts (the du command discussed previously will tell you that), then you 
might want to look at one individual host like this:
du -sm /backup/pool /backup/cpool /backup/pc/host1/*

This should be a *lot* quicker than the previous du command, and also 
should show minimal disk usage for each backup for host1. It is quicker 
because you are only looking at the set of files for the pool, plus one 
host.

PS, at this stage, you may want to look at the recent thread regarding 
disk caches, and caching directory entries instead of file contents. It 
might help with all the directory based searches you are doing to find 
the problem. Long term you may (or not) want to keep the settings.

Regards,
Adam

-- 
Adam Goryachev Website Managers www.websitemanagers.com.au

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Craig O'B. <co...@fi...> - 2013-11-01 13:48:30

> This error shows BackupPC_dump segfault, and pointing to libperl.so
> How do you install your BackupPC ? From source or from RPM?

I did a yum install backuppc, which got it from epel

> That tells you it was unmounted cleanly last time, not that
everything checks out OK.
> Try it with the -f option to make it do the actual checks.

bash-4.1$ fsck -f /dev/sda1
fsck from util-linux-ng 2.17.2
e2fsck 1.41.12 (17-May-2010)
Pass 1: Checking inodes, blocks, and sizes
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
/dev/sda1: 20074505/2929688576 files (0.3% non-contiguous),
2775975116/2929686016 blocks
bash-4.1$

> What distro are you using?  (I use CentOS/RHEL)

CentosOS release 6.4

> How many backups are/were you running in parallel?

Typically 4. But when I switched everything to rsync I wanted fulls done on
all the pc's so I was running up to 8 at a time.

> I think that segfault in a perl process needs to be tracked down before
expecting anything else to make sense.
> Either bad RAM or mismatching perl libs could break about anything else.

I installed perl-libs with yum as well. A yum info perl-libs tells me it
was installed from the updates repo

I think what I'm going to try at this point is to delete the bad backups,
reinstall perl from epel, and keep an eye on it to see if it balloons up
again. Thanks for all your help!


Regards,
Craig


On Thu, Oct 31, 2013 at 10:09 PM, Sharuzzaman Ahmat Raslan <
sha...@gm...> wrote:

> In my experience, segfault in libraries usually caused by installing it
> from different source.
>
> For example, when I install BackupPC for CentOS, I use the one in EPEL
> repo.
>
> I make sure that all the libraries (perl and others), only come from
> CentOS base repo, and not from other, as installing them from somewhere
> else might cause incompatibilities.
>
> In fact, sometime EPEL repo also provide perl library that conflict with
> CentOS base repo, but I just ignore it, and stick to base repo.
>
>
>
>
> On Fri, Nov 1, 2013 at 3:57 AM, Les Mikesell <les...@gm...>wrote:
>
>> On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...>
>> wrote:
>> >>
>> > That doesn't explain your situation, but it still might be something to
>> think
>> > about (and we might be seeing one problem on top of and as result of
>> another).
>> > I agree with Jeffrey - an "Unable to read ..." error *without* a
>> preceeding
>> > "Can't write len=... to .../RStmp" sounds like a mismatch between file
>> length
>> > according to attrib file and result of decompression of compressed file
>> -
>> > probably caused by corruption of the compressed file (or the attrib
>> file,
>> > though unlikely, because the size is not "way off").
>>
>> I think that segfault in a perl process needs to be tracked down
>> before expecting anything else to make sense.  Either bad RAM or
>> mismatching perl libs could break about anything else.
>>
>> --
>>    Les Mikesell
>>     les...@gm...
>>
>>
>> ------------------------------------------------------------------------------
>> Android is increasing in popularity, but the open development platform
>> that
>> developers love is also attractive to malware creators. Download this
>> white
>> paper to learn more about secure code signing practices that can help keep
>> Android apps secure.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
>> _______________________________________________
>> BackupPC-users mailing list
>> Bac...@li...
>> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
>> Wiki:    http://backuppc.wiki.sourceforge.net
>> Project: http://backuppc.sourceforge.net/
>>
>
>
>
> --
> Sharuzzaman Ahmat Raslan
>
>
> ------------------------------------------------------------------------------
> Android is increasing in popularity, but the open development platform that
> developers love is also attractive to malware creators. Download this white
> paper to learn more about secure code signing practices that can help keep
> Android apps secure.
> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
> _______________________________________________
> BackupPC-users mailing list
> Bac...@li...
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>
>

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Craig O'B. <co...@fi...> - 2013-11-01 14:11:14

>And this would explain why the elements are not being linked properly to
the pool -- though I would have thought the more likely result would be a
duplicate pool entry than an unlinked pool entry...

>It might be interesting to look for pool chains with the same (uncompressed)
content and with links < HardLinkMax (typically 31999) to see if pool
entries are being unnecessarily duplicated.

>Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec
md5sum {} \;) | sort | uniq -d -w32

> Note this will find if there are any unnecessarily duplicated pool chains
(beyond the base one). Note to keep it fast and simple I am
> skipping the elements without a suffix... with the assumption being that
if there are duplicated elements then there will probably be
> whole chains of them...

bash-4.1$ find . -type f -links -3198 -name "*_*" -exec md5sum {} \; | sort
| uniq -d -w32
71f4cd3f08af68c2ab20c268d86fa9f3  ./c/9/0/c900361b8dc42b2094d836d43504708a_0
bash-4.1$

Looks like this did find something. What should I do with it?

Regards,
Craig


On Fri, Nov 1, 2013 at 9:48 AM, Craig O'Brien <co...@fi...> wrote:

> > This error shows BackupPC_dump segfault, and pointing to libperl.so
> > How do you install your BackupPC ? From source or from RPM?
>
> I did a yum install backuppc, which got it from epel
>
> > That tells you it was unmounted cleanly last time, not that
> everything checks out OK.
> > Try it with the -f option to make it do the actual checks.
>
> bash-4.1$ fsck -f /dev/sda1
> fsck from util-linux-ng 2.17.2
> e2fsck 1.41.12 (17-May-2010)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> /dev/sda1: 20074505/2929688576 files (0.3% non-contiguous),
> 2775975116/2929686016 blocks
> bash-4.1$
>
> > What distro are you using?  (I use CentOS/RHEL)
>
> CentosOS release 6.4
>
> > How many backups are/were you running in parallel?
>
> Typically 4. But when I switched everything to rsync I wanted fulls done
> on all the pc's so I was running up to 8 at a time.
>
> > I think that segfault in a perl process needs to be tracked down before
> expecting anything else to make sense.
> > Either bad RAM or mismatching perl libs could break about anything else.
>
> I installed perl-libs with yum as well. A yum info perl-libs tells me it
> was installed from the updates repo
>
> I think what I'm going to try at this point is to delete the bad backups,
> reinstall perl from epel, and keep an eye on it to see if it balloons up
> again. Thanks for all your help!
>
>
> Regards,
> Craig
>
>
> On Thu, Oct 31, 2013 at 10:09 PM, Sharuzzaman Ahmat Raslan <
> sha...@gm...> wrote:
>
>> In my experience, segfault in libraries usually caused by installing it
>> from different source.
>>
>> For example, when I install BackupPC for CentOS, I use the one in EPEL
>> repo.
>>
>> I make sure that all the libraries (perl and others), only come from
>> CentOS base repo, and not from other, as installing them from somewhere
>> else might cause incompatibilities.
>>
>> In fact, sometime EPEL repo also provide perl library that conflict with
>> CentOS base repo, but I just ignore it, and stick to base repo.
>>
>>
>>
>>
>> On Fri, Nov 1, 2013 at 3:57 AM, Les Mikesell <les...@gm...>wrote:
>>
>>> On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...>
>>> wrote:
>>> >>
>>> > That doesn't explain your situation, but it still might be something
>>> to think
>>> > about (and we might be seeing one problem on top of and as result of
>>> another).
>>> > I agree with Jeffrey - an "Unable to read ..." error *without* a
>>> preceeding
>>> > "Can't write len=... to .../RStmp" sounds like a mismatch between file
>>> length
>>> > according to attrib file and result of decompression of compressed
>>> file -
>>> > probably caused by corruption of the compressed file (or the attrib
>>> file,
>>> > though unlikely, because the size is not "way off").
>>>
>>> I think that segfault in a perl process needs to be tracked down
>>> before expecting anything else to make sense.  Either bad RAM or
>>> mismatching perl libs could break about anything else.
>>>
>>> --
>>>    Les Mikesell
>>>     les...@gm...
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Android is increasing in popularity, but the open development platform
>>> that
>>> developers love is also attractive to malware creators. Download this
>>> white
>>> paper to learn more about secure code signing practices that can help
>>> keep
>>> Android apps secure.
>>>
>>> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
>>> _______________________________________________
>>> BackupPC-users mailing list
>>> Bac...@li...
>>> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
>>> Wiki:    http://backuppc.wiki.sourceforge.net
>>> Project: http://backuppc.sourceforge.net/
>>>
>>
>>
>>
>> --
>> Sharuzzaman Ahmat Raslan
>>
>>
>> ------------------------------------------------------------------------------
>> Android is increasing in popularity, but the open development platform
>> that
>> developers love is also attractive to malware creators. Download this
>> white
>> paper to learn more about secure code signing practices that can help keep
>> Android apps secure.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
>> _______________________________________________
>> BackupPC-users mailing list
>> Bac...@li...
>> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
>> Wiki:    http://backuppc.wiki.sourceforge.net
>> Project: http://backuppc.sourceforge.net/
>>
>>
>

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Les M. <les...@gm...> - 2013-11-01 14:54:10

On Fri, Nov 1, 2013 at 8:48 AM, Craig O'Brien <co...@fi...> wrote:
>> This error shows BackupPC_dump segfault, and pointing to libperl.so
>> How do you install your BackupPC ? From source or from RPM?
>
> I did a yum install backuppc, which got it from epel

Do you see any other segfaults in your logs (not necessarily just from
backuppc)?

>> How many backups are/were you running in parallel?
>
> Typically 4. But when I switched everything to rsync I wanted fulls done on
> all the pc's so I was running up to 8 at a time.

Most machines would get better overall throughput with a max of 2
concurrent runs (depending on a lot of things, of course...).

>> I think that segfault in a perl process needs to be tracked down before
>> expecting anything else to make sense.
>> Either bad RAM or mismatching perl libs could break about anything else.
>
> I installed perl-libs with yum as well. A yum info perl-libs tells me it was
> installed from the updates repo

Have you installed anything from repos other than the CentOS base and
EPEL?  You shouldn't have any trouble with anything from those.

-- 
   Les Mikesell
      les...@gm...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Timothy J M. <tm...@ob...> - 2013-11-01 15:15:51

"Craig O'Brien" <co...@fi...> wrote on 11/01/2013 09:48:23 AM:

> > This error shows BackupPC_dump segfault, and pointing to libperl.so
> > How do you install your BackupPC ? From source or from RPM?
> 
> I did a yum install backuppc, which got it from epel

That's how I do it.

> > That tells you it was unmounted cleanly last time, not that 
> everything checks out OK.   
> > Try it with the -f option to make it do the actual checks.
> 
> bash-4.1$ fsck -f /dev/sda1
> fsck from util-linux-ng 2.17.2
> e2fsck 1.41.12 (17-May-2010)
> Pass 1: Checking inodes, blocks, and sizes
> Pass 2: Checking directory structure
> Pass 3: Checking directory connectivity
> Pass 4: Checking reference counts
> Pass 5: Checking group summary information
> /dev/sda1: 20074505/2929688576 files (0.3% non-contiguous), 
> 2775975116/2929686016 blocks
> bash-4.1$

Good.  I think we've eliminated a disk or filesystem issue.  I think we're 
pretty comfortable it's a BackupPC corruption issue.  It was hard to tell 
when your error messages said that it could not seek to a particular point 
in a file.

> > What distro are you using?  (I use CentOS/RHEL) 
> 
> CentosOS release 6.4

Same here.

> > I think that segfault in a perl process needs to be tracked 
> down before expecting anything else to make sense.  
> > Either bad RAM or mismatching perl libs could break about anything 
else.
> 
> I installed perl-libs with yum as well. A yum info perl-libs tells 
> me it was installed from the updates repo
> 
> I think what I'm going to try at this point is to delete the bad 
> backups, reinstall perl from epel, and keep an eye on it to see if 
> it balloons up again. Thanks for all your help!

That's a very reasonable, if not very subtle, solution.

I think you need to monitor /var/log/messages for errors that mention 
backup.  See if the crash returns.  Jeff is (justifiably) worried that the 
crash caused your corruption, but it could just as easily be the other way 
around.  Once you clean up from this, you want to make sure that nothing 
comes back.

If you've got the time, running memtest for a weekend might be a good 
idea, too.  The only thing it would cost is the downtime...

Tim Massey
 
Out of the Box Solutions, Inc. 
Creative IT Solutions Made Simple!
http://www.OutOfTheBoxSolutions.com
tm...@ob... 
 
22108 Harper Ave.
St. Clair Shores, MI 48080
Office: (800)750-4OBS (4627)
Cell: (586)945-8796

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Timothy J M. <tm...@ob...> - 2013-10-31 18:01:04

Les Mikesell <les...@gm...> wrote on 10/31/2013 01:54:24 PM:

> On Thu, Oct 31, 2013 at 12:33 PM, Craig O'Brien <co...@fi...> 
wrote:
> >
> >> fsck the filesystem.
> >
> > bash-4.1$ fsck /dev/sda1
> > fsck from util-linux-ng 2.17.2
> > e2fsck 1.41.12 (17-May-2010)
> > /dev/sda1: clean, 20074506/2929688576 files, 2775975889/2929686016 
blocks
> > bash-4.1$
> 
> That tells you it was unmounted cleanly last time, not that everything
> checks out OK.   Try it with the -f option to make it do the actual
> checks.

Good catch!  This should take a long time:  20 minutes to an hour?  Maybe 
more:  the drives are full.

Tim Massey
 
Out of the Box Solutions, Inc. 
Creative IT Solutions Made Simple!
http://www.OutOfTheBoxSolutions.com
tm...@ob... 
 
22108 Harper Ave.
St. Clair Shores, MI 48080
Office: (800)750-4OBS (4627)
Cell: (586)945-8796

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Holger P. <wb...@pa...> - 2013-10-31 19:22:21

Hi,

I've spent far too long writing an email and trying to make it make sense and
then discarding it again.

Just one thought I want to rescue: the RStmp file was really *large*
(something like 1.5 GB), your backup trees are really *large* (1.4 TB), your
pool FS is really *full* (27.5 GB free). Running out of space during a backup
is a bad idea. Both the RStmp file(s) will be truncated (though that should
trigger a second error when it is *written*, just before it is read again) and
the NewFileList, which would, in turn, lead to BackupPC_link missing new files
it would be supposed to link into the pool (resulting in unlinked files).

That doesn't explain your situation, but it still might be something to think
about (and we might be seeing one problem on top of and as result of another).
I agree with Jeffrey - an "Unable to read ..." error *without* a preceeding
"Can't write len=... to .../RStmp" sounds like a mismatch between file length
according to attrib file and result of decompression of compressed file -
probably caused by corruption of the compressed file (or the attrib file,
though unlikely, because the size is not "way off").

How many backups are/were you running in parallel?

Hope that helps.

Regards,
Holger

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Les M. <les...@gm...> - 2013-10-31 19:57:45

On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...> wrote:
>>
> That doesn't explain your situation, but it still might be something to think
> about (and we might be seeing one problem on top of and as result of another).
> I agree with Jeffrey - an "Unable to read ..." error *without* a preceeding
> "Can't write len=... to .../RStmp" sounds like a mismatch between file length
> according to attrib file and result of decompression of compressed file -
> probably caused by corruption of the compressed file (or the attrib file,
> though unlikely, because the size is not "way off").

I think that segfault in a perl process needs to be tracked down
before expecting anything else to make sense.  Either bad RAM or
mismatching perl libs could break about anything else.

-- 
   Les Mikesell
    les...@gm...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Sharuzzaman A. R. <sha...@gm...> - 2013-11-01 02:10:03

In my experience, segfault in libraries usually caused by installing it
from different source.

For example, when I install BackupPC for CentOS, I use the one in EPEL repo.

I make sure that all the libraries (perl and others), only come from CentOS
base repo, and not from other, as installing them from somewhere else might
cause incompatibilities.

In fact, sometime EPEL repo also provide perl library that conflict with
CentOS base repo, but I just ignore it, and stick to base repo.




On Fri, Nov 1, 2013 at 3:57 AM, Les Mikesell <les...@gm...> wrote:

> On Thu, Oct 31, 2013 at 2:20 PM, Holger Parplies <wb...@pa...>
> wrote:
> >>
> > That doesn't explain your situation, but it still might be something to
> think
> > about (and we might be seeing one problem on top of and as result of
> another).
> > I agree with Jeffrey - an "Unable to read ..." error *without* a
> preceeding
> > "Can't write len=... to .../RStmp" sounds like a mismatch between file
> length
> > according to attrib file and result of decompression of compressed file -
> > probably caused by corruption of the compressed file (or the attrib file,
> > though unlikely, because the size is not "way off").
>
> I think that segfault in a perl process needs to be tracked down
> before expecting anything else to make sense.  Either bad RAM or
> mismatching perl libs could break about anything else.
>
> --
>    Les Mikesell
>     les...@gm...
>
>
> ------------------------------------------------------------------------------
> Android is increasing in popularity, but the open development platform that
> developers love is also attractive to malware creators. Download this white
> paper to learn more about secure code signing practices that can help keep
> Android apps secure.
> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
> _______________________________________________
> BackupPC-users mailing list
> Bac...@li...
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>



-- 
Sharuzzaman Ahmat Raslan

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Les M. <les...@gm...> - 2013-10-31 17:54:30

On Thu, Oct 31, 2013 at 12:33 PM, Craig O'Brien <co...@fi...> wrote:
>
>> fsck the filesystem.
>
> bash-4.1$ fsck /dev/sda1
> fsck from util-linux-ng 2.17.2
> e2fsck 1.41.12 (17-May-2010)
> /dev/sda1: clean, 20074506/2929688576 files, 2775975889/2929686016 blocks
> bash-4.1$

That tells you it was unmounted cleanly last time, not that everything
checks out OK.   Try it with the -f option to make it do the actual
checks.

>
> I don't suppose this helps give any insight to what happened? Thanks for all
> your help!

I think it is related to that RStmp file that isn't uncompressing
correctly so rsync can merge the changes - I'm not sure what happens
after that error, though, or how to find the compressed file that is
probably causing it.

-- 
   Les Mikesell
     les...@gm...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: <bac...@ko...> - 2013-10-29 19:56:12

Craig O'Brien wrote at about 13:53:31 -0400 on Tuesday, October 29, 2013:
 > On the General Server Information page, it says "Pool is 2922.42GB
 > comprising 6061942 files and 4369 directories," but our pool file system
 > which contains nothing but backuppc and is 11 TB in size is 100% full.
 > 
 > I'm confused how this happened and even ran the BackupPC_nightly script by
 > hand which didn't seem to clear up any space. Judging by the reported pool
 > size it should be less than 30% full. I could really use some help. Thanks
 > in advance for any ideas on how to go about troubleshooting this.
 > 
 > Regards,
 > Craig
 > ---------------------------------------------------------------------------

I suspect that you have (multiple) backups in the pc tree that are not
linked to the pool...

The following is a quick-and-dirty hack to find non-zero length
*files* in the pc tree that have fewer than 2 hard links (note this
will miss files that are linked to themselves but not to the pool):

cd /var/lib/BackupPC/pc/; find */*/* -type f -links -2 -size +0 | grep -v "^/[^/]*/[0-9]*/backupInfo"

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Tyler J. W. <ty...@to...> - 2013-10-30 14:39:49

On 2013-10-29 17:53, Craig O'Brien wrote:
> On the General Server Information page, it says "Pool is 2922.42GB
> comprising 6061942 files and 4369 directories," but our pool file system
> which contains nothing but backuppc and is 11 TB in size is 100% full.

Did you forget to exclude the path to TopDir (usually /var/lib/backuppc)
from the backup of the BackupPC server itself? I've seen that before. Heck,
I've DONE that before.

Regards,
Tyler

-- 
"The intellectual is constantly betrayed by his vanity. Godlike he
blandly assumes that he can express everything in words; whereas the
things one loves, lives, and dies for are not, in the last analysis
completely expressible in words."
   -- Anne Morrow Lindbergh

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Craig O'B. <co...@fi...> - 2013-10-30 15:07:23

> I'm fairly sure:
> du -sm /backup/pool /backup/cpool /backup/pc/*
> It should count all the data under pool and cpool, and there should be
minimal space used for the pc folders (because it counts the space for the
first time the inode is seen)

I'm trying that now. I'll report back when it finishes.

> To delete a host, hit the Delete button. For Add, Delete, and
configuration copy, changes don't take effect until you select Save. None
of the deleted host's backups will be removed, so if you accidently delete
a host, simply re-add it.
*> To completely remove a host's backups, you need to manually remove the
files below /var/lib/backuppc/pc/HOST *

This is how I've done it when I've removed a host. I would delete the
/backup/pc/host directory and remove the entry from /etc/BackupPC/hosts
file.

> I would not stake my life on this, but I would bet a pretty substantial
amount of money:  you did something to break the pooling.  Most likely by
copying backups around.  This undid the hardlinks and > left you with
individual copies of the files.

I don't doubt the pooling is probably broken but I haven't moved any
backups around. For what it's worth I before I switched all the pc's to use
rsync instead of smb a couple months ago my pool file system was sitting at
30%. I don't know if that's relevant but it does seem odd that my problems
seem to have started with that.

> Or punt completely:  rebuild the BackupPC server and start over.

> You could do almost as well by confirming that your latest backups *are*
hardlinking properly and then deleting all of the old backups except maybe
a copy or two.
> I would not delete the copies by > hand, but rather change the
configuration to only keep 1 full and 1 incremental.
> It might be a good idea to make some archives to make sure you have a
good copy somewhere.  In any case, once BackupPC has deleted all of the old
backups,
> go into your pc directories and make sure that there is indeed only the
backups listed in the GUI in the folder structure.
> Then, change the incremental and full keep counts back to what they
should be and allow it to rebuild.

I'll probably have to do that. At this point I'm just trying to add to the
knowledge base and figure out how it went wrong so it doesn't just happen
again.

> My thought was to parse the output of "df /path/to/drive" and confirm
that it was mounted correctly.

Just in case it helps at all:

bash-4.1$ df -h /backup
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda1              11T  9.7T   28G 100% /backup
bash-4.1$

> Did you forget to exclude the path to TopDir (usually /var/lib/backuppc)
> from the backup of the BackupPC server itself? I've seen that before.
Heck,
> I've DONE that before.

I don't have the server backing itself up.


Here's my config file (with #comment lines removed) just in case that helps
at all.
-----------------------------------------

$Conf{ServerHost} = 'localhost';
$Conf{ServerPort} = -1;
$Conf{ServerMesgSecret} = '';
$Conf{MyPath} = '/bin';
$Conf{UmaskMode} = 23;
$Conf{WakeupSchedule} = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23];
$Conf{MaxBackups} = 4;
$Conf{MaxUserBackups} = 4;
$Conf{MaxPendingCmds} = 15;
$Conf{CmdQueueNice} = 10;
$Conf{MaxBackupPCNightlyJobs} = 4;
$Conf{BackupPCNightlyPeriod} = 1;
$Conf{MaxOldLogFiles} = 14;
$Conf{DfPath} = '/bin/df';
$Conf{DfCmd} = '$dfPath $topDir';
$Conf{SplitPath} = '/usr/bin/split';
$Conf{ParPath} = undef;
$Conf{CatPath} = '/bin/cat';
$Conf{GzipPath} = '/bin/gzip';
$Conf{Bzip2Path} = '/usr/bin/bzip2';
$Conf{DfMaxUsagePct} = 95;
$Conf{TrashCleanSleepSec} = 300;
$Conf{DHCPAddressRanges} = [];
$Conf{BackupPCUser} = 'backuppc';
$Conf{TopDir} = '/var/lib/BackupPC/';
$Conf{ConfDir} = '/etc/BackupPC/';
$Conf{LogDir} = '/var/log/BackupPC';
$Conf{InstallDir} = '/usr/share/BackupPC';
$Conf{CgiDir} = '/usr/share/BackupPC/sbin/';
$Conf{BackupPCUserVerify} = '1';
$Conf{HardLinkMax} = 31999;
$Conf{PerlModuleLoad} = undef;
$Conf{ServerInitdPath} = undef;
$Conf{ServerInitdStartCmd} = '';
$Conf{FullPeriod} = 90;
$Conf{IncrPeriod} = 7;
$Conf{FullKeepCnt} = [ 4 ];
$Conf{FullKeepCntMin} = 1;
$Conf{FullAgeMax} = 3000;
$Conf{IncrKeepCnt} = 13;
$Conf{IncrKeepCntMin} = 1;
$Conf{IncrAgeMax} = 45;
$Conf{IncrLevels} = [ 1 ];
$Conf{BackupsDisable} = 0;
$Conf{PartialAgeMax} = 3;
$Conf{IncrFill} = '0';
$Conf{RestoreInfoKeepCnt} = 5;
$Conf{ArchiveInfoKeepCnt} = 5;
$Conf{BackupFilesOnly} = {};
$Conf{BackupFilesExclude} = {};
$Conf{BlackoutBadPingLimit} = 3;
$Conf{BlackoutGoodCnt} = 7;
$Conf{BlackoutPeriods} = [
  {
    'hourEnd' => '19.5',
    'weekDays' => [
      1,
      2,
      3,
      4,
      5
    ],
    'hourBegin' => 7
  }
];
$Conf{BackupZeroFilesIsFatal} = '1';
$Conf{XferMethod} = 'rsyncd';
$Conf{XferLogLevel} = 1;
$Conf{ClientCharset} = '';
$Conf{ClientCharsetLegacy} = 'iso-8859-1';
$Conf{SmbShareName} = [
  'C$'
];
$Conf{SmbShareUserName} = '';
$Conf{SmbSharePasswd} = '';
$Conf{SmbClientPath} = '/usr/bin/smbclient';
$Conf{SmbClientFullCmd} = '$smbClientPath \\\\$host\\$shareName $I_option
-U $userName -E -d 1 -c tarmode\\ full -Tc$X_option - $fileList';
$Conf{SmbClientIncrCmd} = '$smbClientPath \\\\$host\\$shareName $I_option
-U $userName -E -d 1 -c tarmode\\ full -TcN$X_option $timeStampFile -
$fileList';
$Conf{SmbClientRestoreCmd} = '$smbClientPath \\\\$host\\$shareName
$I_option -U $userName -E -N -d 1 -c tarmode\\ full -Tx -';
$Conf{TarShareName} = [
  '/'
];
$Conf{TarClientCmd} = 'sudo $tarPath -c -v -f -C $sharename -totals';
$Conf{TarFullArgs} = '$fileList+';
$Conf{TarIncrArgs} = '--newer=$incrDate+ $fileList+';
$Conf{TarClientRestoreCmd} = 'sudo $tarPath -x -v -f -C $sharename -totals';
$Conf{TarClientPath} = '/bin/gtar';
$Conf{RsyncClientPath} = '/usr/bin/rsync';
$Conf{RsyncClientCmd} = '$sshPath -q -x -l root $host $rsyncPath $argList+';
$Conf{RsyncClientRestoreCmd} = '$sshPath -q -x -l root $host $rsyncPath
$argList+';
$Conf{RsyncShareName} = [
  'netbackup'
];
$Conf{RsyncdClientPort} = 873;
$Conf{RsyncdUserName} = ''; #Edited to remove detail
$Conf{RsyncdPasswd} = ''; #Edited to remove detail
$Conf{RsyncdAuthRequired} = '0';
$Conf{RsyncCsumCacheVerifyProb} = '0.01';
$Conf{RsyncArgs} = [
  '--numeric-ids',
  '--perms',
  '--owner',
  '--group',
  '-D',
  '--links',
  '--hard-links',
  '--times',
  '--block-size=2048',
  '--recursive'
];
$Conf{RsyncArgsExtra} = [];
$Conf{RsyncRestoreArgs} = [
  '--numeric-ids',
  '--perms',
  '--owner',
  '--group',
  '-D',
  '--links',
  '--hard-links',
  '--times',
  '--block-size=2048',
  '--relative',
  '--ignore-times',
  '--recursive'
];
$Conf{FtpShareName} = [
  ''
];
$Conf{FtpUserName} = '';
$Conf{FtpPasswd} = '';
$Conf{FtpPassive} = '1';
$Conf{FtpBlockSize} = 10240;
$Conf{FtpPort} = 21;
$Conf{FtpTimeout} = 120;
$Conf{FtpFollowSymlinks} = '0';
$Conf{ArchiveDest} = '/tmp';
$Conf{ArchiveComp} = 'gzip';
$Conf{ArchivePar} = '0';
$Conf{ArchiveSplit} = 0;
$Conf{ArchiveClientCmd} = '$Installdir/bin/BackupPC_archiveHost
$tarCreatePath $splitpath $parpath $host $backupnumber $compression
$compext $splitsize $archiveloc $parfile *';
$Conf{SshPath} = '/usr/bin/ssh';
$Conf{NmbLookupPath} = '/usr/bin/nmblookup';
$Conf{NmbLookupCmd} = '$nmbLookupPath -A $host';
$Conf{NmbLookupFindHostCmd} = '$nmbLookupPath $host';
$Conf{FixedIPNetBiosNameCheck} = '0';
$Conf{PingPath} = '/bin/ping';
$Conf{PingCmd} = '$pingPath -c 1 -w 3 $host';
$Conf{PingMaxMsec} = 80;
$Conf{CompressLevel} = 3;
$Conf{ClientTimeout} = 172000;
$Conf{MaxOldPerPCLogFiles} = 12;
$Conf{DumpPreUserCmd} = undef;
$Conf{DumpPostUserCmd} = undef;
$Conf{DumpPreShareCmd} = undef;
$Conf{DumpPostShareCmd} = undef;
$Conf{RestorePreUserCmd} = undef;
$Conf{RestorePostUserCmd} = undef;
$Conf{ArchivePreUserCmd} = undef;
$Conf{ArchivePostUserCmd} = undef;
$Conf{UserCmdCheckStatus} = '0';
$Conf{ClientNameAlias} = undef;
$Conf{SendmailPath} = '/usr/sbin/sendmail';
$Conf{EMailNotifyMinDays} = '2.5';
$Conf{EMailFromUserName} = 'backuppc';
$Conf{EMailAdminUserName} = ''; #Edited to remove detail
$Conf{EMailUserDestDomain} = ''; #Edited to remove detail
$Conf{EMailNoBackupEverSubj} = undef;
$Conf{EMailNoBackupEverMesg} = undef;
$Conf{EMailNotifyOldBackupDays} = 7;
$Conf{EMailNoBackupRecentSubj} = undef;
$Conf{EMailNoBackupRecentMesg} = undef;
$Conf{EMailNotifyOldOutlookDays} = 5;
$Conf{EMailOutlookBackupSubj} = undef;
$Conf{EMailOutlookBackupMesg} = undef;
$Conf{EMailHeaders} = 'MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
';
$Conf{CgiAdminUserGroup} = 'admin';
$Conf{CgiAdminUsers} = ''; #Edited to remove detail
$Conf{CgiURL} = ''; #Edited to remove detail
$Conf{Language} = 'en';
$Conf{CgiUserHomePageCheck} = '';
$Conf{CgiUserUrlCreate} = 'mailto:%s';
$Conf{CgiDateFormatMMDD} = 1;
$Conf{CgiNavBarAdminAllHosts} = '1';
$Conf{CgiSearchBoxEnable} = '1';
$Conf{CgiNavBarLinks} = [
  {
    'link' => '?action=view&type=docs',
    'lname' => 'Documentation',
    'name' => undef
  },
  {
    'link' => 'http://backuppc.wiki.sourceforge.net',
    'lname' => undef,
    'name' => 'Wiki'
  },
  {
    'link' => 'http://backuppc.sourceforge.net',
    'lname' => undef,
    'name' => 'SourceForge'
  }
];
$Conf{CgiStatusHilightColor} = {
  'Reason_backup_failed' => '#ffcccc',
  'Reason_backup_done' => '#ccffcc',
  'Reason_backup_canceled_by_user' => '#ff9900',
  'Reason_no_ping' => '#ffff99',
  'Disabled_OnlyManualBackups' => '#d1d1d1',
  'Status_backup_in_progress' => '#66cc99',
  'Disabled_AllBackupsDisabled' => '#d1d1d1'
};
$Conf{CgiHeaders} = '<meta http-equiv="pragma" content="no-cache">';
$Conf{CgiImageDir} = '/usr/share/BackupPC/html/';
$Conf{CgiExt2ContentType} = {};
$Conf{CgiImageDirURL} = '/BackupPC/images';
$Conf{CgiCSSFile} = 'BackupPC_stnd.css';
$Conf{CgiUserConfigEditEnable} = '1';
$Conf{CgiUserConfigEdit} = {
  'EMailOutlookBackupSubj' => '1',
  'ClientCharset' => '1',
  'TarFullArgs' => '1',
  'RsyncdPasswd' => '1',
  'FtpBlockSize' => '1',
  'IncrKeepCnt' => '1',
  'PartialAgeMax' => '1',
  'FixedIPNetBiosNameCheck' => '1',
  'SmbShareUserName' => '1',
  'EMailFromUserName' => '1',
  'ArchivePreUserCmd' => '0',
  'PingCmd' => '0',
  'FullAgeMax' => '1',
  'FtpUserName' => '1',
  'PingMaxMsec' => '1',
  'CompressLevel' => '1',
  'DumpPreShareCmd' => '0',
  'BackupFilesOnly' => '1',
  'EMailNotifyOldBackupDays' => '1',
  'EMailAdminUserName' => '1',
  'RsyncCsumCacheVerifyProb' => '1',
  'BlackoutPeriods' => '1',
  'NmbLookupFindHostCmd' => '0',
  'MaxOldPerPCLogFiles' => '1',
  'TarClientCmd' => '0',
  'EMailNotifyOldOutlookDays' => '1',
  'SmbSharePasswd' => '1',
  'SmbClientIncrCmd' => '0',
  'FullKeepCntMin' => '1',
  'RsyncArgs' => '1',
  'FtpFollowSymlinks' => '1',
  'ArchiveComp' => '1',
  'TarIncrArgs' => '1',
  'EMailUserDestDomain' => '1',
  'TarClientPath' => '0',
  'RsyncClientCmd' => '0',
  'IncrFill' => '1',
  'RestoreInfoKeepCnt' => '1',
  'UserCmdCheckStatus' => '0',
  'RsyncdClientPort' => '1',
  'IncrAgeMax' => '1',
  'RsyncdUserName' => '1',
  'RsyncRestoreArgs' => '1',
  'ClientCharsetLegacy' => '1',
  'SmbClientFullCmd' => '0',
  'ArchiveInfoKeepCnt' => '1',
  'FtpShareName' => '1',
  'BackupZeroFilesIsFatal' => '1',
  'EMailNoBackupRecentMesg' => '1',
  'FtpPort' => '1',
  'FullKeepCnt' => '1',
  'TarShareName' => '1',
  'EMailNoBackupEverSubj' => '1',
  'TarClientRestoreCmd' => '0',
  'EMailNoBackupRecentSubj' => '1',
  'ArchivePar' => '1',
  'XferLogLevel' => '1',
  'ArchiveDest' => '1',
  'RsyncdAuthRequired' => '1',
  'ClientTimeout' => '1',
  'EMailNotifyMinDays' => '1',
  'SmbClientRestoreCmd' => '0',
  'ClientNameAlias' => '1',
  'DumpPostShareCmd' => '0',
  'IncrLevels' => '1',
  'EMailOutlookBackupMesg' => '1',
  'BlackoutBadPingLimit' => '1',
  'BackupFilesExclude' => '1',
  'FullPeriod' => '1',
  'RsyncClientRestoreCmd' => '0',
  'ArchivePostUserCmd' => '0',
  'IncrPeriod' => '1',
  'RsyncShareName' => '1',
  'FtpTimeout' => '1',
  'RestorePostUserCmd' => '0',
  'BlackoutGoodCnt' => '1',
  'ArchiveClientCmd' => '0',
  'ArchiveSplit' => '1',
  'FtpRestoreEnabled' => '1',
  'XferMethod' => '1',
  'NmbLookupCmd' => '0',
  'BackupsDisable' => '1',
  'SmbShareName' => '1',
  'FtpPasswd' => '1',
  'RestorePreUserCmd' => '0',
  'RsyncArgsExtra' => '1',
  'IncrKeepCntMin' => '1',
  'EMailNoBackupEverMesg' => '1',
  'EMailHeaders' => '1',
  'DumpPreUserCmd' => '0',
  'RsyncClientPath' => '0',
  'DumpPostUserCmd' => '0'
};

------------------------------------------------
Each of the config files in /etc/BackupPC/pc look like this:

bash-4.1$ cat mypc.pl
$Conf{XferMethod} = 'rsyncd';
$Conf{RsyncdPasswd} = ''; #Edited to remove detail
$Conf{RsyncShareName} = [
  'fileshare'
];
bash-4.1$

Regards,
Craig

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: <bac...@ko...> - 2013-10-31 01:01:06

Holger Parplies wrote at about 16:48:11 +0100 on Wednesday, October 30, 2013:
 > Jeffrey, I think we need a script to check pooling? My (still unfinished)
 > BackupPC_copyPool can generate a (huge) list of files, which can be sort(1)ed
 > by inode number. Parsing that should easily reveal anything not correctly
 > linked in an acceptable time frame (of course *generating* the list takes one
 > traversal of all pool and pc directories, but the rest would be fast enough).
 > Does that help, or have you already got something more suited? Are you
 > interested or should I be? ;-)
 

I have code that will do this in 2 related ways:

1. Run my routine "BackupPC_copyPcPool.pl" with the "--fixlinks|-f"
   option which will fix missing (or invalid) pc-to-pool links on the
   fly as the routine crawls the pc tree (after creating an in-memory
   hash cache of the pool inodes)

3. Run my routine "BackupPC_copyPcPool.pl" to generate a list of the
   non-zero length, non-linked files (other than backupInfo which is
   the only non-zero length file in the pc tree that should not be
   linked to the pool) which lie in the pc tree. The routine always
   creates this list since these files would need to be transferred
   manually if not linked to the pool.

   Then pipe this list of unlinked files to my routine
   "BackupPc_relinkPcFiles.pl" to fix each of the non-linked files

Note that BackupPC_copyPcPool works by caching the inode numbers of
the pool/cpool entries in a hash which allows for quick lookup and
checking of whether a pc tree file is linked to a valid pool file.

Note that the above methods take care of the cases when pc tree
files are:
	 1. Unlinked to anything else (nlinks =1)
	 2. Linked to other pc files (but not to the pool)

Note that both methods properly either make a new link in the pool or
delete the existing pc file and link it to a pre-existing pool entry
depending on whether or not a pool entry already exists.

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Les M. <les...@gm...> - 2013-10-31 14:38:04

On Thu, Oct 31, 2013 at 7:49 AM, Craig O'Brien <co...@fi...> wrote:
>
> Unable to read 8388608 bytes from
> /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,

What is the underlying storage here - nfs?

-- 
   Les Mikesell
     les...@gm...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Craig O'B. <co...@fi...> - 2013-10-31 14:47:27

> What is the underlying storage here - nfs?

Local SATA disks in a RAID 5 (5 disks, 3TB each in capacity)

Regards,
Craig


On Thu, Oct 31, 2013 at 10:37 AM, Les Mikesell <les...@gm...>wrote:

> On Thu, Oct 31, 2013 at 7:49 AM, Craig O'Brien <co...@fi...>
> wrote:
> >
> > Unable to read 8388608 bytes from
> > /var/lib/BackupPC//pc/myfileserver/new//ffileshare/RStmp got=0,
>
> What is the underlying storage here - nfs?
>
> --
>    Les Mikesell
>      les...@gm...
>
>
> ------------------------------------------------------------------------------
> Android is increasing in popularity, but the open development platform that
> developers love is also attractive to malware creators. Download this white
> paper to learn more about secure code signing practices that can help keep
> Android apps secure.
> http://pubads.g.doubleclick.net/gampad/clk?id=65839951&iu=/4140/ostg.clktrk
> _______________________________________________
> BackupPC-users mailing list
> Bac...@li...
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Les M. <les...@gm...> - 2013-10-31 15:15:50

On Thu, Oct 31, 2013 at 9:47 AM, Craig O'Brien <co...@fi...> wrote:
>> What is the underlying storage here - nfs?
>
> Local SATA disks in a RAID 5 (5 disks, 3TB each in capacity)

I think I'd force an fsck just on general principles even though it
will take a long time to complete.   Google turns up a few hits on
similar problems, but I don't see a definitive answer.   RStmp is
supposed to be used to hold an uncompressed copy of the previous
version of a large file with changes so rsync can seek to match up the
changed block positions, so this error probably has something to do
with your compressed copy being corrupted and not uncompressing
properly.

-- 
  Les Mikesell
     les...@gm...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Timothy J M. <tm...@ob...> - 2013-10-31 16:07:32

Holger Parplies <wb...@pa...> wrote on 10/30/2013 10:24:05 PM:

> as I understand it, the backups from before the change from smb to 
rsyncd are
> linked into the pool. Since the change, some or all are not. Whether the
> change of XferMethod has anything to do with the problem or whether it
> coincidentally happened at about the same point in time remains to be 
seen.
> I still suspect the link to $topDir as cause, and BackupPC_link is 
independent
> of the XferMethod used (so a change in XferMethod shouldn't have any
> influence).

To add my anecdote, I use a symbolic link for all of my BackupPC hosts:  a 
couple dozen?  And they all work fine.  It's been my standard procedure 
for almost as long as I've been using BackupPC.

Example: 
        ls -l /var/lib
        lrwxrwxrwx. 1 root    root     22 Apr 22  2013 BackupPC -> 
/data/BackupPC/TopDir/

        mount
        /dev/sda1 on /data type ext4 (rw)

I understand phobias from earlier problems (see my earlier e-mail about my 
thoughts on NFS and backups...) but I do not think this one is an issue.


> If the log files show nothing, we're back to finding the problem, but I 
doubt
> that. You can't "break pooling" by copying, as was suggested. Yes, you 
get
> independent copies of files, and they might stay independent, but 
changed
> files should get pooled again, and your file system usage wouldn't 
continue
> growing in such a way as it seems to be. If pooling is currently 
"broken",
> there's a reason for that, and there should be log messages indicating
> problems.

You are 100% correct;  but it depends on how you define "break".  Making a 
copy of a backup will absolutely break pooling--for the copy you just 
made!  :)

It won't prevent *future* copies from pooling, certainly.  But it sure can 
fill up a drive:  even if pooling *is* working correctly for new copies, 
they can still fill up the drive *and* BackupPC_nightly won't do a thing 
about it.

Tim Massey


 
Out of the Box Solutions, Inc. 
Creative IT Solutions Made Simple!
http://www.OutOfTheBoxSolutions.com
tm...@ob... 
 
22108 Harper Ave.
St. Clair Shores, MI 48080
Office: (800)750-4OBS (4627)
Cell: (586)945-8796

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Marcel M. <mai...@fo...> - 2013-10-31 16:36:10

Hi,

> Example: 
>         ls -l /var/lib
>         lrwxrwxrwx. 1 root    root     22 Apr 22  2013 BackupPC -> 
> /data/BackupPC/TopDir/
> 
>         mount
>         /dev/sda1 on /data type ext4 (rw)

out of curiosity - why don't you just configure /data/BackupPC/TopDir
in config.pl as the TopDir?

Regards
Marcel

-- 
Registrierter Linux User #307343

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Les M. <les...@gm...> - 2013-10-31 16:59:33

On Thu, Oct 31, 2013 at 11:36 AM, Marcel Meckel
<mai...@fo...> wrote:
> Hi,
>
>> Example:
>>         ls -l /var/lib
>>         lrwxrwxrwx. 1 root    root     22 Apr 22  2013 BackupPC ->
>> /data/BackupPC/TopDir/
>>
>>         mount
>>         /dev/sda1 on /data type ext4 (rw)
>
> out of curiosity - why don't you just configure /data/BackupPC/TopDir
> in config.pl as the TopDir?

Versions earlier than 3.2 didn't allow that after the initial install
- and in distribution-packaged version (rpm/deb) the location decision
had already been made by the packagers.

-- 
   Les Mikesell
     les...@gm...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: <bac...@ko...> - 2013-11-01 16:18:30

Craig O'Brien wrote at about 10:11:07 -0400 on Friday, November 1, 2013:
 > >And this would explain why the elements are not being linked properly to
 > the pool -- though I would have thought the more likely result would be a
 > duplicate pool entry than an unlinked pool entry...
 > 
 > >It might be interesting to look for pool chains with the same (uncompressed)
 > content and with links < HardLinkMax (typically 31999) to see if pool
 > entries are being unnecessarily duplicated.
 > 
 > >Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec
 > md5sum {} \;) | sort | uniq -d -w32
 > 
 > > Note this will find if there are any unnecessarily duplicated pool chains
 > (beyond the base one). Note to keep it fast and simple I am
 > > skipping the elements without a suffix... with the assumption being that
 > if there are duplicated elements then there will probably be
 > > whole chains of them...
 > 

I added some more bash-foo so that the following should find *any* and *all*
unnecessary pool dups...

(cd /var/lib/BackupPC/cpool; find . -name "*_0" | sed "s/_0$//" | (IFS=$'\n'; while read FILE; do find "${FILE}"* -links -3199 -exec md5sum {} \; | sort | uniq -D -w32 ; done))

Then do an 'ls -ial' to find their size and number of links they each
have. The "-i" will also tell you the inode for later reference.

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Holger P. <wb...@pa...> - 2013-11-01 17:59:05

Hi,

I get some diagnostics when reading this with 'use warnings "wrong_numbers"' ...

bac...@ko... wrote on 2013-11-01 12:18:17 -0400 [Re: [BackupPC-users] Disk space used far higher than reported pool?size]:
> Craig O'Brien wrote at about 10:11:07 -0400 on Friday, November 1, 2013:
>  > >And this would explain why the elements are not being linked properly to
>  > the pool -- though I would have thought the more likely result would be a
>  > duplicate pool entry than an unlinked pool entry...
>  > 
>  > >It might be interesting to look for pool chains with the same (uncompressed)
>  > content and with links < HardLinkMax (typically 31999) to see if pool

this one looks correct. 31999. Unless of course you've changed it in config.pl
because your FS requirements differ.

>  > entries are being unnecessarily duplicated.
>  > 
>  > >Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec

This one doesn't.

>  > md5sum {} \;) | sort | uniq -d -w32
>  > 
>  > > Note this will find if there are any unnecessarily duplicated pool chains
>  > (beyond the base one). Note to keep it fast and simple I am
>  > > skipping the elements without a suffix... with the assumption being that
>  > if there are duplicated elements then there will probably be
>  > > whole chains of them...
>  > 
> 
> I added some more bash-foo so that the following should find *any* and *all*
> unnecessary pool dups...
> 
> (cd /var/lib/BackupPC/cpool; find . -name "*_0" | sed "s/_0$//" | (IFS=$'\n'; while read FILE; do find "${FILE}"* -links -3199 -exec md5sum {} \; | sort | uniq -D -w32 ; done))

Nor does this one (the 3199 again). While it will find chain members with
less links than apparently necessary, it won't find all of them - only those
with *far* too small link number. That might be sufficient, depending on what
we're looking for. You probably wouldn't have chosen the (arbitrary) value
"3199", though, if you hadn't in fact meant "31999" ;-). And you wouldn't be
saying "*any* and *all*" if you were meaning "some".

I'd like to point out three things:
1.) unnecessary duplication *within* the pool is not the problem we are
    looking for,
2.) if it were a problem, then because a duplicate was created way ahead of
    time and repeatedly, not because the overflow happens at 31950 instead of
    31999,
3.) finding "unnecessary duplicates" can have a normal explanation: if at some
    point you had more than 31999 copies of one file (content) in your
    backups, BackupPC would have created a pool duplicate. Some of the backups
    linking to the first copy would have expired over time, leaving behind a
    link count < 31999. Further rsync backups would tend to link to the second
    copy, at least for unchanging existing files (in full backups). In other
    cases, the first copy might be reused, but there's no guarantee the link
    count would be exactly 31999 (though it would probably tend to be).
    Having so many copies of identical file content in your backups would tend
    to happen for small files rather than huge ones, I would expect, and it
    doesn't seem to be very common anyway (in my pools, I find exactly one file
    with a link count of 60673 (XFS) and a total of five with more than 10000
    links, the largest having 103 bytes (compressed)).

Regards,
Holger

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: <bac...@ko...> - 2013-11-01 18:47:05

Holger Parplies wrote at about 18:57:05 +0100 on Friday, November 1, 2013:
 > Hi,
 > 
 > I get some diagnostics when reading this with 'use warnings "wrong_numbers"' ...
 > 
 > bac...@ko... wrote on 2013-11-01 12:18:17 -0400 [Re: [BackupPC-users] Disk space used far higher than reported pool?size]:
 > > Craig O'Brien wrote at about 10:11:07 -0400 on Friday, November 1, 2013:
 > >  > >And this would explain why the elements are not being linked properly to
 > >  > the pool -- though I would have thought the more likely result would be a
 > >  > duplicate pool entry than an unlinked pool entry...
 > >  > 
 > >  > >It might be interesting to look for pool chains with the same (uncompressed)
 > >  > content and with links < HardLinkMax (typically 31999) to see if pool
 > 
 > this one looks correct. 31999. Unless of course you've changed it in config.pl
 > because your FS requirements differ.
 > 
 > >  > entries are being unnecessarily duplicated.
 > >  > 
 > >  > >Try: (cd /var/lib/BackupPC/cpool; find . -type f -links -3198 -name "*_*" -exec
 > 
 > This one doesn't.

Oops typo... dropped a 9
 > 
 > >  > md5sum {} \;) | sort | uniq -d -w32
 > >  > 
 > >  > > Note this will find if there are any unnecessarily duplicated pool chains
 > >  > (beyond the base one). Note to keep it fast and simple I am
 > >  > > skipping the elements without a suffix... with the assumption being that
 > >  > if there are duplicated elements then there will probably be
 > >  > > whole chains of them...
 > >  > 
 > > 
 > > I added some more bash-foo so that the following should find *any* and *all*
 > > unnecessary pool dups...
 > > 
 > > (cd /var/lib/BackupPC/cpool; find . -name "*_0" | sed "s/_0$//" | (IFS=$'\n'; while read FILE; do find "${FILE}"* -links -3199 -exec md5sum {} \; | sort | uniq -D -w32 ; done))
 > 
 > Nor does this one (the 3199 again). 
Typo again... dropped a 9

 > You probably wouldn't have chosen the (arbitrary) value
 > "3199", though, if you hadn't in fact meant "31999" ;-). And you wouldn't be
 > saying "*any* and *all*" if you were meaning "some".
 > 
 > I'd like to point out three things:
 > 1.) unnecessary duplication *within* the pool is not the problem we are
 >     looking for,

This is probably not his *primary* issue since the pool is (only)
~3T. But when he started talking about file read errors, I was
concerned that if the pool file reads were being truncated, then there
likely would be pool duplicates since the byte-by-byte comparisons
would fail for a given partial file md5sum leading to extra chain creation...

 > 2.) if it were a problem, then because a duplicate was created way ahead of
 >     time and repeatedly, not because the overflow happens at 31950 instead of
 >     31999,

 > 3.) finding "unnecessary duplicates" can have a normal explanation: if at some
 >     point you had more than 31999 copies of one file (content) in your
 >     backups, BackupPC would have created a pool duplicate. Some of the backups
 >     linking to the first copy would have expired over time, leaving behind a
 >     link count < 31999. Further rsync backups would tend to link to the second
 >     copy, at least for unchanging existing files (in full backups). In other
 >     cases, the first copy might be reused, but there's no guarantee the link
 >     count would be exactly 31999 (though it would probably tend to
 >     be).


You are absolutely right that there are valid reasons for the link
count overflowing 31999 and then later dropping below as links are
expired.

To tell you the truth, my use of "-links 31999" (corrected) was really
more pedantic -- in reality, I have never seen a case of link counts
getting that high... and if it does, it's probably extremely rare to
have a single non-zero file repeated that many times unless you have a
huge set of clients or a huge set of past full backups... (or some
special situation where users keep large numbers of copies of certain
files).

So, basically, while there may be an exceptional case or two, anything
spewed back by my shell one-liner is worth looking at from the
perspective of potential issues with pool duplication.

 >     Having so many copies of identical file content in your backups would tend
 >     to happen for small files rather than huge ones, I would expect, and it
 >     doesn't seem to be very common anyway (in my pools, I find exactly one file
 >     with a link count of 60673 (XFS) and a total of five with more than 10000
 >     links, the largest having 103 bytes (compressed)).

Exactly -- that's my point.
So other than your one case of 60673 links, any other case of a
duplicate pool chain would be due to an error somewhere...

You may remain correct that adding "-nlinks 31999" or "-nlinks 31500"
or any similar number is not going to limit the search in
reality... and therefore won't have much of a practical difference...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: Les M. <les...@gm...> - 2013-11-01 19:24:25

On Fri, Nov 1, 2013 at 1:46 PM,  <bac...@ko...> wrote:
>
> This is probably not his *primary* issue since the pool is (only)
> ~3T. But when he started talking about file read errors, I was
> concerned that if the pool file reads were being truncated, then there
> likely would be pool duplicates since the byte-by-byte comparisons
> would fail for a given partial file md5sum leading to extra chain creation...

The read errors were in the RStmp files that is supposed to be the
uncompressed copy of a large compressed file so rsync can seek around
looking for a match.   I wonder if there could be a file (huge
database, mailbox,etc.) that compresses to a point that even with the
safety factor of backups not starting at 95% full, the uncompressed
copy won't fit.   Or maybe a sparse dbm type file where the original
doesn't allocate the space the length would indicate.

-- 
   Les Mikesell
      les...@gm...

Re: [BackupPC-users] Disk space used far higher than reported pool size

From: <bac...@ko...> - 2013-11-01 19:20:21

Holger Parplies wrote at about 18:57:05 +0100 on Friday, November 1, 2013:
 > 3.) finding "unnecessary duplicates" can have a normal explanation: if at some
 >     point you had more than 31999 copies of one file (content) in your
 >     backups, BackupPC would have created a pool duplicate. Some of the backups
 >     linking to the first copy would have expired over time, leaving behind a
 >     link count < 31999. Further rsync backups would tend to link to the second
 >     copy, at least for unchanging existing files (in full backups). In other
 >     cases, the first copy might be reused, but there's no guarantee the link
 >     count would be exactly 31999 (though it would probably tend to be).

Interesting... I think this depends on the transfer method. The rsync
method looks to the immediately prior full for comparison, so new hard
links will be made to the same chain as the last full. Thus, if
earlier elements in the chain have reduced link count, they will tend
not to be filled back in.

It seems like the other transfer methods directly reference the
PoolWrite package which always crawls up the chain looking for
matches... If true, it does seem that one could in general, speed up
fulls for the other algorithms by putting a matching candidate from
the previous full (if any) first in the candidate match list... rather
than matching them in chain order (or several simultaneously).

In any case, if my quick reading of the code is correct, then the
other methods will tend to fill in earlier chains elements first so
that the link count will march back up to 31999.

<< < 1 2 3 > >> (Page 2 of 3)