From: Pieter W. <si...@us...> - 2010-01-08 13:59:47
|
On Thu, Jan 07, 2010 at 05:38:14PM -0700, Kyle Anderson wrote: > Pieter, > Thank you for kindly providing this script. I also have an accounting need to > get some sort of reasonable estimate for how much space they are occupying, and > I don't want to do a du. Since files that occur in multiple backups are stored only once, yet linked multiple times, i'm not entirely sure what you mean by "space they are occupying". The alloc= number will give you a guess of the occupied disk space, distributing the amount of required storage among the shared backups. Eg., if you have two backups, and both contain 2.22GiB of shared data, and one unique 1GiB file on both, running diffsize.pl on either will should give you 2.11GiB, running du on either will give you 3.22GiB, running du on both at once will give you 4.22GiB. > Forgive my ignorance, but can you give human readable explanations for the > abbreviations?: > total: alloc=x dalloc=x dentries=x dsize=x falloc=x fcount=x fsize=x alloc is the number that, when summed over all backups on the filesystem (assuming there is only a backuppc directory on it), should approximately add up to the allocated space in the filesystem. So it will count small files as the size of the filesystem block they fit in eg., and includes an approximate storage requirement for directories themselves. This is probably the most reasonable number to measure the amount of storage requirement a certain backup/user causes you. dalloc is the approximate storage requirement for directories only, excluding possibly-shared files. dentries is the total number of directory entries dsize is the approximate size of directories (very arbitrary number, different filesystems count the size of directories in different ways). falloc is the (distributed) storage requirement for files only, excluding directories. dalloc+falloc should equal alloc fcount is the (distributed) number of files, (eg. a file occuring in two backups will be counted as 0.5 in each) fsize is the (distributed) size of files, using their exact size, not including overhead causes by file sizes that are no multiple of the filesystem block sze. This is probably the most reasonable number to use as a measure for how much data a user backs up/backup contains. It is independent from the type of filesystem the backuppc pool is stored on. PS: apologies for the not-very-human-readable output - it was never really meant to be published. -- Pieter |