From: Tyler J. W. <ty...@to...> - 2013-03-05 10:21:14
|
Jeffrey, As always, thanks. Did you forget to increment jLib.pm to 0.4.2? There's a comment in the library but "$VERSION = '0.4.1';" in the code. Would you mind including these as attachments, rather than inline? It would save us some cut-and-paste. Regards, Tyler On 2013-03-01 23:30, bac...@ko... wrote: > Here is the latest (updated) version of BackupPC_copyPcPool along with > the jLib.pm library which it uses. > > It is much, much better than my initial version(s) released in 2011 as > pointed out in my email from yesterday. However, it does not yet > include the "incremental" backup/transfer option that I referred to > yesterday. Also, it currently requires ~190 bytes/pool entry (for > o(million) pool elements) to cache the pool inodes. If the -P (pack) > flage, this is reduced to about ~140 bytes/pool entry. I intend to implement > a tighter packing of about 30 bytes/pool entry at the expense of > lookup speed in a future version. > > If anyone has trouble with line wrapping errors caused by my emailing > the plain text, I can send it as an attached file.. > > I am very interested in hearing people's feedback and experiences with > this program... > > ------------------------------------------------------------------------ > BackupPC_copyPcPool.pl > > #!/usr/bin/perl > #============================================================= -*-perl-*- > # > # BackupPC_copyPcPool.pl: Reliably & efficiently copy pool and pc tree > # > # DESCRIPTION > # See below for detailed description of what it does and how it works > # > # AUTHOR > # Jeff Kosowsky > # > # COPYRIGHT > # Copyright (C) 2011-2013 Jeff Kosowsky > # > # This program is free software; you can redistribute it and/or modify > # it under the terms of the GNU General Public License as published by > # the Free Software Foundation; either version 2 of the License, or > # (at your option) any later version. > # > # This program is distributed in the hope that it will be useful, > # but WITHOUT ANY WARRANTY; without even the implied warranty of > # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > # GNU General Public License for more details. > # > # You should have received a copy of the GNU General Public License > # along with this program; if not, write to the Free Software > # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > # > #======================================================================== > # > # Version 0.2.1, February 2013 > # > # CHANGELOG: > # 0.1.1 (Feb 2011) > # 0.1.2 (May 2011) > # 0.1.3 (Sep 2011) > # 0.1.4 (Jan 2013) Added options & error checking to set backuppc uid & guid > # when not available from passwd & group files, respectively > # 0.2.0 (Feb 2013) Initial release of new version using full in-memory I-node > # (hash) caching rather than I-node pooling via file system > # tree > # 0.2.1 (Feb 2013) Added packed option for I-node cache hash index & values > # Added md5sum output option > #======================================================================== > # > # DESCRIPTION > # > # 1. The program first recurses through the pool/cpool to create a > # perl hash cache (%InodeCache) indexed by inode of all the pool > # entries. This subsequently allow for rapid lookup and association of > # each linked pc file with the corresponding stored pool > # file. Optionally, the cache can be written out to a file as a list > # of "<inode> <pool mdsum>" pairs. This inode chache file can then be > # read in rapidly to populate the chache on subsequent runs, assuming > # the pool hasn't changed in the interim. > # > # 2. The program then runs through the pc tree (or a designated > # subtree thereof) to create a single long list of all of the > # directories, hard links, and zero-length files > # For hard links, the entry is: > # <path-to-pool-file> <path-to-pc-file> > # For directories, the entry is: > # D <owner> <group> <mode> <mtime> <path-to-directory> > # For zero-length files, the entry is: > # Z <owner> <group> <mode> <mtime> <path-to-file> > # For 'errors' (e.g., non-linked, non-zero length files), the entry is: > # X <owner> <group> <mode> <mtime> <path-to-file> > # Comments can be indicated by: > # # > > # Note: all paths are relative to TopDir > > # Any (non-zero length) pc files that are not linked (i.e. only one > # hard link) or that are not found in the InodeCache may optionally be > # corrected and linked to the pool and if this results in a new pool > # entry then that entry is added to the InoceCache too. Files that are > # not corrected are listed for separate examination and manual > # correction or backup. > > # The output consists of 3 files: > > # <outfile>.links is the file consisting of the list of links, > # directories and zero-length files described above > > # <outfile>.nopool is a listing of the normal top-level log and info > # files that are not linked and need to be copied > # over directly > > # <outfile>.nolinks is a listing of the non-linked, non-zero length > # files that either couldn't be linked or that you > # decided not to link > # > > # If the --Writecache|-W option is set, a 4th output file is generated: > > # <outfile>.icache is a 2 column listing consisting of > # <inode number> <pool file name> pairs. > # If the --writecache|-W option is repeated, then a > # 3rd column is added consisting of the md5sum of > # the (uncompressed) contents of the pool file. > # One can then check for pool dups by running: > # sort -k 3 <outfile>.icache | uniq -f 2 -D > > # NOTE: Backuppc on the source machine should be disabled (or a > # static snapshot used) during the entire backup process > > # 3. Restoring then consists of the following steps: > > # A. Copy over the pool/cpool any way you like, including > # rsync but without having to worry about hard links > > # B. Copy over the non-pooled files (<outfile>.nopool) any way you want, > # e.g., rsync, tar, etc. > > # C. Run this program again using the -r|--restore flag and the > # <outfile>.nolinks as the input > > # D. Optionally copy over the non-linked files in <outfile>.nolinks > > # NOTE: Backuppc on the target machine should be disabled (or a > # snapshot used) during the entire restore process > > # Selected features: > # --gzip|-g flag creates compressed output/input > # --stdio writes/reads to stdio allowing you to pipe the backup > # stage directly into the restore stage > # --fixlinks|-f fixes any missing/broken pc links > # --topdir|-t allows setting of alternative topdir > # --dryrun|-d doesn't make any changes to pool or pc tree > # --verbose|-v verbosity (repeat for higher verbosity) > # > #======================================================================== > > use strict; > use warnings; > use File::Path; > use File::Glob ':glob'; > use Getopt::Long qw(:config no_ignore_case bundling); > use Fcntl; #Required for RW I/O masks > use Switch; > > use lib "/usr/share/BackupPC/lib"; > use BackupPC::FileZIO; > use BackupPC::Lib; > use BackupPC::jLib v0.4.2; # Requires version >= 0.4.2 > use BackupPC::Attrib qw(:all); > > no utf8; > > #use Data::Dumper; #JJK-DEBUG > #use Devel::Size qw(size total_size); #JJK-DEBUG > > select(STDERR); #Auto-flush (i.e., unbuffer) STDERR > $| = 1; > select(STDOUT); > > #Variables > my $bpc = BackupPC::Lib->new or die "BackupPC::Lib->new failed\n"; > my $md5 = Digest::MD5->new; > my $attr; #Re-bless for each backup since may have different compress level > my $MaxLinks = $bpc->{Conf}{HardLinkMax}; > my $CREATMASK = O_WRONLY|O_CREAT|O_TRUNC; > my %InodeCache=(); #Cache for saving pool/cpool inodes > my @nonpooled = (); > my @backups = (); > my @compresslvls=(); #Pool to use for each backup; > my ($compresslvl, $ptype); > > my $backuppcuser = $bpc->{Conf}{BackupPCUser}; #User name > my $backuppcgroup = $bpc->{Conf}{BackupPCUser}; #Assumed to be same as user > > my $directories = 0; > my $totfiles = 0; #Total of next 6 variables > my $zerolengthfiles = 0; > my $existinglink_pcfiles = 0; > my $fixedlink_pcfiles = 0; > my $unlinkable_pcfiles = 0; > my $unlinked_pcfiles = 0; > my $unlinked_nonpcfiles = 0; > my $unlinked_files = 0; #Total of above three > my $missing_attribs = 0; > > #GetOptions defaults: > my $cleanpool; > $dryrun=0; #Global variable defined in jLib.pm (do not use 'my') #JJK-DEBUG > #$dryrun=1; #Set to 1 to make dry run only > my $fixlinks=0; > my $Force; > my $gzip; > my $giddef = my $giddef_def = getgrnam($backuppcuser); #GID > $giddef_def = "<unavailable>" unless defined $giddef_def; > my $outfile; > my $Overwrite; > my $Pack; > my $pool = my $pool_def = 'both'; > my $restore; > my $Readcache; > my $Skip; > my $stdio; > my $TopDir = my $TopDir_def = $bpc->{TopDir}; > my $uiddef = my $uiddef_def = getpwnam($backuppcgroup); #UID > $uiddef_def = "<unavailable>" unless defined $uiddef_def; > my $verbose = my $verbose_def = 2; > my $noverbose; > my $Writecache; > > usage() unless( > GetOptions( > "cleanpool|c" => \$cleanpool, #Remove orphan pool entries > "dryrun|d!" => \$dryrun, #1=dry run > "fixlinks|f" => \$fixlinks, #Fix unlinked/broken pc files > "Force|F" => \$Force, #Override stuff... > "gid|G=i" => \$giddef, #default gid if not in group file > "gzip|g" => \$gzip, #Compress files > "outfile|o=s" => \$outfile, #Output file (required) > "Overwrite|O" => \$Overwrite, #OVERWRITES existing files/dirs (restore) > "pool|p=s" => \$pool, #Pool (pool|cpool|both) > "Pack|P" => \$Pack, #Pack the InodeCache hash > "restore|r" => \$restore, #Restore rather than backup > "Readcache|R=s" => \$Readcache, #Read InodeCache from file > "Skip|S" => \$Skip, #Skip existing backups (restore) > "stdio|s" => \$stdio, #Print/read to/from stdout/stdin > "topdir|t=s" => \$TopDir, #Location of TopDir > "uid|U=i" => \$uiddef, #default gid if not in group file > "verbose|v+" => \$verbose, #Verbosity (repeats allowed) > "noverbose" => \$noverbose, #Shut off all verbosity > "Writecache|W+" => \$Writecache,#Write InodeCache out to file > "help|h" => \&usage, > )); > > if($restore) { > usage() if $cleanpool || $fixlinks || $outfile || > $Readcache || $Writecache || > (!$stdio && @ARGV < 1); > } else { > usage() if ! defined($outfile) || ($pool !~ /^(pool|cpool|both)$/) || > ($Readcache && $Writecache) || $Overwrite; > } > > $verbose = 0 if $noverbose; > my $DRYRUN = $dryrun ? "**DRY-RUN** " : ""; > > die "Error: 'backuppc' not found in password file, so must enter default UID on commandline (--uid|-U option)...\n" unless defined $uiddef; > die "Error: 'backuppc' not found in group file, so must enter default GID on commandline (--gid|-G option)...\n" unless defined $giddef; > my (%USERcache, %GROUPcache, %UIDcache, %GIDcache); > $UIDcache{$uiddef} = $backuppcuser; > $GIDcache{$uiddef} = $backuppcgroup; > $USERcache{$backuppcuser} = $uiddef; > $GROUPcache{$backuppcgroup} = $giddef; > > ############################################################################ > if($TopDir ne $TopDir_def) { > #NOTE: if we are not using the TopDir in the config file, then we > # need to manually override the settings of BackupPC::Lib->new > # which *doesn't* allow you to set TopDir (even though it seems so > # from the function definition, it gets overwritten later when the > # config file is read) > $TopDir =~ s|//*|/|g; #Remove any lurking double slashes > $TopDir =~ s|/*$||g; #Remove trailing slash > $bpc->{TopDir} = $TopDir; > $bpc->{Conf}{TopDir} = $TopDir; > > $bpc->{storage}->setPaths({TopDir => $TopDir}); > $bpc->{PoolDir} = "$bpc->{TopDir}/pool"; > $bpc->{CPoolDir} = "$bpc->{TopDir}/cpool"; > } > > %Conf = $bpc->Conf(); #Global variable defined in jLib.pm (do not use 'my') > ############################################################################# > #By convention for the rest of this program, we will assume that all > #directory variables have a trailing "/". This should save us some > #efficiency in terms of not having to always add back the slash. > $TopDir .= "/"; $TopDir =~ s|//*|/|g; > my $pcdir = 'pc/'; > > die "TopDir = '$TopDir' doesn't exist!\n" unless -d $TopDir; > die "TopDir = '$TopDir' doesn't look like a valid TopDir!\n" > unless -d "$TopDir/pool" && -d "$TopDir/cpool" && -d "$TopDir/${pcdir}"; > > system("$bpc->{InstallDir}/bin/BackupPC_serverMesg status jobs >/dev/null 2>&1"); > unless(($? >>8) == 1) { > die "Dangerous to run when BackupPC is running!!! (use '--Force' to override)\n" > unless $Force || > (stat("$TopDir/"))[0] . "." . (stat(_))[1] != > (stat("$TopDir_def/"))[0] . "." . (stat(_))[1]; #different fs & inode > warn "WARNING: May be dangerous to run when BackupPC is running!!!\n"; > #Warn but don't die if *appear* to be in different TopDir > } > > ############################################################################ > if(defined $restore) { > do_restore(@ARGV); > exit; #DONE > } > ############################################################################ > #Open files for reading/writing > #NOTE: do before chdir to TopDir > my $sfx = $gzip ? ".gz" : ""; > my $linksfile = "${outfile}.links${sfx}"; > my $nopoolfile = "${outfile}.nopool${sfx}"; > my $nolinksfile = "${outfile}.nolinks${sfx}"; > > die "ERROR: '$linksfile' already exists!\n" if !$stdio && -e $linksfile; > die "ERROR: '$nopoolfile' already exists!\n" if -e $nopoolfile; > die "ERROR: '$nolinksfile' already exists!\n" if -e $nolinksfile; > > my $outpipe = $gzip ? "| gzip > " : "> "; > > if($stdio) { > open(LINKS, $gzip ? "| gzip -f" : ">&STDOUT"); #Redirect LINKS to STDOUT > } else { > open(LINKS, $outpipe . $linksfile) > or die "ERROR: Can't open '$linksfile' for writing!($!)\n"; > } > > open(NOPOOL, $outpipe . $nopoolfile) > or die "ERROR: Can't open '$nopoolfile' for writing!($!)\n"; > open(NOLINKS, $outpipe . $nolinksfile) > or die "ERROR: Can't open '$nolinksfile' for writing!($!)\n"; > > my ($read_Icache, $write_Icache); > if($Pack) { #Pack InodeCache entries > $read_Icache = \&read_Icache_pack; > $write_Icache = \&write_Icache_pack; > } else { #Don't pack InodeCache entries > $read_Icache = \&read_Icache_nopack; > $write_Icache = \&write_Icache_nopack; > } > > if($Readcache) { #Read in existing InodeCache from file > open(RCACHE, $gzip ? "/bin/zcat $Readcache |" : "< $Readcache") or > die "ERROR: Can't open '$Readcache' for reading!($!)\n"; > } elsif($Writecache) { > my $wcachefile = "${outfile}.icache${sfx}"; > die "ERROR: '$wcachefile' already exists!\n" if -e $wcachefile; > open(WCACHE, $outpipe . $wcachefile) > or die "ERROR: Can't open '$wcachefile' for writing!($!)\n"; > } > > ############################################################################ > chdir($TopDir); #Do this so we don't need to worry about distinguishing > #between absolute and relative (to TopDir) pathnames > #Everything following opening the files occurs in TopDir > ############################################################################ > initialize_backups(); > > warn "**Populating Icache...\n" if $verbose>=1; > my $toticache=0; > if($Readcache) { #Read in existing InodeCache from file > $ptype = $Conf{CompressLevel} ? "cpool" : "pool"; > while(<RCACHE>) { > if(/^(\d+)\s+(\S+)/) { > &$write_Icache($ptype, $1, $2); > $toticache++; > } elsif(/^#(c?pool)\s*$/) { > $ptype = $1; > } else { > warn "'$_' is not a valid cache entry\n" unless /^#/; > } > } > close RCACHE; > } else { # Create InodeCache > if($pool eq "both") { > $toticache += populate_icache("pool"); > $toticache += populate_icache("cpool"); > } else { > $toticache += populate_icache($pool); > } > close WCACHE if $Writecache; > } > warn "*Icache entries created=$toticache\n" if $verbose >=2; > if(defined(&total_size)){ #JJK-DEBUG: Print out sizes used by InodeCache > warn sprintf("InodeCache total size: %d\n",total_size(\%InodeCache)); > warn sprintf("Cpool cache entries: %d\tSize: %d\tTotal_size: %d\n", > (scalar keys $InodeCache{"pool"}), size($InodeCache{"pool"}), > total_size($InodeCache{"pool"})) > if defined $InodeCache{"pool"}; > warn sprintf("Cpool cache entries: %d\tSize: %d\tTotal_size: %d\n", > (scalar keys $InodeCache{"cpool"}),size($InodeCache{"cpool"}), > total_size($InodeCache{"cpool"})) > if defined $InodeCache{"cpool"}; > } > > exit unless @backups; > > warn "**Recording nonpooled top level files...\n" if $verbose>=1; > foreach (@nonpooled) { #Print out the nonpooled files > printf NOPOOL "%s\n", $_; > } > close(NOPOOL); > > warn "**Recording linked & non-linked pc files...\n" if $verbose>=1; > my $lastmachine = ''; > foreach (@backups) { > m|$pcdir([^/]*)|; > $lastmachine = $1 if $1 ne $lastmachine; > > warn "* Recursing through backup: $_\n" if $verbose>=1; > $compresslvl = shift(@compresslvls); > $attr = BackupPC::Attrib->new({ compress => $compresslvl }); > #Reinitialize this jLib global variable in case new compress level > $ptype = ($compresslvl > 0 ? "cpool" : "pool"); > m|(.*/)(.*)|; > find_pool_links($1, $2); > } > close(LINKS); > close(NOLINKS); > > ############################################################################## > #Print summary & concluding message: > printf STDERR "\nDirectories=%d\tTotal Files=%d\n", > $directories, ($totfiles + $#nonpooled); > printf STDERR "Link files=%d\t [Zero-length=%d, Hardlinks=%d (Fixed=%d)]\n", > ($zerolengthfiles+$existinglink_pcfiles+$fixedlink_pcfiles), > $zerolengthfiles, ($existinglink_pcfiles+$fixedlink_pcfiles), > $fixedlink_pcfiles; > printf STDERR "Non-pooled Toplevel=%d\n", $#nonpooled; > printf STDERR "Non-pooled Other=%d [Valid-pc=%d (Failed-fixes=%d), Invalid-pc=%d]\n", > $unlinked_files, ($unlinked_pcfiles+$unlinkable_pcfiles), > $unlinkable_pcfiles, $unlinked_nonpcfiles; > printf STDERR "PC files missing attrib entry=%d\n", $missing_attribs; > > my ($rsyncnopool, $rsyncnolinks); > if($gzip) { > $gzip = "-g "; > $rsyncnopool = "zcat $nopoolfile | rsync -aOH --files-from=- $TopDir <newTopDir>"; > $rsyncnolinks = "zcat $nolinksfile | rsync -aOH --files-from=- $TopDir <newTopDir>"; > } else { > $gzip = ""; > $rsyncnopool = "rsync -aOH --files-from=$nopoolfile $TopDir <newTopDir>"; > $rsyncnolinks = "rsync -aOH --files-from=$nolinksfile $TopDir <newTopDir>"; > } > > print STDERR <<EOF; > ------------------------------------------------------------------------------ > To complete copy, do the following as user 'root' or 'backuppc': > 1. Copy/rsync over the pool & cpool directories to the new TopDir > (note this must be done *before* restoring '$linksfile') > For rsync: > rsync -aO $TopDir\{pool,cpool\} <newTopDir> > > 2. Copy/rsync over the non-pooled top level files ($nopoolfile) > For rsync: > $rsyncnopool > > 3. Restore the pc directory tree, hard links and zero-sized files: > $0 -r ${gzip}[-t <newTopDir>] $linksfile > > 4. Optionally, copy/rsync over the non-linked pc files ($nolinksfile) > For rsync: > $rsyncnolinks > > ------------------------------------------------------------------------------ > > EOF > exit; > > ############################################################################## > ############################################################################## > #SUBROUTINES: > > sub usage > { > print STDERR <<EOF; > > usage: $0 [options] -o <outfile> [<relpath-to-pcdir> <relpath-to-pcdir>...] > $0 [options] -r <restorefile> [<relpath-to-pcdir> <relpath-to-pcdir>...] > > where the optional <relpath-to-pcdir> arguments are paths relative to the pc > tree of form: 'host' or 'host/share' or 'host/share/n' or 'host/share/n/.../' > If no optional <relpath-to-pcdir> arguments are given then the entire pc tree > is backed up or restored. > > Options: [Common to copy & restore] > --dryrun|-d Dry-run - doesn\'t change pool or pc trees > Negate with: --nodryrun > --Force|-F Overrides various checks (e.g., if BackupPC running > or if directories present) > --gid|-G <gid> Set group id for backuppc user > [Default = $giddef_def] > --gzip|-g Pipe files to/from gzip compression > --topdir|-t Location of TopDir. > [Default = $TopDir_def] > Note you may want to change from default for example > if you are working on a shadow copy. > --uid|-U <uid> Set user id for backuppc user > [Default = $uiddef_def] > --verbose|-v Verbose (repeat for more verbosity) > [Default level = $verbose_def] > Use --noverbose to turn off all verbosity > --help|-h This help message > > Options: [Copy only] > --cleanpool|-c If orphaned files (nlinks=1) found when populating > Icache, remove them (and renumber chain as needed) > NOTE: Orphans shouldn\'t exist if you have just run > BackupPC_nightly > --fixlinks|-f Attempt to link valid pc files back to the pool > if not already hard-linked > NOTE: this changes files in the pc and\/or pool > of your source too! > --outfile|-o [outfile] Required stem name for the 3 output files > <outfile>.nopools > <outfile>.links > <outfile>.nolinks > --pool|-p [pool|cpool|both] Pools to include in Icache tree > [Default = $pool_def] > --Pack|-P Pack both the indices and values of the InodeCache. > This saves approximately 52 bytes per pool entry. > For o(1 million) entries, this reduces usage from > ~190 bytes/entry to ~140 bytes/entry > --Readcache|-R [file] Read in previously written inode pool cache from > file. This allows resuming previously started backups > Only use this option if the pool has not changed. > Note: can\'t be used with --Writecache|-W > --stdio|-s Print the directory tree, links and zero-sized files > to stdout so it can be piped directly to another copy > of the program running --restore > NOTE: Status, Warnings, and Errors are printed to > stdout. > --WriteCache|-W Write inode pool cache to file <outfile>.icache > as a 2 column list consisting of: > <inode number> <pool entry> > If repeated, include md5sums of the (uncompressed) > pool file contents as a 3rd column (this is slow) > If argument path is '-', then quit after writing > cache. > Note: can\'t be used with --Readcache|-R > Options: [Restore only] > --Overwrite|-O OVERWRITE existing files & directories (Dangerous) > --Skip|-S SKIP existing backups > --stdio Read the directory tree, links and zero-sized files > from stdin so it can be piped directly from another > copy of the program running in the create mode. > Note <restorefile> should not be used in this case > For example, the following pipe works: > $0 -t <source TopDir> [-g] --stdio | $0 -t <dest TopDir> [-g] -r --stdio > > OVERVIEW: > First create an inode-indexed cache (hash) of all pool entries. > Note the cache may optionaly be written out to a file for later re-use > with the --Readcache|-R flag. > > Then, recurse through the paths specified relative to the pc tree or if no > paths specified then the entire pc tree. > - If the file is a (non-pooled) top-level log file, then write its path > relative to pcdir out to <outfile>.nopool > Note this includes all non-directories above the share level plus the > backInfo files that are covered by the input paths > - If the file is a directory, zero length file, or an existing hard link to > the tree, then write it out to <outfile>.links > - If the file is not hard-linked but is a valid non-zero length pc file > (f-mangled and present in the attrib file) and --fixlinks is selected, > then try to link it properly to the appropriate pool. > - Otherwise, add it to <outfile>.nolinks > > NOTE: TO ENSURE INTEGRITY OF RESULTS IT IS IMPORTANT THAT BACKUPPC IS NOT > RUNNING (use --Force to override) > > NOTE: you should run BackupPC_nightly before running this program so that > no unnecessary links are backed up; alternatively, set the --cleanpool > option which will remove orphan pool entries. > > EOF > exit(1); > } > > #Glob to determine what is being backed up. > #Make sure each backup seems valid and determine it's compress level > #Collect top-level non-pooled files > sub initialize_backups > { > if (!@ARGV) { # All backups at backup number level > # TopDir/pc/<host>/<nn> > @backups = bsd_glob("${pcdir}*/*"); #2 levels down; > @nonpooled = grep(! -d , bsd_glob("${pcdir}*")); #All non-directories > } else { # Subset of backups > return if($Writecache && $ARGV[0] eq '-'); #Don't copy pc tree > foreach(@ARGV) { > my $backupdir = $pcdir . $_ . '/'; > $backupdir =~ s|//*|/|g; #Remove redundant slashes > $backupdir =~ s|/\.(?=/)||g; #Replace /./ with / > die "ERROR: '$backupdir' is not a valid pc tree subdirectory\n" if $pcdir eq $backupdir || !-d $backupdir; > if($backupdir =~ m|^\Q${pcdir}\E[^/]+/$|) { #Hostname only > push(@backups, bsd_glob("${backupdir}*")); > } else { # At share level or below #Backup number or below > push(@backups, ${backupdir}); > } > } > @backups = keys %{{map {$_ => 1} @backups}}; #Eliminate dups > @nonpooled = keys %{{map {$_ => 1} @nonpooled}}; #Eliminate dups > } > > push(@nonpooled, grep(! -d, @backups)); #Non-directories > @backups = grep(-d , @backups); #Directories > push(@nonpooled, grep(m|/backupInfo$|, @backups)); # backupInfo not pooled > #Note everything else *should* be pooled - if not, we will catch it later as an error > @backups = grep(!m|/backupInfo$|, @backups); > > foreach(@backups) { > s|/*$||; #Remove trailing slash > m|^(\Q${pcdir}\E[^/]+/[^/]+)(/)?|; > die "Backup '$1' does not contain a 'backupInfo' file\n" > unless -f "$1/backupInfo"; > push(@nonpooled, "$1/backupInfo") unless $2; #Don't include if share level or below > my $compress = get_bakinfo("$1","compress"); > push(@compresslvls, $compress); > my $thepool = $compress > 0 ? 'cpool' : 'pool'; > die "Backup '$1' links to non-included '$thepool'\n" > if $pool ne 'both' && $pool ne $thepool; > } > } > > #Iterate through the 'pooldir' tree and populate the InodeCache hash > sub populate_icache > { > my ($fpool) = @_; > my (@fstat, $dh, @dirlist); > my $icachesize = 0; > > return 0 unless bsd_glob("$fpool/[0-9a-f]"); #No entries in pool > print WCACHE "#$fpool\n" if $Writecache; > my @hexlist = ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'a', 'b', 'c', 'd', 'e', 'f'); > my ($idir,$jdir,$kdir); > print STDERR "\n* Populating Icache with '$fpool' branch:[0-f] " if $verbose >=2; > foreach my $i (@hexlist) { > print STDERR "$i " if $verbose >=2; > $idir = $fpool . "/" . $i . "/"; > foreach my $j (@hexlist) { > $jdir = $idir . $j . "/"; > foreach my $k (@hexlist) { > $kdir = $jdir . $k . "/"; > unless(opendir($dh, $kdir)) { > warn "Can't open pool directory: $kdir\n" if $verbose>=4; > next; > } > my @entries = grep(!/^\.\.?$/, readdir ($dh)); #Exclude dot files > closedir($dh); > warn "POOLDIR: $kdir (" . ($#entries+1) ." files)\n" > if $verbose >=3; > > if($cleanpool) { #Remove orphans & renumber first and then > #create the Icache after orphan deletion & chain renumbering > #NOTE: we need to do this before populating the InodeCache > #since pool deletions & renumbering would change the cache > my @poolorphans = > grep(-f $kdir . $_ && (stat(_))[3] < 2, @entries); > foreach (sort {poolname2number($b) cmp poolname2number($a)} @poolorphans) { #Reverse sort to minimize moves > my $pfile = $kdir . $_; > my $result =delete_pool_file($pfile); > if($verbose >=1) { > $result == 1 ? > warn "WARN: Deleted orphan pool entry: $pfile\n": > warn "ERROR: Couldn't properly delete orphan pool entry: $pfile\n" > } > } > @entries = grep(!/^\.\.?$/, readdir ($dh)) > if(@poolorphans); #Reload if orphans renumbered/removed > } > foreach (@entries) { > # Go through all entries in terminal cpool branch > @fstat = stat($kdir . $_); > #Note: 0=dev, 1=ino, 2=mode, 3=nlink, 4=uid, 5=gid, 6=rdev > #7=size, 8=atime, 9=mtime, 10=ctime, 11=blksize, 12=blocks > if($fstat[3] >= 2) { # Valid (non-orphan) pool entry > #Note: No sense in creating icache entries for orphans > &$write_Icache($fpool, $fstat[1], $_); > if($Writecache >= 2) { > printf WCACHE "%d %s %s\n", $fstat[1], $_, > zFile2FullMD5($bpc, $md5, $kdir . $_); > }elsif($Writecache) { > printf WCACHE "%d %s\n", $fstat[1], $_; > } > $icachesize++; > } else { #Orphan pool entry > warn "WARN: Orphan pool entry: $_\n" > if $verbose>=1; > } > warn "WARN: Zero-size pool entry: $_\n" > if $fstat[7] == 0 && $verbose>=1; > } > } # kdir > } # jdir > } # idir > print STDERR "\n" if $verbose >=2; > return $icachesize; > } > > #Recursively go through pc tree > sub find_pool_links > { > my ($dir, $filename) = @_; > > my $dh; > my $file = $dir . $filename; > if(-d $file) { > my @fstat = stat(_); #Note last stat was -d $file > #Print info to signal & re-create directory: > #D <user> <group> <mod> <atime> <mtime> <file name> > print_file_reference("D", $file, \@fstat); > $directories++; > opendir($dh,$file); > my @contents = readdir($dh); > closedir($dh); > foreach (@contents) { > next if /^\.\.?$/; # skip dot files (. and ..) > find_pool_links($file . '/', $_); #Recurse > } > } > else { #Not a directory > my @fstat = stat(_); #Note last stat was -d $file > $totfiles++; > if($fstat[7] == 0) { #Zero size file > print_file_reference("Z", $file, \@fstat); > $zerolengthfiles++; > return; > } elsif($fstat[3] > 1) { #More than one link > if(defined(my $pooltarget = &$read_Icache($ptype, $fstat[1]))) { > #Valid hit: hard link found in InodeCache > $existinglink_pcfiles++; > $pooltarget =~ /(.)(.)(.)/; > print_hardlink_reference("$ptype/$1/$2/$3/$pooltarget", $file); > return; > } > } > > if($file =~ m|^\Q${pcdir}\E[^/]+/[^/]+/backupInfo$|) { > $totfiles--; > return; #BackupInfo already taken care of in 'nonpooled' > } > > #Non-zero sized file not in InodeCache that is not 'backupInfo' > if($filename =~ /^f|^attrib$/ && -f $file) { > #VALID pc file if f-mangled or attrib file > if( $filename =~ /^f/ && > !( -f "$dir/attrib" && > $attr->read($dir,"attrib") == 1 && > defined($attr->get($bpc->fileNameUnmangle($filename))))) { > #Attrib file error if attrib file missing or > #unmangled name not an element of the attrib file > warn "ERROR: $file (inode=$fstat[1], nlinks=$fstat[3]) VALID pc file with MISSING attrib entry\n"; > $missing_attribs++; > } > if($fixlinks) { > if(fix_link($file, \@fstat) == 1) { > $fixedlink_pcfiles++; > return; > } else {$unlinkable_pcfiles++;} > } else{ > warn "ERROR: $file (inode=$fstat[1], nlinks=$fstat[3]) VALID pc file NOT LINKED to pool\n"; > $unlinked_pcfiles++; > } > } else { > warn "ERROR: $file (inode=$fstat[1], nlinks=$fstat[3]) INVALID pc file and UNLINKED to pool\n"; > $unlinked_nonpcfiles++ > } > #ERROR: Not linked/linkable to pool (and not zero length file or directory > print_file_reference("X", $file, \@fstat); > printf NOLINKS "%s\n", $file; > $unlinked_files++ > } > } > > #Try to fix link by finding/creating new pool-link > sub fix_link > { > my ($filename, $fstatptr) = @_; > > my ($plink, @pstat); > my ($md5sum, $result) = zFile2MD5($bpc, $md5, $filename, 0, $compresslvl); > $result = jMakeFileLink($bpc, $filename, $md5sum, 2, $compresslvl, \$plink) > if $result > 0; > #Note we choose NewFile=2 since if we are fixing, we always want to make the link > if($result > 0) { #(true if both above calls succeed) > $plink =~ m|.*(\Q${ptype}\E.*/(.*))|; > $plink = $1; #Path relative to TopDir > if($dryrun) { > @pstat = @$fstatptr; > } else { > @pstat = stat($plink); > &$write_Icache($ptype, $pstat[1], $2); > } > print_hardlink_reference($plink, $filename); > if($verbose >=2) { > warn "${DRYRUN}NOTICE: pool entry '$plink' (inode=$fstatptr->[1]) missing from Icache and added back\n" if $result == 3; > warn sprintf("${DRYRUN}NOTICE: %s (inode=%d, nlinks=%d) was fixed by linking to *%s* pool entry: %s (inode=%d, nlinks=%d)\n", > $filename, $fstatptr->[1], $fstatptr->[3], > ($result == 1 ? "existing" : "new"), > $plink, $pstat[1], $pstat[3]) > if $result !=3; > } > return 1; > } > warn "ERROR: $filename (inode $fstatptr->[1]); fstatptr->[3] links; md5sum=$md5sum) VALID pc file FAILED FIXING by linking to pool\n"; > return 0; > } > > sub print_hardlink_reference > { > my ($poolname, $filename) = @_; > > #<poolname> <filename> > printf LINKS "%s %s\n", $poolname, $filename; #Print hard link > } > > sub print_file_reference > { > my ($firstcol, $filename, $fstatptr) = @_; > > #<firstcol> <user> <group> <mod> <mtime> <filename> > printf(LINKS "%s %s %s %04o %u %s\n", > $firstcol, UID($fstatptr->[4]), GID($fstatptr->[5]), > $fstatptr->[2] & 07777, $fstatptr->[9], $filename); > } > > ############################################################################### > ##Read/write InodeCache > > #Note that without packing, InodeCache hash requireas about 186 > #bytes/entry ('total_size') to store each entry with about 74 bytes > #required to store the hash structure ('size') alone and an additional > #exactly 112 byte to store the 32-char hex string (this assumes chains > #are sparse enough not to affect the average storage). The 74 bytes > #for the hash structure consists of about 8 bytes for the inode data > #plus 48 bytes for the overhead of the scalar variable storing the > #index plus ~22 bytes for the magic of hash storage and the pointer to > #the hash value. The additional 112 bytes consists of 64 bytes to > #store the unpacked 32-char hex string plus 48 bytes of overhead for > #the scalar variable storing the hash value. > > #If we pack the value, we can reduce the hex string storage from 64 > #bytes to 16 bytes by storing 2 hex chars per byte. This reduces the > #total storage by 48 bytes/entry from 186 to 138 bytes/entry. > > #If we pack the index as an unsigned 32-bit long (V), then we can > #save an additional 4 bytes, reducing the average storage to about 134 > #bytes/entry. But, we need to make sure that the highest inode number > #is less than 2^32 (~4.3 billion). If we pack the index as an unsigned > #64-bit quad (Q), then we use 8 bytes for the index data and we are > #back to a total 138 bytes/entry. > > #Note these averages are based on a hash size (i.e. pool size) of 1 > #million entries. The 74 bytes/entry for the hash structure alone will > #increase (slowly) as the size of the hash increases but the storage > #size for the hash vaue (112 bytes unpacked, 64 packed) does not > #increase. > > sub read_Icache_nopack > { > $InodeCache{$_[0]}{$_[1]}; #No packing of values & indices > } > > sub read_Icache_pack > { > # join("_", unpack("H32V", $InodeCache{$_[0]}{$_[1]})); #Pack values only > > #Also pack indices: > join("_", unpack("H32V", $InodeCache{$_[0]}{pack("V",$_[1])}));#32-bit indices > # join("_", unpack("H32V", $InodeCache{$_[0]}{pack("Q",$_[1])}));#64-bit indices > #Note: shouldn't need 64 bits for inodes since they are ino_t > #which are typdef'd as longs > } > > sub write_Icache_nopack > { > $InodeCache{$_[0]}{$_[1]} = $_[2]; #No packing of values & indices > } > > sub write_Icache_pack > { > my ($pool, $index, $value) = @_; > > $value =~ m|(.{32})(.(.*))?|; > #Pack values only > # $InodeCache{$pool}{$index}= defined($3) ? pack("H32V", $1, $3) : pack("H32", $1); > > #Also pack indices: > $InodeCache{$pool}{pack("V",$index)} = defined($3) ? pack("H32V", $1, $3) : pack("H32", $1); #32-bit indices > > # $InodeCache{$pool}{pack("Q",$index)} = defined($3) ? pack("H32V", $1, $3) : pack("H32", $1); #64-bit indices > #Note shouldn't need 64 bits for inodes since they are ino_t which > #are typdef'd as longs > } > > ############################################################################### > # Return user name corresponding to numerical UID with caching > sub UID > { > unless(exists($UIDcache{$_[0]})) { > unless(defined($UIDcache{$_[0]} = getpwuid($_[0]))) { > printf STDERR "Warning: '%d' not found in password file, writing numerical UID instead...\n", $_[0]; > $UIDcache{$_[0]} = $_[0]; > } > } > return $UIDcache{$_[0]}; > } > > # Return group name corresponding to numerical GID with caching > sub GID > { > unless(exists($GIDcache{$_[0]})) { > unless(defined($GIDcache{$_[0]} = getgrgid($_[0]))) { > printf STDERR "Warning: '%d' not found in group file, writing numerical GID instead...\n", $_[0]; > $GIDcache{$_[0]} = $_[0]; > } > } > return $GIDcache{$_[0]}; > } > > # Return numerical UID corresponding to user name with caching > sub USER > { > unless(exists($USERcache{$_[0]})) { > unless(defined($USERcache{$_[0]} = getpwnam($_[0]))) { > printf STDERR "Warning: '%s' not found in password file, writing default backuppc UID (%d) instead...\n", $_[0], $uiddef; > $USERcache{$_[0]} = $uiddef; > } > } > return $USERcache{$_[0]}; > } > > # Return numerical GUID coresponding to group name with caching > sub GROUP > { > unless(exists($GROUPcache{$_[0]})) { > unless(defined($GROUPcache{$_[0]} = getgrnam($_[0]))) { > printf STDERR "Warning: '%s' not found in group file, writing default backuppc GID (%d) instead...\n", $_[0], $giddef; > $GROUPcache{$_[0]} = $giddef; > } > } > return $GROUPcache{$_[0]}; > } > ############################################################################### > ############################################################################### > sub do_restore > { > my $currbackup = ""; > my $formaterr = 0; > my $ownererr = 0; > my $permserr = 0; > my $mkdirerr = 0; > my $mkzeroerr = 0; > my $mklinkerr = 0; > my $filexsterr = 0; > my $utimerr = 0; > my $newdir = 0; > my $newzero = 0; > my $newlink = 0; > my $skipped = 0; > > ############################################################################### > if($stdio) { > open(LINKS, $gzip ? "/bin/zcat - |" : "<& STDIN"); > } else { > my $restorefile = shift @_; > open(LINKS, $gzip ? "/bin/zcat $restorefile |" : "< $restorefile") or > die "ERROR: Can't open '$restorefile' for reading!($!)\n"; > } > my @backuplist = @_; > ############################################################################### > chdir($TopDir); #Do this so we don't need to worry about distinguishing > #between absolute and relative (to TopDir) pathnames > ############################################################################### > die "ERROR: pool directories empty!\n" > unless $Force || bsd_glob("{cpool,pool}/*"); > > my(@skiplist, $skiplist); #Backups to skip... > if($Skip) { > @skiplist = grep(@{[bsd_glob("$_/f*")]}, bsd_glob("${pcdir}*/[0-9]*")); > #NOTE: glob must be evaulated in list context because stateful... > } elsif(!$Overwrite && grep(-d, bsd_glob("${pcdir}*/[0-9]*/f*"))) { > die "ERROR: pc directory contains existing backups!\n(use --Skip to skip or --Overwrite to OVERWRITE)\n" > } > $skiplist = "\Q" . join("\E|\Q", @skiplist) . "\E" if @skiplist; > > my $backuplist; #Backups to selectively backup... > $backuplist = "\Q" . join("\E|\Q", @backuplist) . "\E" if @backuplist; > > umask 0000; #No permission bits disabled > my $time = time; #We will use this for setting atimes. > my @dirmtimes =(); #Stack consisting of paired "dir, mtime" entries > my $matchback = 1; > my ($line); > > LINE: while($line = <LINKS>) { > chomp $line; > > unless($line =~ m|^[a-f0-9DZX#]|) { #First character test > print STDOUT "ERR_CHAR1: $line\n"; > warn sprintf("ERROR: Illegal first line character: %s\n", > substr($line,0,1)) > if $verbose >=1; > next LINE; #NOTE: next without line would go to next switch case > } > > switch ($&) { #Go through cases... > case 'D' { > unless($line =~ m|^D +([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) +(\Q${pcdir}\E.*)|) { > print STDOUT "ERR_DFRMT $line\n"; > $formaterr++; > next LINE; > } > #NOTE: 1=uname 2=group 3=mode 4=mtime 5=dirpath > print STDERR "$1|$2|$3|$4|$5|\n" if $verbose >=4; > my $user = USER($1); > my $group = GROUP($2); > my $mode = oct($3); > my $mtime = $4; > my $dir = $5; $dir =~ s|/*$||; #Remove trailing slash > $dir =~ m|(.*)/|; > my $pdir = $1; #parent dir > > #Don't restore if on skiplist or if not on backuplist > if(($skiplist && $dir =~ m|^($skiplist)|) || > ($backuplist && $dir !~ m|^pc/($backuplist)|)) { > $matchback = 0; > next LINE; > } else { > $matchback = 1; > } > > if($verbose >= 1) { > $dir =~ m|pc/([^/]*/[^/]*)|; > if($1 ne $currbackup) { > $currbackup = $1; > warn "RESTORING: $currbackup\n"; > } > } > > #Look at @dirmtimes stack to see if we have backed out of > #top directory(ies) on stack and can now set mtime > my $lastdir; #If in new part of tree, set mtimes for past dirs > while(defined($lastdir = shift(@dirmtimes)) && > $dir !~ m|^\Q$lastdir\E|) { > jutime($time, shift(@dirmtimes), $lastdir); > } > unshift(@dirmtimes, $lastdir) if $lastdir; #Put back last one > > if(! -e $dir) { #Make directory (nothing in the way) > unless( -d $pdir || jmake_path $pdir){ > #Make parent directory if doesn't exist > #NOTE: typically shouldn't be necesary since list is > #created with directories along the way > print STDOUT "ERR_MKDIR $line\n"; > $mkdirerr++; > next LINE; > } > #Make directory & set perms > unless(mkdir $dir, $mode) { > print STDOUT "ERR_MKDIR $line\n"; > $mkdirerr++; > next LINE; > } > } elsif(-d $dir) { #Directory already exists > #Set perms > unless(jchmod $mode, $dir) { > print STDOUT "ERR_PERMS $line\n"; > $permserr++; > next LINE; > } > } else { #Non-directory in the way, abort line > print STDOUT "ERR_DEXST $line\n"; > $filexsterr++; > next LINE; > } > > #Set ownership > unless(jchown $user, $group, $dir){ > print STDOUT "ERR_OWNER $line\n"; > $ownererr++; > next LINE; > } > #Put dir mtime on stack > unshift(@dirmtimes, $mtime); #We need to set dir mtime > unshift(@dirmtimes, $dir); #when done adding files to dir > > $newdir++; > } > > case 'Z' { > next LINE unless $matchback; > unless($line =~ m|^Z +([^ ]+) +([^ ]+) +([^ ]+) +([^ ]+) +((\Q${pcdir}\E.*/)(.*))|) { > print STDOUT "ERR_ZFRMT $line\n"; > $formaterr++; > next LINE; > } > #NOTE: 1=uname 2=group 3=mode 4=mtime 5=fullpath 6=dir 7=file > print STDERR "$1|$2|$3|$4|$5|$6|$7\n" if $verbose >=4; > my $user = USER($1); > my $group = GROUP($2); > my $mode = oct($3); > my $mtime = $4; > my $file = $5; > my $dir = $6; > my $name = $7; > > #Check if file exists and not overwritable > if(-e $file && !($Overwrite && -f $file && unlink($file))) { > print STDOUT "ERR_ZEXST $line\n"; > $filexsterr++; > next LINE; > } > > unless( -d $dir || jmake_path $dir){ #Make dir if doesn't exist > #NOTE: typically shouldn't be necesary since list is > #created with directories along the way > print STDOUT "ERR_MKDIR $line\n"; > $mkdirerr++; > next LINE; > } > #Create zero-length file with desired perms > unless(sysopen(ZERO, $file, $CREATMASK, $mode) && close(ZERO)){ > print STDOUT "ERR_MKZER $line\n"; > $mkzeroerr++; > next LINE; > } > unless(jchown $user, $group, $file){ #Set ownership > print STDOUT "ERR_OWNER $line\n"; > $ownererr++; > next LINE; > } > unless(jutime $time, $mtime, $file) { #Set timestamps > $utimerr++; > next LINE; > } > $newzero++; > } > > case 'X' {$skipped++; next LINE; } #Error line > > case '#' {next LINE; } #Comment line > > else { #Hard link is default switch case > next LINE unless $matchback; > unless($line =~ m|(c?pool/[0-9a-f]/[0-9a-f]/[0-9a-f]/[^/]+) +((\Q${pcdir}\E.*/).*)|) { > print STDOUT "ERR_LFRMT $line\n"; > $formaterr++; > next LINE; > } > print STDERR "$1|$2|$3\n" if $verbose >=4; # 1=source 2=target 3=targetdir > my $source = $1; > my $target = $2; > my $targetdir = $3; > > if(-e $target && > !($Overwrite && -f $target && unlink($target))){ > print STDOUT "ERR_LEXST $line\n"; > $filexsterr++; #Target exists and not overwritable > next LINE; > } > > unless(-d $targetdir || jmake_path $targetdir){ > #Make targetdir if doesn't exist > #NOTE: typically shouldn't be necesary since list is > #created with directories along the way > print STDOUT "ERR_MKDIR $line\n"; > $mkdirerr++; > next LINE; > } > > unless(jlink $source, $target) { #Make link > print STDOUT "ERR_MKLNK $line\n"; > $mklinkerr++; > next LINE; > } > $newlink++ > } > } #SWITCH end > } # WHILE reading lines > > #Set mtimes for any remaining directories in @dirmtimes stack > while(my $dir = shift(@dirmtimes)){ > jutime($time, shift(@dirmtimes), $dir); > } > > ############################################################################### > #Print results: > > > printf("\nSUCCESSES: Restores=%d [Dirs=%d Zeros=%d Links=%d] Skipped=%d\n", > ($newdir+$newzero+$newlink), $newdir, $newzero, $newlink, $skipped); > printf("ERRORS: TOTAL=%d Format=%d Mkdir=%d Mkzero=%d Mklink=%d\n", > ($formaterr + $mkdirerr + $mkzeroerr + $mklinkerr + > $filexsterr + $ownererr + $permserr + $utimerr), > $formaterr, $mkdirerr, $mkzeroerr, $mklinkerr); > printf(" File_exists=%d Owner=%d Perms=%d Mtime=%d\n\n", > $filexsterr, $ownererr, $permserr, $utimerr); > > exit; #RESTORE done > } > ############################################################################### > > ------------------------------------------------------------------------ > jLib.pm > > #============================================================= -*-perl-*- > # > # BackupPC::jLib package > # > # DESCRIPTION > # > # This library includes various supporting subroutines for use with BackupPC > # functions used by BackupPC. > # Some of the routines are variants/extensions of routines originally written > # by Craig Barratt as part of the main BackupPC release. > # > # AUTHOR > # Jeff Kosowsky > # > # COPYRIGHT > # Copyright (C) 2008-2013 Jeff Kosowsky > # > # This program is free software; you can redistribute it and/or modify > # it under the terms of the GNU General Public License as published by > # the Free Software Foundation; either version 2 of the License, or > # (at your option) any later version. > # > # This program is distributed in the hope that it will be useful, > # but WITHOUT ANY WARRANTY; without even the implied warranty of > # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > # GNU General Public License for more details. > # > # You should have received a copy of the GNU General Public License > # along with this program; if not, write to the Free Software > # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > # > #======================================================================== > # > # Version 0.4.1, released January 2013 > > # CHANGELOG > # 0.4.0 (Jan 2011) > # 0.4.1 (Jan 2013) Added MD52RPath, is_poollink > # Added jchmod, jchown > # 0.4.2 (Feb 2013) Made write_attrib more robust > # > #======================================================================== > > package BackupPC::jLib; > > use strict; > use vars qw($VERSION); > $VERSION = '0.4.1'; > > use warnings; > use File::Copy; > use File::Glob ':glob'; > use File::Path; > use File::Temp; > use Fcntl; #Required for RW I/O masks > > use BackupPC::Lib; > use BackupPC::Attrib; > use BackupPC::FileZIO; > use Data::Dumper; #Just used for debugging... > > no utf8; > > use constant _128KB => 131072; > use constant _1MB => 1048576; > use constant LINUX_BLOCK_SIZE => 4096; > use constant TOO_BIG => 2097152; # 1024*1024*2 (2MB) > > require Exporter; > our @ISA = qw(Exporter); > our @EXPORT = qw( > %Conf $dryrun $errorcount > LINUX_BLOCK_SIZE TOO_BIG > printerr warnerr firstbyte > zFile2MD5 zFile2FullMD5 > link_recursively_topool > jfileNameUnmangle > getattr read_attrib count_file_attribs > get_bakinfo get_file_attrib get_attrib_value get_attrib_type > write_attrib write_file_attrib > attrib > MD52RPath > is_poollink GetPoolLink jMakeFileLink > poolname2number renumber_pool_chain delete_pool_file > run_nightly > jcompare zcompare zcompare2 > delete_files jtouch > jchmod jchown jcopy jlink junlink jmkdir jmkpath jmake_path > jrename jrmtree jutime > ); > > #Global variables > our %Conf; > our $errorcount=0; > our $dryrun=1; #global variable - set to 1 to be safe -- should be set to > #0 in actual program files if you don't want a dry-run > #None of the routines below will change/delete/write actual > #file data if this flag is set. The goal is to do everything but > #the final write to assist with debugging or cold feet :) > > sub printerr > { > print "ERROR: " . $_[0]; > $errorcount++; > } > > sub warnerr > { > $|++; # flush printbuffer first > warn "ERROR: " . $_[0]; > $errorcount++; > } > > # Returns the firstbyte of a file. > # If coding $coding undefined or 0, return as unpacked 2 char hexadecimal > # string. Otherwise, return as binary byte. > # Return -1 on error. > # Useful for checking the type of compressed file/checksum coding. > sub firstbyte { > my ($file, $coding) = @_; > my $fbyte=''; > sysopen(my $fh, $file, O_RDONLY) || return -1; > $fbyte = -1 unless sysread($fh, $fbyte, 1) == 1; > close($fh); > if (! defined($coding) || $coding == 0) { > $fbyte = unpack('H*',$fbyte); # Unpack as 2 char hexadecimal string > } > else { > $fbyte = vec($fbyte, 0, 8); # Return as binary byte > } > return $fbyte; > } > > # Compute the MD5 digest of a compressed file. This is the compressed > # file version of the Lib.pm function File2MD5. > # For efficiency we don't use the whole file for big files > # - for files <= 256K we use the file size and the whole file. > # - for files <= 1M we use the file size, the first 128K and > # the last 128K. > # - for files > 1M, we use the file size, the first 128K and > # the 8th 128K (ie: the 128K up to 1MB). > # See the documentation for a discussion of the tradeoffs in > # how much data we use and how many collisions we get. > # > # Returns the MD5 digest (a hex string) and the file size if suceeeds. > # (or "" and error code if fails). > # Note return for a zero size file is ("", 0). > # > # If $size < 0 then calculate size of file by fully decompressing > # If $size = 0 then first try to read corresponding attrib file > # (if it exists), if doesn't work then calculate by fully decompressing > # IF $size >0 then use that as the size of the file > # > # If compreslvl is undefined then use the default compress level from > # the config file > > sub zFile2MD5 > { > my($bpc, $md5, $name, $size, $compresslvl) = @_; > > my ($fh, $rsize, $filesize, $md5size); > > return ("", -1) unless -f $name; > return ("", 0) if (stat(_))[7] == 0; #Zero size file > $compresslvl = $Conf{CompressLevel} unless defined $compresslvl; > unless (defined ($fh = BackupPC::FileZIO->open($name, 0, $compresslvl))) { > printerr "Can't open $name\n"; > return ("", -1); > } > > my ($datafirst, $datalast); > my @data; > #First try to read up to the first 128K (131072 bytes) > if ( ($md5size = $fh->read(\$datafirst, _128KB)) < 0 ) { #Fist 128K > printerr "Can't read & decompress $name\n"; > return ("", -1); > } > > if ($md5size == _128KB) { # If full read, continue reading up to 1st MB > my $i=0; > #Read in up to 1MB (_1MB), 128K at a time and alternate between 2 data buffers > while ( ($rsize = $fh->read(\$data[(++$i)%2], _128KB)) == _128KB > && ($md5size += $rsize) < _1MB ) {} > $md5size +=$rsize if $rsize < _128KB; # Add back in partial read > $datalast = ($i > 1 ? > substr($data[($i-1)%2], $rsize, _128KB-$rsize) : '') > . substr($data[$i%2], 0 ,$rsize); #Last 128KB (up to 1MB) > } > > $size = 0 unless defined $size; > if($size > 0) { #Use given size > $filesize = $size; > }elsif($md5size < _1MB) { #Already know the size because read it all (note don't do <=) > $filesize = $md5size; > } elsif($compresslvl == 0) { #Not compressed, so: size = actual size > $filesize = -s $name; > }elsif($size == 0) { # Try to find size from attrib file > $filesize = get_attrib_value($name, "size"); > if(!defined($filesize)) { > warn "Can't read size of $name from attrib file so calculating manually\n"; > } > } > if(!defined($filesize)) { #No choice but continue reading to find size > $filesize = $md5size; > while (($rsize = $fh->read(\($data[0]), _128KB)) > 0) { > $filesize +=$rsize; > } > } > $fh->close(); > > $md5->reset(); > $md5->add($filesize); > $md5->add($datafirst); > $md5->add($datalast) if defined($datalast); > return ($md5->hexdigest, $filesize); > } > > #Compute md5sum of the full data contents of a file. > #If the file is compressed, calculate the md5sum of the inflated > #version (using the zlib routines to uncompress the stream). Note this > #gives the md5sum of the FULL file -- not just the partial file md5sum > #above. > sub zFile2FullMD5 > { > my($bpc, $md5, $name, $compresslvl) = @_; > > my $fh; > my $data; > > $compresslvl = $Conf{CompressLevel} unless defined $compresslvl; > unless (defined ($fh = BackupPC::FileZIO->open($name, 0, $compresslvl))) { > printerr "Can't open $name\n"; > return -1; > } > > $md5->reset(); > while ($fh->read(\$data, 65536) > 0) { > $md5->add($data); > } > > return $md5->hexdigest; > } > > # Like MakeFileLink but for existing files where we don't have the > # digest available. So compute digest and call MakeFileLink after > # For each file, check if the file exists in $bpc->{TopDir}/pool. > # If so, remove the file and make a hardlink to the file in > # the pool. Otherwise, if the newFile flag is set, make a > # hardlink in the pool to the new file. > # > # Returns 0 if a link should be made to a new file (ie: when the file > # is a new file but the newFile flag is 0). > # Returns 1 if a link to an existing file is made, > # Returns 2 if a link to a new file is made (only if $newFile is set) > # Returns negative on error. > sub zMakeFileLink > { > my($bpc, $md5, $name, $newFile, $compress) = @_; > > $compress = $Conf{CompressLevel} unless defined $compress; > > my ($d,$ret) = zFile2MD5($bpc, $md5, $name, 0, $compress); > return -5 if $ret < 0; > $bpc->MakeFileLink($name, $d, $newFile, $compress); > } > > > # Use the following to create a new file link ($copy) that links to > # the same pool file as the original ($orig) file does (or > # should). i.e. this doesn't assume that $orig is properly linked to > # the pool. This is computationally more costly than just making the > # link, but will avoid further corruption whereby you get archive > # files with multiple links to each other but no link to pool. > > # First, check if $orig links to the pool and if not create a link > # via MakeFileLink. Don't create a new pool entry if newFile is zero. > # If a link either already exists from the original to the pool or if > # one was successfully created, then simply link $copy to the same > # pool entry. Otherwise, just copy (don't link) $orig to $copy > # and leave it unlinked to the pool. > > # Note that checking whether $orig is linked to the pool is > # cheaper than running MakeFileLink since we only need the md5sum > # checksum. > # Note we assume that the attrib entry for $orig (if it exists) is > # correct, since we use that as a shortcut to determine the filesize > # (via zFile2MD5) > # Returns 1 if a link to an existing file is made, > # Returns 2 if a link to a new file is made (only if $newFile is set) > # Returns 0 if file was copied (either because MakeFileLink failed or > # because newFile=0 and no existing pool match > # Returns negative on error. > sub zCopyFileLink > { > my ($bpc, $orig, $copy, $newFile, $compress) = @_; > my $ret=1; > $compress = $Conf{CompressLevel} unless defined $compress; > my $md5 = Digest::MD5->new; > my ($md5sum, $md5ret) = zFile2MD5($bpc, $md5, $orig, 0, $compress); > > #If $orig is already properly linked to the pool (or linkable to pool after running > #MakeFileLink on $orig) and HardLinkMax is not exceeded, then just link to $orig. > if($md5ret > 0) { #If valid md5sum and non-zero length file (so should be linked to pool)... > if((GetPoolLinkMD5($bpc, $orig, $md5sum, $compress, 0) == 1 || #If pool link already exists > ($ret = $bpc->MakeFileLink($orig, $md5sum, $newFile, $compress))> 0) #or creatable by MakeFileLink > && (stat($orig))[3] < $bpc->{Conf}{HardLinkMax}) { # AND (still) less than max links > return $ret if link($orig, $copy); # Then link from copy to orig > } > } > if(copy($orig, $copy) == 1) { #Otherwise first copy file then try to link the copy to pool > if($md5ret > 0 && ($ret = $bpc->MakeFileLink($copy, $md5sum, $newFile, $compress))> 0) { > return 2; #Link is to a new copy > } > printerr "Failed to link copied file to pool: $copy\n"; > return 0; #Copy but not linked > } > die "Failed to link or copy file: $copy\n"; > return -1; > } > > # Copy $source to $target, recursing down directory trees as > # needed. The 4th argument if non-zero, means (for files) use > # zCopyFileLink to make sure that everything is linked properly to the > # pool; otherwise, just do a simple link. The 5th argument $force, > # will erase a target directory if already present, otherwise an error > # is signalled. The final 6th argument is the CompressionLevel which > # can be left out and will the be calculated from > # bpc->Conf{CompressLevel} > > # Note the recursion is set up so that the return is negative if > # error, positive if consistent with a valid pool link (i.e. ... [truncated message content] |