From: Craig B. <cr...@at...> - 2002-07-17 05:44:59
|
> Maybe you could do something to allow creating a backup of everything that > matches a pattern (Date based or size based). This way you could get a > computer backed up no mater how much time you have. You could several > backups, either: > > ** multiple backups - one for files before Jan 2000 - one for files in 2000 > - one for files in 2001 - etc. > ** multiple backups - one for files less than 200K - .. files between 200K > and 1M - etc. > ** Or you could get a list of folders on the drive and allow the admin to > edit the list of folders and paste in the folders they want backed up. Then > they could cut and paste into multiple backups to get a full done over a few > days. > > If you were really adventurous, the program could ask the admin how much to > backup per day - per pc max. Then backuppc could do a listing of all the > files and add up the file sizes to make multiple lists of files and create > jobs that would run each day until the drive is backed up. Then you could > do an incremental at the end to get all the files that changed during the > time it took to backup the whole computer. Maybe you could call this type > of backup a "Staged Full" or something along those lines. > > One other thing, it would probably be good to have this backup performed > over stages instead of days. That way you could have a list of jobs created > that would run in order and mark themselves off of the list when completed, > This would allow a stage to complete if it fails one stage and has to re-run > that stage (as when a user is in the office for a couple of hours and then > leaves before a stage finishes). This sounds like a lot of effort. The basic problem is that smbclient pulling the data over SMB is painfully slow. It's ridiculous that, for example, a 1.4GHz linux box connected to a 1.7GHz WinXP box over an idle 100MBps connection only averages 1MB/sec. And unless the client has a lot of memory the impact on the client machine is large. The native client disk performance is probably 10MB/sec or more. Adding a feature to resume a partially complete backup would help a lot. It's not that difficult: - when BackupPC_dump fails (often because the user powered off their machine or canceled the backup) it currently cleans up by trashing the pc/$host/new directory and the log file. Instead, it would leave them intact and write the last status to the backups file. This could be displayed as "partial" or "new" in the CGI interface and browsing etc would work. - next time it starts it would check if there is a partial dump. If it is older than $Conf{PartialDumpAgeMax} (which might be set to, say, 2 days) it deletes the partial and starts with a clean dump. Otherwise, it looks through the log file to identify completed directories (I can elaborate on how this is done). It then starts a new dump appending the completed directories to the exclude list. Conveniently, the compressed log file can simply be appended. Since smbclient doesn't give very fine control over which files to include or exclude this feature would be a little crude, but it would help backing up big systems, especially laptops that aren't connected for long periods. I think the correct solution to the mediocre SMB performance is using rsync. I would expect a 5-10x speedup in backup performance. Currently rsync support is planned for the next version, but I don't know when I'll get started on it. Moreover, we would need to develop a WinXX client based on rsync, since rsync doesn't have really have a WinXX port (although you can run it using cygwin for example). Craig |