From: Kern S. <ke...@si...> - 2003-04-09 07:13:33
|
Hello, On Wed, 2003-04-09 at 01:45, Matthias Wamser wrote: > On Tue, Apr 08, 2003 at 05:54:29PM +0200, Kern Sibbald wrote: > > Hello, > > > > Bacula's algorithm for deciding if a file should be > > backed up during an Incremental or Differential job > > is to first find the start time of the relevant preceding > > backup job (for Incremental, it is the start time of the > > last Full, Incremental, or Differental job, and for > > Differential, it is the time of the last Full backup). > > > > Then this "since" time is compared to the st_mtime of each file. > > If st_mtime is greater than the "since" time, the file > > will be backed up, otherwise the file is skipped. > > ok, but theres the problem (you still mentioned it in > another place on the mailinglist) that the times on > each backup client have to be synchron. and on a > windows client (you also mentioned it) thats not normal. Follow Phil's advice. > > > > > st_mtime is the last time the file was modified. > > > > So in your case, you are moving files, but their > > modification times (st_mtime) are not changing and > > so Bacula is not backing them up. > > my experience :( > > > For the moment, the only solution is to "touch" the > > files, then they will be backed up. > > this could not be the solution, not even as workaround. > we have 10 workstations (some windows) and about 20 > servers (most linux, some solaris and some windows) > which have to be backed up. and there are not much persons > i could tell they should touch their files if they want > that they will be backed up. > another reason is, i wouldnt like to modify the mod time > of a pdf or word doc for example, since i like to know > the original timestamp. > and the last reason is i want the backup program to > be so "intelligent" that i dont have to think :-) > > > I have always planned to allow the user to choose > > to backup files on the st_atime field, which is the > > last time the file was accessed. In your case, this > > would cause the files to be backed up (I think). > > i think so too > > > However, it also cause all unchanged files that have > > simply been read to be backed up. > > does not sound so good, cause you say it, files that > simply been read would be backed up. > > > It is my understanding that all backup software works > > this way. If not, I certainly would like to know which > > products detect new files. > > hm? my understanding was the opposite, but i dont know > for sure whats "standard". i just know that windows clients > often make use of the "archiv" flag, respectively the > windows backup client cares about setting the archiv flag. Looking at the archive flag is an interesting idea that could efficiently solve the problem on Windows. I'll look into it. > > and my main experience is with amanda: i thought it > would detect new files even in incremental mode, but > now i am not sure .. > > the "dump" man page says "[...] incremental backup, tells > dump to copy all files new or modified since the last > dump of a lower level [...]" > and dump itself maintains a file called "dumpdates" > which " [...] is readable by people, consisting > of one free format record per line: > filesystem name, increment level and ctime(3) format > dump date. [...]" > > amanda uses dump (if available) > > ... have to investigate it. Yes, please do investigate it. I'll be VERY surprised if AMANDA or dump work differently. Networker on the other hand probably does this by comparing all files since from what I have heard, Networker spends a great deal of time figuring out what it is going to do before doing it ... > > > It *is* possible for Bacula to behave as you want it > > to (assuming some programming) because Bacula has information > > on all files that have been previously backed up, so > > it is possible to know that a file is new. > > This is in fact done in Bacula Verify jobs. The downside > > of doing this is that there is MUCH more overhead. > > yeah, thats what i saw in the documentation. my first thought was: > why isnt this an option for Backup jobs? It is a question of priorities and my human limitations. > > how much more overhead is this? On a typical Linux installation 1GHz machine 256Meg memory 120,000 files, I would estimate 5-10 minutes. > i think you dont have to compare each file in a manner like > done by the "Verify" job. its sufficient to get the list > of files since (including) last backup and incrementals > and to compare it with the actual file list from the > client. > > oh no, its not necessary to compare against the whole file > list. just the last backup list would be enough, or am > i wrong? (its a little bit late at night) if theres > a file which wasnt in the last incremental run it must > be new, or not? > > and in case of a new directory you can backup all files > in there, cause they must be "new" if the directory > wasnt there before > > > There is a similar problem during restore only in > > reverse that if you delete files after a full save > > but before the last incremental save, then do a > > full restore, those deleted files will also be > > restored. > > ok, but thats not so tragical, i think > > > I'd be interested to hear comments from you > > and others on this subject. > > no problem, but now i will go to sleep and dream of it ;-) Yes, you need some sleep. It isn't as easy as you think. Bacula would need to reconstruct the current backup list with all currently backed up files since the last Full save, then compare to the full list on the Client. Or conversely if you wish, Bacula would have to examine every file on the Client and see if there is a current backup of it by looking up the file in the catalog. Best regards, Kern |