Re: [mbackup-devel] virtual locations
Status: Alpha
Brought to you by:
jo2y
|
From: James O'K. <jo...@mi...> - 2000-07-06 17:42:27
|
On Thu, 6 Jul 2000 ra...@Te... wrote: > Let's think of users home directories in a larger networked environment. > Home dirs may be placed on different hosts and on different disks within > a single host. As requirements change $HOMEs may be moved between disks > or hosts. A feature I was planning was the uses of md5 checksums. That get indexed along with other metadata. My idea was to use this for backing up things like /usr which would be largely the same across machines. This idea could be extended to your problem, where if the md5 sums are the same, then we don't backup, we just make a reference to the previous backed up stuff. Some of the details are still fuzzy in my mind, but I hope to clear them up as things move closer to that point. I'm open to hear some other ideas that might work? > Another point that directly affects the above mentioned idea (in case of > a file system backup -- opposed to a data base backup) is: > > What is the file? Is it the data behind the inode or the file name? If > it's the inode, what should be done in case the backup set has been > move to a new location (or restored on a new disk)? And should the data > be backuped when the file name or permissions/ownership have changed? I've been treating it as a file, but I've been keeping the inode with it for later use. One way to index this would be to have two table like so: Metadata data_location -------- ------------- metadatarecordnum datarecordnum filename tapenum stat() info position_on_tape data_location_pointer checksum This is just off the top of my head, but if you have two files, that happen to be hardlinks to each other, in the metadata table, you would have the parts that were different, the filename, the info from a stat() call and a reference to the data_location table. The second file that shares that inode would have different information in the metadata table, but a pointer to the same data_location. One tricky part however is getting them to be hardlinks on the restore. Perhaps, we could also add a previous_version pointer that refered to the version that came before. With some semi complex SQL queries we could follow the chain of pointers through all the version of the file. It's not very simple, but the possibility to do it is there. Volunteers to help code are welcome. :) -james |