[Dump-users] Re: Unable to handle removed files via incremental dumps

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

[...]
> > So, after all it should work if you initially create the mirror using
> > dump -0 | restore.
> 
> Probably You mean restore -r here.

Yep, something like that ...

> I've tried that. You're right. It works. However the documentation states
> that -r should  be used only on a fresh filesystem. Isn't it dangerous
> if it's used on already populated one ?

Dunno, UTSL - or maybe some developer will be able to help you out with
that one?!

But can't you use some separate filesystem for that mirror?

> > As an alternative to dump I'd recommend rsync which is made exactly for
> > this kind of task. However, take care to disable checksumming for
> > local copying - maybe that rsync does that automatically, though ...
> >
> 
> I'm currently using rsync. However even
> find . |wc -l
> takes huge amount of time when we talk about let's say one million files.
> Find/rsync uses stat for each file. I'm not sure whether the kernel keeps
> a copy of the FAT(don't know how it's called for ext2/3) in memory, but
> even in that case the context switches between the kernel and userspace for
> each stat seems expensive and unnecessary, when we could read that information
> directly from the FAT. That's what dump does, right ?

The information that is needed for the decision whether to dump a particular
file or not is stored in the inode on ext2/3 and in the directory
entry/ies on FAT filesystems - the FAT itself is needed only when actually
dumping a file (and for finding directory blocks), as it only indicates which
blocks ("clusters") belong together, not any other properties of this group of
blocks, not even its exact used size or if it's a file or a directory, which
both is stored in the corresponding directory entry. This information is
stored with the inode (at least for smaller files) on ext2/3, thus eliminating
any need for an additional read. However, linux caches inodes in main memory
just as any other filesystem information, thus for caching purposes it
shouldn't matter where the needed information is stored.

BTW, how do you think should dump read blocks from the filesystem
without context switches? The only thing dump does not use is the kernel's
filesystem code. The block device drivers and the block buffering
mechanisms are just the same as with the kernel's filesystem driver.

However, I don't know whether dump optimizes block device access by
sorting requests or something.

Maybe you could provide some more detailled information on the problem
you want to solve/the kind of data you have to backup?

Cyas, Florian

PS: Could you please limit your quoting to the parts needed?