Re: [Dar-libdar_api] Re: last archive slice
For full, incremental, compressed and encrypted backups or archives
Brought to you by:
edrusb
|
From: Johnathan B. <jk...@sh...> - 2004-07-25 07:36:28
|
On July 24, 2004 11:22 pm, Denis Corbin wrote:
> Johnathan Burchill wrote:
> | Hi Denis,
>
> Hello Johnathan,
>
> | In dar version 2.1.3, libdar mistakes the last slice number sometimes.
> |
> | For example, create an archive with slice sizes of 3 MB. Say this
> | results in 6 slices. Now create the archive, same basename, same
> | directory, and allow the library to overwrite the existing archive,
> | but use 5 MB slices. Suppose it results in 3 slices.
> |
> | Now try to list the archive. I get the following user interaction
>
> question:
> | "/opt/user/backups/dar_backups/test6.6.dar is a file from another set
> | of backup file, please provide the correct file."
> |
> | Then the user has to delete slices, one by one starting with the
> | highest number, and retry the list command until it works.
> |
> | Is this the expected behaviour?
>
> yes.
>
> | Shouldn't libdar see that the third slice,
> | in this example, would be the last archive slice? I have to admit,
> | it's confusing for a user who does this to know that the archive
> | consists of only three slices when six are on disk.
>
> This is a user mistake to hide under the same basename in the same
> directory slices of different archives.
>
> There is no use to keep last slices of an old archive, it only waste
> your disk space: If you miss the first slice, you won't be able to
> restore anything from your old archive. Else, if you definitively want
> to keep several archives with the same name, it is not a very good idea
> to mix their respective slices toghether in the same directory... ;-) no
> ?
That is exactly the problem I was trying to address. The slices _are_=20
absolutely useless, and just waste disk space, so why not have libdar=20
remove them?
>
> | You wouldn't have to change the slice size to get this problem.
>
> Suppose you
>
> | did a full backup and got 12 CD-R-sized slices. Then during the course
> | of the following week, you delete a bunch of directories that you
> | don't need anymore, and do a full backup which has 11 CD-R-sized
> | slices. Dar will miss the fact that the last slice #11, but will
> | still ask the user for the 12th slice.
>
> what would be the 12th slice useful for ? why would it have to stay on
> disk for ?
>
> | I see the following solutions:
> |
> | 1) libdar deletes all slices on a disk if they have the same storage
> | directory and basename as the new one being created. It might do this
> | before creating the archive, or after creating the archive. Perhaps
> | doing this after will be more efficient, since you only delete slices
> | that haven't been overwritten, if there are any.
>
> I do not agree. libdar deletes nothing. 'rm' command does. ('rm' or
> unlink() system call)
Libdar does delete files on disk if the appropriate conditions are met=20
during a restoration, but I quibble :).
But libdar does overwrite archive slices! Is this not a form of deletion?
Consider the echo command:
echo "hello" > file1
echo "bye" > file1
What is file1?
cat file1
bye
It is not, "byelo", yet this is the analogous behavior that dar does when=20
creating a new archive with slices, while replacing one that already=20
exists. As you intend this to be the behaviour, I suggest that it is not=20
the obvious or intuitive one, and at least should be documented. Apologies=
=20
in advance if it is already in the manpage.
I conceptualize disk archives as objects, whether they are made up of=20
slices or not. Making a new backup in a directory that contains an archive=
=20
of the same name should replace the archive as a whole, not on a=20
slice-by-slice basis. I do not see anything "clever" or assuming about=20
that for libdar.
An archive with one slice, made with a slice size of 0, would be completely=
=20
replaced by the new archive. By your logic you should overwrite the=20
initial portion of the old archive with the new one, not completely=20
replace it.
Same thing for when someone saves a file in a document processor. You don't=
=20
expect to have to first remove the one on disk before saving what is in=20
memory to the same filename.
In fact, in the API, libdar::archive is a class for which you can=20
instantiate archive objects. The objects aren't individual slices, they=20
are entire archives. Overwriting an old archive object with a new archive=20
object with the same name, same directory, should completely replace the=20
old archive object.
>
> | 2) The search algorithm gets fixed to stop on the first slice with a
> | "T" type terminating character in the header. Am I correct in
> | interpreting
>
> the
>
> | "T" to mean that the slice is the last one in the archive? I assume
> | that libdar figures out the last slice by relying on the filename, not
> | the header, although I haven't checked the sourcecode.
>
> Assuming in libdar the mistakes of the users is not a good idea. I can
> already hear some users complaining that libdar does not remove unuseful
> slices of older archives, while it does not complain anymore that extra
> slices exist and do only waste disk space.
I understand your point now.=20
I am not suggesting though that libdar just stop reporting that there are=20
extra useless slices, I am suggesting that it do something to stop that=20
situation from happening. I will make a weaker request then, that at least=
=20
libdar warn the user of this situation during the creation process.=20
=46or instance, would it not make sense for libdar to report during the=20
creation process that there are extra useless slices, report which ones=20
they are, and ask the user whether they should be removed, or warn the=20
user that they should be removed manually? You eliminate the assumption by=
=20
asking the question.
After all, you do ask if your user intends to get into the "endless loop"=20
by backing up a directory that contains the archive, with no exclusion=20
file filters. You don't let it happen without the question.
>
> I don't like MS-office like programs that tend to have the prentention
> to be more clever than their users. They either get much restrictive and
> blindly forbid operation needed by more clever users than the program
> developers, or automatically do stupid things or have stupid questions
> to the user, that even less clever users find borring.
I wholeheartedly agree! :)
>
> | 3) A combination of 1) and 2).
>
> no again. If a user wants to clean its previous archive it is simple:
>
> ~ rm old_archive.*.dar
> ~ dar -c old_arhive ...
>
> nothing more nothing less. Simple, everybody understands, no hidden
> features from dar or libdar, no hidden surprises. (lib)dar is only a
> backup software.
>
> | I see a similar problem when creating archives with KDar. I have a
>
> progress
>
> | bar that periodically updates which slice is currently being written,
> | and what that slice's size is. When the "pause" between slices option
> | is chosen, and libdar asks if it okay to continue writing the next
> | slice, first it writes the "non-terminating" character to the slice
> | header, and then waits for the user to answer the question. In the
> | case where the
>
> user
>
> | is overwriting an older archive, the statusbar code runs through the
>
> files
>
> | on disk until it finds the one with the "T" in the header. This will
> | be the last one in the older archive, and the statusbar will jump to
> | that slice instead of the current one.
>
> To my point of view, this is here again a user mistake. Once you
> overwrite the first slice of the old archive, all the other slices of
> this archives are useless. Why not removing first all the slices of the
> old archive you have planned to overwrite ?
I agree, if the user removes the old archive before creating the new one,=20
this situation will never arise in KDar.
What I do not understand is why, given that you admit the remaining slices=
=20
become useless once the first one is overwritten, libdar does not remove=20
those useless extra slices automatically.=20
Can you imagine some situation where, for a more clever user, those extra=20
slices are not useless?=20
Perhaps this is a stupid question! :)=20
>
> | Perhaps libdar should write the "N" to the termination type only if
> | the user says it's okay to continue, and write an "I" to the
> | termination type if the user cancels. That way we know that the "last"
> | archive slice is actually invalid, i.e. the creation process was
> | aborted, or failed in
>
> some
>
> | way.
>
> perhaps, KDAR could propose a "remove old archive slice" option ? ;-)
> option, that can simply rely on the rm command or unlink system call...
>
Agreed. Except I still think if you allow the user to overwrite an archive,=
=20
the entire archive should be replaced, not just individual slices.
> | For the status indicator, the best solution would be for libdar to
> | have a "currentSlice" method, which could be called at any time to
> | determine the slice number of the current one being written or read.
>
> a new callback function ? ;-)
There's no need for a callback function. The method just returns the slice=
=20
number, a libdar::infinint. I envision usage as such:
//(start a new creation thread):
libdar::archive =3D newArchive;
createArchiveThread *createThread =3D new=20
createArchiveThread( newArchive, ...);=20
createThread->start();
//(occasionaly check the current slice number):
while ( createThread->running() )
{
libdar::infinint currentSlice =3D theArchive->currentSlice();
//report the current slice to the user
updateStatusBar( currentSlice );
//Sleep and check again
sleep( DURATION );
}
>
> | Should I file a bug report for any of this?
>
> I don't think so, do you still ?
>
I do see now that the issue is whether the user or the library should be=20
removing slices. Fair enough, your philosophy of design is keep it simple=20
and unassuming. Just because I cannot imagine any situation in which the=20
user will ever need those "useless" extra slices, doesn't mean that=20
someone else won't.
I will implement a "do not overwrite entire archive" option, that defaults=
=20
to being off.
[...]
>
> Cheers,
> Denis.
Cheers,
JB
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by BEA Weblogic Workshop
> FREE Java Enterprise J2EE developer tools!
> Get your free copy of BEA WebLogic Workshop 8.1 today.
> http://ads.osdn.com/?ad_id=3D4721&alloc_id=3D10040&op=3Dclick
> _______________________________________________
> Dar-libdar_api mailing list
> Dar...@li...
> https://lists.sourceforge.net/lists/listinfo/dar-libdar_api
=2D-=20
Johnathan K. Burchill, Ph.D.
Department of Physics and Astronomy
University of Calgary
2500 University Drive N.W.
Calgary, AB T2N 1N4
Canada
(403) 217-4286
jk...@sh...
|