Re: [Dar-support] Multiple slices on LTO6 tape

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Le 18/05/2022 à 13:20, Petr Skoda a écrit :
> Dear Denis,

Hi Petr,

> 
> I must say that I am still confused concerning the slices and
> information you have written below and all what I have read in dar doc.
> 
> I understand that the slice created by dar_split has special format
> which allows to treat it in a different way than slices done by dar_xform.

dar_split does not provide any format. It is just used to cut a single
sliced dar backup over several tapes, and to stick these different
fragments for dar to have get back a single (big) slice.

> 
> But then I still do not know how to restore the content when I have
> individual slices on separate tapes. Suppose I have all data for my
> subdirectory only on slice 3 (the last one). You say I need only the
> last slice.

In your case you created a three slices backup and droped or expect to
drop each slice to a different tape. This is another way of doing than
using dar_split, which have some drawbacks and advantages:

drawbaks:
as this is a multi-sliced backup, dar expectes different file names for
each slice. Thus you have to play with symlinks to point to the tape and
this require to force dar to pause between slices (-p option) in order
to have the time to:
- rewind the tape
- change the tape
- remove / add a symlink pointing to the tape having the name of the
next slice
- backup process does not directly write to tape, you need to store at
least one slice at a time on local disk, copying to tape, removing it
from disk (or keep it if you hav enough disk storage), then continue
with the next slice. unsing dar_split, you can directly send the backup
to tape.

advantage:
- you do not need dar_split
- if for some reason and I/O succeeds while writing a slice to tape, you
don't have to restart the whole backup, slice already on tape are fine,
you just need to restry writing the failed slice (eventually to a
different tape).

For the rest, this is equal as what's in the FAQ: you still need to read
with --sequential-read option, you can use -E option to automate what's
possible rather or in addition to -p option.

> So I insert tape with slice 3 and want to extract the catalogue for
> further usage .
> How I will do dar -C  - something like  
> dd if=/dev/nst0 bs=256k | dar -C  mycat -A - seems not to work.

assuming a three sliced backup, this should work (I have no tape drive
to test it):

ln -s /dev/nst0 backup.3.dar
dar -C backup_isolated_cat -A backup -z --sequential-read

If you get a message about fadvise not available on the device, you need
to compile dar with:
	./configure --disable-fadvise

> 
> I know that I must use the generic name of the slices (without the
> .3.dar) to work on archive. But if it is read from a pipe after dd - how
> to do it ?

Using symlink, as shown above. As you see this is not very confortable
to use that way (playing with symlinks in addition to playing with tapes).

But better converting this 3 slices backup to a single sliced backup
(using dar_xform). During the converstion, you can use dar_split to send
the resulting singles sliced backup directly to tape, and thus avoiding
storing twice the data on local disk (single sliced backup, and three
sliced backup). This is what I proposed in my previous email.

> 
> Futhermore you have written in a doc that a special control sequences
> are interspersed acrros the tape to allow the reconstruction but the -0
> (sequential) mode must be used...  So why I may not do this to get the
> content of a particular tape with a slice ?

first because dar expect the backup to start with a backup header, that
is only contained at the beginning of the first slice. More precisely,
this information is also duplicated at the end of the backup and is used
when reading the backup without --sequential-mode (so called direct mode
access).

However dar has a feature that can let you recover at most of corrupted
backup (the -alax mode), but it is painful as you have to provide by
hand the few information contained in that header (which is here
missing) based on the dar version you were using for the backup. The
archive format has evolved over time and current dar version is still
able to read backup created with version 1.0.0 more than twenty years
ago. However it must know which version (which format) is the backup
created after, in order to read it properly.

So you can read the content of a slice taken from a multi-sliced backup
using both --sequential-read and --alax options. But that's painful and
does not handle files that are located on two continuous slices (this is
quite improbable that a slice end matches the end of a saved file and
that the next slice start with a new file).

> 
> I suppose that still the FAQ should explain in a detail the questiion:
> 
> "I have splitted the data by xform (e.g. after netcat ) to several
> slices . Each of them was written to a separate tape. How I will extract
> the backup ?"

It could add this, but I will probably not, because to my point of view
this is not the best way of using dar with tape. What you did is pretty
logical and you could not guess of the other way (dar_split) due to the
lack of FAQ about that (my bad). Don't you agree that using dar_split
instead of multi-sliced backup only brings advantages? If you explain me
some drawback of using dar_split in that context, I will reconsider my
point on that :)

to my point of view in your case, if you don't want to remake the whole
backup, convert it to a single sliced backup with dar_xform and output
it to tapes using dar_split

assuming your have backup.1.dar backup.2.dar and backup.3.dar available
this is done this way:

dar_xform backup - | dar_split split_output /dev/nst0

> 
> All of your answers is about dar_split but you say it must be used to
> create the (special) slices. What should I do if  did not do it just
> used the dar_xform of simply dar with -s and -S option of create ?

not sure to understand your question

> How to extract catalog from such tapes ? 

we have answered this question above

> How to use sequential mode.

you can use --sequential-read even with multi-sliced backup. But as
already said, you need to mimic filename using symlinks for dar finds
the slices it expectes.

> 
> Sorry for my ignorance, but I really have spent  a lot of time reading
> various docs on dar web and mail list but still have very weak
> understanding what to do ...

not worries, documentation is always perfectible, you input are valuable
(see this new FAQ that was missing, I guess if you had it available
since the beginning, things would have been simpler to you, right?)

> 
> So far it looks that I cannot do anything with multiple slices on tape
> except of feeding the drive with all tapes in sequential mode .

No you can, just playing with symlink to the tape device as seen previously.

> I am not even verify it the particular slice is correct (as the -l  does
> not work on other then first slice) 

--sequential-mode is also available with sliced backup, thus suitable to
tape as you did, at the cost of symlink manipulation to emulate filenames.

> Extracting the catalogue from the
> last one is also problematic
> is the only answer - use the dar_split (so I have to start with backups
> from the scratch ?  I have already written ten or more tapes with slices
> from multiple backups (filesystems)
> 
> 
> 
> Best regards, Petr Skoda
> 
> 
> ---------- Původní e-mail ----------
> Od: Denis Corbin <dar...@fr...>
> Komu: dar...@li...
> Datum: 13. 5. 2022 18:38:45
> Předmět: Re: [Dar-support] Multiple slices on LTO6 tape
> 
> 
> 
>     >
>     > I have saved each slice to one tape using  dd if=datas.1.dar
>     > of=/dev/nst0 bs=256k etc...
>     > I still have the 3 slices on my staging disk (taking almost 7TB) .
>     But I
>     > am not sure how to list the content , check the consistency and
>     finally
>     > extract without needing such large space in the future.
> 
>     in the mode you have been using, if you want to list the backup
>     content,
>     dar will only need the last slice. This will lead you to read the whole
>     3rd slice from tape.
> 
> 
> how the command should look like (with dd and pipe - I need dd because
> of larger block) ?
> 
> 
>  
> 
>     Better you can use instead an isolated catalogue of
>     the backup on tape (something you can do on-fly or afterward). An
>     isolated catalogue is usually small an does not require to be backed up
>     to tape as you can recreate one from a backup at any time.
> 
>     example to create a isolated catalogue:
>     dar -C isolated -A backup -z ... 
> 
> 
> Yes but then I need still to have all slices on the disk (called
> backup.1.dar , backup.2.dar etc.... - 10TB or more)
> 
> 
>  
> 
> 
> 
>     If you want to extract a single file from backup, dar will ask the last
>     slice, then the slice where the file/tape to restore resided. But if
>     you
>     restore with the help of an isolated catalogue, dar will only need the
>     slice where is located the file to restore.
> 
> 
> How will it get the right info if reading from tape with the other than
> first slice ?- the extract will not accept the data from tape - (except
> the first)?
> 
> I.e. when using dd if=/dev/nst0 s=256k|dar -0 -x -g something  it worked
> on tape with backup.1.dar
> 
> 
> 
> 
> 
>  
> 
> 
>     example to restore with the help of a catalogue
>     dar -x backup -A isolated ... 
> 
> it is written many times in docs - but here I need the name of backup
> (so it must be on the  disk and in fact all slices )
> 
> 
>  
> 
> 
> 
> 
>     > I would like
>     > simply to restore only some files in the future getting instruction
>     > which tape to use and than extract  Ideally using some pipes combined
>     > with dd if=/dev/nst0 bs=256k |  dar something -x -
>     > However dar requires all slices together - I am not able to use only
>     > e.g. datas.2.dar to list only its contents.
> 
>     You could better use dar+dar_split instead, if this suits your need.
>     The
>     advantage is that it will not need you any extra storage to restore or
>     list the backup, but will act a bit like tar, reading the tapes from
>     the
>     first toward the last up to the point the file's metadata and data to
>     restore are reached. 
> 
> 
> ok - so dar_split is something different which creates some header and
> catalogues for each slices/tape ?
> 
> 
>  
> 
> 
> 
>     Note that you cannot use dar_split at reading time if you have not used
>     it at creation time. when using dar+dar_split, dar creates a single
>     slice, slice that dar_split (as you can guess from its name) splits
>     over
>     different tapes. Thus, at reading time, dar expects a single slice and
>     not the concatenation of two or more slices, which is what dar_splits
>     does at reading time (concatenating the content of several tapes).
> 
> 
> so can I read from tape with dar_split (using -0 as well )or not)
> 
>  
> 
> 
>     However you can convert a splited backup to a single sliced backup
>     using
>     dar_xform and then use dar_split:
> 
>     dar_xform backup - | dar_split split_output /dev/tape
> 
>     note that if you had an isolated catalogue from 'backup' it stays a
>     valid isolated catalog for the single sliced backup generated by
>     dar_xform, that dar_split has written to several tapes.
> 
> 
> Would you mind to write a concrete examples of using /dev/nst0 or
> /dev/tape)
> 
> instead of name of the backup ?
> 
> Combination with pipe on dd ?
> 
>  
> 
> 
>     >
>     > When starting with dar I expected that in sequential mode with some
>     > flags (even the -al does not work here) I will be able to scan
>     what is
>     > on given slice and get something from it.
> 
>     you can get file content per slide using the -Tslice option while
>     listing an multi-sliced backup (or an isolated catalogue from this
>     backup).
> 
> 
> In fact it did not help in reading the single slice. It only tells me
> (while having all slices on disk ) at which slices the subdire pointed
> by -g is written
> 
> 
>  
> 
>     But you lose this ability when using dar with dar_split as from
>     dar stand point, there is only one slice.
> 
>     > But now having more tapes I am even not able to check what is on the
>     > tape (if it is slice 2 I am not able simply to use
>     > dd if=/dev/nst0 bs=256k |dar -0 -l  -   to see just for
>     orientation part
>     > of the listing of content.
> 
>     there is not really a table of content per slice, a sliced backup is
>     still a coherent backup with an table of content at the end of the
>     slice
>     set. 
> 
> 
> I understand - but it seems also not to have header recognized by dar
> (except of the first)
> 
>  
> 
> 
>     Dar is not expecting to have all slices available at any time, you can
>     use -p option to pause after each (n) created slice(s) and do what is
>     needed, like move the produced slices to tape then remove it, before
>     having dar continuing its work. At reading time, dar will ask for the
>     missing slice and pause, you can then obtain it from tape and let dar 
> 
> 
> Ok - but it means I need to feed all slices (tape by tape) until finding
> the right one.
> 
> Even if I know that my subdir is on a slice number n>1
> 
>  
> 
> 
> 
> 
>     with dar_split you do not need mbuffer if you just want to rate limit
>     the throughput (see its -r option)
> 
> 
> yes but mbuffer is printing the progress the data rates etc ... And the
> buffer filling is IMHO important for preventing the tape shoe shining .
> 
> I do not want o make slower the feeding the tape what probably does the
> -r ) - so in writting it is usefull the if the  tape drive does not cope
> with a speed of feeding data. But on LTO on SAS controller the speed of
> writting is fast and the tape waits for feeding and e.g. compressing)
> the data so the buffer (large several GB) is needed not the let the tape
> to stop.
> 
> In fact  in my setup the hard disk is slower than tape speed (as it is
> not on dedicated SAS)
> 
> But I will try to make some tests to prove this my subjective feeling.
> 
> 
> 
> 
> 

Re: [Dar-support] Multiple slices on LTO6 tape

For full, incremental, compressed and encrypted backups or archives

Re: [Dar-support] Multiple slices on LTO6 tape