#16 possible corruption with multi-stream

open
nobody
None
5
2012-12-07
2002-04-18
Jamie Clark
No

I've had problems with restoring from multi-stream
backup sets. Some large files (> 100M size) appear to
be corrupt in the restored data.

This report is more of a "heads up" than anything. I
will continue to investigate (time-consuming) but
thought it worthwhile to mention in case someone else
has seen it.

Attributes of the problem so far:

1. Only large files seem to be affected.

2. Corruption was detected by the inability to gunzip
affected files (ie gzip CRC calculations).

3. If I arrange the backups so that no time overlap
occurs across clients then the problem goes away.

Note that if (3) was a practical permanent solution
then I would not be using the multi-stream server.

I have checked: all the clients have unique client
identifiers in their backup.conf files.

System details are:

Server:

Redhat 6.2
scsi0 : ncr53c8xx - version 3.2a-2
scsi : 1 host.
Vendor: ECRIX Model: VXA-1 Rev: 2424
Type: Sequential-Access ANSI SCSI
revision: 02
Vendor: SPECTRA Model: 215 Rev: 1008
Type: Medium Changer ANSI SCSI
revision: 02

running afbackup-3.3.5.1pl2
only running multi-stream server
using auto tape changer

Clients are all RH 6.2 running afbackup-3.3.5.1pl2,
all connecting to multi-stream server.

I'd upgrade all to 3.3.6 right away - but I haven't
seen anything in the changelog that indicates this
problem might be touched.

I am testing 3.3.6 at the moment.

-jamie

Discussion

  • Albert Fluegel
    Albert Fluegel
    2002-05-25

    Logged In: YES
    user_id=251048

    I've been thinking about this for a while now, didn't
    come to real clues. To narrow the problem i'd like to
    ask you to first turn off compression, run parallel
    multi stream backup (takes more tape space, i know),
    then run afverify, check for corrupt files and try to
    restore one or more of them. Then compare the original
    and restored version in more detail i.e. are there only
    single wrong bits, somehow shifted blocks or whatever.
    This should lead to a first idea understanding the
    problem (it has never been reported to me).
    BTW did the problem occur on all clients or only on
    some of them ?
    Thanks for spending efforts on this !

     
  • Albert Fluegel
    Albert Fluegel
    2002-05-25

    Logged In: YES
    user_id=251048

    It would be great to have the 3.3.7beta6 server
    installed for the tests, because it is more verbose on
    errors occuring while demultiplexing the data stream.