From: <bac...@li...> - 2005-02-15 08:26:56
|
A BUGNOTE has been added to this bug. ====================================================================== http://bugs.bacula.org/bug_view_advanced_page.php?bug_id=0000227 ====================================================================== Reported By: ddavenport Assigned To: ====================================================================== Project: bacula Bug ID: 227 Category: Director Reproducibility: always Severity: major Priority: normal Status: feedback ====================================================================== Date Submitted: 02-08-2005 22:42 PST Last Modified: 02-15-2005 00:26 PST ====================================================================== Summary: Tapes get truncated after restore. Description: Tapes are truncated after a restore. I've run into a problem twice now and I think that I must have something wrong in my config, but I'm not sure what is up. I'm using bacula version 1.36.0 on FreeBSD 5.2.1-RELEASE-p9 and have attached a Quantum DLT-8000 (single-tape) SCSI drive. What happens is this. If I run a restore job that uses the tape that's currently in the drive, the restore runs fine, but when it's finished it doesn't seem to advance the tape back to the double EOF. Here's the messages I received this morning when it happened again: 04-Feb-2005 02:05 chii-dir: Start Restore Job RestoreFiles.2005-02-04_02.05.31 04-Feb-2005 02:07 chii-sd: Ready to read from volume "CBN-0006" on device /dev/sa0. 04-Feb-2005 02:07 chii-sd: Forward spacing to file:block 26:0. CBN-0006 was the tape that was already in the drive since I just needed to pull a copy of a file that had been saved the night before. The tape only had about 29Gb written on it. When it finished, it gave me this: 04-Feb-2005 02:17 chii-sd: End of Volume at file 26 on device /dev/sa0, Volume "CBN-0006" 04-Feb-2005 02:17 chii-sd: End of all volumes. 04-Feb-2005 02:17 chii-fd: -rw------- 1 www www 18434 2005-02-01 12:46:41 (file and dir chopped) 04-Feb-2005 02:17 chii-dir: Bacula 1.36.0 (20Oct04): 04-Feb-2005 02:17 JobId: 861 Job: RestoreFiles.2005-02-04_02.05.31 Client: chii-fd Start time: 04-Feb-2005 02:05 End time: 04-Feb-2005 02:17 Files Expected: 1 Files Restored: 1 Bytes Restored: 18,434 Rate: 0.0 KB/s FD Errors: 0 FD termination status: OK SD termination status: OK Termination: Restore OK 04-Feb-2005 02:17 chii-dir: Begin pruning Jobs. 04-Feb-2005 02:17 chii-dir: No Jobs found to prune. 04-Feb-2005 02:17 chii-dir: Begin pruning Files. 04-Feb-2005 02:17 chii-dir: No Files found to prune. 04-Feb-2005 02:17 chii-dir: End auto prune. What's worrying me is the 'End of Volume at File 26' which is where the recovery took place. It looks like it's not going back to the end of the last write. Then when the next backup tried to run: 04-Feb-2005 03:00 chii-dir: Start Backup JobId 862, Job=OpenView_Backup.2005-02-04_03.00.00 04-Feb-2005 03:00 nnm-fd: Since time adjusted by -1 seconds. 04-Feb-2005 03:00 chii-sd: Volume "CBN-0006" previously written, moving to end of data. 04-Feb-2005 03:00 chii-sd: OpenView_Backup.2005-02-04_03.00.00 Error: I canot write on Volume "CBN-0006" because: The number of files mismatch! Volume=26 Catalog=62 04-Feb-2005 03:00 chii-sd: Marking Volume "CBN-0006" in Error in Catalog. 04-Feb-2005 03:00 chii-dir: Created new Volume "CBN-0007" in catalog. 04-Feb-2005 03:00 chii-sd: Please mount Volume "CBN-0007" on Storage Device "CBNQuantum" for Job OpenView_Backup.2005-02-04_03.00.00 And now it's marked the tape in Error state instead of Append. Here's the relevent section of my bacula-sd.conf file. The btape test didn't kick back any errors with it so I assumed that everything was fine. I just tried adding the Fast Forward Space File recently, but this has happened once before when I had it set to 'No'. Device { Name = CBNQuantum Archive Device = /dev/sa0 Media Type = DLT8000 AlwaysOpen = yes Removable media = yes Random access = no Hardware End of Medium = no Fast Forward Space File = yes BSF at EOM = yes TWO EOF = yes Backward Space Record = no AutomaticMount = yes Offline on Unmount = no } Is there something I'm missing? I don't think unmounting the tape would do any good as it seems to be setting the EOV as soon as the restoration completes for some reason. Any help would be greatly appreciated. We don't do restores often, but this is twice I've had a tape error out with only about 1/5 of the space used and I don't have _that_ many spare tapes. ====================================================================== ---------------------------------------------------------------------- kern - 02-09-2005 02:28 PST ---------------------------------------------------------------------- I consider this a very serious bug and would like to get to the bottom of it. The question is whether it is a Bacula bug or an OS bug -- one or the other seems to be truncating your tape. I suggest you try the following (my email might have been a bit more detailed). Reproduce the problem with a very small FileSet of one file by: 1. Define a small FileSet of 1 file. 2. Backup the file 2 or three times on a fresh tape using a Full backup each time. Save the output to attach to this bug report as a .txt file. 3. Use the console to set a debug level of 100 on the SD by: setdebug level=100 (select the Storage daemon) 4. Restore the file from the first backup by selecting the JobId of the first job that wrote to the tape. Capture the debug output from this. 5. Try another backup to see if the tape was truncated. If so, attach the output to this bug report as a .txt file. ---------------------------------------------------------------------- chemical - 02-14-2005 03:24 PST ---------------------------------------------------------------------- This is exact the same bug as I entered in http://bugs.bacula.org/bug_view_advanced_page.php?bug_id=0000231 5 Minutes ago. Saw this post too late. But, however, I reproduced it without knowing this reported bug - so we have 2 completely different systems having the same bug. Addition: If a job in the meanwhile changes the tape after the restore, no problem here! Its just the case if one writes again directly after a restore to that tape. ---------------------------------------------------------------------- kern - 02-14-2005 07:14 PST ---------------------------------------------------------------------- OK, thanks to chemical's notes, I have been able to reproduce the problem. I have not figured out the cause, but as far as I can tell, it occurs only if "fast forward space file" is not set. ddaventport and gvan, please confirm by putting your Device resource in a bugnote. The problem is a Bacula bug, and most imporatantly you have not lost your files (at least not in my tests). The problem is that Bacula loses track of where it is on the tape after the restore if fast forward space file is not set. Despite some reports to the contrary, you should be able to restore files from your tape (at least I could). If you cannot restore from the "truncated" tapes, then there is probably another problem that I will consider later. I DO NOT recommend writing to these tapes yet. The manual describes how to correct the error status and the number of files on the tapes. ---------------------------------------------------------------------- chemical - 02-15-2005 00:04 PST ---------------------------------------------------------------------- So you mean that by explicitly defining "Fast Space Forward File = yes" (even if it is the default) does not change my configuration but fixes this bug? Or do we have to wait for the 1.36.2-release? ---------------------------------------------------------------------- chemical - 02-15-2005 00:07 PST ---------------------------------------------------------------------- I meant "Fast Forward Space File = yes" (typo) ---------------------------------------------------------------------- kern - 02-15-2005 00:14 PST ---------------------------------------------------------------------- After a bit more investigation, it looks like Fast Space Forward File is not involved with the bug. A temporary solution *should* be to unmount and remount the drive after every restore. I have not tested this though. I'll attach a patch for 1.36.1 in the next few hours, which will also be in 1.36.2. ---------------------------------------------------------------------- chemical - 02-15-2005 00:26 PST ---------------------------------------------------------------------- Added "fast forward space file = yes" to my device config, restarted storage daemon, tried again: same error. Bug History Date Modified Username Field Change ====================================================================== 02-08-05 22:42 ddavenport New Bug 02-09-05 02:28 kern Bugnote Added: 0000637 02-09-05 02:28 kern Status new => feedback 02-09-05 02:37 kern Description Updated 02-09-05 02:37 kern summary Tapes go into Error status after restore. => Tapes get truncated after restore. 02-09-05 02:44 ddavenport File Added: bug.txt 02-10-05 03:39 ddavenport Bug Monitored: ddavenport 02-10-05 05:39 gvan Bug Monitored: gvan 02-14-05 03:24 chemical Bugnote Added: 0000658 02-14-05 04:20 chemical Bug Monitored: chemical 02-14-05 07:14 kern Bugnote Added: 0000664 02-15-05 00:04 chemical Bugnote Added: 0000665 02-15-05 00:07 chemical Bugnote Added: 0000666 02-15-05 00:14 kern Bugnote Added: 0000667 02-15-05 00:26 chemical Bugnote Added: 0000668 ====================================================================== |