From: <no...@bu...> - 2004-06-02 15:50:14
|
A BUGNOTE has been added to this bug. ====================================================================== http://bugs.bacula.org/bug_view_advanced_page.php?bug_id=0000004 ====================================================================== Reported By: scoopex Assigned To: ====================================================================== Project: bacula Bug ID: 4 Category: File Daemon Reproducibility: sometimes Severity: feature Priority: normal Status: feedback ====================================================================== Date Submitted: 02-06-2004 02:48 PDT Last Modified: 02-06-2004 08:50 PDT ====================================================================== Summary: Director and Storage daemon passwords or names not the same Description: The passwords of SD and DIR ar definitly correct, but bacula-sd stops working after a short period of time. Message-ID: <Pin...@mo...> of the list bacula-users-list describes the problem very good - this is similar to my situation. Restarting all daemons, fixed that problem in the past for some days. I´m keeping gdb connected to the crashed process, if you need additional information, please let me know what I should do, or what additional tests you need. This occurrs on my test-testup - so we can do closly any test :-) ====================================================================== ---------------------------------------------------------------------- scoopex - 02-06-2004 02:55 PDT ---------------------------------------------------------------------- Additional information about the job, which was running while the storage-daemon stopped to work. -- removed.lf.net-dir: No prior Full backup Job record found. removed.lf.net-dir: No prior or suitable Full backup found. Doing FULL backup. removed.lf.net-dir: Start Backup JobId 1573, Job=removed.lf.net.2004-06-02_10.45.11 uml.lf.net-sd: Volume "LFO-018" previously written, moving to end of data. removed.lf.net-fd: removed.lf.net.2004-06-02_10.45.11 Fatal error: job.c:1151 Comm error with SD. bad response to Append Data. ERR=Resource temporarily unavailable removed.lf.net-dir: removed.lf.net.2004-06-02_10.45.11 Error: Bacula 1.34.2 (24Apr04): 02-Jun-2004 11:10 JobId: 1573 Job: removed.lf.net.2004-06-02_10.45.11 Backup Level: Full (upgraded from Incremental) Client: removed.lf.net-fd FileSet: "removed.lf.net" 2004-05-27 10:09:31 Start time: 02-Jun-2004 10:45 End time: 02-Jun-2004 11:10 FD Files Written: 0 SD Files Written: 0 FD Bytes Written: 0 SD Bytes Written: 0 Rate: 0.0 KB/s Software Compression: None Volume name(s): Volume Session Id: 1 Volume Session Time: 1086165655 Last Volume Bytes: 35,432,201,328 Non-fatal FD errors: 0 SD Errors: 0 FD termination status: Error SD termination status: Error Termination: *** Backup Error *** -- Maybe it can be possible that the tape contains a unfinished backup, of a test-run some weeks ago.... ---------------------------------------------------------------------- scoopex - 02-06-2004 05:02 PDT ---------------------------------------------------------------------- Sorry, I have selected the wrong "Severity" to this bug - it should be "block". Best regards Marc Schoechlin ---------------------------------------------------------------------- kern - 02-06-2004 07:19 PDT ---------------------------------------------------------------------- I don't understand what is going on here, so we may need to debug some more. Thanks for figuring out how to generate it. Your gdb output is OK (it is missing the debugging symbols, but that is not important at this point). It shows that the Storage daemon is busy attempting to get to the end of the tape. I cannot understand why it would be doing that. If you have a tape loaded, Bacula might try to read the tape, but it shouldn't attempt to position it. Do you by any chance have a blank tape loaded that is running away causing the Storage daemon to appear to hang? Have you run a successful btape "test" command as well as the tapetest program recommended in the manual? ---------------------------------------------------------------------- scoopex - 02-06-2004 08:50 PDT ---------------------------------------------------------------------- >If you have a tape loaded, Bacula might try to read the tape, but it shouldn't attempt to >position it. >Do you by any chance have a blank tape loaded that is running away causing the >Storage daemon to appear to hang? Sure - I know inserted a "purged" tape, started a new backup-run - and the storage-daemon seems not to have any problems. (Incremental Job finished sucessfully) I suppose that this problems could be the result of a interupted backup. (I stopped the bacula-dir by running the init-script with parameter "stop" - after starting again that I saw with mytop that bacula is purging the incomplete entries in the database.) Maybe this is a problem because bacula is searching the end of the stream on the tape. I suppose there are some markers, and bacula misses them.... ? (I don´t know so much about tape-drives, forgive me about doing wild presumptions :-)) >Have you run a successful btape "test" command as well as the tapetest program >recommended in the manual? Yes, of course. I run the described test, and the tapetest.c, which detects the pthreads-problems on FreeBSD-machines. Both were successful! Bug History Date Modified Username Field Change ====================================================================== 02-06-04 02:48 scoopex New Bug 02-06-04 02:48 scoopex File Added: trace1 02-06-04 02:55 scoopex Bugnote Added: 0000005 02-06-04 05:02 scoopex Bugnote Added: 0000007 02-06-04 07:19 kern Bugnote Added: 0000009 02-06-04 07:51 kern Description Updated 02-06-04 07:51 kern Steps to Reproduce Updated 02-06-04 07:51 kern Additional Information Updated 02-06-04 07:51 kern Status new => feedback 02-06-04 08:50 scoopex Bugnote Added: 0000010 ====================================================================== |