|
From: Adam C. <ada...@li...> - 2008-05-19 16:19:47
|
Okay, Bacula-sd and director are currently running with -d 100 and we plan to NOT change the tape tonight. I hope we'll get and useful traceback. If not I'll rebuild the Debian package with NOSTRIP and see if sd send us a backtrace. Regards, Adam. Arno Lehmann a écrit : > Hi, > > 19.05.2008 08:52, Adam Cécile wrote: > >> Hi, >> >> Could you please tell me more about how to get a useful traceback ? >> > > You need gdb installed and the binaries you run should not be > stripped. The former is usually ensured by your package manager, the > latter is typically done by compiling from source. I beluieve you need > the -g switch to gcc, and during install, you skip the strip process. > > Bacula itself has a script that is automatically called when a program > crashes, which will create a backtrace and mail it to the configured > operator. > > You best verify that all this works, for example by sending a signal > to the debug-version SD. > > Also, have a look at the mail I just send regarding Marias problem... > the information might be helpful for you, too. And I suspect the > problems might be related. > > Arno > > >> Thanks in advance, >> >> Regards, Adam. >> >> Kern Sibbald a écrit : >> >>> On Friday 16 May 2008 03:09:26 Adam Cécile wrote: >>> >>> >>>> Reported as #1087: >>>> http://bugs.bacula.org/view.php?id=1087 >>>> >>>> >>> OK, thanks. If you haven't already done so, please attach a traceback when it >>> crashes, as well as your bacula-dir.conf and bacula-sd.conf files. >>> >>> Thanks, >>> >>> Kern >>> >>> >>> >>>> Best regards, Adam. >>>> >>>> Kern Sibbald a écrit : >>>> >>>> >>>>> Hello Adam, >>>>> >>>>> If the SD is crashing, then there is definitely a bug and you should open >>>>> a bug report. It would be preferable if you move up to version 2.2.8 as >>>>> it simplifies things for me in debugging and finding the problems. >>>>> >>>>> Best regards, >>>>> >>>>> Kern >>>>> >>>>> On Thursday 15 May 2008 09:05:02 Adam Cécile wrote: >>>>> >>>>> >>>>>> Hello, >>>>>> >>>>>> I use Max Wait Time to cancel jobs that are left in queue because no >>>>>> tapes are available. >>>>>> This is useful when our customers forget to load a new set of tapes into >>>>>> the changer. >>>>>> >>>>>> The problem is that SD crashes in this case, here a sample of logs: >>>>>> 03-May 12:01 pdc1.it-lyon-sd JobId 1580: Please mount Volume "Daily-005" >>>>>> or label a new one for: >>>>>> Job: pdc1.it-lyon.2008-05-02_21.00.26 >>>>>> Storage: "Dell-LTO2" (/dev/nst0) >>>>>> Pool: Friday >>>>>> Media type: LTO2 >>>>>> >>>>>> Then: >>>>>> 02-May 21:00 pdc1.it-lyon-dir JobId 1582: Start Backup JobId 1582, >>>>>> Job=intox1.it-lyon.2008-05-02_21.00.28 >>>>>> 02-May 21:01 pdc1.it-lyon-dir JobId 1582: Using Device "Dell-LTO2" >>>>>> 06-May 11:10 intox1.it-lyon-fd: intox1.it-lyon.2008-05-02_21.00.28 Fatal >>>>>> error: job.c:1808 Comm error with SD. bad response to Append Data. >>>>>> ERR=Aucune donnée disponible >>>>>> 06-May 11:11 pdc1.it-lyon-dir JobId 1582: Error: Bacula pdc1.it-lyon-dir >>>>>> 2.2.5 (09Oct07): 06-May-2008 11:11:01 >>>>>> >>>>>> Bacula-sd processus sometimes wipes, sometimes it keeps running but >>>>>> doesn't work anymore until we restart it. >>>>>> >>>>>> Another log example: >>>>>> >>>>>> 06-mai 12:23 localhost-sd JobId 235: Please mount Volume "000027L3" or >>>>>> label a new one for: >>>>>> Job: atp-data.2008-05-02_22.00.44 >>>>>> Storage: "Drive-1" (/dev/nst0) >>>>>> Pool: Weekly >>>>>> Media type: LTO3 >>>>>> 07-mai 12:23 localhost-sd JobId 235: Please mount Volume "000027L3" or >>>>>> label a new one for: >>>>>> Job: atp-data.2008-05-02_22.00.44 >>>>>> Storage: "Drive-1" (/dev/nst0) >>>>>> Pool: Weekly >>>>>> Media type: LTO3 >>>>>> 08-mai 12:23 localhost-sd JobId 235: Fatal error: Max time exceeded >>>>>> waiting to mount Storage Device "Drive-1" (/dev/nst0) for Job >>>>>> atp-data.2008-05-02_22.00.44 >>>>>> 08-mai 12:23 localhost-sd JobId 235: Job write elapsed time = 134:15:41, >>>>>> Transfer rate = 3.350 M bytes/second >>>>>> 08-mai 12:23 localhost-fd JobId 235: Fatal error: backup.c:892 Network >>>>>> send error to SD. ERR=Broken pipe >>>>>> 08-mai 12:23 localhost-dir JobId 235: Error: Bacula localhost-dir 2.2.8 >>>>>> (26Jan08): 08-mai-2008 12:23:41 >>>>>> >>>>>> This is a serious issue as Max Wait Time can't be used (always crash). >>>>>> >>>>>> Could you please tell me if this is a known issue or not ? If not, a >>>>>> customer is okay to "forget to change the tape" so I can provide you >>>>>> some debugging backtraces if needed. >>>>>> >>>>>> Thanks in advance, >>>>>> >>>>>> Best regards, Adam. >>>>>> >>>>>> >>> >>> >> ------------------------------------------------------------------------- >> This SF.net email is sponsored by: Microsoft >> Defy all challenges. Microsoft(R) Visual Studio 2008. >> http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ >> _______________________________________________ >> Bacula-users mailing list >> Bac...@li... >> https://lists.sourceforge.net/lists/listinfo/bacula-users >> >> > > |