From: Peter S. <pet...@te...> - 2006-05-31 04:00:24
|
On Tue, 2006-05-30 at 10:33 +0200, Kern Sibbald wrote: > > On Sat, 2006-05-27 at 09:24 +0200, Kern Sibbald wrote: > >> > On Thu, 2006-05-25 at 18:46 +0200, Kern Sibbald wrote: > >> >> I don't think this has anything to do with the seg fault we see on 64 > >> >> bit > >> >> machines, and with some compilers on 32 bit systems as that segfault > >> >> happens at Bacula startup. > >> >> > >> >> I suspect that you have a problem with a network switch/hub or OS > >> that > >> >> is > >> >> timing you out. > >> >> > >> >> Try using the Heartbeat Interval directive in both the FD and the SD. > >> > I turned on debugging and tested across the network and suddenly it > >> > worked. The changes I done was that I was running debug and had > >> > dismounted some filesystem so everything fit on one tape. > >> > To figure out what caused it I remounted the filesystems and did > >> another > >> > debug run and this time it failed. > >> > > >> > One more round, this time on the same system (excluding everything > >> > network related) and it still failed. > >> > It seems like it fails only when spanning tapes but I think I have had > >> > other x86 system doing that without any problems and I would expect it > >> > to be more widespread if that was the case. > >> > > >> > Complete logs are at http://www.techwiz.ca/~peters/bacula/DebugLogs/ > >> > info about the build is at http://www.techwiz.ca/~peters/bacula/ > >> > > >> > What else can I try ? > >> > >> I already answered that. > >> > > Didn't think the network would be a possible problem (why does it only > > die when it's done, not in the middle?) but added heartbeat as a test. > > > > Heartbeat Interval = 600 > > SDConnectTimeout = 600 > > > > I ran it twice an both failed in the end just like before > > Did you add the Heartbeat Interval to both the SD and FD as I suggested? Only FD, did now add it to sd also but don't know what it will do unless the doc is not telling the whole truth (states that it's sending heartbeat while waiting for tape change, the fail is after the last tape is finished, never during a tape change) and Suse internal loopback ip stack drops connections in the middle. > > Have you run the btape "test" and "fill" commands with your current setup? > Did it before and did it now again, just to be sure - test, fill (multitape) & autochanger all worked. The factors I see in the problem is: fails when one server backs up to more then one tape didn't fail when same sever backed up to one tape (over network cables) fails even when all is done on the same local server seems to always fail when it's done, maybe around the "despooling attributes" part (I'm always spooling attributes) it fails both when spooling data (which I do over network) and not spooling data (case when backing up backup server) To me it doesn't look like a network issue but what kind of issue it is I can't guess. /ps > > > > > > . > > . > > . > > 28-May 15:15 sisko-sd: 3302 Autochanger "loaded drive 0", result is Slot > > 12. > > 28-May 15:16 sisko-sd: Recycled volume "FUL082" on device "DL0" > > (/dev/scsi/nsth6-2c0i1l0), all previous data lost. > > 28-May 15:16 sisko-sd: New volume "FUL082" mounted on device "DL0" > > (/dev/scsi/nsth6-2c0i1l0) at 28-May-2006 15:16. > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/ftp > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/lib/mysql > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/www > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/lib/mysql/bacula > > 28-May 17:04 sisko-fd: Backup_sisko.2006-05-28_10.58.18 Fatal error: > > backup.c:500 Network send error to SD. ERR=Broken pipe > > 28-May 17:04 sisko-dir: Backup_sisko.2006-05-28_10.58.18 Error: Bacula > > 1.38.9 (02May06): 28-May-2006 17:04:35 > > JobId: 8549 > > Job: Backup_sisko.2006-05-28_10.58.18 > > Backup Level: Full > > Client: "sisko-fd" x86_64-unknown-linux-gnu,suse,10.0 > > FileSet: "AllLocalSisko" 2005-11-11 10:28:41 > > Pool: "Full" > > Storage: "ADIC" > > Scheduled time: 28-May-2006 10:58:18 > > Start time: 28-May-2006 10:58:21 > > End time: 28-May-2006 17:04:35 > > Elapsed time: 6 hours 6 mins 14 secs > > Priority: 11 > > FD Files Written: 647,246 > > SD Files Written: 0 > > FD Bytes Written: 89,351,545,585 (89.35 GB) > > SD Bytes Written: 0 (0 B) > > Rate: 4066.2 KB/s > > Software Compression: None > > Volume name(s): FUL072|FUL075|FUL082 > > Volume Session Id: 15 > > Volume Session Time: 1148775652 > > Last Volume Bytes: 17,998,782,780 (17.99 GB) > > Non-fatal FD errors: 1 > > SD Errors: 0 > > FD termination status: Error > > SD termination status: Error > > Termination: *** Backup Error *** > > > > . > > . > > . > > 29-May 08:24 sisko-sd: Recycled volume "FUL079" on device "DL0" > > (/dev/scsi/nsth6-2c0i1l0), all previous data lost. > > 29-May 08:24 sisko-sd: New volume "FUL079" mounted on device "DL0" > > (/dev/scsi/nsth6-2c0i1l0) at 29-May-2006 08:24. > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/ftp > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/lib/mysql > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/www > > sisko-fd: Filesystem change prohibited. Will not descend into > > /var/lib/mysql/bacula > > 29-May 09:29 sisko-fd: Backup_sisko.2006-05-29_03.05.12 Fatal error: > > backup.c:500 Network send error to SD. ERR=Broken pipe > > 29-May 09:29 sisko-dir: Backup_sisko.2006-05-29_03.05.12 Error: Bacula > > 1.38.9 (02May06): 29-May-2006 09:29:37 > > JobId: 8562 > > Job: Backup_sisko.2006-05-29_03.05.12 > > Backup Level: Full (upgraded from Incremental) > > Client: "sisko-fd" x86_64-unknown-linux-gnu,suse,10.0 > > FileSet: "AllLocalSisko" 2005-11-11 10:28:41 > > Pool: "Full" > > Storage: "ADIC" > > Scheduled time: 29-May-2006 03:05:11 > > Start time: 29-May-2006 04:03:00 > > End time: 29-May-2006 09:29:37 > > Elapsed time: 5 hours 26 mins 37 secs > > Priority: 11 > > FD Files Written: 646,975 > > SD Files Written: 0 > > FD Bytes Written: 89,379,981,675 (89.37 GB) > > SD Bytes Written: 0 (0 B) > > Rate: 4560.9 KB/s > > Software Compression: None > > Volume name(s): FUL076|FUL077|FUL079 > > Volume Session Id: 28 > > Volume Session Time: 1148775652 > > Last Volume Bytes: 17,998,782,792 (17.99 GB) > > Non-fatal FD errors: 0 > > SD Errors: 0 > > FD termination status: Error > > SD termination status: Error > > Termination: *** Backup Error *** > > > > <SNIP> > > > > > > _______________________________________________ > > Bacula-devel mailing list > > Bac...@li... > > https://lists.sourceforge.net/lists/listinfo/bacula-devel > > > > > Best regards, Kern |