From: Mantis B. T. <no...@bu...> - 2014-06-15 17:50:07
|
The following issue has been SUBMITTED. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-06-15 18:49 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-05 16:01:30
|
The following issue requires your FEEDBACK. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: feedback ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-05 17:01 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-07 01:19:10
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-07 02:19 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-07 08:02:59
|
The following issue requires your FEEDBACK. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: feedback ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-07 09:02 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-07 11:56:17
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-07 12:56 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-07 11:58:46
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-07 12:58 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-07 14:33:14
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-07 15:33 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-07 16:56:42
|
The following issue requires your FEEDBACK. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: feedback ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-07 17:56 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-07 17:40:51
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-07 18:40 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-08 11:50:50
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-08 12:50 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-08 11:58:58
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: new ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-08 12:58 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-08 11:59:33
|
The following issue has been ASSIGNED. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: acknowledged ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-08 12:59 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-10 00:31:32
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: acknowledged ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-10 01:31 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 10:18:02
|
The following issue requires your FEEDBACK. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: feedback ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 11:17 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 10:40:54
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: feedback ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 11:40 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 12:03:01
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: assigned ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 13:02 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. ---------------------------------------------------------------------- (0006968) jpschewe (reporter) - 2014-07-12 13:02 http://bugs.bacula.org/view.php?id=2071#c6968 ---------------------------------------------------------------------- So when you did the restore your file deamon kept running? I ask because my director said that the restore finished successfully, but my file daemon crashed. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 2014-07-12 13:02 jpschewe Note Added: 0006968 2014-07-12 13:02 jpschewe Status feedback => assigned ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 12:48:40
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: assigned ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 13:48 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. ---------------------------------------------------------------------- (0006968) jpschewe (reporter) - 2014-07-12 13:02 http://bugs.bacula.org/view.php?id=2071#c6968 ---------------------------------------------------------------------- So when you did the restore your file deamon kept running? I ask because my director said that the restore finished successfully, but my file daemon crashed. ---------------------------------------------------------------------- (0006969) kern (administrator) - 2014-07-12 13:48 http://bugs.bacula.org/view.php?id=2071#c6969 ---------------------------------------------------------------------- Yes, the FD continued running. I was never able to make the FD crash, and I just tested it one more time to be 100% sure, because I did most of my testing with bextract. However as I said, the Volume seems to have only one junction point and no files with data on it. I am running on 7.0.4+ = 7.0.5 while you are running on a reasonably old 5.2.6, which could possibly explain the crash vs no crash. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 2014-07-12 13:02 jpschewe Note Added: 0006968 2014-07-12 13:02 jpschewe Status feedback => assigned 2014-07-12 13:48 kern Note Added: 0006969 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 12:54:37
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: assigned ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 13:54 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. ---------------------------------------------------------------------- (0006968) jpschewe (reporter) - 2014-07-12 13:02 http://bugs.bacula.org/view.php?id=2071#c6968 ---------------------------------------------------------------------- So when you did the restore your file deamon kept running? I ask because my director said that the restore finished successfully, but my file daemon crashed. ---------------------------------------------------------------------- (0006969) kern (administrator) - 2014-07-12 13:48 http://bugs.bacula.org/view.php?id=2071#c6969 ---------------------------------------------------------------------- Yes, the FD continued running. I was never able to make the FD crash, and I just tested it one more time to be 100% sure, because I did most of my testing with bextract. However as I said, the Volume seems to have only one junction point and no files with data on it. I am running on 7.0.4+ = 7.0.5 while you are running on a reasonably old 5.2.6, which could possibly explain the crash vs no crash. ---------------------------------------------------------------------- (0006970) kern (administrator) - 2014-07-12 13:54 http://bugs.bacula.org/view.php?id=2071#c6970 ---------------------------------------------------------------------- If you can build Bacula from source, you might pull the public git repo (I just pushed everything) and build it -- it is version 7.0.4 plus a number of fixes. If you can then reproduce a crash with that version, I will be happy to look at it further. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 2014-07-12 13:02 jpschewe Note Added: 0006968 2014-07-12 13:02 jpschewe Status feedback => assigned 2014-07-12 13:48 kern Note Added: 0006969 2014-07-12 13:54 kern Note Added: 0006970 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 12:55:52
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: assigned ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 13:55 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. ---------------------------------------------------------------------- (0006968) jpschewe (reporter) - 2014-07-12 13:02 http://bugs.bacula.org/view.php?id=2071#c6968 ---------------------------------------------------------------------- So when you did the restore your file deamon kept running? I ask because my director said that the restore finished successfully, but my file daemon crashed. ---------------------------------------------------------------------- (0006969) kern (administrator) - 2014-07-12 13:48 http://bugs.bacula.org/view.php?id=2071#c6969 ---------------------------------------------------------------------- Yes, the FD continued running. I was never able to make the FD crash, and I just tested it one more time to be 100% sure, because I did most of my testing with bextract. However as I said, the Volume seems to have only one junction point and no files with data on it. I am running on 7.0.4+ = 7.0.5 while you are running on a reasonably old 5.2.6, which could possibly explain the crash vs no crash. ---------------------------------------------------------------------- (0006970) kern (administrator) - 2014-07-12 13:54 http://bugs.bacula.org/view.php?id=2071#c6970 ---------------------------------------------------------------------- If you can build Bacula from source, you might pull the public git repo (I just pushed everything) and build it -- it is version 7.0.4 plus a number of fixes. If you can then reproduce a crash with that version, I will be happy to look at it further. ---------------------------------------------------------------------- (0006971) jpschewe (reporter) - 2014-07-12 13:55 http://bugs.bacula.org/view.php?id=2071#c6971 ---------------------------------------------------------------------- Thanks for checking again. This may in fact be fixed in 7.0.4+. Unfortunately the Ubuntu and Debian package repositories haven't been updated to the 7.0 series yet. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 2014-07-12 13:02 jpschewe Note Added: 0006968 2014-07-12 13:02 jpschewe Status feedback => assigned 2014-07-12 13:48 kern Note Added: 0006969 2014-07-12 13:54 kern Note Added: 0006970 2014-07-12 13:55 jpschewe Note Added: 0006971 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 12:56:49
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: assigned ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 13:56 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. ---------------------------------------------------------------------- (0006968) jpschewe (reporter) - 2014-07-12 13:02 http://bugs.bacula.org/view.php?id=2071#c6968 ---------------------------------------------------------------------- So when you did the restore your file deamon kept running? I ask because my director said that the restore finished successfully, but my file daemon crashed. ---------------------------------------------------------------------- (0006969) kern (administrator) - 2014-07-12 13:48 http://bugs.bacula.org/view.php?id=2071#c6969 ---------------------------------------------------------------------- Yes, the FD continued running. I was never able to make the FD crash, and I just tested it one more time to be 100% sure, because I did most of my testing with bextract. However as I said, the Volume seems to have only one junction point and no files with data on it. I am running on 7.0.4+ = 7.0.5 while you are running on a reasonably old 5.2.6, which could possibly explain the crash vs no crash. ---------------------------------------------------------------------- (0006970) kern (administrator) - 2014-07-12 13:54 http://bugs.bacula.org/view.php?id=2071#c6970 ---------------------------------------------------------------------- If you can build Bacula from source, you might pull the public git repo (I just pushed everything) and build it -- it is version 7.0.4 plus a number of fixes. If you can then reproduce a crash with that version, I will be happy to look at it further. ---------------------------------------------------------------------- (0006971) jpschewe (reporter) - 2014-07-12 13:55 http://bugs.bacula.org/view.php?id=2071#c6971 ---------------------------------------------------------------------- Thanks for checking again. This may in fact be fixed in 7.0.4+. Unfortunately the Ubuntu and Debian package repositories haven't been updated to the 7.0 series yet. ---------------------------------------------------------------------- (0006972) jpschewe (reporter) - 2014-07-12 13:56 http://bugs.bacula.org/view.php?id=2071#c6972 ---------------------------------------------------------------------- Can I use my 5.2.x clients with the new 7.0.4 director and storage daemon? Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 2014-07-12 13:02 jpschewe Note Added: 0006968 2014-07-12 13:02 jpschewe Status feedback => assigned 2014-07-12 13:48 kern Note Added: 0006969 2014-07-12 13:54 kern Note Added: 0006970 2014-07-12 13:55 jpschewe Note Added: 0006971 2014-07-12 13:56 jpschewe Note Added: 0006972 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-12 13:03:41
|
A NOTE has been added to this issue. ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: assigned ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-12 14:03 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. ---------------------------------------------------------------------- (0006968) jpschewe (reporter) - 2014-07-12 13:02 http://bugs.bacula.org/view.php?id=2071#c6968 ---------------------------------------------------------------------- So when you did the restore your file deamon kept running? I ask because my director said that the restore finished successfully, but my file daemon crashed. ---------------------------------------------------------------------- (0006969) kern (administrator) - 2014-07-12 13:48 http://bugs.bacula.org/view.php?id=2071#c6969 ---------------------------------------------------------------------- Yes, the FD continued running. I was never able to make the FD crash, and I just tested it one more time to be 100% sure, because I did most of my testing with bextract. However as I said, the Volume seems to have only one junction point and no files with data on it. I am running on 7.0.4+ = 7.0.5 while you are running on a reasonably old 5.2.6, which could possibly explain the crash vs no crash. ---------------------------------------------------------------------- (0006970) kern (administrator) - 2014-07-12 13:54 http://bugs.bacula.org/view.php?id=2071#c6970 ---------------------------------------------------------------------- If you can build Bacula from source, you might pull the public git repo (I just pushed everything) and build it -- it is version 7.0.4 plus a number of fixes. If you can then reproduce a crash with that version, I will be happy to look at it further. ---------------------------------------------------------------------- (0006971) jpschewe (reporter) - 2014-07-12 13:55 http://bugs.bacula.org/view.php?id=2071#c6971 ---------------------------------------------------------------------- Thanks for checking again. This may in fact be fixed in 7.0.4+. Unfortunately the Ubuntu and Debian package repositories haven't been updated to the 7.0 series yet. ---------------------------------------------------------------------- (0006972) jpschewe (reporter) - 2014-07-12 13:56 http://bugs.bacula.org/view.php?id=2071#c6972 ---------------------------------------------------------------------- Can I use my 5.2.x clients with the new 7.0.4 director and storage daemon? ---------------------------------------------------------------------- (0006973) kern (administrator) - 2014-07-12 14:03 http://bugs.bacula.org/view.php?id=2071#c6973 ---------------------------------------------------------------------- Yes, you do not need to upgrade your FDs. You must however run any SDs on the same version as the Dir, and on any given machine you must run all the same versions of all components. However, you could for the moment just consider setting up the Dir and SD in a separate set of directories for testing. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 2014-07-12 13:02 jpschewe Note Added: 0006968 2014-07-12 13:02 jpschewe Status feedback => assigned 2014-07-12 13:48 kern Note Added: 0006969 2014-07-12 13:54 kern Note Added: 0006970 2014-07-12 13:55 jpschewe Note Added: 0006971 2014-07-12 13:56 jpschewe Note Added: 0006972 2014-07-12 14:03 kern Note Added: 0006973 ====================================================================== |
From: Mantis B. T. <no...@bu...> - 2014-07-17 14:33:28
|
The following issue has been CLOSED ====================================================================== http://bugs.bacula.org/view.php?id=2071 ====================================================================== Reported By: jpschewe Assigned To: kern ====================================================================== Project: Bacula Bug Reports Issue ID: 2071 Category: File Daemon Reproducibility: always Severity: crash Priority: normal Status: closed Resolution: fixed Fixed in Version: ====================================================================== Date Submitted: 2014-06-15 18:49 BST Last Modified: 2014-07-17 15:33 BST ====================================================================== Summary: Bacula crashed trying to do a restore Description: I'm running version 5.2.6 of bacula on Linux. This is the latest available for Ubuntu. I backed up a windows laptop and then tried to restore to a Linux desktop. The Linux desktop is also running the storage daemon and the director. I tried the restore again and got the same error. This is the end of the restore log. 15-Jun 10:46 jon-fd JobId 13614: drwxrwxrwx 1 root root 0 2014-06-13 13:49:55 /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/18/ 15-Jun 10:46 jon-fd JobId 13614: -rwxrwxrwx 1 root root 4456 2014-06-13 13:50:21 ???? /mnt/vhs/jen-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/2rmh2bnm.default-1400546053153/Cache/1/19/2173115-Jun 10:53 jon-dir JobId 13614: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 15-Jun 10:53 jon-dir JobId 13614: Fatal error: Socket error on Store end command: ERR=No data available 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: OK Termination: *** Restore Error *** 15-Jun 10:53 jon-dir JobId 13614: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13614 Job: RestoreFiles.2014-06-15_10.41.57_04 Restore Client: jon-fd Start time: 15-Jun-2014 10:41:59 End time: 15-Jun-2014 10:53:25 Files Expected: 27,962 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: OK Termination: *** Restore Error *** Here is the gdb traceback email I received. [New LWP 17020] [New LWP 3705] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f3d4eec0c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 25469064 $3 = 25469128 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. ====================================================================== ---------------------------------------------------------------------- (0006942) kern (administrator) - 2014-07-05 17:01 http://bugs.bacula.org/view.php?id=2071#c6942 ---------------------------------------------------------------------- Clearly Bacula got a fatal error, but I see no evidence of a crash (daemon stops running). The gdb output is not valid. It looks like you are trying to restore Windows data to a Linux machine. I principal this is possible, but from the information you have provided, there isn't much I can do. To get more information, first, proved that Bacula actually crashed by showing that it is running before the failure and not running after the failure. If it crashes we will need a valid gdb output, which if you have configured Bacula correctly will be emailed to you. If it is not a crash then possibly by setting debug level 200 and posting the trace file, we might be able to see something. ---------------------------------------------------------------------- (0006948) jpschewe (reporter) - 2014-07-07 02:19 http://bugs.bacula.org/view.php?id=2071#c6948 ---------------------------------------------------------------------- I'm not sure why the gdb output isn't valid. That's what was emailed to me when bacula-fd crashed. I just uplaoded a file named "typescript". This file is a trace of bacula-fd run under gdb with -f -d 200. I snipped the middle out of the file so that it's small enough for upload. The restore failure from the director still gives the same packet too large error. I saw the error message in gdb and asked it for a stack trace. ---------------------------------------------------------------------- (0006949) kern (administrator) - 2014-07-07 09:02 http://bugs.bacula.org/view.php?id=2071#c6949 ---------------------------------------------------------------------- I've deleted your upload since it did not contain useful data. When you run Bacula under a debugger you must include the -s option on the command line otherwise the debugger will stop on every signal. The -s option causes Bacula to ignore non-fatal signals. Are you running the same version of Bacula for the Dir, SD, and FD? ---------------------------------------------------------------------- (0006950) jpschewe (reporter) - 2014-07-07 12:56 http://bugs.bacula.org/view.php?id=2071#c6950 ---------------------------------------------------------------------- I've attached 2 files. One is running under gdb and one is not. The one running under gdb doesn't get the segfault, but I do see a thread exiting when the restore failed. The one not running under gdb clearly gets a seg fault. When not running under gdb I get this email: [New LWP 15657] [New LWP 15288] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 0x00007f67c8654c33 in select () at ../sysdeps/unix/syscall-template.S:81 81 ../sysdeps/unix/syscall-template.S: No such file or directory. $1 = 762212202 $2 = 19411032 $3 = 19411096 /etc/bacula/scripts/btraceback.gdb:4: Error in sourced command file: No symbol "catalog_db" in current context. In both cases I get an email from the director that the restore failed with packet size too big. 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 4396 2014-07-01 10:23:05 /mnt/external/jpschewe/test-restore/C:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/1E6B4m01 07-Jul 06:16 jon-fd JobId 13732: -rwxrwxrwx 1 root root 5161 2014-06-30 16:09:44 /mnt/external/jpschewe/test-restore/ÿÿÿúC:/Users/Jennifer Schewe/AppData/Local/Mozilla/Firefox/Profiles/hjfcjzwn.default/Cache/6/08/2110E07-Jul 06:18 jon-dir JobId 13732: Fatal error: bsock.c:503 Packet size too big from "Client: jon-fd:jon:9102. Terminating connection. 07-Jul 06:18 jon-dir JobId 13732: Fatal error: Socket error on Store end command: ERR=No data available 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 1 FD termination status: SD termination status: Running Termination: *** Restore Error *** 07-Jul 06:18 jon-dir JobId 13732: Error: Bacula jon-dir 5.2.6 (21Feb12): Build OS: x86_64-pc-linux-gnu ubuntu 14.04 JobId: 13732 Job: RestoreFiles.2014-07-07_06.15.33_09 Restore Client: jon-fd Start time: 07-Jul-2014 06:15:35 End time: 07-Jul-2014 06:18:41 Files Expected: 36,914 Files Restored: 0 Bytes Restored: 0 Rate: 0.0 KB/s FD Errors: 2 FD termination status: SD termination status: Running Termination: *** Restore Error *** ---------------------------------------------------------------------- (0006951) jpschewe (reporter) - 2014-07-07 12:58 http://bugs.bacula.org/view.php?id=2071#c6951 ---------------------------------------------------------------------- And yes I'm running the same version of bacula for the storage, director and file deamons when I'm restoring. They are all at vesion 5.2.6 and on the same host. The windows client used to create the backups that I'm trying to restore is 5.2.10. ---------------------------------------------------------------------- (0006952) jpschewe (reporter) - 2014-07-07 15:33 http://bugs.bacula.org/view.php?id=2071#c6952 ---------------------------------------------------------------------- I just added gdb-ignore-sigusr.trace.snipped. It seems that the -s option to bacula-sd ignores all signals including SIGSEGV. This means that gdb doesn't get a chance to be notified of the segmentation violation. So I ran bacula-sd without -s in gdb and then told gdb to ignore SIGUSR1 and SIGUSR2. This provided me with a clean trace showing that there is an error in cleaning up memory. I'm not sure how this relates to the server message of packet too large, but hopefully provides you enough information to tell me where I might debug next. ---------------------------------------------------------------------- (0006953) kern (administrator) - 2014-07-07 17:56 http://bugs.bacula.org/view.php?id=2071#c6953 ---------------------------------------------------------------------- OK, nice job of getting gdb to work. As far as I can see, somehow Bacula's memory has become corrupted. This probably also caused the previous error that you saw concerning the packet size. The problem is that the detection of the problem is a long time after the damage is done. It is essentially impossible to track down and fix if I cannot reproduce it, because tracking it down involves enabling memory checking code in lots of places then once it triggers (currently only at the end of the job), you add more and more until you isolate the problem. This can take several days. So, to resolve it, I will need to have the bsr file that is generated for the restore, and the volume which contains the data. Since the Volume is probably very large (judging from the size of the restore) this can be a problem. Of course, you also may not appreciate sending me your Volume data. Another option would be to try to reduce it to the minimal restore size (perhaps one file) that causes the problem. A final option is to see if the restore works without restoring the Windows data or to restore it on a Windows machine. Restoring Windows data on a Linux machine uses code that picks through Windows meta data to find the real data and restore it. This code is very likely to be the source of the bug. But tracking down the exact place can really be problematic. ---------------------------------------------------------------------- (0006954) jpschewe (reporter) - 2014-07-07 18:40 http://bugs.bacula.org/view.php?id=2071#c6954 ---------------------------------------------------------------------- I can restore Linux data to a Linux computer. I've also restored a smaller portion of this same data to a LInux computer. So I suspect it's some particular file that's actually the issue. I'll see if I can narrow this down and post more details. ---------------------------------------------------------------------- (0006955) jpschewe (reporter) - 2014-07-08 12:50 http://bugs.bacula.org/view.php?id=2071#c6955 ---------------------------------------------------------------------- OK, I've managed to make it crash when only restoring 1 file in 1 directory and I'm willing to share that data. So I would like to make sure I have a reasonable test plan here before I put this together. Here's what I'm thinking, please correct any issues you see. Create a disk file pool called "test" Create a job that only backs up this bad directory on Windows using the pool "test" Run the job Test restoring to my Linux client and verify that it fails Gather up the bsr and the disk volume and transfer to you Does this sound reasonable? ---------------------------------------------------------------------- (0006956) kern (administrator) - 2014-07-08 12:58 http://bugs.bacula.org/view.php?id=2071#c6956 ---------------------------------------------------------------------- Yes, that sounds like a good way to be able to resolve the problem, and I am interested in fixing it. Thanks for taking all the trouble to simplify the failure case. Please do send me the bsr and the disk volume. If it is small enough, you may post it to this bug report. Otherwise, please feel free to send it to my personal email address: kern at sibbald dot com. That address should allow receiving a rather large email. Another option is that you can post it on some site where I can download it. ---------------------------------------------------------------------- (0006957) jpschewe (reporter) - 2014-07-10 01:31 http://bugs.bacula.org/view.php?id=2071#c6957 ---------------------------------------------------------------------- I just uploaded a tar file with the bsr file and the disk volume containing the backup. When I run a restore with this I get the files created and the director reports that the restore succeeded, but the file daemon crashed. ---------------------------------------------------------------------- (0006966) kern (administrator) - 2014-07-12 11:17 http://bugs.bacula.org/view.php?id=2071#c6966 ---------------------------------------------------------------------- I have tried reading the Volume you sent me, and as far as I can tell, it does read it, but the problem is that the only Bacula "file" on the volume is C:/Users/admin/AppData/Local/Application Data/, which is a Windows Junction Point, and Bacula does not usually deal with Junction Points. Consequently, it skips that "file", and there is nothing more on the Volume. So I cannot reproduce any crash or even extract anything other than that Bacula says that it cannot deal with C:/Users/admin/AppData/Local/Application Data/ ---------------------------------------------------------------------- (0006967) kern (administrator) - 2014-07-12 11:40 http://bugs.bacula.org/view.php?id=2071#c6967 ---------------------------------------------------------------------- I've modified the code so that it should treat the Junction point more like a directory. The bottom line is that a directory will be created. However, in the case of the Volume, there were no files under that directory so nothing more to restore. There may still be a bug the in the code that parses the BackupWrite format to extract only the data, but I will need a file that breaks it. ---------------------------------------------------------------------- (0006968) jpschewe (reporter) - 2014-07-12 13:02 http://bugs.bacula.org/view.php?id=2071#c6968 ---------------------------------------------------------------------- So when you did the restore your file deamon kept running? I ask because my director said that the restore finished successfully, but my file daemon crashed. ---------------------------------------------------------------------- (0006969) kern (administrator) - 2014-07-12 13:48 http://bugs.bacula.org/view.php?id=2071#c6969 ---------------------------------------------------------------------- Yes, the FD continued running. I was never able to make the FD crash, and I just tested it one more time to be 100% sure, because I did most of my testing with bextract. However as I said, the Volume seems to have only one junction point and no files with data on it. I am running on 7.0.4+ = 7.0.5 while you are running on a reasonably old 5.2.6, which could possibly explain the crash vs no crash. ---------------------------------------------------------------------- (0006970) kern (administrator) - 2014-07-12 13:54 http://bugs.bacula.org/view.php?id=2071#c6970 ---------------------------------------------------------------------- If you can build Bacula from source, you might pull the public git repo (I just pushed everything) and build it -- it is version 7.0.4 plus a number of fixes. If you can then reproduce a crash with that version, I will be happy to look at it further. ---------------------------------------------------------------------- (0006971) jpschewe (reporter) - 2014-07-12 13:55 http://bugs.bacula.org/view.php?id=2071#c6971 ---------------------------------------------------------------------- Thanks for checking again. This may in fact be fixed in 7.0.4+. Unfortunately the Ubuntu and Debian package repositories haven't been updated to the 7.0 series yet. ---------------------------------------------------------------------- (0006972) jpschewe (reporter) - 2014-07-12 13:56 http://bugs.bacula.org/view.php?id=2071#c6972 ---------------------------------------------------------------------- Can I use my 5.2.x clients with the new 7.0.4 director and storage daemon? ---------------------------------------------------------------------- (0006973) kern (administrator) - 2014-07-12 14:03 http://bugs.bacula.org/view.php?id=2071#c6973 ---------------------------------------------------------------------- Yes, you do not need to upgrade your FDs. You must however run any SDs on the same version as the Dir, and on any given machine you must run all the same versions of all components. However, you could for the moment just consider setting up the Dir and SD in a separate set of directories for testing. ---------------------------------------------------------------------- (0006980) kern (administrator) - 2014-07-17 15:33 http://bugs.bacula.org/view.php?id=2071#c6980 ---------------------------------------------------------------------- I am closing this as fixed because the new code to be released in 7.0.5 handles junction point restores better. However, there may be some problem with restoring some files, so if you have a test case with real file data in it, please post it. Issue History Date Modified Username Field Change ====================================================================== 2014-06-15 18:49 jpschewe New Issue 2014-07-05 17:01 kern Note Added: 0006942 2014-07-05 17:01 kern Status new => feedback 2014-07-07 02:16 jpschewe File Added: typescript 2014-07-07 02:19 jpschewe Note Added: 0006948 2014-07-07 02:19 jpschewe Status feedback => new 2014-07-07 09:00 kern File Deleted: typescript 2014-07-07 09:02 kern Note Added: 0006949 2014-07-07 09:02 kern Status new => feedback 2014-07-07 12:52 jpschewe File Added: no-gdb.trace.snipped 2014-07-07 12:52 jpschewe File Added: gdb.trace.snipped 2014-07-07 12:56 jpschewe Note Added: 0006950 2014-07-07 12:56 jpschewe Status feedback => new 2014-07-07 12:58 jpschewe Note Added: 0006951 2014-07-07 15:30 jpschewe File Added: gdb-ignore-sigusr.trace.snipped 2014-07-07 15:33 jpschewe Note Added: 0006952 2014-07-07 17:56 kern Note Added: 0006953 2014-07-07 17:56 kern Status new => feedback 2014-07-07 18:40 jpschewe Note Added: 0006954 2014-07-07 18:40 jpschewe Status feedback => new 2014-07-08 12:50 jpschewe Note Added: 0006955 2014-07-08 12:58 kern Note Added: 0006956 2014-07-08 12:59 kern Assigned To => kern 2014-07-08 12:59 kern Status new => acknowledged 2014-07-10 01:30 jpschewe File Added: bacula-crash.tar.gz 2014-07-10 01:31 jpschewe Note Added: 0006957 2014-07-12 11:17 kern Note Added: 0006966 2014-07-12 11:17 kern Status acknowledged => feedback 2014-07-12 11:40 kern Note Added: 0006967 2014-07-12 13:02 jpschewe Note Added: 0006968 2014-07-12 13:02 jpschewe Status feedback => assigned 2014-07-12 13:48 kern Note Added: 0006969 2014-07-12 13:54 kern Note Added: 0006970 2014-07-12 13:55 jpschewe Note Added: 0006971 2014-07-12 13:56 jpschewe Note Added: 0006972 2014-07-12 14:03 kern Note Added: 0006973 2014-07-17 15:33 kern Note Added: 0006980 2014-07-17 15:33 kern Status assigned => closed 2014-07-17 15:33 kern Resolution open => fixed ====================================================================== |