From: David N. <dn...@ne...> - 2013-11-07 02:05:26
|
On 11/5/13 5:53 PM, Dan Langille wrote: > You are on 9.2-release. > > Have you run freebsd-update to get the latest security patches? Yes > > Did you see the post by Dean E. Weimer today? The most recent post from Dean to this list (at least that I have) is from 4 November at 2030 UTC, saying essentially that a complete rebuild of the OS solved his problem. I'm hoping not to have to boil the ocean... > > Second: read below. With your help (thanks), I got the debug version built and running. Two things: 1. Just like before, the binary in /usr/local/sbin/bacula-fd runs fine when launched on its own. By "runs" I mean the director successfully completes a backup job. 2. Just like before, the binary in /usr/local/sbin/bacula-fd crashed when called from the startup script in /usr/local/etc/rc.d/bacula-fd. By "crashed" I mean the client machine's fd daemon dies during a backup job. I've pasted below the crash output to STDERR. Thanks in advance for more troubleshooting clues. dn root@o:/usr/ports/sysutils/bacula-client # /usr/local/etc/rc.d/bacula-fd start Starting bacula_fd. root@o:/usr/ports/sysutils/bacula-client # Bacula interrupted by signal 0: UNKNOWN SIGNAL Kaboom! bacula-fd, o-fd got signal 0 - UNKNOWN SIGNAL. Attempting traceback. Kaboom! exepath=/usr/local/sbin/ Calling: /usr/local/sbin/btraceback /usr/local/sbin/bacula-fd 21541 /var/db/bacula execv: /usr/local/sbin/btraceback failed: ERR=No such file or directory It looks like the traceback worked ... Dumping: /var/db/bacula/o-fd.21541.bactrace o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 28804618 from bnet.c:774 o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2885c2d8 from jcr.c:358 o-fd: smartall.c:404 Orphaned buffer: o-fd 148 bytes at 28832a98 from bnet.c:767 o-fd: smartall.c:404 Orphaned buffer: o-fd 4112 bytes at 28a69018 from bnet.c:773 o-fd: smartall.c:404 Orphaned buffer: o-fd 7 bytes at 2880dcd8 from bnet.c:775 o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 28831508 from bnet.c:776 o-fd: smartall.c:404 Orphaned buffer: o-fd 8 bytes at 28831538 from workq.c:162 o-fd: smartall.c:404 Orphaned buffer: o-fd 16 bytes at 28831568 from jcr.c:347 o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 29008218 from jcr.c:360 o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c158 from jcr.c:362 o-fd: smartall.c:404 Orphaned buffer: o-fd 536 bytes at 29008518 from find.c:63 o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c298 from find.c:66 o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 29030198 from job.c:248 o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c3d8 from job.c:249 o-fd: smartall.c:404 Orphaned buffer: o-fd 21 bytes at 29066058 from job.c:251 o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069038 from tls.c:422 o-fd: smartall.c:404 Orphaned buffer: o-fd 40 bytes at 29067078 from job.c:1736 o-fd: smartall.c:404 Orphaned buffer: o-fd 56 bytes at 2906a1d8 from job.c:803 o-fd: smartall.c:404 Orphaned buffer: o-fd 64 bytes at 2906a238 from job.c:933 o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069098 from alist.c:51 o-fd: smartall.c:404 Orphaned buffer: o-fd 352 bytes at 29640318 from job.c:968 o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069df8 from alist.c:51 o-fd: smartall.c:404 Orphaned buffer: o-fd 23 bytes at 29066088 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 17 bytes at 290660e8 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 30 bytes at 290301d8 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 23 bytes at 29066118 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 13 bytes at 29066148 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 18 bytes at 29066178 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 14 bytes at 290661a8 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 36 bytes at 29030258 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 64 bytes at 2906a298 from job.c:916 o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069db8 from alist.c:51 o-fd: smartall.c:404 Orphaned buffer: o-fd 14 bytes at 290661d8 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 13 bytes at 29066208 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 18 bytes at 29066238 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 29066268 from dlist.c:356 o-fd: smartall.c:404 Orphaned buffer: o-fd 148 bytes at 29004318 from bsock.c:64 o-fd: smartall.c:404 Orphaned buffer: o-fd 4112 bytes at 2978d018 from bsock.c:73 o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 29009d18 from bsock.c:74 o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 290662c8 from bsock.c:159 o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 29030398 from bsock.c:160 o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069d38 from tls.c:422 o-fd: smartall.c:404 Orphaned buffer: o-fd 77 bytes at 2906d1e8 from job.c:572 o-fd: smartall.c:404 Orphaned buffer: o-fd 32 bytes at 29030318 from runscript.c:51 o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905d418 from runscript.c:203 o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 290303d8 from bpipe.c:76 > > On Nov 4, 2013, at 5:44 PM, David Newman <dn...@ne...> wrote: > >> On 10/29/13 12:42 PM, Dan Langille wrote: >>> On 2013-10-27 19:33, David Newman wrote: >>>> On 10/27/13 11:31 AM, Dan Langille wrote: >>>> >>>> On Oct 22, 2013, at 3:00 PM, David Newman wrote: >>>> >>>> >>>> >>>> On 10/19/13 11:40 PM, Kern Sibbald wrote: >>>> Hello, >>>> >>>> From what I can see -- first "signal 0", and second this >>>> traceback, this looks a lot like a FreeBSD pthreads bug. >>>> >>>> First because there is no such thing, at least in userland, >>>> as a signal number 0, which I saw in an earlier >>>> email. Second, as the traceback below >>>> shows, Bacula is waiting on a pthread_cond_timedwait() and >>>> while in the pthread_cond_timedwait, which is a "system" >>>> subroutine, it emits a pthread_cond_signal(), probably no >>>> problem, followed by a pthread_kill(). That seems odd to >>>> me, but perhaps it is how FreeBSD does it, but the net >>>> result is that it is killing Bacula. >>>> >>>> Obviously, this could be a Bacula bug, but it is not occurring >>>> elsewhere, and it looks very suspicious to me. >>>> >>>> You can get more information by compiling with >>>> #define DEVELOPER 1 >>>> in <bacula>/src/version.h and ensuring that the -g >>>> option is on the compile and that the binaries are not >>>> stripped (default for Bacula Makefiles, but not for the >>>> FreeBSD ports system). >>>> >>>> Then if you get another traceback, it may be clearer what >>>> is going on. Since this is relatively serious, I would recommend >>>> running Bacula under the debugger directly, see the manual on >>>> the details of how, then when the debugger gets control after >>>> the signal, manually do the "thread apply all bt" command. >>>> >>>> FreeBSD gurus, a little help? >>>> >>>> That's not me. >>>> >>>> I don't see version.h under the bacula-client port directory. >>>> >>>> try this: >>>> >>>> make clean >>>> make patch >>>> find . -name version.h >>>> ./bacula-5.2.12/src/version.h >>>> >>>> OK, thanks. That works. >>>> >>>> Kern's email gave three steps. Sorry for the baby questions, but I >>>> don't >>>> know how to do steps 2 or 3, either. >>>> >>>> On 10/19/13 11:40 PM, Kern Sibbald wrote: >>>> >>>> You can get more information by compiling with >>>> #define DEVELOPER 1 >>>> in <bacula>/src/version.h >>>> >>>> That's step 1, which you've helped me find. >>>> >>>> and ensuring that the -g >>>> option is on the compile >>>> >>>> That's step 2. >>>> >>>> I don't see a place for that option in the Makefile. >>> >>> I think that goes on: >>> >>> CPPFLAGS+= >>> >>> to become: >>> >>> -I/usr/include/readline -I${LOCALBASE}/include -g >>> >>> I think. I have not tested that. >> >> Which file would this go into? >> >> After 'make patch', running 'grep -R LOCALBASE *' from the root of the >> port returns nothing. > > Make that change in /usr/ports/sysutils/bacula-server/Makefile > > Yes, bacula-server, not a typo. bacula-client is s slave port of bacula-server. > >> >> >>> >>>> >>>> and that the binaries are not >>>> stripped (default for Bacula Makefiles, but not for the >>>> FreeBSD ports system). >>> >>> Looking in /usr/ports/Mk/bsd.port.mk, I think you want WITH_DEBUG which >>> I think you can add to the OPTIONS_DEFINE line. >> >> What's the procedure here? >> >> Is it (1) to uncomment WITH_DEBUG in /usr/ports/Mk/bsd.port.mk; and >> >> (2) to change the Makefile to OPTIONS_DEFINE= NLS OPENSSL PYTHON WITH_DEBUG > > Make those changes to OPTIONS_DEFINE /usr/ports/sysutils/bacula-server/Makefile as well. > > I suggest deleting all bacula packages on this client. Then make clean, and make install in the bacula-client dir. > > > > >> >> ?? >> >> thanks >> >> dn >> >> >>> >>> >>>> >>>> That's step 3. Sorry, don't know how to do that either. >>>> >>>> >>>> >>>> Also, I do have bacula-fd running fine on other FreeBSD 9.2 systems. >>>> The >>>> only delta AFAIK is that this is an i386 system and the others are >>>> amd64. >>>> >>>> To review: >>>> >>>> 1. Backup jobs complete when manually starting bacula-fd. >>>> >>>> What command are you entering? >>>> >>>> /usr/local/sbin/bacula-fd >>>> >>>> >>>> >>>> 2. Backup jobs do not complete when launching bacula-fd via the startup >>>> script in /usr/local/etc/rc.d/bacula-fd. >>>> >>>> For example: usr/local/etc/rc.d/bacula-fd start ? >>>> >>>> Yes: >>>> >>>> /usr/local/etc/rc.d/bacula-fd start # note leading stroke >>>> >>>> Thanks >>>> >>>> dn >>>> >>>> >>>> >>>> >>>> >>>> Thanks in advance for further debugging clues. >>>> >>>> dn >>>> >>>> >>>> >>>> >>>> If any of you are FreeBSD system gurus you might compare the >>>> last known working version of the OS with 9.2, particularly the >>>> pthreads routines. Perhaps they are using a signal 0 internally, >>>> and somehow that leaked back to Bacula. >>>> >>>> Best regards, >>>> Kern >>>> >>>> On 10/18/2013 01:29 AM, David Newman wrote: >>>> On 10/17/13 5:33 AM, Martin Simmons wrote: >>>> On Wed, 16 Oct 2013 12:13:26 -0700, David Newman said: >>>> On 10/14/13 2:44 AM, Martin Simmons wrote: >>>> On Sun, 13 Oct 2013 18:25:07 -0700, David Newman said: >>>> On 10/9/13 4:41 PM, David Newman wrote: >>>> FreeBSD 9.2-RELEASE, bacula-client-5.2.12_3 installed from ports >>>> >>>> Ever since upgrading this host to FreeBSD 9.2, bacula-fd crashes >>>> as soon >>>> as bacula-dir starts a backup job. The entry in /var/log/messages >>>> is: >>>> >>>> Oct 9 16:25:50 o bacula-fd: Bacula interrupted by signal 0: >>>> UNKNOWN SIGNAL >>>> >>>> Backups worked fine on this host running FreeBSD 9.1 and other hosts >>>> upgraded to FreeBSD 9.2 run backups OK. >>>> >>>> I've done the uninstall/reinstall thing with the bacula-client >>>> port, but >>>> that made no difference. >>>> >>>> Thanks in advance for troubleshooting clues. >>>> >>>> dn >>>> Is there a Wireshark decode for Bacula? >>>> >>>> I'm still stuck on this problem, and need more info on what's causing >>>> that UNKNOWN SIGNAL error. Wireshark 1.8.6 just shows strings of >>>> bytes >>>> for the Bacula stuff. >>>> >>>> Thanks. >>>> >>>> dn >>>> A wireshark decode won't help much here because problems like this >>>> must be in >>>> the fd itself. >>>> >>>> Try attaching gdb to the bacula-fd process and see if it catches the >>>> mysterious signal (see >>>> http://www.bacula.org/5.2.x-manuals/en/problems/problems/What_Do_When_Bacula.html#SECTION00640000000000000000). >>>> >>>> No luck with this. Per that URL, I've put the btraceback.gdb file in >>>> the >>>> same directory as the bacula-fd executable on the client (in this case, >>>> /usr/local/sbin) and made the .gdb file executable. >>>> >>>> At run time it produces this error: >>>> >>>> /usr/local/sbin/btraceback.gdb:1: Error in sourced command file: >>>> No symbol table is loaded. Use the "file" command. >>>> >>>> That's problem 1. Problem 2 is that the syntax given for capturing >>>> STDERR and STDOUT -- 2>\&1 -- doesn't work on either csh (root's >>>> default >>>> on FreeBSD) or bash. >>>> >>>> Any ideas on remedying either issue? >>>> It looks like you missed the part after the # in the URL -- you don't >>>> need the >>>> btraceback.gdb file. >>>> >>>> The section I meant is called "Manually Running Bacula Under The >>>> Debugger" on >>>> that page (you'll have to adapt it for the bacula-fd). >>>> Sorry for missing that. >>>> >>>> The backup runs fine under the debugger, including the backup job >>>> beforehand, but not with the FreeBSD startup script in >>>> /usr/local/etc/rc.d. >>>> >>>> I've pasted below the debugger output and the startup script. >>>> >>>> Thanks in advance for further troubleshooting clues. >>>> >>>> dn >>>> >>>> >>>> ========== >>>> >>>> Successful run, via /usr/local/sbin/bacula-fd run via gdb: >>>> >>>> (gdb) thread apply all bt >>>> Thread 5 (Thread 28c08b00 (LWP 100213/bacula-fd)): >>>> #0 0x282302b3 in pthread_kill () from /lib/libthr.so.3 >>>> #1 0x2822f9b2 in pthread_kill () from /lib/libthr.so.3 >>>> #2 0x282328f9 in pthread_cond_signal () from /lib/libthr.so.3 >>>> #3 0x281f5d20 in bthread_cond_timedwait_p () from >>>> /usr/local/lib/libbac.so.5 >>>> #4 0x281ef9b0 in watchdog_thread () from /usr/local/lib/libbac.so.5 >>>> #5 0x281f7167 in lmgr_thread_launcher () from >>>> /usr/local/lib/libbac.so.5 >>>> #6 0x28227f3a in pthread_getprio () from /lib/libthr.so.3 >>>> #7 0x00000000 in ?? () >>>> >>>> Thread 3 (Thread 28805e00 (LWP 100211/bacula-fd)): >>>> #0 0x28624323 in nanosleep () from /lib/libc.so.7 >>>> #1 0x2822ad8b in nanosleep () from /lib/libthr.so.3 >>>> #2 0x281c1a90 in bmicrosleep () from /usr/local/lib/libbac.so.5 >>>> #3 0x281f7349 in check_deadlock () from /usr/local/lib/libbac.so.5 >>>> #4 0x28227f3a in pthread_getprio () from /lib/libthr.so.3 >>>> #5 0x00000000 in ?? () >>>> >>>> Thread 2 (Thread 28804300 (LWP 100133/bacula-fd)): >>>> #0 0x28646103 in select () from /lib/libc.so.7 >>>> #1 0x2822a960 in select () from /lib/libthr.so.3 >>>> #2 0x281c45a8 in bnet_thread_server () from /usr/local/lib/libbac.so.5 >>>> #3 0x0804f5c6 in main () >>>> #0 0x282302b3 in pthread_kill () from /lib/libthr.so.3 >>>> >>>> ========== >>>> >>>> FreeBSD startup script: >>>> >>>> #!/bin/sh >>>> # >>>> # $FreeBSD: sysutils/bacula-server/files/bacula-fd.in 323275 2013-07-19 >>>> 09:44:58Z rm $ >>>> # >>>> # PROVIDE: bacula_fd >>>> # REQUIRE: DAEMON >>>> # KEYWORD: shutdown >>>> # >>>> # Add the following lines to /etc/rc.conf.local or /etc/rc.conf >>>> # to enable this service: >>>> # >>>> # bacula_fd_enable (bool): Set to NO by default. >>>> # Set it to YES to enable bacula_fd. >>>> # bacula_fd_flags (params): Set params used to start bacula_fd. >>>> # >>>> >>>> . /etc/rc.subr >>>> >>>> name="bacula_fd" >>>> rcvar=${name}_enable >>>> command=/usr/local/sbin/bacula-fd >>>> >>>> load_rc_config $name >>>> >>>> : ${bacula_fd_enable="NO"} >>>> : ${bacula_fd_flags=" -u root -g wheel -v -c >>>> /usr/local/etc/bacula/bacula-fd.conf"} >>>> : ${bacula_fd_pidfile="/var/run/bacula-fd.9102.pid"} >>>> >>>> pidfile="${bacula_fd_pidfile}" >>>> >>>> run_rc_command "$1" >>>> >>>> ========== >>>> >>>> >>>> >>>> >>>> >>>> Thanks. >>>> >>>> dn >>>> >>>> >>>> >>>> If that doesn't catch it, then try the gdb command >>>> >>>> break signal_handler >>>> >>>> (signal_handler prints the "Bacula interrupted by signal" message). >>>> >>>> __Martin >>>> >>>> __Martin >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> October Webinars: Code for Performance >>>> Free Intel webinars can help you accelerate application performance. >>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>> most from >>>> the latest Intel processors and coprocessors. See abstracts and >>>> register > >>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk >>>> >>>> _______________________________________________ >>>> Bacula-users mailing list >>>> Bac...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bacula-users >>>> >>>> ------------------------------------------------------------------------------ >>>> >>>> October Webinars: Code for Performance >>>> Free Intel webinars can help you accelerate application performance. >>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>> most from >>>> the latest Intel processors and coprocessors. See abstracts and >>>> register > >>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135031&iu=/4140/ostg.clktrk >>>> >>>> _______________________________________________ >>>> Bacula-users mailing list >>>> Bac...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bacula-users >>>> >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> October Webinars: Code for Performance >>>> Free Intel webinars can help you accelerate application performance. >>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>> most from >>>> the latest Intel processors and coprocessors. See abstracts and >>>> register > >>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Bacula-users mailing list >>>> Bac...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bacula-users >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> October Webinars: Code for Performance >>>> Free Intel webinars can help you accelerate application performance. >>>> Explore tips for MPI, OpenMP, advanced profiling, and more. Get the >>>> most from >>>> the latest Intel processors and coprocessors. See abstracts and >>>> register > >>>> http://pubads.g.doubleclick.net/gampad/clk?id=60135991&iu=/4140/ostg.clktrk >>>> _______________________________________________ >>>> Bacula-users mailing list >>>> Bac...@li... >>>> https://lists.sourceforge.net/lists/listinfo/bacula-users >>> >>> >> >> ------------------------------------------------------------------------------ >> November Webinars for C, C++, Fortran Developers >> Accelerate application performance with scalable programming models. Explore >> techniques for threading, error checking, porting, and tuning. Get the most >> from the latest Intel processors and coprocessors. See abstracts and register >> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk >> _______________________________________________ >> Bacula-users mailing list >> Bac...@li... >> https://lists.sourceforge.net/lists/listinfo/bacula-users > |