From: Angus J. <aj...@dr...> - 2004-09-26 03:41:57
|
I sent this from the wrong email address a few minutes ago. Not sure if it= =20 will double on the list, but if it does...SORRY!! Angus -------- Hi all, I just received this email. This is on a production server running=20 Bacula. I was just setting up Nagios to do service checks on this server's= =20 director. I was also doing more testing on my test server, and the same=20 thing happened. It seems like after it checks about 20 times or so it=20 hangs the director (you cannot connect to it), then finally the director=20 crashes with this error. Is there something that I am doing wrong? I've increased the check time=20 from 5 minutes to 8 hours for now...this seems to stop the crash. Thanks for any help! Angus >[Switching to Process 512, Thread 1] >0x2823c883 in poll () from /usr/lib/libc.so.5 >$1 =3D "libart-dir", '\0' <repeats 19 times> >$2 =3D 0x80db0d8 "bacula-dir" >$3 =3D 0x80db158 "/sbin/bacula-dir" >$4 =3D "MySQL" >$5 =3D 0x809bfac "1.34.6 (28 July 2004)" >$6 =3D 0x8094457 "i386-unknown-freebsd5.1" >$7 =3D 0x809444f "freebsd" >$8 =3D 0x8094443 "5.1-RELEASE" >#0 0x2823c883 in poll () from /usr/lib/libc.so.5 >#1 0x28106651 in _thread_kern_sched_state_unlock () from= /usr/lib/libc_r.so.5 >#2 0x28106050 in _thread_kern_scheduler () from /usr/lib/libc_r.so.5 > >Thread 11 (Process 512, Thread 11): >#0 0x28105aac in _thread_kern_sched () from /usr/lib/libc_r.so.5 >#1 0x28106218 in _thread_kern_sched_state () from /usr/lib/libc_r.so.5 >#2 0x281021a3 in _select () from /usr/lib/libc_r.so.5 >#3 0x281023a3 in select () from /usr/lib/libc_r.so.5 >#4 0x080755d2 in bnet_thread_server(char*, int, int, workq_tag*, void*=20 >(*)(void*)) (bind_addr=3D0x48c <Error reading address 0x48c: Bad address>,= =20 >port=3D9101, > max_clients=3D10, client_wq=3D0x80b0560, > handle_client_request=3D0x80687fc <handle_UA_client_request>) > at bnet_server.c:139 >#5 0x080686cb in connect_thread (arg=3D0x80b058c) at ua_server.c:90 >#6 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 > >Thread 10 (Process 512, Thread 10): >#0 0x28105aac in _thread_kern_sched () from /usr/lib/libc_r.so.5 >#1 0x28106283 in _thread_kern_sched_state_unlock () from= /usr/lib/libc_r.so.5 >#2 0x28109da9 in _pthread_cond_timedwait () from /usr/lib/libc_r.so.5 >#3 0x280fe1b9 in _thread_gc () from /usr/lib/libc_r.so.5 >#4 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 > >Thread 9 (Process 512, Thread 9): >#0 0x28105aac in _thread_kern_sched () from /usr/lib/libc_r.so.5 >#1 0x28106218 in _thread_kern_sched_state () from /usr/lib/libc_r.so.5 >#2 0x28103cc7 in _nanosleep () from /usr/lib/libc_r.so.5 >#3 0x28103dce in nanosleep () from /usr/lib/libc_r.so.5 >#4 0x08073abb in bmicrosleep(int, long) (sec=3D1, usec=3D209) at= bsys.c:504 >#5 0x08086286 in watchdog_thread (arg=3D0x0) at watchdog.c:262 >#6 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 > >Thread 8 (Process 512, Thread 8): >#0 0x28105aac in _thread_kern_sched () from /usr/lib/libc_r.so.5 >#1 0x28106218 in _thread_kern_sched_state () from /usr/lib/libc_r.so.5 >#2 0x28102edb in _read () from /usr/lib/libc_r.so.5 >#3 0x28102f75 in read () from /usr/lib/libc_r.so.5 >#4 0x08073d00 in read_nbytes (bsock=3D0xffffffff, > ptr=3D0xbfa8766c=20 > "\230\035(\b\030l\016\b\230\035(\b\230w=A8=BF\017(\005\b\230\035(\bN",=20 > nbytes=3D4) at bnet.c:71 >#5 0x08073f31 in bnet_recv(BSOCK*) (bsock=3D0x8281d98) at bnet.c:169 >#6 0x0805280f in bget_dirmsg(BSOCK*) (bs=3D0x8281d98) at getmsg.c:77 >#7 0x0804c8a5 in wait_for_job_termination(JCR*) (jcr=3D0x80e6c18) > at backup.c:266 >#8 0x0804c7bc in do_backup(JCR*) (jcr=3D0x80e6c18) at backup.c:230 >#9 0x08054480 in job_thread (arg=3D0x80e6c18) at job.c:204 >#10 0x080563aa in jobq_server (arg=3D0x80b04a0) at jobq.c:428 >#11 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 >#12 0xbfa78000 in ?? () >#13 0x2810d544 in _dead_list () from /usr/lib/libc_r.so.5 >#14 0x28103dce in nanosleep () from /usr/lib/libc_r.so.5 >#15 0x08073abb in bmicrosleep(int, long) (sec=3D1, usec=3D209) at= bsys.c:504 >#16 0x08059df4 in wait_for_next_job(char*) (one_shot_job_to_run=3D0x0) > at scheduler.c:101 >#17 0x0804ada5 in main (argc=3D0, argv=3D0x80db4d8) at dird.c:240 >#18 0x0804a955 in _start () > >Thread 7 (Process 512, Thread 7): >#0 0x28105aac in _thread_kern_sched () from /usr/lib/libc_r.so.5 >#1 0x28106218 in _thread_kern_sched_state () from /usr/lib/libc_r.so.5 >#2 0x28102edb in _read () from /usr/lib/libc_r.so.5 >#3 0x28102f75 in read () from /usr/lib/libc_r.so.5 >#4 0x08073d00 in read_nbytes (bsock=3D0xffffffff, > ptr=3D0xbfacb66c=20 > "\230\037(\b\030\f\016\b\230\037(\b\230=B7=AC=BF\017(\005\b\230\037(\bN",= =20 > nbytes=3D4) at bnet.c:71 >#5 0x08073f31 in bnet_recv(BSOCK*) (bsock=3D0x8281f98) at bnet.c:169 >#6 0x0805280f in bget_dirmsg(BSOCK*) (bs=3D0x8281f98) at getmsg.c:77 >#7 0x0804c8a5 in wait_for_job_termination(JCR*) (jcr=3D0x80e0c18) > at backup.c:266 >#8 0x0804c7bc in do_backup(JCR*) (jcr=3D0x80e0c18) at backup.c:230 >#9 0x08054480 in job_thread (arg=3D0x80e0c18) at job.c:204 >#10 0x080563aa in jobq_server (arg=3D0x80b04a0) at jobq.c:428 >#11 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 >#12 0xbfabc000 in ?? () >#13 0x2810d544 in _dead_list () from /usr/lib/libc_r.so.5 >#14 0x28103dce in nanosleep () from /usr/lib/libc_r.so.5 >#15 0x08073abb in bmicrosleep(int, long) (sec=3D1, usec=3D209) at= bsys.c:504 >#16 0x08059df4 in wait_for_next_job(char*) (one_shot_job_to_run=3D0x0) > at scheduler.c:101 >#17 0x0804ada5 in main (argc=3D0, argv=3D0x80db4d8) at dird.c:240 >#18 0x0804a955 in _start () > >Thread 6 (Process 512, Thread 6): >#0 0x28105aac in _thread_kern_sched () from /usr/lib/libc_r.so.5 >#1 0x28106218 in _thread_kern_sched_state () from /usr/lib/libc_r.so.5 >#2 0x28108acc in _connect () from /usr/lib/libc_r.so.5 >#3 0x28108bb5 in connect () from /usr/lib/libc_r.so.5 >#4 0x08074a9c in bnet_open (jcr=3D0x80e9418, name=3D0x808a840 "File= daemon", > host=3D0x80e4398 "sales-10.libart.org", service=3D0x0, port=3D9102, > fatal=3D0x2810d544) at bnet.c:630 >#5 0x08074b81 in bnet_connect(JCR*, int, int, char const*, char*, char*,= =20 >int, int) (jcr=3D0x80e9418, retry_interval=3D10, max_retry_time=3D12959330, > name=3D0x808a840 "File daemon", host=3D0x80e4398= "sales-10.libart.org", > service=3D0x0, port=3D9102, verbose=3D1) at bnet.c:654 >#6 0x080510b9 in connect_to_file_daemon(JCR*, int, int, int)= (jcr=3D0x80e9418, > retry_interval=3D10, max_retry_time=3D12960000, verbose=3D1) at= fd_cmds.c:79 >#7 0x0804c6b2 in do_backup(JCR*) (jcr=3D0x80e9418) at backup.c:187 >#8 0x08054480 in job_thread (arg=3D0x80e9418) at job.c:204 >#9 0x080563aa in jobq_server (arg=3D0x80b04a0) at jobq.c:428 >#10 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 >#11 0xbfa56000 in ?? () >#12 0x281051ed in _mutex_cv_lock () from /usr/lib/libc_r.so.5 >#13 0x080e6c18 in ?? () >#14 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 >#15 0xbfa78000 in ?? () >#16 0x2810d544 in _dead_list () from /usr/lib/libc_r.so.5 >#17 0x28103dce in nanosleep () from /usr/lib/libc_r.so.5 >#18 0x08073abb in bmicrosleep(int, long) (sec=3D1, usec=3D209) at= bsys.c:504 >#19 0x08059df4 in wait_for_next_job(char*) (one_shot_job_to_run=3D0x0) > at scheduler.c:101 >#20 0x0804ada5 in main (argc=3D0, argv=3D0x80db4d8) at dird.c:240 >#21 0x0804a955 in _start () > >Thread 5 (Process 512, Thread 5): >#0 0x28105aac in _thread_kern_sched () from /usr/lib/libc_r.so.5 >#1 0x28106218 in _thread_kern_sched_state () from /usr/lib/libc_r.so.5 >#2 0x28102edb in _read () from /usr/lib/libc_r.so.5 >#3 0x28102f75 in read () from /usr/lib/libc_r.so.5 >#4 0x08073d00 in read_nbytes (bsock=3D0xffffffff, > ptr=3D0xbfa43dec=20 > "\210\212\a\b\024>=A4=BF\030l\016\b\030?=A4=BF\017(\005\b\030\031(\b=A8",= nbytes=3D4)=20 > at bnet.c:71 >#5 0x08073f31 in bnet_recv(BSOCK*) (bsock=3D0x8281918) at bnet.c:169 >#6 0x0805280f in bget_dirmsg(BSOCK*) (bs=3D0x8281918) at getmsg.c:77 >#7 0x08057065 in msg_thread (arg=3D0x80e6c18) at msgchan.c:228 >#8 0x280fd3fe in _thread_start () from /usr/lib/libc_r.so.5 >#9 0xbfa34000 in ?? () >#10 0x080e9c28 in ?? () >/share/bacula/btraceback.gdb:10: Error in sourced command file: >Error accessing memory address 0x6e: Bad address. >#0 0x2823c883 in poll () from /usr/lib/libc.so.5 |