From: Jo R. <jr...@ne...> - 2009-10-10 07:28:26
|
I've just upgraded a machine from 6.3 to 7.2. I replaced all the ports with new versions compiled on 7.2, and everything is working normally (just likely everything else running these builds) except for bacula-dir. It is hanging right after starting the UA server, and before it starts accepting network connections. No core, no log message, nothing -- except that you have to kill -9 the process. I found one other report about this in the archives but they said updating gettext fixed it. I recompiled those to be sure. I even recompiled bacula with NLS disabled so that gettext wasn't linked it, and the problem doesn't change. I'm making no headway on this, and would appreciate some assistance: /usr/local/sbin/bacula-dir -d300 -f -v bacula-dir: dird.c:184-0 Debug level = 300 bacula-dir: runscript.c:296-0 runscript: debug bacula-dir: runscript.c:297-0 --> RunScript bacula-dir: runscript.c:298-0 --> Command=/usr/local/share/bacula/ make_catalog_backup bacula bacula *snip* localhost bacula-dir: runscript.c:299-0 --> Target= bacula-dir: runscript.c:300-0 --> RunOnSuccess=1 bacula-dir: runscript.c:301-0 --> RunOnFailure=0 bacula-dir: runscript.c:302-0 --> FailJobOnError=1 bacula-dir: runscript.c:303-0 --> RunWhen=2 bacula-dir: runscript.c:296-0 runscript: debug bacula-dir: runscript.c:297-0 --> RunScript bacula-dir: runscript.c:298-0 --> Command=/usr/local/share/bacula/ delete_catalog_backup bacula-dir: runscript.c:299-0 --> Target= bacula-dir: runscript.c:300-0 --> RunOnSuccess=1 bacula-dir: runscript.c:301-0 --> RunOnFailure=0 bacula-dir: runscript.c:302-0 --> FailJobOnError=1 bacula-dir: runscript.c:303-0 --> RunWhen=1 bacula-dir: message.c:263-0 Copy message resource 2870f1b8 to 28714698 bacula-dir: bsys.c:503-0 Could not open state file. sfd=-1 size=188: ERR=No such file or directory bacula-dir: mysql.c:101-0 db_open first time bacula-dir: mysql.c:130-0 initdb ref=1 connected=0 db=0 bacula-dir: mysql.c:166-0 mysql_init done bacula-dir: mysql.c:187-0 mysql_real_connect done bacula-dir: mysql.c:189-0 db_user=bacula db_name=bacula db_password=*snip* bacula-dir: mysql.c:215-0 opendb ref=1 connected=1 db=28708044 bacula-dir: sql_create.c:341-0 In create mediatype bacula-dir: sql_create.c:344-0 selectmediatype: SELECT MediaTypeId,MediaType FROM MediaType WHERE MediaType='File_SVcolo' bacula-dir: mysql.c:236-0 closedb ref=0 connected=1 db=28708044 bacula-dir: mysql.c:240-0 close db=28708044 backup0-dir: dird.c:317-0 Start UA server FWIW, it's not the state file error. I didn't used to get that error until I removed all files trying to see if something in the environment was confusing it. Exact same process, same hang in the same place whether the state file was there or not. Machine: Rackable 3U with single-core Athlon CPU: AMD Opteron(tm) Processor 244 (1804.10-MHz 686-class CPU) Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 Features = 0x78bfbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2> AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow!+,3DNow!> real memory = 2146828288 (2047 MB) This exact machine and hardware have been running FreeBSD 6.x and Bacula for >2 years now, zero problems. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Kern S. <ke...@si...> - 2009-10-10 08:55:27
|
Hello, I am sorry you are having problems. I assume 6.3 to 7.2 refers to some system OS that you did not specify (except possibly at the end of your email), and you did not specify what version of Bacula you are using. If this is a FreeBSD machine, someone reported problems with networking that was due to the fact that the /etc/hosts file specified localhost as an IPv6 address, yet IPv6 was not configured, which of course causes problems -- OS distribution bug IMO. Otherwise, I have no idea what is going wrong. Unfortunately we no longer give support on this list. We do answer development questions, but this seems to be a support question. Please see www.bacula.org -> Support for all the possible options. Regards, Kern On Saturday 10 October 2009 09:15:53 Jo Rhett wrote: > I've just upgraded a machine from 6.3 to 7.2. I replaced all the > ports with new versions compiled on 7.2, and everything is working > normally (just likely everything else running these builds) except for > bacula-dir. It is hanging right after starting the UA server, and > before it starts accepting network connections. No core, no log > message, nothing -- except that you have to kill -9 the process. > > I found one other report about this in the archives but they said > updating gettext fixed it. I recompiled those to be sure. I even > recompiled bacula with NLS disabled so that gettext wasn't linked it, > and the problem doesn't change. I'm making no headway on this, and > would appreciate some assistance: > > /usr/local/sbin/bacula-dir -d300 -f -v > bacula-dir: dird.c:184-0 Debug level = 300 > bacula-dir: runscript.c:296-0 runscript: debug > bacula-dir: runscript.c:297-0 --> RunScript > bacula-dir: runscript.c:298-0 --> Command=/usr/local/share/bacula/ > make_catalog_backup bacula bacula *snip* localhost > bacula-dir: runscript.c:299-0 --> Target= > bacula-dir: runscript.c:300-0 --> RunOnSuccess=1 > bacula-dir: runscript.c:301-0 --> RunOnFailure=0 > bacula-dir: runscript.c:302-0 --> FailJobOnError=1 > bacula-dir: runscript.c:303-0 --> RunWhen=2 > bacula-dir: runscript.c:296-0 runscript: debug > bacula-dir: runscript.c:297-0 --> RunScript > bacula-dir: runscript.c:298-0 --> Command=/usr/local/share/bacula/ > delete_catalog_backup > bacula-dir: runscript.c:299-0 --> Target= > bacula-dir: runscript.c:300-0 --> RunOnSuccess=1 > bacula-dir: runscript.c:301-0 --> RunOnFailure=0 > bacula-dir: runscript.c:302-0 --> FailJobOnError=1 > bacula-dir: runscript.c:303-0 --> RunWhen=1 > bacula-dir: message.c:263-0 Copy message resource 2870f1b8 to 28714698 > bacula-dir: bsys.c:503-0 Could not open state file. sfd=-1 size=188: > ERR=No such file or directory > bacula-dir: mysql.c:101-0 db_open first time > bacula-dir: mysql.c:130-0 initdb ref=1 connected=0 db=0 > bacula-dir: mysql.c:166-0 mysql_init done > bacula-dir: mysql.c:187-0 mysql_real_connect done > bacula-dir: mysql.c:189-0 db_user=bacula db_name=bacula > db_password=*snip* > bacula-dir: mysql.c:215-0 opendb ref=1 connected=1 db=28708044 > bacula-dir: sql_create.c:341-0 In create mediatype > bacula-dir: sql_create.c:344-0 selectmediatype: SELECT > MediaTypeId,MediaType FROM MediaType WHERE MediaType='File_SVcolo' > bacula-dir: mysql.c:236-0 closedb ref=0 connected=1 db=28708044 > bacula-dir: mysql.c:240-0 close db=28708044 > backup0-dir: dird.c:317-0 Start UA server > > FWIW, it's not the state file error. I didn't used to get that error > until I removed all files trying to see if something in the > environment was confusing it. Exact same process, same hang in the > same place whether the state file was there or not. > > Machine: Rackable 3U with single-core Athlon > CPU: AMD Opteron(tm) Processor 244 (1804.10-MHz 686-class CPU) > Origin = "AuthenticAMD" Id = 0xf5a Stepping = 10 > > Features > = > 0x78bfbff > < > FPU > ,VME > ,DE > ,PSE > ,TSC > ,MSR > ,PAE > ,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2> > AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow!+,3DNow!> > real memory = 2146828288 (2047 MB) > > This exact machine and hardware have been running FreeBSD 6.x and > Bacula for >2 years now, zero problems. |
From: Jo R. <jr...@ne...> - 2009-10-10 17:58:25
|
On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: > I assume 6.3 to 7.2 refers to some system OS that you did not > specify (except > possibly at the end of your email), and you did not specify what > version of > Bacula you are using. 3.0.2 -- the latest stable. > If this is a FreeBSD machine, someone reported problems with > networking that > was due to the fact that the /etc/hosts file specified localhost as > an IPv6 > address, yet IPv6 was not configured, which of course causes > problems -- OS > distribution bug IMO. Not in the hosts file on this system. > Otherwise, I have no idea what is going wrong. Unfortunately we no > longer give > support on this list. We do answer development questions, but this > seems to > be a support question. Please see www.bacula.org -> Support for all > the > possible options. Kern, it's really hard not to read that as a blowoff. In the many years that I've been using bacula, asking questions involving debug output on the -users list hasn't been very productive, and I have been consistently referred to the -devel list. I'm not looking for commercial support -- those options only allow you to pay someone at the same skill level to ask the same question back to this list. You and I both know there's no commercial support option that won't involve the -devel list for a fix. I'm more than happy to repost this to -users if that's what you prefer. But the next step is that you're going to have to tell me how to get more information out of this. I've looked at dird.c which contains that message and there doesn't appear to be more debug around that. What's going to help? A gdb trace? Here is the ktrace (similar to strace on linux) output near the failure. From my reading it is hanging in the _umtx_op() call. 91887 bacula-dir GIO fd 1 wrote 44 bytes "bacula-dir: mysql.c:240-0 close db=28708044 " 91887 bacula-dir RET write 44/0x2c 91887 bacula-dir CALL write(0x4,0x28763000,0x5) 91887 bacula-dir GIO fd 4 wrote 5 bytes 0x0000 0100 0000 01 |.....| 91887 bacula-dir RET write 5 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) 91887 bacula-dir RET shutdown 0 91887 bacula-dir CALL close(0x4) 91887 bacula-dir RET close 0 91887 bacula-dir CALL __sysctl(0xbfbfe88c, 0x2,0x2815eea0,0xbfbfe8a4,0,0) 91887 bacula-dir RET __sysctl 0 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) 91887 bacula-dir RET sigaction 0 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR|S_IWUSR) 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" 91887 bacula-dir RET open 4 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) 91887 bacula-dir RET lseek 0 91887 bacula-dir CALL close(0x4) 91887 bacula-dir RET close 0 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" 91887 bacula-dir RET open 4 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) 91887 bacula-dir RET lseek 0 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) 91887 bacula-dir GIO fd 1 wrote 42 bytes "backup0-dir: dird.c:317-0 Start UA server " 91887 bacula-dir RET write 42/0x2a 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) 91887 bacula-dir RET _umtx_op 0 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) 91887 bacula-dir RET sigprocmask 0 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) 91887 bacula-dir RET sigprocmask 0 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call 91887 bacula-dir PSIG SIGINT SIG_DFL -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Kern S. <ke...@si...> - 2009-10-10 18:25:01
|
On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: > On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: > > I assume 6.3 to 7.2 refers to some system OS that you did not > > specify (except > > possibly at the end of your email), and you did not specify what > > version of > > Bacula you are using. > > 3.0.2 -- the latest stable. > > > If this is a FreeBSD machine, someone reported problems with > > networking that > > was due to the fact that the /etc/hosts file specified localhost as > > an IPv6 > > address, yet IPv6 was not configured, which of course causes > > problems -- OS > > distribution bug IMO. > > Not in the hosts file on this system. > > > Otherwise, I have no idea what is going wrong. Unfortunately we no > > longer give > > support on this list. We do answer development questions, but this > > seems to > > be a support question. Please see www.bacula.org -> Support for all > > the > > possible options. > > Kern, it's really hard not to read that as a blowoff. I am not sure what you mean by a "blowoff". I am definitely telling you that unless it is a bug, we don't deal with it on this list. > In the many > years that I've been using bacula, asking questions involving debug > output on the -users list hasn't been very productive, and I have been > consistently referred to the -devel list. I'm not looking for > commercial support -- those options only allow you to pay someone at > the same skill level to ask the same question back to this list. You > and I both know there's no commercial support option that won't > involve the -devel list for a fix. I don't believe any of the support options on the web site under Professional support involve the bacula-devel list, unless possibly for a bug. > > I'm more than happy to repost this to -users if that's what you > prefer. But the next step is that you're going to have to tell me > how to get more information out of this. I've looked at dird.c which > contains that message and there doesn't appear to be more debug around > that. > What's going to help? Sorry, I have no idea without digging into the problem. > A gdb trace? Perhaps, without seeing it I cannot say. However it looks to me more like a network configuration problem or possibly an OS bug. Bacula *is* known to work properly on FreeBSD 7.2-STABLE. > > Here is the ktrace (similar to strace on linux) output near the > failure. From my reading it is hanging in the _umtx_op() call. > > 91887 bacula-dir GIO fd 1 wrote 44 bytes > "bacula-dir: mysql.c:240-0 close db=28708044 > " > 91887 bacula-dir RET write 44/0x2c > 91887 bacula-dir CALL write(0x4,0x28763000,0x5) > 91887 bacula-dir GIO fd 4 wrote 5 bytes > 0x0000 0100 0000 > 01 > > |.....| > > 91887 bacula-dir RET write 5 > 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) > 91887 bacula-dir RET shutdown 0 > 91887 bacula-dir CALL close(0x4) > 91887 bacula-dir RET close 0 > 91887 bacula-dir CALL __sysctl(0xbfbfe88c, > 0x2,0x2815eea0,0xbfbfe8a4,0,0) > 91887 bacula-dir RET __sysctl 0 > 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) > 91887 bacula-dir RET sigaction 0 > 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR|S_IWUSR) > 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > 91887 bacula-dir RET open 4 > 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > 91887 bacula-dir RET lseek 0 > 91887 bacula-dir CALL close(0x4) > 91887 bacula-dir RET close 0 > 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| > O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) > 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > 91887 bacula-dir RET open 4 > 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > 91887 bacula-dir RET lseek 0 > 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) > 91887 bacula-dir GIO fd 1 wrote 42 bytes > "backup0-dir: dird.c:317-0 Start UA server > " > 91887 bacula-dir RET write 42/0x2a > 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) > 91887 bacula-dir RET _umtx_op 0 > 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) > 91887 bacula-dir RET sigprocmask 0 > 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) > 91887 bacula-dir RET sigprocmask 0 > 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) > 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call > 91887 bacula-dir PSIG SIGINT SIG_DFL Sorry, I have no idea what umtx_op does. I suggest you ask about this on the FreeBSD support list. Regards, Kern |
From: Jo R. <jr...@ne...> - 2009-10-10 19:02:53
|
On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: > I am not sure what you mean by a "blowoff". I am definitely telling > you that > unless it is a bug, we don't deal with it on this list. It involves a hung process. No debug output. Exact same configuration works on previous OS. Exact same problem on a completely different system running the same code. Smells like a bug. Sounds like a bug. Looks like a bug. It's probably a bug. (it may be a bug that only affect FreeBSD, but it's a bug) I did repost this to -users. If you respond there and assist then I'll stop feeling like you're blowing it off. > Sorry, I have no idea what umtx_op does. > I suggest you ask about this on the FreeBSD support list. Right. Because they know the bacula source code. Kern, everything else compiles and runs just fine on this system, only bacula hangs. Gee, where do you think the problem is? Where do you think they will refer me to? I guess it's been a long time since I was active here, but I'm shocked that the support quality has descended so far. You can't be bothered to identify what is happening at this point in the code, and what I might look at for the problem? Really? Yes, I'll go read the source and figure it out from scratch myself if I have to. But if that's true, then I'll consider bacula "unsupported" going forward. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Dan L. <da...@la...> - 2009-10-10 23:32:28
|
Jo Rhett wrote: > I guess it's been a long time since I was active here, but I'm shocked > that the support quality has descended so far. You can't be > bothered to identify what is happening at this point in the code, and > what I might look at for the problem? Really? We differ in our view regarding the project. Each project handles things in its own way. This is ours. > Yes, I'll go read the source and figure it out from scratch myself if > I have to. But if that's true, then I'll consider bacula > "unsupported" going forward. You are free to do as you wish. Best wishes if you don't like how we run things here. |
From: Kern S. <ke...@si...> - 2009-10-10 23:52:58
|
On Saturday 10 October 2009 21:02:32 Jo Rhett wrote: > On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: > > I am not sure what you mean by a "blowoff". I am definitely telling > > you that > > unless it is a bug, we don't deal with it on this list. > > It involves a hung process. No debug output. Exact same > configuration works on previous OS. Then it is most likely an OS bug. > Exact same problem on a > completely different system running the same code. > > Smells like a bug. Sounds like a bug. Looks like a bug. It's > probably a bug. > (it may be a bug that only affect FreeBSD, but it's a bug) There is definitely something wrong, but for the moment, to me it looks like a FreeBSD bug or a problem configuring your network. > > I did repost this to -users. If you respond there and assist then > I'll stop feeling like you're blowing it off. I put in 10-14 hours of work on Bacula for free every day, and unfortunately no longer have time to give support or read the Bacula users list, so I guess you will just need to continue to think I am "blowing it off" even if I don't know what it means, and I have done my best to respond to you. > > > Sorry, I have no idea what umtx_op does. > > I suggest you ask about this on the FreeBSD support list. > > Right. Because they know the bacula source code. Kern, everything > else compiles and runs just fine on this system, only bacula hangs. You apparently haven't bothered to check your facts -- just surf Internet a bit looking for _umtx_op. > Gee, where do you think the problem is? I have told you at least two times, I think it is either a network configuration problem or an OS bug. > Where do you think they will refer me to? Instead of just ignoring my advice, why don't you ask them? You might be surprised to see how many problems there are (or were) with _umtx on FreeBSD. > > I guess it's been a long time since I was active here, but I'm shocked > that the support quality has descended so far. From what I understand, you never asked on the bacula-users list, if that is the case, you are complaining about quality of support, but you never asked in the right place where some FreeBSD users might have already seen and resolved the problem. > You can't be bothered to identify what is happening at this point in the code, and what I might look at for the problem? Really? I am not sure what more I can do since I have already told you my best guess what the problem is, given the information at hand. I also told you that a gdb traceback could possibly be useful. > > Yes, I'll go read the source and figure it out from scratch myself if > I have to. Great. That is what Open Source is supposed to be about -- users helping resolve problems. > But if that's true, then I'll consider bacula > "unsupported" going forward. |
From: Jo R. <jr...@ne...> - 2009-10-11 06:31:26
|
>> Yes, I'll go read the source and figure it out from scratch myself if >> I have to. On Oct 10, 2009, at 4:53 PM, Kern Sibbald wrote: > Great. That is what Open Source is supposed to be about -- users > helping > resolve problems. Yes. And I often do that for many projects. And yes, it is often the main developer says "I don't have time to look at this, but what's going on here is this... and take a look here..." and thus gives me the ability to start somewhere useful instead of starting at the top and trying to learn the code base en total. "It's an OS problem" tells me nothing. "It's a networking problem" also tells me nothing. A comment about exactly what the code is going to be doing at this point would be really useful. "It starts a subprocess to listen on 9101 for incoming connections" or .... what? Dan Langille said: > We differ in our view regarding the project. Each project handles > things in its own way. This is ours. Nice way. Don't even toss a bone to the people who you are expecting to find and fix the problems for you. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Kern S. <ke...@si...> - 2009-10-11 08:35:00
|
On Sunday 11 October 2009 08:31:03 Jo Rhett wrote: > >> Yes, I'll go read the source and figure it out from scratch myself if > >> I have to. > > On Oct 10, 2009, at 4:53 PM, Kern Sibbald wrote: > > Great. That is what Open Source is supposed to be about -- users > > helping > > resolve problems. > > Yes. And I often do that for many projects. And yes, it is often > the main developer says "I don't have time to look at this, but what's > going on here is this... and take a look here..." and thus gives me > the ability to start somewhere useful instead of starting at the top > and trying to learn the code base en total. > > "It's an OS problem" tells me nothing. "It's a networking problem" > also tells me nothing. A comment about exactly what the code is > going to be doing at this point would be really useful. "It starts a > subprocess to listen on 9101 for incoming connections" or .... what? The problem here is that you apparently don't read what I write to you. I told you *exactly* what kind of network problem I think it is (IPv6 including the details). I told you it might be an OS problem and that you should ask FreeBSD, and that you could search the Internet for _umtx_op. After you suggested gdb I said that could "possibly" provide some information. There is no reason for me to describe how Bacula works because at this point I don't think this is a Bacula problem. You have consumed about 45 minutes of my time; you not read or have rejected every idea I have given you; you criticize the bacula-users list, which I find gives very good advice as justification for bypassing them; then you complain we haven't even thrown you a bone. That is not going to help getting to a solution to your problem. From now on, you are on your own on this one unless another developer wants to try to help ... > > Dan Langille said: > > We differ in our view regarding the project. Each project handles > > things in its own way. This is ours. > > Nice way. Don't even toss a bone to the people who you are expecting > to find and fix the problems for you. |
From: Jo R. <jr...@ne...> - 2009-10-12 07:50:55
|
On Oct 11, 2009, at 1:35 AM, Kern Sibbald wrote: > The problem here is that you apparently don't read what I write to > you. You aren't reading my replies. I read and replied to each one of the following. I'll clarify again. > I told you *exactly* what kind of network problem I think it is (IPv6 > including the details). Yes, and I confirmed that this isn't the problem. There's no IPv6 address in /etc/hosts, nor does this host use DNS at all. No way for it to derive "::1" from its name anywhere. (and I don't use localhost anywhere, but likewise localhost has no ::1 entry) > I told you it might be an OS problem and that you should ask FreeBSD, For which I pointed out that a problem which affects only Bacula will be referred back to here. Why would it not be? "Bacula is crashing..." because they will immediately (and rightly so) send me right back to these lists. > and that you could search the Internet for _umtx_op. I did. I found 6 references to it, most of which don't appear related, and none of which identified either a bug or a fix. Any actual problem tends to have numerous reports -- 6 unrelated entries spanning 10 years that mention this function doesn't make much of a case for anything. If I overlooked a clear problem related to this, or a patch related to this, I'm happy to re-read. But I read them all carefully, and all related messages in those mailing lists, and I saw nothing that indicated a clear bug or even a similar problem. Not a single one of them relates to freebsd 7.2 and only one relates to freebsd 7 at all. Most of them seriously predate FreeBSD 6.3. And guess, what -- I've run bacula on every version from 4.4 up to 6.3 without any drama. So yes, there very well could be a FreeBSD related bug here. But without any more information, short of reading the code myself from scratch and building test cases, I have no place to start. > After you suggested gdb I said that could "possibly" provide some > information. I'm sorry, I haven't had time to get a gdb dump. It didn't appear likely to be received well. If you'll accept it, I'll prioritize it. > There is no reason for me to describe how Bacula works because at > this point I don't think this is a Bacula problem. Well if you were to describe what it's doing, it might help me identify things I could investigate which could further isolate, identify or resolve the bug. Give me somewhere useful to investigate. > You have consumed about 45 minutes of my time; you not read or have > rejected > every idea I have given you No, I have read everything you've said. I responded to each in turn, and have done everything you suggested other than walk over to the freebsd mailing lists and say "Bacula is crashing -- help" because they will immediately (and rightly so) suggest I contact you. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Dan L. <da...@la...> - 2009-10-12 17:28:04
|
Jo Rhett wrote: > On Oct 11, 2009, at 1:35 AM, Kern Sibbald wrote: >> The problem here is that you apparently don't read what I write to >> you. > > You aren't reading my replies. I read and replied to each one of the > following. I'll clarify again. Time to stop this contest here. Things are progressing on the users mailing list AFAIK. |
From: Dan L. <da...@la...> - 2009-10-11 12:23:26
|
> Dan Langille said: >> We differ in our view regarding the project. Each project handles >> things in its own way. This is ours. > > > Nice way. Don't even toss a bone to the people who you are expecting > to find and fix the problems for you. Eh? I guess you posted this before reading my post on -users.... I wish you well. |
From: Jo R. <jr...@ne...> - 2009-10-21 07:57:56
|
The gdb backtrace: (gdb) bt #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 #5 0x28615080 in calloc () from /lib/libc.so.7 #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 #10 0x286155dc in malloc () from /lib/libc.so.7 #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ libmysqlclient_r.so.16 #13 0x0808b608 in start_UA_server () #14 0x08052a9c in main () On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: > On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: >> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: >>> I assume 6.3 to 7.2 refers to some system OS that you did not >>> specify (except >>> possibly at the end of your email), and you did not specify what >>> version of >>> Bacula you are using. >> >> 3.0.2 -- the latest stable. >> >>> If this is a FreeBSD machine, someone reported problems with >>> networking that >>> was due to the fact that the /etc/hosts file specified localhost as >>> an IPv6 >>> address, yet IPv6 was not configured, which of course causes >>> problems -- OS >>> distribution bug IMO. >> >> Not in the hosts file on this system. >> >>> Otherwise, I have no idea what is going wrong. Unfortunately we no >>> longer give >>> support on this list. We do answer development questions, but this >>> seems to >>> be a support question. Please see www.bacula.org -> Support for all >>> the >>> possible options. >> >> Kern, it's really hard not to read that as a blowoff. > > I am not sure what you mean by a "blowoff". I am definitely telling > you that > unless it is a bug, we don't deal with it on this list. > >> In the many >> years that I've been using bacula, asking questions involving debug >> output on the -users list hasn't been very productive, and I have >> been >> consistently referred to the -devel list. I'm not looking for >> commercial support -- those options only allow you to pay someone at >> the same skill level to ask the same question back to this list. You >> and I both know there's no commercial support option that won't >> involve the -devel list for a fix. > > I don't believe any of the support options on the web site under > Professional > support involve the bacula-devel list, unless possibly for a bug. > >> >> I'm more than happy to repost this to -users if that's what you >> prefer. But the next step is that you're going to have to tell me >> how to get more information out of this. I've looked at dird.c which >> contains that message and there doesn't appear to be more debug >> around >> that. > >> What's going to help? > > Sorry, I have no idea without digging into the problem. > >> A gdb trace? > > Perhaps, without seeing it I cannot say. > > However it looks to me more like a network configuration problem or > possibly > an OS bug. > > Bacula *is* known to work properly on FreeBSD 7.2-STABLE. > > > >> >> Here is the ktrace (similar to strace on linux) output near the >> failure. From my reading it is hanging in the _umtx_op() call. >> >> 91887 bacula-dir GIO fd 1 wrote 44 bytes >> "bacula-dir: mysql.c:240-0 close db=28708044 >> " >> 91887 bacula-dir RET write 44/0x2c >> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) >> 91887 bacula-dir GIO fd 4 wrote 5 bytes >> 0x0000 0100 0000 >> 01 >> >> |.....| >> >> 91887 bacula-dir RET write 5 >> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) >> 91887 bacula-dir RET shutdown 0 >> 91887 bacula-dir CALL close(0x4) >> 91887 bacula-dir RET close 0 >> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, >> 0x2,0x2815eea0,0xbfbfe8a4,0,0) >> 91887 bacula-dir RET __sysctl 0 >> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) >> 91887 bacula-dir RET sigaction 0 >> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| >> S_IWUSR) >> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >> 91887 bacula-dir RET open 4 >> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >> 91887 bacula-dir RET lseek 0 >> 91887 bacula-dir CALL close(0x4) >> 91887 bacula-dir RET close 0 >> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| >> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) >> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >> 91887 bacula-dir RET open 4 >> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >> 91887 bacula-dir RET lseek 0 >> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) >> 91887 bacula-dir GIO fd 1 wrote 42 bytes >> "backup0-dir: dird.c:317-0 Start UA server >> " >> 91887 bacula-dir RET write 42/0x2a >> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) >> 91887 bacula-dir RET _umtx_op 0 >> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) >> 91887 bacula-dir RET sigprocmask 0 >> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) >> 91887 bacula-dir RET sigprocmask 0 >> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) >> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call >> 91887 bacula-dir PSIG SIGINT SIG_DFL > > > Sorry, I have no idea what umtx_op does. > I suggest you ask about this on the FreeBSD support list. > > Regards, > > Kern |
From: Kern S. <ke...@si...> - 2009-10-21 08:09:52
|
What you show below is unfortunately only part of the story. To understand better what is going on we need a "thread apply all bt" as documented in the manual. Also, since it is in the MySQL library, knowing what version you are using can be useful. This thread seems to be blocked in in the MySQL client libraries. This could be due to quite a number of things. First is that MySQL is not running, after that, your client library may not correspond to the server, some other MySQL or Bacula thread may be blocking it, ... > The gdb backtrace: > > (gdb) bt > #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > #5 0x28615080 in calloc () from /lib/libc.so.7 > #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > #10 0x286155dc in malloc () from /lib/libc.so.7 > #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #13 0x0808b608 in start_UA_server () > #14 0x08052a9c in main () > > On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: >> On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: >>> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: >>>> I assume 6.3 to 7.2 refers to some system OS that you did not >>>> specify (except >>>> possibly at the end of your email), and you did not specify what >>>> version of >>>> Bacula you are using. >>> >>> 3.0.2 -- the latest stable. >>> >>>> If this is a FreeBSD machine, someone reported problems with >>>> networking that >>>> was due to the fact that the /etc/hosts file specified localhost as >>>> an IPv6 >>>> address, yet IPv6 was not configured, which of course causes >>>> problems -- OS >>>> distribution bug IMO. >>> >>> Not in the hosts file on this system. >>> >>>> Otherwise, I have no idea what is going wrong. Unfortunately we no >>>> longer give >>>> support on this list. We do answer development questions, but this >>>> seems to >>>> be a support question. Please see www.bacula.org -> Support for all >>>> the >>>> possible options. >>> >>> Kern, it's really hard not to read that as a blowoff. >> >> I am not sure what you mean by a "blowoff". I am definitely telling >> you that >> unless it is a bug, we don't deal with it on this list. >> >>> In the many >>> years that I've been using bacula, asking questions involving debug >>> output on the -users list hasn't been very productive, and I have >>> been >>> consistently referred to the -devel list. I'm not looking for >>> commercial support -- those options only allow you to pay someone at >>> the same skill level to ask the same question back to this list. You >>> and I both know there's no commercial support option that won't >>> involve the -devel list for a fix. >> >> I don't believe any of the support options on the web site under >> Professional >> support involve the bacula-devel list, unless possibly for a bug. >> >>> >>> I'm more than happy to repost this to -users if that's what you >>> prefer. But the next step is that you're going to have to tell me >>> how to get more information out of this. I've looked at dird.c which >>> contains that message and there doesn't appear to be more debug >>> around >>> that. >> >>> What's going to help? >> >> Sorry, I have no idea without digging into the problem. >> >>> A gdb trace? >> >> Perhaps, without seeing it I cannot say. >> >> However it looks to me more like a network configuration problem or >> possibly >> an OS bug. >> >> Bacula *is* known to work properly on FreeBSD 7.2-STABLE. >> >> >> >>> >>> Here is the ktrace (similar to strace on linux) output near the >>> failure. From my reading it is hanging in the _umtx_op() call. >>> >>> 91887 bacula-dir GIO fd 1 wrote 44 bytes >>> "bacula-dir: mysql.c:240-0 close db=28708044 >>> " >>> 91887 bacula-dir RET write 44/0x2c >>> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) >>> 91887 bacula-dir GIO fd 4 wrote 5 bytes >>> 0x0000 0100 0000 >>> 01 >>> >>> |.....| >>> >>> 91887 bacula-dir RET write 5 >>> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) >>> 91887 bacula-dir RET shutdown 0 >>> 91887 bacula-dir CALL close(0x4) >>> 91887 bacula-dir RET close 0 >>> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, >>> 0x2,0x2815eea0,0xbfbfe8a4,0,0) >>> 91887 bacula-dir RET __sysctl 0 >>> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) >>> 91887 bacula-dir RET sigaction 0 >>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| >>> S_IWUSR) >>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >>> 91887 bacula-dir RET open 4 >>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >>> 91887 bacula-dir RET lseek 0 >>> 91887 bacula-dir CALL close(0x4) >>> 91887 bacula-dir RET close 0 >>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| >>> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) >>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >>> 91887 bacula-dir RET open 4 >>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >>> 91887 bacula-dir RET lseek 0 >>> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) >>> 91887 bacula-dir GIO fd 1 wrote 42 bytes >>> "backup0-dir: dird.c:317-0 Start UA server >>> " >>> 91887 bacula-dir RET write 42/0x2a >>> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) >>> 91887 bacula-dir RET _umtx_op 0 >>> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) >>> 91887 bacula-dir RET sigprocmask 0 >>> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) >>> 91887 bacula-dir RET sigprocmask 0 >>> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) >>> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call >>> 91887 bacula-dir PSIG SIGINT SIG_DFL >> >> >> Sorry, I have no idea what umtx_op does. >> I suggest you ask about this on the FreeBSD support list. >> >> Regards, >> >> Kern > > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > Bacula-devel mailing list > Bac...@li... > https://lists.sourceforge.net/lists/listinfo/bacula-devel > Best regards, Kern |
From: Martin S. <ma...@li...> - 2009-10-21 16:09:22
|
>>>>> On Wed, 21 Oct 2009 00:57:40 -0700, Jo Rhett said: > > The gdb backtrace: > > (gdb) bt > #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > #5 0x28615080 in calloc () from /lib/libc.so.7 > #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > #10 0x286155dc in malloc () from /lib/libc.so.7 > #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > #13 0x0808b608 in start_UA_server () > #14 0x08052a9c in main () It looks suspicious that the pthread implementation comes from /usr/local/lib/mysql/libmysqlclient_r.so.16 in this backtrace. Either gdb is broken or your libmysqlclient is very strange. __Martin > > On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: > > On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: > >> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: > >>> I assume 6.3 to 7.2 refers to some system OS that you did not > >>> specify (except > >>> possibly at the end of your email), and you did not specify what > >>> version of > >>> Bacula you are using. > >> > >> 3.0.2 -- the latest stable. > >> > >>> If this is a FreeBSD machine, someone reported problems with > >>> networking that > >>> was due to the fact that the /etc/hosts file specified localhost as > >>> an IPv6 > >>> address, yet IPv6 was not configured, which of course causes > >>> problems -- OS > >>> distribution bug IMO. > >> > >> Not in the hosts file on this system. > >> > >>> Otherwise, I have no idea what is going wrong. Unfortunately we no > >>> longer give > >>> support on this list. We do answer development questions, but this > >>> seems to > >>> be a support question. Please see www.bacula.org -> Support for all > >>> the > >>> possible options. > >> > >> Kern, it's really hard not to read that as a blowoff. > > > > I am not sure what you mean by a "blowoff". I am definitely telling > > you that > > unless it is a bug, we don't deal with it on this list. > > > >> In the many > >> years that I've been using bacula, asking questions involving debug > >> output on the -users list hasn't been very productive, and I have > >> been > >> consistently referred to the -devel list. I'm not looking for > >> commercial support -- those options only allow you to pay someone at > >> the same skill level to ask the same question back to this list. You > >> and I both know there's no commercial support option that won't > >> involve the -devel list for a fix. > > > > I don't believe any of the support options on the web site under > > Professional > > support involve the bacula-devel list, unless possibly for a bug. > > > >> > >> I'm more than happy to repost this to -users if that's what you > >> prefer. But the next step is that you're going to have to tell me > >> how to get more information out of this. I've looked at dird.c which > >> contains that message and there doesn't appear to be more debug > >> around > >> that. > > > >> What's going to help? > > > > Sorry, I have no idea without digging into the problem. > > > >> A gdb trace? > > > > Perhaps, without seeing it I cannot say. > > > > However it looks to me more like a network configuration problem or > > possibly > > an OS bug. > > > > Bacula *is* known to work properly on FreeBSD 7.2-STABLE. > > > > > > > >> > >> Here is the ktrace (similar to strace on linux) output near the > >> failure. From my reading it is hanging in the _umtx_op() call. > >> > >> 91887 bacula-dir GIO fd 1 wrote 44 bytes > >> "bacula-dir: mysql.c:240-0 close db=28708044 > >> " > >> 91887 bacula-dir RET write 44/0x2c > >> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) > >> 91887 bacula-dir GIO fd 4 wrote 5 bytes > >> 0x0000 0100 0000 > >> 01 > >> > >> |.....| > >> > >> 91887 bacula-dir RET write 5 > >> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) > >> 91887 bacula-dir RET shutdown 0 > >> 91887 bacula-dir CALL close(0x4) > >> 91887 bacula-dir RET close 0 > >> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, > >> 0x2,0x2815eea0,0xbfbfe8a4,0,0) > >> 91887 bacula-dir RET __sysctl 0 > >> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) > >> 91887 bacula-dir RET sigaction 0 > >> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| > >> S_IWUSR) > >> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > >> 91887 bacula-dir RET open 4 > >> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > >> 91887 bacula-dir RET lseek 0 > >> 91887 bacula-dir CALL close(0x4) > >> 91887 bacula-dir RET close 0 > >> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| > >> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) > >> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > >> 91887 bacula-dir RET open 4 > >> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > >> 91887 bacula-dir RET lseek 0 > >> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) > >> 91887 bacula-dir GIO fd 1 wrote 42 bytes > >> "backup0-dir: dird.c:317-0 Start UA server > >> " > >> 91887 bacula-dir RET write 42/0x2a > >> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) > >> 91887 bacula-dir RET _umtx_op 0 > >> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) > >> 91887 bacula-dir RET sigprocmask 0 > >> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) > >> 91887 bacula-dir RET sigprocmask 0 > >> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) > >> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call > >> 91887 bacula-dir PSIG SIGINT SIG_DFL > > > > > > Sorry, I have no idea what umtx_op does. > > I suggest you ask about this on the FreeBSD support list. > > > > Regards, > > > > Kern > > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > Bacula-devel mailing list > Bac...@li... > https://lists.sourceforge.net/lists/listinfo/bacula-devel > |
From: Kern S. <ke...@si...> - 2009-10-21 17:03:58
|
On Wednesday 21 October 2009 18:09:09 Martin Simmons wrote: > >>>>> On Wed, 21 Oct 2009 00:57:40 -0700, Jo Rhett said: > > > > The gdb backtrace: > > > > (gdb) bt > > #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > > #5 0x28615080 in calloc () from /lib/libc.so.7 > > #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > > #10 0x286155dc in malloc () from /lib/libc.so.7 > > #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ > > libmysqlclient_r.so.16 > > #13 0x0808b608 in start_UA_server () > > #14 0x08052a9c in main () > > It looks suspicious that the pthread implementation comes from > /usr/local/lib/mysql/libmysqlclient_r.so.16 in this backtrace. Either gdb > is broken or your libmysqlclient is very strange. Good point, Martin! While in start_UA_server() Bacula does create the thread that is going to listen for console connections, but it should have nothing to do with libmysqlclient_r. Something is definitely very broken -- as you say, probably gdb or the kernel interface that gdb needs to run. Kern > > __Martin > > > On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: > > > On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: > > >> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: > > >>> I assume 6.3 to 7.2 refers to some system OS that you did not > > >>> specify (except > > >>> possibly at the end of your email), and you did not specify what > > >>> version of > > >>> Bacula you are using. > > >> > > >> 3.0.2 -- the latest stable. > > >> > > >>> If this is a FreeBSD machine, someone reported problems with > > >>> networking that > > >>> was due to the fact that the /etc/hosts file specified localhost as > > >>> an IPv6 > > >>> address, yet IPv6 was not configured, which of course causes > > >>> problems -- OS > > >>> distribution bug IMO. > > >> > > >> Not in the hosts file on this system. > > >> > > >>> Otherwise, I have no idea what is going wrong. Unfortunately we no > > >>> longer give > > >>> support on this list. We do answer development questions, but this > > >>> seems to > > >>> be a support question. Please see www.bacula.org -> Support for all > > >>> the > > >>> possible options. > > >> > > >> Kern, it's really hard not to read that as a blowoff. > > > > > > I am not sure what you mean by a "blowoff". I am definitely telling > > > you that > > > unless it is a bug, we don't deal with it on this list. > > > > > >> In the many > > >> years that I've been using bacula, asking questions involving debug > > >> output on the -users list hasn't been very productive, and I have > > >> been > > >> consistently referred to the -devel list. I'm not looking for > > >> commercial support -- those options only allow you to pay someone at > > >> the same skill level to ask the same question back to this list. You > > >> and I both know there's no commercial support option that won't > > >> involve the -devel list for a fix. > > > > > > I don't believe any of the support options on the web site under > > > Professional > > > support involve the bacula-devel list, unless possibly for a bug. > > > > > >> I'm more than happy to repost this to -users if that's what you > > >> prefer. But the next step is that you're going to have to tell me > > >> how to get more information out of this. I've looked at dird.c which > > >> contains that message and there doesn't appear to be more debug > > >> around > > >> that. > > >> > > >> What's going to help? > > > > > > Sorry, I have no idea without digging into the problem. > > > > > >> A gdb trace? > > > > > > Perhaps, without seeing it I cannot say. > > > > > > However it looks to me more like a network configuration problem or > > > possibly > > > an OS bug. > > > > > > Bacula *is* known to work properly on FreeBSD 7.2-STABLE. > > > > > >> Here is the ktrace (similar to strace on linux) output near the > > >> failure. From my reading it is hanging in the _umtx_op() call. > > >> > > >> 91887 bacula-dir GIO fd 1 wrote 44 bytes > > >> "bacula-dir: mysql.c:240-0 close db=28708044 > > >> " > > >> 91887 bacula-dir RET write 44/0x2c > > >> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) > > >> 91887 bacula-dir GIO fd 4 wrote 5 bytes > > >> 0x0000 0100 0000 > > >> 01 > > >> > > >> |.....| > > >> > > >> 91887 bacula-dir RET write 5 > > >> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) > > >> 91887 bacula-dir RET shutdown 0 > > >> 91887 bacula-dir CALL close(0x4) > > >> 91887 bacula-dir RET close 0 > > >> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, > > >> 0x2,0x2815eea0,0xbfbfe8a4,0,0) > > >> 91887 bacula-dir RET __sysctl 0 > > >> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) > > >> 91887 bacula-dir RET sigaction 0 > > >> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| > > >> S_IWUSR) > > >> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > > >> 91887 bacula-dir RET open 4 > > >> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > > >> 91887 bacula-dir RET lseek 0 > > >> 91887 bacula-dir CALL close(0x4) > > >> 91887 bacula-dir RET close 0 > > >> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| > > >> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) > > >> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > > >> 91887 bacula-dir RET open 4 > > >> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > > >> 91887 bacula-dir RET lseek 0 > > >> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) > > >> 91887 bacula-dir GIO fd 1 wrote 42 bytes > > >> "backup0-dir: dird.c:317-0 Start UA server > > >> " > > >> 91887 bacula-dir RET write 42/0x2a > > >> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) > > >> 91887 bacula-dir RET _umtx_op 0 > > >> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) > > >> 91887 bacula-dir RET sigprocmask 0 > > >> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) > > >> 91887 bacula-dir RET sigprocmask 0 > > >> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) > > >> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call > > >> 91887 bacula-dir PSIG SIGINT SIG_DFL > > > > > > Sorry, I have no idea what umtx_op does. > > > I suggest you ask about this on the FreeBSD support list. > > > > > > Regards, > > > > > > Kern > > > > ------------------------------------------------------------------------- > >----- Come build with us! The BlackBerry(R) Developer Conference in SF, CA > > is the only developer event you need to attend this year. Jumpstart your > > developing skills, take BlackBerry mobile applications to market and stay > > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > > http://p.sf.net/sfu/devconference > > _______________________________________________ > > Bacula-devel mailing list > > Bac...@li... > > https://lists.sourceforge.net/lists/listinfo/bacula-devel > > --------------------------------------------------------------------------- >--- Come build with us! The BlackBerry(R) Developer Conference in SF, CA is > the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > Bacula-devel mailing list > Bac...@li... > https://lists.sourceforge.net/lists/listinfo/bacula-devel |
From: Jo R. <jr...@ne...> - 2009-10-22 17:46:28
|
Apologizes, I re-orged your response to top-posting to keep the information in the message. What do you mean by "the pthread implementation comes from /usr/local/ lib/mysql/libmysqlclient_r.so.16 in this backtrace" ? How would I determine this? How can I prevent this, etc? Note: this seems likely to relate to the problem, since all other google-able references to hangs in this routine are tied to thread issues, so this might be the right road to investigate. I deeply appreciate your assistance debugging this. FWIW, I seem no reference to threading libraries in any of the relevant modules. ]# ldd /usr/local/sbin/bacula-dir /usr/local/sbin/bacula-dir: libbacfind.so.1 => /usr/local/lib/libbacfind.so.1 (0x280de000) libbacsql.so.1 => /usr/local/lib/libbacsql.so.1 (0x280ea000) libbacpy.so.1 => /usr/local/lib/libbacpy.so.1 (0x28107000) libbaccfg.so.1 => /usr/local/lib/libbaccfg.so.1 (0x28109000) libbac.so.1 => /usr/local/lib/libbac.so.1 (0x28110000) libmysqlclient_r.so.16 => /usr/local/lib/mysql/ libmysqlclient_r.so.16 (0x28160000) libcrypt.so.4 => /lib/libcrypt.so.4 (0x281dd000) libz.so.4 => /lib/libz.so.4 (0x281f6000) libintl.so.8 => /usr/local/lib/libintl.so.8 (0x28208000) libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x28211000) libwrap.so.5 => /usr/lib/libwrap.so.5 (0x282ef000) libssl.so.5 => /usr/lib/libssl.so.5 (0x282f6000) libcrypto.so.5 => /lib/libcrypto.so.5 (0x28337000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x28490000) libm.so.5 => /lib/libm.so.5 (0x28585000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x2859a000) libc.so.7 => /lib/libc.so.7 (0x285ae000) # ldd /usr/local/bin/mysql /usr/local/bin/mysql: libreadline.so.7 => /lib/libreadline.so.7 (0x28099000) libncursesw.so.7 => /lib/libncursesw.so.7 (0x280cb000) libmysqlclient.so.16 => /usr/local/lib/mysql/ libmysqlclient.so.16 (0x28117000) libcrypt.so.4 => /lib/libcrypt.so.4 (0x28181000) libz.so.4 => /lib/libz.so.4 (0x2819a000) libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x281ac000) libm.so.5 => /lib/libm.so.5 (0x282a1000) libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x282b6000) libc.so.7 => /lib/libc.so.7 (0x282c1000) libncurses.so.7 => /lib/libncurses.so.7 (0x283c3000) # ldd /usr/local/lib/mysql/libmysqlclient_r.so.16 /usr/local/lib/mysql/libmysqlclient_r.so.16: libcrypt.so.4 => /lib/libcrypt.so.4 (0x28300000) libm.so.5 => /lib/libm.so.5 (0x28319000) libz.so.4 => /lib/libz.so.4 (0x2832e000) libc.so.7 => /lib/libc.so.7 (0x28080000) On Oct 21, 2009, at 9:09 AM, Martin Simmons wrote: > It looks suspicious that the pthread implementation comes from > /usr/local/lib/mysql/libmysqlclient_r.so.16 in this backtrace. > Either gdb is > broken or your libmysqlclient is very strange. >>>>>> On Wed, 21 Oct 2009 00:57:40 -0700, Jo Rhett said: >> >> The gdb backtrace: >> >> (gdb) bt >> #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 >> #5 0x28615080 in calloc () from /lib/libc.so.7 >> #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 >> #10 0x286155dc in malloc () from /lib/libc.so.7 >> #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #13 0x0808b608 in start_UA_server () >> #14 0x08052a9c in main () > > > __Martin > > > >> >> On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: >>> On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: >>>> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: >>>>> I assume 6.3 to 7.2 refers to some system OS that you did not >>>>> specify (except >>>>> possibly at the end of your email), and you did not specify what >>>>> version of >>>>> Bacula you are using. >>>> >>>> 3.0.2 -- the latest stable. >>>> >>>>> If this is a FreeBSD machine, someone reported problems with >>>>> networking that >>>>> was due to the fact that the /etc/hosts file specified localhost >>>>> as >>>>> an IPv6 >>>>> address, yet IPv6 was not configured, which of course causes >>>>> problems -- OS >>>>> distribution bug IMO. >>>> >>>> Not in the hosts file on this system. >>>> >>>>> Otherwise, I have no idea what is going wrong. Unfortunately we no >>>>> longer give >>>>> support on this list. We do answer development questions, but >>>>> this >>>>> seems to >>>>> be a support question. Please see www.bacula.org -> Support for >>>>> all >>>>> the >>>>> possible options. >>>> >>>> Kern, it's really hard not to read that as a blowoff. >>> >>> I am not sure what you mean by a "blowoff". I am definitely telling >>> you that >>> unless it is a bug, we don't deal with it on this list. >>> >>>> In the many >>>> years that I've been using bacula, asking questions involving debug >>>> output on the -users list hasn't been very productive, and I have >>>> been >>>> consistently referred to the -devel list. I'm not looking for >>>> commercial support -- those options only allow you to pay someone >>>> at >>>> the same skill level to ask the same question back to this list. >>>> You >>>> and I both know there's no commercial support option that won't >>>> involve the -devel list for a fix. >>> >>> I don't believe any of the support options on the web site under >>> Professional >>> support involve the bacula-devel list, unless possibly for a bug. >>> >>>> >>>> I'm more than happy to repost this to -users if that's what you >>>> prefer. But the next step is that you're going to have to tell >>>> me >>>> how to get more information out of this. I've looked at dird.c >>>> which >>>> contains that message and there doesn't appear to be more debug >>>> around >>>> that. >>> >>>> What's going to help? >>> >>> Sorry, I have no idea without digging into the problem. >>> >>>> A gdb trace? >>> >>> Perhaps, without seeing it I cannot say. >>> >>> However it looks to me more like a network configuration problem or >>> possibly >>> an OS bug. >>> >>> Bacula *is* known to work properly on FreeBSD 7.2-STABLE. >>> >>> >>> >>>> >>>> Here is the ktrace (similar to strace on linux) output near the >>>> failure. From my reading it is hanging in the _umtx_op() call. >>>> >>>> 91887 bacula-dir GIO fd 1 wrote 44 bytes >>>> "bacula-dir: mysql.c:240-0 close db=28708044 >>>> " >>>> 91887 bacula-dir RET write 44/0x2c >>>> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) >>>> 91887 bacula-dir GIO fd 4 wrote 5 bytes >>>> 0x0000 0100 0000 >>>> 01 >>>> >>>> |.....| >>>> >>>> 91887 bacula-dir RET write 5 >>>> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) >>>> 91887 bacula-dir RET shutdown 0 >>>> 91887 bacula-dir CALL close(0x4) >>>> 91887 bacula-dir RET close 0 >>>> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, >>>> 0x2,0x2815eea0,0xbfbfe8a4,0,0) >>>> 91887 bacula-dir RET __sysctl 0 >>>> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) >>>> 91887 bacula-dir RET sigaction 0 >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| >>>> S_IWUSR) >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >>>> 91887 bacula-dir RET open 4 >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >>>> 91887 bacula-dir RET lseek 0 >>>> 91887 bacula-dir CALL close(0x4) >>>> 91887 bacula-dir RET close 0 >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| >>>> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >>>> 91887 bacula-dir RET open 4 >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >>>> 91887 bacula-dir RET lseek 0 >>>> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) >>>> 91887 bacula-dir GIO fd 1 wrote 42 bytes >>>> "backup0-dir: dird.c:317-0 Start UA server >>>> " >>>> 91887 bacula-dir RET write 42/0x2a >>>> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) >>>> 91887 bacula-dir RET _umtx_op 0 >>>> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) >>>> 91887 bacula-dir RET sigprocmask 0 >>>> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) >>>> 91887 bacula-dir RET sigprocmask 0 >>>> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) >>>> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call >>>> 91887 bacula-dir PSIG SIGINT SIG_DFL >>> >>> >>> Sorry, I have no idea what umtx_op does. >>> I suggest you ask about this on the FreeBSD support list. >>> >>> Regards, >>> >>> Kern >> >> >> ------------------------------------------------------------------------------ >> Come build with us! The BlackBerry(R) Developer Conference in SF, CA >> is the only developer event you need to attend this year. Jumpstart >> your >> developing skills, take BlackBerry mobile applications to market >> and stay >> ahead of the curve. Join us from November 9 - 12, 2009. Register now! >> http://p.sf.net/sfu/devconference >> _______________________________________________ >> Bacula-devel mailing list >> Bac...@li... >> https://lists.sourceforge.net/lists/listinfo/bacula-devel >> -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Jo R. <jr...@ne...> - 2009-10-22 17:40:55
|
On Oct 21, 2009, at 1:09 AM, Kern Sibbald wrote: > What you show below is unfortunately only part of the story. To > understand > better what is going on we need a "thread apply all bt" as > documented in > the manual. (gdb) thread apply all bt Cannot get thread info: invalid key > Also, since it is in the MySQL library, knowing what version > you are using can be useful. 5.1.39 > This thread seems to be blocked in in the MySQL client libraries. > This > could be due to quite a number of things. First is that MySQL is not > running, It is running. If you look at the debug output (included here again for reference) you'll see that the process opens a session with mysql and does a query successfully, then closes the session before hanging: bacula-dir: mysql.c:101-0 db_open first time bacula-dir: mysql.c:130-0 initdb ref=1 connected=0 db=0 bacula-dir: mysql.c:166-0 mysql_init done bacula-dir: mysql.c:187-0 mysql_real_connect done bacula-dir: mysql.c:189-0 db_user=bacula db_name=bacula db_password=<snip> bacula-dir: mysql.c:215-0 opendb ref=1 connected=1 db=28708044 bacula-dir: sql_create.c:341-0 In create mediatype bacula-dir: sql_create.c:344-0 selectmediatype: SELECT MediaTypeId,MediaType FROM MediaType WHERE MediaType='File_SVcolo' bacula-dir: mysql.c:236-0 closedb ref=0 connected=1 db=28708044 bacula-dir: mysql.c:240-0 close db=28708044 backup0-dir: dird.c:317-0 Start UA server > after that, your client library may not correspond to the server, > some other MySQL or Bacula thread may be blocking it, ... I'm not sure what you mean by this. If you mean the mysql client, they are compiled together at the same time on the same machine. >> The gdb backtrace: >> >> (gdb) bt >> #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 >> #5 0x28615080 in calloc () from /lib/libc.so.7 >> #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 >> #10 0x286155dc in malloc () from /lib/libc.so.7 >> #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ >> libmysqlclient_r.so.16 >> #13 0x0808b608 in start_UA_server () >> #14 0x08052a9c in main () >> >> On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: >>> On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: >>>> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: >>>>> I assume 6.3 to 7.2 refers to some system OS that you did not >>>>> specify (except >>>>> possibly at the end of your email), and you did not specify what >>>>> version of >>>>> Bacula you are using. >>>> >>>> 3.0.2 -- the latest stable. >>>> >>>>> If this is a FreeBSD machine, someone reported problems with >>>>> networking that >>>>> was due to the fact that the /etc/hosts file specified localhost >>>>> as >>>>> an IPv6 >>>>> address, yet IPv6 was not configured, which of course causes >>>>> problems -- OS >>>>> distribution bug IMO. >>>> >>>> Not in the hosts file on this system. >>>> >>>>> Otherwise, I have no idea what is going wrong. Unfortunately we no >>>>> longer give >>>>> support on this list. We do answer development questions, but >>>>> this >>>>> seems to >>>>> be a support question. Please see www.bacula.org -> Support for >>>>> all >>>>> the >>>>> possible options. >>>> >>>> Kern, it's really hard not to read that as a blowoff. >>> >>> I am not sure what you mean by a "blowoff". I am definitely telling >>> you that >>> unless it is a bug, we don't deal with it on this list. >>> >>>> In the many >>>> years that I've been using bacula, asking questions involving debug >>>> output on the -users list hasn't been very productive, and I have >>>> been >>>> consistently referred to the -devel list. I'm not looking for >>>> commercial support -- those options only allow you to pay someone >>>> at >>>> the same skill level to ask the same question back to this list. >>>> You >>>> and I both know there's no commercial support option that won't >>>> involve the -devel list for a fix. >>> >>> I don't believe any of the support options on the web site under >>> Professional >>> support involve the bacula-devel list, unless possibly for a bug. >>> >>>> >>>> I'm more than happy to repost this to -users if that's what you >>>> prefer. But the next step is that you're going to have to tell >>>> me >>>> how to get more information out of this. I've looked at dird.c >>>> which >>>> contains that message and there doesn't appear to be more debug >>>> around >>>> that. >>> >>>> What's going to help? >>> >>> Sorry, I have no idea without digging into the problem. >>> >>>> A gdb trace? >>> >>> Perhaps, without seeing it I cannot say. >>> >>> However it looks to me more like a network configuration problem or >>> possibly >>> an OS bug. >>> >>> Bacula *is* known to work properly on FreeBSD 7.2-STABLE. >>> >>> >>> >>>> >>>> Here is the ktrace (similar to strace on linux) output near the >>>> failure. From my reading it is hanging in the _umtx_op() call. >>>> >>>> 91887 bacula-dir GIO fd 1 wrote 44 bytes >>>> "bacula-dir: mysql.c:240-0 close db=28708044 >>>> " >>>> 91887 bacula-dir RET write 44/0x2c >>>> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) >>>> 91887 bacula-dir GIO fd 4 wrote 5 bytes >>>> 0x0000 0100 0000 >>>> 01 >>>> >>>> |.....| >>>> >>>> 91887 bacula-dir RET write 5 >>>> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) >>>> 91887 bacula-dir RET shutdown 0 >>>> 91887 bacula-dir CALL close(0x4) >>>> 91887 bacula-dir RET close 0 >>>> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, >>>> 0x2,0x2815eea0,0xbfbfe8a4,0,0) >>>> 91887 bacula-dir RET __sysctl 0 >>>> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) >>>> 91887 bacula-dir RET sigaction 0 >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| >>>> S_IWUSR) >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >>>> 91887 bacula-dir RET open 4 >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >>>> 91887 bacula-dir RET lseek 0 >>>> 91887 bacula-dir CALL close(0x4) >>>> 91887 bacula-dir RET close 0 >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| >>>> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" >>>> 91887 bacula-dir RET open 4 >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) >>>> 91887 bacula-dir RET lseek 0 >>>> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) >>>> 91887 bacula-dir GIO fd 1 wrote 42 bytes >>>> "backup0-dir: dird.c:317-0 Start UA server >>>> " >>>> 91887 bacula-dir RET write 42/0x2a >>>> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) >>>> 91887 bacula-dir RET _umtx_op 0 >>>> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) >>>> 91887 bacula-dir RET sigprocmask 0 >>>> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) >>>> 91887 bacula-dir RET sigprocmask 0 >>>> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) >>>> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call >>>> 91887 bacula-dir PSIG SIGINT SIG_DFL >>> >>> >>> Sorry, I have no idea what umtx_op does. >>> I suggest you ask about this on the FreeBSD support list. >>> >>> Regards, >>> >>> Kern >> >> >> ------------------------------------------------------------------------------ >> Come build with us! The BlackBerry(R) Developer Conference in SF, CA >> is the only developer event you need to attend this year. Jumpstart >> your >> developing skills, take BlackBerry mobile applications to market >> and stay >> ahead of the curve. Join us from November 9 - 12, 2009. Register now! >> http://p.sf.net/sfu/devconference >> _______________________________________________ >> Bacula-devel mailing list >> Bac...@li... >> https://lists.sourceforge.net/lists/listinfo/bacula-devel >> > > > Best regards, Kern -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Martin S. <ma...@li...> - 2009-10-22 18:10:45
|
The lines in the gdb backtrace like this: #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/libmysqlclient_r.so.16 mean that gdb found the address 0x281a1aef to be inside the function pthread_create and that address is within the library libmysqlclient_r.so.16. Normally pthread_create comes from libthr.so on FreeBSD 7. Your ldd output for /usr/local/lib/mysql/libmysqlclient_r.so.16 shows no dependency on libthr.so. My guess is that your libmysqlclient_r.so.16 has be miscompiled, using the static version of libthr rather than the dynamic one. I recommend getting rid of that MySQL client installation and using the precompiled package for the mysql-client port, which doesn't have this problem. Version 5.0 is on the media and other versions are on the FreeBSD ftp site. The same applies to the MySQL server if you have that on the same machine. __Martin >>>>> On Thu, 22 Oct 2009 10:46:11 -0700, Jo Rhett said: > > Apologizes, I re-orged your response to top-posting to keep the > information in the message. > > What do you mean by "the pthread implementation comes from /usr/local/ > lib/mysql/libmysqlclient_r.so.16 in this backtrace" ? How would I > determine this? How can I prevent this, etc? > > Note: this seems likely to relate to the problem, since all other > google-able references to hangs in this routine are tied to thread > issues, so this might be the right road to investigate. I deeply > appreciate your assistance debugging this. > > FWIW, I seem no reference to threading libraries in any of the > relevant modules. > > ]# ldd /usr/local/sbin/bacula-dir > /usr/local/sbin/bacula-dir: > libbacfind.so.1 => /usr/local/lib/libbacfind.so.1 (0x280de000) > libbacsql.so.1 => /usr/local/lib/libbacsql.so.1 (0x280ea000) > libbacpy.so.1 => /usr/local/lib/libbacpy.so.1 (0x28107000) > libbaccfg.so.1 => /usr/local/lib/libbaccfg.so.1 (0x28109000) > libbac.so.1 => /usr/local/lib/libbac.so.1 (0x28110000) > libmysqlclient_r.so.16 => /usr/local/lib/mysql/ > libmysqlclient_r.so.16 (0x28160000) > libcrypt.so.4 => /lib/libcrypt.so.4 (0x281dd000) > libz.so.4 => /lib/libz.so.4 (0x281f6000) > libintl.so.8 => /usr/local/lib/libintl.so.8 (0x28208000) > libiconv.so.3 => /usr/local/lib/libiconv.so.3 (0x28211000) > libwrap.so.5 => /usr/lib/libwrap.so.5 (0x282ef000) > libssl.so.5 => /usr/lib/libssl.so.5 (0x282f6000) > libcrypto.so.5 => /lib/libcrypto.so.5 (0x28337000) > libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x28490000) > libm.so.5 => /lib/libm.so.5 (0x28585000) > libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x2859a000) > libc.so.7 => /lib/libc.so.7 (0x285ae000) > > # ldd /usr/local/bin/mysql > /usr/local/bin/mysql: > libreadline.so.7 => /lib/libreadline.so.7 (0x28099000) > libncursesw.so.7 => /lib/libncursesw.so.7 (0x280cb000) > libmysqlclient.so.16 => /usr/local/lib/mysql/ > libmysqlclient.so.16 (0x28117000) > libcrypt.so.4 => /lib/libcrypt.so.4 (0x28181000) > libz.so.4 => /lib/libz.so.4 (0x2819a000) > libstdc++.so.6 => /usr/lib/libstdc++.so.6 (0x281ac000) > libm.so.5 => /lib/libm.so.5 (0x282a1000) > libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x282b6000) > libc.so.7 => /lib/libc.so.7 (0x282c1000) > libncurses.so.7 => /lib/libncurses.so.7 (0x283c3000) > > # ldd /usr/local/lib/mysql/libmysqlclient_r.so.16 > /usr/local/lib/mysql/libmysqlclient_r.so.16: > libcrypt.so.4 => /lib/libcrypt.so.4 (0x28300000) > libm.so.5 => /lib/libm.so.5 (0x28319000) > libz.so.4 => /lib/libz.so.4 (0x2832e000) > libc.so.7 => /lib/libc.so.7 (0x28080000) > > On Oct 21, 2009, at 9:09 AM, Martin Simmons wrote: > > It looks suspicious that the pthread implementation comes from > > /usr/local/lib/mysql/libmysqlclient_r.so.16 in this backtrace. > > Either gdb is > > broken or your libmysqlclient is very strange. > > >>>>>> On Wed, 21 Oct 2009 00:57:40 -0700, Jo Rhett said: > >> > >> The gdb backtrace: > >> > >> (gdb) bt > >> #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > >> #5 0x28615080 in calloc () from /lib/libc.so.7 > >> #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > >> #10 0x286155dc in malloc () from /lib/libc.so.7 > >> #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #13 0x0808b608 in start_UA_server () > >> #14 0x08052a9c in main () > > > > > > __Martin > > > > > > > >> > >> On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: > >>> On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: > >>>> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: >>>>> I assume 6.3 to 7.2 refers to some system OS that you did not >>>>> specify (except >>>>> possibly at the end of your email), and you did not specify what >>>>> version of >>>>> Bacula you are using. > >>>> > >>>> 3.0.2 -- the latest stable. > >>>> >>>>> If this is a FreeBSD machine, someone reported problems with >>>>> networking that >>>>> was due to the fact that the /etc/hosts file specified localhost >>>>> as >>>>> an IPv6 >>>>> address, yet IPv6 was not configured, which of course causes >>>>> problems -- OS >>>>> distribution bug IMO. > >>>> > >>>> Not in the hosts file on this system. > >>>> >>>>> Otherwise, I have no idea what is going wrong. Unfortunately we no >>>>> longer give >>>>> support on this list. We do answer development questions, but >>>>> this >>>>> seems to >>>>> be a support question. Please see www.bacula.org -> Support for >>>>> all >>>>> the >>>>> possible options. > >>>> > >>>> Kern, it's really hard not to read that as a blowoff. > >>> > >>> I am not sure what you mean by a "blowoff". I am definitely telling > >>> you that > >>> unless it is a bug, we don't deal with it on this list. > >>> > >>>> In the many > >>>> years that I've been using bacula, asking questions involving debug > >>>> output on the -users list hasn't been very productive, and I have > >>>> been > >>>> consistently referred to the -devel list. I'm not looking for > >>>> commercial support -- those options only allow you to pay someone > >>>> at > >>>> the same skill level to ask the same question back to this list. > >>>> You > >>>> and I both know there's no commercial support option that won't > >>>> involve the -devel list for a fix. > >>> > >>> I don't believe any of the support options on the web site under > >>> Professional > >>> support involve the bacula-devel list, unless possibly for a bug. > >>> > >>>> > >>>> I'm more than happy to repost this to -users if that's what you > >>>> prefer. But the next step is that you're going to have to tell > >>>> me > >>>> how to get more information out of this. I've looked at dird.c > >>>> which > >>>> contains that message and there doesn't appear to be more debug > >>>> around > >>>> that. > >>> > >>>> What's going to help? > >>> > >>> Sorry, I have no idea without digging into the problem. > >>> > >>>> A gdb trace? > >>> > >>> Perhaps, without seeing it I cannot say. > >>> > >>> However it looks to me more like a network configuration problem or > >>> possibly > >>> an OS bug. > >>> > >>> Bacula *is* known to work properly on FreeBSD 7.2-STABLE. > >>> > >>> > >>> > >>>> > >>>> Here is the ktrace (similar to strace on linux) output near the > >>>> failure. From my reading it is hanging in the _umtx_op() call. > >>>> > >>>> 91887 bacula-dir GIO fd 1 wrote 44 bytes > >>>> "bacula-dir: mysql.c:240-0 close db=28708044 > >>>> " > >>>> 91887 bacula-dir RET write 44/0x2c > >>>> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) > >>>> 91887 bacula-dir GIO fd 4 wrote 5 bytes > >>>> 0x0000 0100 0000 > >>>> 01 > >>>> > >>>> |.....| > >>>> > >>>> 91887 bacula-dir RET write 5 > >>>> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) > >>>> 91887 bacula-dir RET shutdown 0 > >>>> 91887 bacula-dir CALL close(0x4) > >>>> 91887 bacula-dir RET close 0 > >>>> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, > >>>> 0x2,0x2815eea0,0xbfbfe8a4,0,0) > >>>> 91887 bacula-dir RET __sysctl 0 > >>>> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) > >>>> 91887 bacula-dir RET sigaction 0 > >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| > >>>> S_IWUSR) > >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > >>>> 91887 bacula-dir RET open 4 > >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > >>>> 91887 bacula-dir RET lseek 0 > >>>> 91887 bacula-dir CALL close(0x4) > >>>> 91887 bacula-dir RET close 0 > >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| > >>>> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) > >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > >>>> 91887 bacula-dir RET open 4 > >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > >>>> 91887 bacula-dir RET lseek 0 > >>>> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) > >>>> 91887 bacula-dir GIO fd 1 wrote 42 bytes > >>>> "backup0-dir: dird.c:317-0 Start UA server > >>>> " > >>>> 91887 bacula-dir RET write 42/0x2a > >>>> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) > >>>> 91887 bacula-dir RET _umtx_op 0 > >>>> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) > >>>> 91887 bacula-dir RET sigprocmask 0 > >>>> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) > >>>> 91887 bacula-dir RET sigprocmask 0 > >>>> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) > >>>> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call > >>>> 91887 bacula-dir PSIG SIGINT SIG_DFL > >>> > >>> > >>> Sorry, I have no idea what umtx_op does. > >>> I suggest you ask about this on the FreeBSD support list. > >>> > >>> Regards, > >>> > >>> Kern > >> > >> > >> ------------------------------------------------------------------------------ > >> Come build with us! The BlackBerry(R) Developer Conference in SF, CA > >> is the only developer event you need to attend this year. Jumpstart > >> your > >> developing skills, take BlackBerry mobile applications to market > >> and stay > >> ahead of the curve. Join us from November 9 - 12, 2009. Register now! > >> http://p.sf.net/sfu/devconference > >> _______________________________________________ > >> Bacula-devel mailing list > >> Bac...@li... > >> https://lists.sourceforge.net/lists/listinfo/bacula-devel > >> > > -- > Jo Rhett > Net Consonance : consonant endings by net philanthropy, open source > and other randomness > |
From: Jo R. <jr...@ne...> - 2009-10-22 21:05:50
|
On Oct 22, 2009, at 11:10 AM, Martin Simmons wrote: > The lines in the gdb backtrace like this: > > #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ > libmysqlclient_r.so.16 > > mean that gdb found the address 0x281a1aef to be inside the function > pthread_create and that address is within the library > libmysqlclient_r.so.16. Sorry, yeah I know how to read the trace. I just meant that if this was a misreport by gdb I'm way too far out of compiler development stuff to fix gdb ;-) > Normally pthread_create comes from libthr.so on FreeBSD 7. I knew that, which is why I was confused that I didn't see it linked to either bacula-dir or libmysql. > Your ldd output for /usr/local/lib/mysql/libmysqlclient_r.so.16 > shows no > dependency on libthr.so. My guess is that your libmysqlclient_r.so. > 16 has be > miscompiled, using the static version of libthr rather than the > dynamic one. I'll investigate. > I recommend getting rid of that MySQL client installation and using > the > precompiled package for the mysql-client port, which doesn't have this > problem. Every time we mix up the pre-compiled ports we get random dependancy clashes which end up causing us to refresh half the ports tree installed on each machine. (for instance, libiconv just updated which is depended on by nearly everything) That's a massive headache. But I'll see about creating a fresh compile from a different machine, since I've recompiled -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Kern S. <ke...@si...> - 2009-10-22 18:50:56
|
On Thursday 22 October 2009 19:40:37 Jo Rhett wrote: > On Oct 21, 2009, at 1:09 AM, Kern Sibbald wrote: > > What you show below is unfortunately only part of the story. To > > understand > > better what is going on we need a "thread apply all bt" as > > documented in > > the manual. > > (gdb) thread apply all bt > Cannot get thread info: invalid key > > > Also, since it is in the MySQL library, knowing what version > > you are using can be useful. > > 5.1.39 If I am not mistaken, that is the MySQL version that was terribly broken when doing regression on Solaris. > > > This thread seems to be blocked in in the MySQL client libraries. > > This > > could be due to quite a number of things. First is that MySQL is not > > running, I should have said -- MySQL is not running in that particular thread. > > It is running. If you look at the debug output (included here again > for reference) you'll see that the process opens a session with mysql > and does a query successfully, then closes the session before hanging: > > bacula-dir: mysql.c:101-0 db_open first time > bacula-dir: mysql.c:130-0 initdb ref=1 connected=0 db=0 > bacula-dir: mysql.c:166-0 mysql_init done > bacula-dir: mysql.c:187-0 mysql_real_connect done > bacula-dir: mysql.c:189-0 db_user=bacula db_name=bacula > db_password=<snip> > bacula-dir: mysql.c:215-0 opendb ref=1 connected=1 db=28708044 > bacula-dir: sql_create.c:341-0 In create mediatype > bacula-dir: sql_create.c:344-0 selectmediatype: SELECT > MediaTypeId,MediaType FROM MediaType WHERE MediaType='File_SVcolo' > bacula-dir: mysql.c:236-0 closedb ref=0 connected=1 db=28708044 > bacula-dir: mysql.c:240-0 close db=28708044 > backup0-dir: dird.c:317-0 Start UA server > > > after that, your client library may not correspond to the server, > > some other MySQL or Bacula thread may be blocking it, ... > > I'm not sure what you mean by this. If you mean the mysql client, > they are compiled together at the same time on the same machine. So you built your own MySQL? > > >> The gdb backtrace: > >> > >> (gdb) bt > >> #0 0x281a4709 in _umtx_op_err () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #1 0x281a454b in __thr_umutex_lock () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #2 0x2819ef08 in init_static () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #3 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #4 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > >> #5 0x28615080 in calloc () from /lib/libc.so.7 > >> #6 0x2819ecca in mutex_init () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #7 0x2819ef8d in init_static () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #8 0x2819f7b3 in pthread_mutex_trylock () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #9 0x28688434 in _pthread_mutex_trylock () from /lib/libc.so.7 > >> #10 0x286155dc in malloc () from /lib/libc.so.7 > >> #11 0x281a0afb in _thr_alloc () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #12 0x281a1aef in pthread_create () from /usr/local/lib/mysql/ > >> libmysqlclient_r.so.16 > >> #13 0x0808b608 in start_UA_server () > >> #14 0x08052a9c in main () > >> > >> On Oct 10, 2009, at 11:25 AM, Kern Sibbald wrote: > >>> On Saturday 10 October 2009 19:58:05 Jo Rhett wrote: > >>>> On Oct 10, 2009, at 1:55 AM, Kern Sibbald wrote: > >>>>> I assume 6.3 to 7.2 refers to some system OS that you did not > >>>>> specify (except > >>>>> possibly at the end of your email), and you did not specify what > >>>>> version of > >>>>> Bacula you are using. > >>>> > >>>> 3.0.2 -- the latest stable. > >>>> > >>>>> If this is a FreeBSD machine, someone reported problems with > >>>>> networking that > >>>>> was due to the fact that the /etc/hosts file specified localhost > >>>>> as > >>>>> an IPv6 > >>>>> address, yet IPv6 was not configured, which of course causes > >>>>> problems -- OS > >>>>> distribution bug IMO. > >>>> > >>>> Not in the hosts file on this system. > >>>> > >>>>> Otherwise, I have no idea what is going wrong. Unfortunately we no > >>>>> longer give > >>>>> support on this list. We do answer development questions, but > >>>>> this > >>>>> seems to > >>>>> be a support question. Please see www.bacula.org -> Support for > >>>>> all > >>>>> the > >>>>> possible options. > >>>> > >>>> Kern, it's really hard not to read that as a blowoff. > >>> > >>> I am not sure what you mean by a "blowoff". I am definitely telling > >>> you that > >>> unless it is a bug, we don't deal with it on this list. > >>> > >>>> In the many > >>>> years that I've been using bacula, asking questions involving debug > >>>> output on the -users list hasn't been very productive, and I have > >>>> been > >>>> consistently referred to the -devel list. I'm not looking for > >>>> commercial support -- those options only allow you to pay someone > >>>> at > >>>> the same skill level to ask the same question back to this list. > >>>> You > >>>> and I both know there's no commercial support option that won't > >>>> involve the -devel list for a fix. > >>> > >>> I don't believe any of the support options on the web site under > >>> Professional > >>> support involve the bacula-devel list, unless possibly for a bug. > >>> > >>>> I'm more than happy to repost this to -users if that's what you > >>>> prefer. But the next step is that you're going to have to tell > >>>> me > >>>> how to get more information out of this. I've looked at dird.c > >>>> which > >>>> contains that message and there doesn't appear to be more debug > >>>> around > >>>> that. > >>>> > >>>> What's going to help? > >>> > >>> Sorry, I have no idea without digging into the problem. > >>> > >>>> A gdb trace? > >>> > >>> Perhaps, without seeing it I cannot say. > >>> > >>> However it looks to me more like a network configuration problem or > >>> possibly > >>> an OS bug. > >>> > >>> Bacula *is* known to work properly on FreeBSD 7.2-STABLE. > >>> > >>>> Here is the ktrace (similar to strace on linux) output near the > >>>> failure. From my reading it is hanging in the _umtx_op() call. > >>>> > >>>> 91887 bacula-dir GIO fd 1 wrote 44 bytes > >>>> "bacula-dir: mysql.c:240-0 close db=28708044 > >>>> " > >>>> 91887 bacula-dir RET write 44/0x2c > >>>> 91887 bacula-dir CALL write(0x4,0x28763000,0x5) > >>>> 91887 bacula-dir GIO fd 4 wrote 5 bytes > >>>> 0x0000 0100 0000 > >>>> 01 > >>>> > >>>> |.....| > >>>> > >>>> 91887 bacula-dir RET write 5 > >>>> 91887 bacula-dir CALL shutdown(0x4,<invalid=2>) > >>>> 91887 bacula-dir RET shutdown 0 > >>>> 91887 bacula-dir CALL close(0x4) > >>>> 91887 bacula-dir RET close 0 > >>>> 91887 bacula-dir CALL __sysctl(0xbfbfe88c, > >>>> 0x2,0x2815eea0,0xbfbfe8a4,0,0) > >>>> 91887 bacula-dir RET __sysctl 0 > >>>> 91887 bacula-dir CALL sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c) > >>>> 91887 bacula-dir RET sigaction 0 > >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR| > >>>> S_IWUSR) > >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > >>>> 91887 bacula-dir RET open 4 > >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > >>>> 91887 bacula-dir RET lseek 0 > >>>> 91887 bacula-dir CALL close(0x4) > >>>> 91887 bacula-dir RET close 0 > >>>> 91887 bacula-dir CALL open(0x2815eee0,O_RDWR|O_APPEND| > >>>> O_CREAT,S_IRUSR|S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH) > >>>> 91887 bacula-dir NAMI "/var/db/bacula/backup0-dir.conmsg" > >>>> 91887 bacula-dir RET open 4 > >>>> 91887 bacula-dir CALL lseek(0x4,0,SEEK_SET,0x2) > >>>> 91887 bacula-dir RET lseek 0 > >>>> 91887 bacula-dir CALL write(0x1,0x28711000,0x2a) > >>>> 91887 bacula-dir GIO fd 1 wrote 42 bytes > >>>> "backup0-dir: dird.c:317-0 Start UA server > >>>> " > >>>> 91887 bacula-dir RET write 42/0x2a > >>>> 91887 bacula-dir CALL _umtx_op(0xbfbfebd0,0x3,0x1,0,0) > >>>> 91887 bacula-dir RET _umtx_op 0 > >>>> 91887 bacula-dir CALL sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8) > >>>> 91887 bacula-dir RET sigprocmask 0 > >>>> 91887 bacula-dir CALL sigprocmask(SIG_SETMASK,0x287010d8,0) > >>>> 91887 bacula-dir RET sigprocmask 0 > >>>> 91887 bacula-dir CALL _umtx_op(0x281daa80,0x11,0,0,0) > >>>> 91887 bacula-dir RET _umtx_op -1 errno 4 Interrupted system call > >>>> 91887 bacula-dir PSIG SIGINT SIG_DFL > >>> > >>> Sorry, I have no idea what umtx_op does. > >>> I suggest you ask about this on the FreeBSD support list. > >>> > >>> Regards, > >>> > >>> Kern > >> > >> ------------------------------------------------------------------------ > >>------ Come build with us! The BlackBerry(R) Developer Conference in SF, > >> CA is the only developer event you need to attend this year. Jumpstart > >> your > >> developing skills, take BlackBerry mobile applications to market > >> and stay > >> ahead of the curve. Join us from November 9 - 12, 2009. Register now! > >> http://p.sf.net/sfu/devconference > >> _______________________________________________ > >> Bacula-devel mailing list > >> Bac...@li... > >> https://lists.sourceforge.net/lists/listinfo/bacula-devel > > > > Best regards, Kern |
From: Jo R. <jr...@ne...> - 2009-10-22 19:01:14
|
On Oct 22, 2009, at 11:51 AM, Kern Sibbald wrote: >> 5.1.39 > > If I am not mistaken, that is the MySQL version that was terribly > broken when > doing regression on Solaris. Can you point me at where to find which versions are known good for bacula? > So you built your own MySQL? Yes. It's a fact of life on FreeBSD. Unless you use the precompiled versions available, but then you get whatever random dependancies the latest builder uses. For example, bacula as available in binary form depends on sqllite3, which is not production quality for bacula. So we have to compile it ourselves, etc. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |
From: Kern S. <ke...@si...> - 2009-10-22 19:41:06
|
On Thursday 22 October 2009 21:01:00 Jo Rhett wrote: > On Oct 22, 2009, at 11:51 AM, Kern Sibbald wrote: > >> 5.1.39 > > > > If I am not mistaken, that is the MySQL version that was terribly > > broken when > > doing regression on Solaris. > > Can you point me at where to find which versions are known good for > bacula? We don't have any specific listing. In general, all recent versions of MySQL have been very good, but the most recent one put out by Sun was broken at least on a Solaris platform. I am currently using MySQL 5.0.51a. > > > So you built your own MySQL? > > Yes. It's a fact of life on FreeBSD. IMO building yourself is always dangerous unless you are an expert with the particular program (maybe you are with MySQL). > Unless you use the precompiled > versions available, but then you get whatever random dependancies the > latest builder uses. For example, bacula as available in binary form > depends on sqllite3, which is not production quality for bacula. So > we have to compile it ourselves, etc. I suspect your problem is with the MySQL build. Martin's detailed analysis is very good, and he has probably pinpointed the cause of your problems. I suggest you follow his advice. |
From: Jo R. <jr...@ne...> - 2009-10-22 23:19:31
|
On Oct 22, 2009, at 12:41 PM, Kern Sibbald wrote: >> Yes. It's a fact of life on FreeBSD. > > IMO building yourself is always dangerous unless you are an expert > with the > particular program (maybe you are with MySQL). Really? Obviously I'm an old fart who came from "everything has to be compiled locally" years, but I really haven't seen many issues with that. And even less so by carefully controlling the build platform. (yah yah yah chat chat chat, you can ignore this tangent) > I suspect your problem is with the MySQL build. Martin's detailed > analysis is > very good, and he has probably pinpointed the cause of your > problems. I > suggest you follow his advice. I agree completely. I'm just trying to find time to confirm. -- Jo Rhett Net Consonance : consonant endings by net philanthropy, open source and other randomness |