I'm running a Dancer 4.15b6 on a FreeBSD box:
FreeBSD claudio.csita.unige.it 4.0-RELEASE FreeBSD 4.0-RELEASE #0: Tue Jun 13 15:51:23 CEST 2000 claudio@claudio.csita.unige.it:/usr/src/sys/compile/CDiMa i386
While my bot is running generally fine and showing uptimes of weeks
sometimes it happens that it crashes with apparently no reason.
It's difficult to debug this bug because it's difficult to reproduce
the conditions under which it manifests itself.
Here is the stacktrace from the logfile:
21.43.34 Join TALOS [0] (Taloxanto@h213-48-215.PD1.albacom.net)
Output from gdb
Attaching to program: /home/claudio/bot/dancer, process 38128
Reading symbols from /usr/lib/libcrypt.so.2...done.
Reading symbols from /usr/lib/libm.so.2...done.
Reading symbols from /usr/lib/compat/libc.so.3...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
0x2813ae08 in wait4 () from /usr/lib/compat/libc.so.3
(gdb) #0 0x2813ae08 in wait4 () from /usr/lib/compat/libc.so.3
#1 0x28140c4c in waitpid () from /usr/lib/compat/libc.so.3
#2 0x804a151 in DumpStack (
format=0x8084d89 "gdb -q %s %d 2>/dev/null <<EOF\nset prompt\nwhere\ndetach\
nshell kill -CONT %d\nquit\nEOF\n") at stacktrace.c:181
#3 0x804a30b in StackTrace () at stacktrace.c:248
#4 0x804a385 in SignalCrashHandler (sig=11) at dancer.c:378
#5 0xbfbfffcc in ?? ()
#6 0x807fae8 in StrEqual (
first=0xe000312 <Error reading address 0xe000312: Bad address>,
second=0x8338400 "Taloxanto@*.PD1.albacom.net") at strio.c:193
#7 0x807877b in MultiCheck (domain=0x8338400 "Taloxanto@*.PD1.albacom.net")
at flood.c:331
#8 0x804e528 in OnJoin (
from=0xbfbff565 "TALOS!Taloxanto@h213-48-215.PD1.albacom.net",
line=0xbfbff596 ":#consoli") at server.c:1054
#9 0x804c3e7 in ParseServer (
line=0xbfbff564 ":TALOS!Taloxanto@h213-48-215.PD1.albacom.net")
at server.c:178
#10 0x805d2ff in HandleSockets (readset=0xbfbff804, writeset=0xbfbff784)
at netstuff.c:661
#11 0x805dd54 in EventLoop () at netstuff.c:908
#12 0x8049eb8 in Start (s=0x8303cc0) at dancer.c:299
#13 0x804a889 in main (argc=1, argv=0xbfbff964) at dancer.c:605
Detaching from program: /home/claudio/bot/dancer, process 38128
Output from gdb
Attaching to program: /home/claudio/bot/dancer, process 38128
Reading symbols from /usr/lib/libcrypt.so.2...done.
Reading symbols from /usr/lib/libm.so.2...done.
Reading symbols from /usr/lib/compat/libc.so.3...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
0x2813ae08 in wait4 () from /usr/lib/compat/libc.so.3
(gdb) #0 0x2813ae08 in wait4 () from /usr/lib/compat/libc.so.3
#1 0x28140c4c in waitpid () from /usr/lib/compat/libc.so.3
#2 0x804a151 in DumpStack (
format=0x8084d89 "gdb -q %s %d 2>/dev/null <<EOF\nset prompt\nwhere\ndetach\
nshell kill -CONT %d\nquit\nEOF\n") at stacktrace.c:181
#3 0x804a30b in StackTrace () at stacktrace.c:248
#4 0x804a385 in SignalCrashHandler (sig=11) at dancer.c:378
#5 0xbfbfffcc in ?? ()
#6 0x804a39c in SignalCrashHandler (sig=11) at dancer.c:384
#7 0xbfbfffcc in ?? ()
#8 0x807fae8 in StrEqual (
first=0xe000312 <Error reading address 0xe000312: Bad address>,
second=0x8338400 "Taloxanto@*.PD1.albacom.net") at strio.c:193
[...]
Detaching from program: /home/claudio/bot/dancer, process 38128
Output from gdb
Attaching to program: /home/claudio/bot/dancer, process 38128
Reading symbols from /usr/lib/libcrypt.so.2...done.
Reading symbols from /usr/lib/libm.so.2...done.
Reading symbols from /usr/lib/compat/libc.so.3...done.
Reading symbols from /usr/libexec/ld-elf.so.1...done.
0x2813ae08 in wait4 () from /usr/lib/compat/libc.so.3
(gdb) #0 0x2813ae08 in wait4 () from /usr/lib/compat/libc.so.3
#1 0x28140c4c in waitpid () from /usr/lib/compat/libc.so.3
#2 0x804a151 in DumpStack (
format=0x8084d89 "gdb -q %s %d 2>/dev/null <<EOF\nset prompt\nwhere\ndetach\
nshell kill -CONT %d\nquit\nEOF\n") at stacktrace.c:181
#3 0x804a30b in StackTrace () at stacktrace.c:248
#4 0x804a385 in SignalCrashHandler (sig=11) at dancer.c:378
#5 0xbfbfffcc in ?? ()
#6 0x804a39c in SignalCrashHandler (sig=11) at dancer.c:384
#7 0xbfbfffcc in ?? ()
#8 0x804a39c in SignalCrashHandler (sig=11) at dancer.c:384
#9 0xbfbfffcc in ?? ()
#10 0x807fae8 in StrEqual (
from then on it went looping adding a sig=11 after each iteration.
Logged In: YES
user_id=32285
When is it most likely to happen? When few are in the
channel, or when many are in the channel?
(You see, I am wondering if (g->ident->userdomain) could get
a NULL value for some reason).
If this is the case, maybe you can fix it temporarely by
inserting a check before line 193 in strio.c. like
if(first && second)
or something like that.
Would be helpfull if you found a way to reproduce the error,
I guess.