That seems to corroborate Zoran and Vlad's findings that has
something to do with hostname resolution at startup.
I'm going to leave this bug open until it's fixed: the startup
error should be more enlightening than the "Ns_Tls: invalid
key" error. Whatever is supposed to be setting that key
should check for name resolution failure and generate a
sensible error message.
Thanks for responding.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
I just tested with a hostname that has no DNS and is not
in /etc/hosts, along with the sample-config.tcl -- [ns_info
address] returned an empty string (as it couldn't resolve the
hostname, obviously) and nssock complained that it couldn't
listen on ":8000".
nssock: failed to listen on :8000: No such file or directory
At no point did I get the "Ns_Tls: invalid key: 0" error,
though. This is on Debian stable/testing with glibc 2.3 and
NPTL.
Need to find a RH AS 2.1 machine that I can experiment on ...
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
OK, I just built a clean AOLserver 4.0.5 on the RH AS 2.1 box,
and with the unmodified sample-config.tcl:
# bin/nsd -ft sample-config.tcl -u admin
...
[01/Jul/2004:16:38:52][22093.1024][-main-] Notice: nsmain:
AOLserver/4.0.5 starting
...
[01/Jul/2004:16:38:52][22093.1024][-main-] Error: dns:
gethostbyname failed: no valid IP address
[01/Jul/2004:16:38:52][22093.1024][-main-] Error: nssock:
failed to listen on :8000: No such file or directory
[01/Jul/2004:16:38:52][22093.3076][-driver-] Notice: starting
[01/Jul/2004:16:38:52][22093.3076][-driver-] Notice: driver:
accepting connections
I cannot reproduce this problem. What config .tcl file did you
use to start the server that gave you this error? If it wasn't
the stock, unmodified sample-config.tcl, could you attach the
file to this bug? Thanks.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Ian Harding sent me an email that may shed some light on
this issue:
Date: Mon, 28 Jun 2004 10:40:43 -0700
From: "Ian Harding" <ianh@tpchd.org>
To: <aolserver@listserv.aol.com>
Subject: nsd Dumps Core if Interface Down at Startup
Ns_Tls: invalid key: 0: should be between 1 and 100
[1] Abort trap (core dumped) /usr/pkg/bin/nsd...
I get the above error when starting aolserver whenever
the network =
interface is down. Obviously, a server needs a valid
interface, but this =
seems like a cryptic error and ungraceful exit.
Of course, this could just be me.
NetBSD workstation.tpchd.org 2.0_BETA NetBSD 2.0_BETA
(iharding) #0: Thu =
May 13 14:25:35 UTC 2004
iharding@:/usr/src/sys/arch/i386/compile/iharding=
i386
aolserver-4.01 America Online's open source web server
aols-postgres-4.0 Postgres database access module for
aolserver
#0 0x48192e1b in kill () from /usr/lib/libc.so.12
(gdb) bt
#0 0x48192e1b in kill () from /usr/lib/libc.so.12
#1 0x482073db in abort () from /usr/lib/libc.so.12
#2 0x48114c76 in Tcl_PanicVA ()
from /usr/pkg/lib/libtcl84.so.1
#3 0x48114c95 in Tcl_Panic ()
from /usr/pkg/lib/libtcl84.so.1
#4 0x480acc3c in Ns_TlsGet ()
from /usr/pkg/lib/libnsthread.so
#5 0x4807cf8e in NsTclLogObjCmd ()
from /usr/pkg/lib/libnsd.so
#6 0x4807cab5 in NsTclLogObjCmd ()
from /usr/pkg/lib/libnsd.so
#7 0x4807c621 in Ns_Log () from /usr/pkg/lib/libnsd.so
#8 0x48075c09 in NsEnableDNSCache ()
from /usr/pkg/lib/libnsd.so
#9 0x48075b1e in NsEnableDNSCache ()
from /usr/pkg/lib/libnsd.so
#10 0x48075850 in Ns_GetAddrByHost ()
from /usr/pkg/lib/libnsd.so
#11 0x480757f5 in Ns_GetAddrByHost ()
from /usr/pkg/lib/libnsd.so
#12 0x4807d8eb in NsInitConf () from /usr/pkg/lib/libnsd.so
#13 0x4807bc49 in _init () from /usr/pkg/lib/libnsd.so
#14 0x4804bcc0 in _rtld () from /usr/libexec/ld.elf_so
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
The problem comes from not being able to resolve the
hostname of the server via DNS, and at start-up when the DNS
API tries to Ns_Log the error, it fails because logging
isn't initialized at that point yet, resulting in the Ns_Tls
error.
The fix committed to 4.0.7 ensures that the logging
subsystem is initialized before the DNS attempt is made.
If you would like to refer to this comment somewhere else in this project, copy and paste the following link:
Logged In: YES
user_id=21885
(Moving this from Support Requests to Bugs tracker.)
Zoran also discovered a similar problem:
http://article.gmane.org/gmane.comp.web.aolserver/9849
The upside is that Nate indicates he may have introduced
this bug in AOLserver 3.5.8:
http://article.gmane.org/gmane.comp.web.aolserver/8892
Considering that the error was happening in AOLserver 4.0,
either the bug didn't get fixed (unlikely) or what we're seeing
is a different bug.
Vlad seems to have a useful looking gdb backtrace showing
the problem:
http://article.gmane.org/gmane.comp.web.aolserver/9705
Samer -- can you still reproduce this problem with the latest
AOLserver 4.0.5 release?
Logged In: YES
user_id=424745
I remember that I passed this problem by setting a fully
qualified hostname..
hostname server.domain.com
instead of a simple hostname like "server"
Logged In: YES
user_id=21885
That seems to corroborate Zoran and Vlad's findings that has
something to do with hostname resolution at startup.
I'm going to leave this bug open until it's fixed: the startup
error should be more enlightening than the "Ns_Tls: invalid
key" error. Whatever is supposed to be setting that key
should check for name resolution failure and generate a
sensible error message.
Thanks for responding.
Logged In: YES
user_id=21885
I just tested with a hostname that has no DNS and is not
in /etc/hosts, along with the sample-config.tcl -- [ns_info
address] returned an empty string (as it couldn't resolve the
hostname, obviously) and nssock complained that it couldn't
listen on ":8000".
nssock: failed to listen on :8000: No such file or directory
At no point did I get the "Ns_Tls: invalid key: 0" error,
though. This is on Debian stable/testing with glibc 2.3 and
NPTL.
Need to find a RH AS 2.1 machine that I can experiment on ...
Logged In: YES
user_id=21885
Samer, I was able to get my hands on a RH AS 2.1 box to
test, and still cannot reproduce your bug.
# uname -a
Linux ... 2.4.9-e.27smp #1 SMP Tue Aug 5 15:49:54 EDT
2003 i686 unknown
# nsd -ft sample-config.tcl -u admin
...
[01/Jul/2004:16:07:44][18658.1024][-main-] Notice: nsmain:
AOLserver/3.5.5 starting
...
[01/Jul/2004:16:07:44][18658.1024][-main-] Error: dns:
gethostbyname failed: no valid IP address
[01/Jul/2004:16:07:44][18658.1024][-main-] Error: nssock:
failed to listen on :8000: No such file or directory
[01/Jul/2004:16:07:44][18658.3076][-nssock-] Notice:
nssock: starting
[01/Jul/2004:16:07:44][18658.3076][-nssock-] Notice:
nssock: accepting connections
# ldd nsd
...
libpthread.so.0 => /lib/i686/libpthread.so.0
(0x40025000)
libm.so.6 => /lib/i686/libm.so.6 (0x4003b000)
libc.so.6 => /lib/i686/libc.so.6 (0x4005e000)
# ls -l /lib/i686/libpthread.so.0 /lib/i686/libc.so.6
lrwxrwxrwx 1 root root 13 May 8
2003 /lib/i686/libc.so.6 -> libc-2.2.4.so
lrwxrwxrwx 1 root root 17 May 8
2003 /lib/i686/libpthread.so.0 -> libpthread-0.9.so
I just realized this is 3.5.5. Let me try building 4.0.5 on this
box ...
Logged In: YES
user_id=21885
OK, I just built a clean AOLserver 4.0.5 on the RH AS 2.1 box,
and with the unmodified sample-config.tcl:
# bin/nsd -ft sample-config.tcl -u admin
...
[01/Jul/2004:16:38:52][22093.1024][-main-] Notice: nsmain:
AOLserver/4.0.5 starting
...
[01/Jul/2004:16:38:52][22093.1024][-main-] Error: dns:
gethostbyname failed: no valid IP address
[01/Jul/2004:16:38:52][22093.1024][-main-] Error: nssock:
failed to listen on :8000: No such file or directory
[01/Jul/2004:16:38:52][22093.3076][-driver-] Notice: starting
[01/Jul/2004:16:38:52][22093.3076][-driver-] Notice: driver:
accepting connections
I cannot reproduce this problem. What config .tcl file did you
use to start the server that gave you this error? If it wasn't
the stock, unmodified sample-config.tcl, could you attach the
file to this bug? Thanks.
Logged In: YES
user_id=21885
Ian Harding sent me an email that may shed some light on
this issue:
Date: Mon, 28 Jun 2004 10:40:43 -0700
From: "Ian Harding" <ianh@tpchd.org>
To: <aolserver@listserv.aol.com>
Subject: nsd Dumps Core if Interface Down at Startup
Ns_Tls: invalid key: 0: should be between 1 and 100
[1] Abort trap (core dumped) /usr/pkg/bin/nsd...
I get the above error when starting aolserver whenever
the network =
interface is down. Obviously, a server needs a valid
interface, but this =
seems like a cryptic error and ungraceful exit.
Of course, this could just be me.
NetBSD workstation.tpchd.org 2.0_BETA NetBSD 2.0_BETA
(iharding) #0: Thu =
May 13 14:25:35 UTC 2004
iharding@:/usr/src/sys/arch/i386/compile/iharding=
i386
aolserver-4.01 America Online's open source web server
aols-postgres-4.0 Postgres database access module for
aolserver
#0 0x48192e1b in kill () from /usr/lib/libc.so.12
(gdb) bt
#0 0x48192e1b in kill () from /usr/lib/libc.so.12
#1 0x482073db in abort () from /usr/lib/libc.so.12
#2 0x48114c76 in Tcl_PanicVA ()
from /usr/pkg/lib/libtcl84.so.1
#3 0x48114c95 in Tcl_Panic ()
from /usr/pkg/lib/libtcl84.so.1
#4 0x480acc3c in Ns_TlsGet ()
from /usr/pkg/lib/libnsthread.so
#5 0x4807cf8e in NsTclLogObjCmd ()
from /usr/pkg/lib/libnsd.so
#6 0x4807cab5 in NsTclLogObjCmd ()
from /usr/pkg/lib/libnsd.so
#7 0x4807c621 in Ns_Log () from /usr/pkg/lib/libnsd.so
#8 0x48075c09 in NsEnableDNSCache ()
from /usr/pkg/lib/libnsd.so
#9 0x48075b1e in NsEnableDNSCache ()
from /usr/pkg/lib/libnsd.so
#10 0x48075850 in Ns_GetAddrByHost ()
from /usr/pkg/lib/libnsd.so
#11 0x480757f5 in Ns_GetAddrByHost ()
from /usr/pkg/lib/libnsd.so
#12 0x4807d8eb in NsInitConf () from /usr/pkg/lib/libnsd.so
#13 0x4807bc49 in _init () from /usr/pkg/lib/libnsd.so
#14 0x4804bcc0 in _rtld () from /usr/libexec/ld.elf_so
Logged In: YES
user_id=21885
Personal notes -- HEAD, nsd/nsconf.c:92.
Logged In: YES
user_id=21885
The fix has been commited to CVS HEAD and will be
backported for 4.0.7.
Logged In: YES
user_id=21885
The problem comes from not being able to resolve the
hostname of the server via DNS, and at start-up when the DNS
API tries to Ns_Log the error, it fails because logging
isn't initialized at that point yet, resulting in the Ns_Tls
error.
The fix committed to 4.0.7 ensures that the logging
subsystem is initialized before the DNS attempt is made.