From: SourceForge.net <no...@so...> - 2007-11-30 23:13:18
|
Bugs item #1831901, was opened at 2007-11-14 18:03 Message generated for change (Comment added) made by tanders You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112694&aid=1831901&group_id=12694 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: agent Group: linux >Status: Pending Resolution: None Priority: 5 Private: No Submitted By: Nobody/Anonymous (nobody) Assigned to: Nobody/Anonymous (nobody) Summary: segfault snmpd 5.4.1 on Linux 64bits Initial Comment: The snmpd agent has crashed several times on two Linux 64 (OpenSuse 10.3 - Linux 2.6.22.12-0.1-default #1 SMP x86_64 GNU/Linux) I have configured some external objects with the exec directive which runs a shell script to get some data from a running application. A can query the first four objects, when a get into the fifth the snmpd agent crashes with a segfault error. Here is the run call with the error: host0:~ # /usr/sbin/snmpd -r -A -f -Le -D -p /var/run/snmpd.pid &> snmpd.out Segmentation fault host0:~ # And follow attached the debug log file of snmpd agent (gziped). ---------------------------------------------------------------------- >Comment By: Thomas Anders (tanders) Date: 2007-12-01 00:13 Message: Logged In: YES user_id=848638 Originator: NO Shouldn't the 4th entry start with "exec jmxmemheapused" instead? It looks like you're having duplicate NAMEs for exec statements which is a bad idea, even though it should be handled more gracefully. ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2007-11-30 13:08 Message: Logged In: NO syslocation ******************* syscontact ***************** rocommunity ********************* rocommunity ********************* master no agentuser nobody agentgroup nobody sysservices 78 exec jmxthread /bin/sh /home/hoplon/bin/getJMXData.sh Thread:Global:ThreadCount exec jmxdaemonthread /bin/sh /home/hoplon/bin/getJMXData.sh Thread:Global:DaemonThreadCount exec jmxmemheapcom /bin/sh /home/hoplon/bin/getJMXData.sh Memory:HeapMemoryUsage:Committed exec jmxmemheapcom /bin/sh /home/hoplon/bin/getJMXData.sh Memory:HeapMemoryUsage:Used_MB exec jmxmemheapmax /bin/sh /home/hoplon/bin/getJMXData.sh Memory:HeapMemoryUsage:Max_MB exec jmxmemnoheapcom /bin/sh /home/hoplon/bin/getJMXData.sh Memory:NonHeapMemoryUsage:Committed exec jmxmemnonheapused /bin/sh /home/hoplon/bin/getJMXData.sh Memory:NonHeapMemoryUsage:Used_MB exec jmxmemnonheapused /bin/sh /home/hoplon/bin/getJMXData.sh Memory:NonHeapMemoryUsage:Max_MB ---------------------------------------------------------------------- Comment By: Thomas Anders (tanders) Date: 2007-11-27 13:47 Message: Logged In: YES user_id=848638 Originator: NO Please post the content of your snmpd.conf (obfuscating passwords/communities is fine). ---------------------------------------------------------------------- Comment By: Rafael Henchen (rhenchen) Date: 2007-11-27 13:22 Message: Logged In: YES user_id=1937359 Originator: NO Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x2afef242d270 (LWP 24731)] 0x00002afeef49219d in var_extensible_old () from /usr/lib64/libnetsnmpmibs.so.15 (gdb) bt #0 0x00002afeef49219d in var_extensible_old () from /usr/lib64/libnetsnmpmibs.so.15 #1 0x00002afeef231428 in netsnmp_old_api_helper () from /usr/lib64/libnetsnmphelpers.so.15 #2 0x00002afeeeff516c in netsnmp_call_handlers () from /usr/lib64/libnetsnmpagent.so.15 #3 0x00002afeeefe5345 in handle_var_requests () from /usr/lib64/libnetsnmpagent.so.15 #4 0x00002afeeefe6fce in handle_pdu () from /usr/lib64/libnetsnmpagent.so.15 #5 0x00002afeeefe93b9 in netsnmp_handle_request () from /usr/lib64/libnetsnmpagent.so.15 #6 0x00002afeeefe985e in handle_snmp_packet () from /usr/lib64/libnetsnmpagent.so.15 #7 0x00002afeeff80def in ?? () from /usr/lib64/libnetsnmp.so.15 #8 0x00002afeeff81784 in _sess_read () from /usr/lib64/libnetsnmp.so.15 #9 0x00002afeeff8228d in snmp_sess_read () from /usr/lib64/libnetsnmp.so.15 #10 0x00002afeeff82303 in snmp_read () from /usr/lib64/libnetsnmp.so.15 #11 0x00005555555590d0 in main () from /usr/sbin/snmpd (gdb) ---------------------------------------------------------------------- Comment By: Thomas Anders (tanders) Date: 2007-11-18 22:02 Message: Logged In: YES user_id=848638 Originator: NO Can you please type "bt" at the gdb prompt to get us a full backtrace? ---------------------------------------------------------------------- Comment By: Rafael Henchen (rhenchen) Date: 2007-11-16 13:38 Message: Logged In: YES user_id=1937359 Originator: NO I have noticed the "duplicate table data attempted to be entered. row exists" warning message only on gdb. And looking closely to my snmpd.conf i have discovered that the fourth and the fifth declared objects has the same name. Looks like the bug #1020597, was not fixed properly on 5.1.2-p02 version, as said on the ERRATA. Looks like before the fix the segfault crash has happened on the "registration" of objects, and now the agent is crashing because the "get" it's interacting over this accepted duplicated item! I think that can be easily to fix, on a first moment, only putting an adequate error message and don't letting the agent to start on that case. ---------------------------------------------------------------------- Comment By: Rafael Henchen (rhenchen) Date: 2007-11-16 13:03 Message: Logged In: YES user_id=1937359 Originator: NO This was the output that i get over gdb (gdb) run -f -Lo The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/sbin/snmpd -f -Lo (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) [Thread debugging using libthread_db enabled] [New Thread 0x2ba26b94e270 (LWP 25638)] (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) (no debugging symbols found) netsnmp_assert !"registration != duplicate" failed agent_registry.c:535 netsnmp_subtree_load() netsnmp_assert !"registration != duplicate" failed agent_registry.c:535 netsnmp_subtree_load() netsnmp_assert !"registration != duplicate" failed agent_registry.c:535 netsnmp_subtree_load() duplicate table data attempted to be entered. row exists NET-SNMP version 5.4.1 Connection from UDP: [127.0.0.1]:32818 Connection from UDP: [127.0.0.1]:32818 Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x2ba26b94e270 (LWP 25638)] 0x00002ba2689b319d in var_extensible_old () from /usr/lib64/libnetsnmpmibs.so.15 (gdb) ---------------------------------------------------------------------- Comment By: Thomas Anders (tanders) Date: 2007-11-14 18:10 Message: Logged In: YES user_id=848638 Originator: NO Can you get us a gdb backtrace? See the Net-SNMP Wiki section "Debugging" for instructions. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112694&aid=1831901&group_id=12694 |