From: SourceForge.net <no...@so...> - 2008-03-21 00:20:35
|
Bugs item #1794532, was opened at 2007-09-14 10:37 Message generated for change (Comment added) made by tanders You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112694&aid=1794532&group_id=12694 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: agent Group: None Status: Open Resolution: None Priority: 6 Private: No Submitted By: Matthias Saou (thias) Assigned to: Robert Story (rstory) Summary: 5.4.1 segfault in ipAddressTable_container_load Initial Comment: Hi, I've just updated RHEL4 and RHEL5 servers to net-snmp 5.4.1, and have the exact same problem as described here on a few of them : http://sourceforge.net/mailarchive/forum.php?thread_name=46C1A8D9.7040607%40softax.pl&forum_name=net-snmp-coders I was previously running 5.4 fine, but it was spitting many useless messages to syslog, which 5.4.1 fixes on the servers where I tested it. What the problematic servers have in common are "duplicate addresses" : They either have VPN connections established (which are P-t-P with the same address on the local side), or are Xen hosts with many vif and veth interfaces. The other thing that strikes me is that no 64bit servers seem affected, only some 32bit ones. The segfault happens almost immediately after boot. All I get in /var/log/messages is : Sep 14 10:24:20 polar snmpd[31612]: netsnmp_assert !"registration != duplicate" failed agent_registry.c:535 netsnmp_subtree_load() Sep 14 10:24:20 polar last message repeated 2 times Sep 14 10:24:20 polar snmpd[31612]: Duplicate IP address detected, some interfaces may not be visible in IP-MIB Sep 14 10:24:20 polar snmpd[31612]: Duplicate IP address detected, some interfaces may not be visible in IP-MIB Backtrace : (gdb) bt #0 ipAddressTable_container_load (container=0x9907708) at ip-mib/ipAddressTable/ipAddressTable_data_access.c:347 #1 0x0016779d in ipAddressTable_container_init (container_ptr_ptr=0x24c5e0, cache=0x9906ec8) at ip-mib/ipAddressTable/ipAddressTable_data_access.c:137 #2 0x00162e9e in _ipAddressTable_initialize_interface (reg_ptr=0x0, flags=0) at ip-mib/ipAddressTable/ipAddressTable_interface.c:2003 #3 0x0014aceb in initialize_table_ipAddressTable () at ip-mib/ipAddressTable/ipAddressTable.c:104 #4 0x0014ad7a in init_ipAddressTable () at ip-mib/ipAddressTable/ipAddressTable.c:56 #5 0x001fd09a in init_mib_modules () at ../agent/mibgroup/mib_module_inits.h:20 #6 0x004290ef in main (argc=8, argv=0xbf969454) at snmpd.c:901 Please let me know if more information could be useful. ---------------------------------------------------------------------- >Comment By: Thomas Anders (tanders) Date: 2008-03-21 01:20 Message: Logged In: YES user_id=848638 Originator: NO Latest 5.4.x SVN is expected to contain a fix. Could you give it a try? ---------------------------------------------------------------------- Comment By: John (jforsyth) Date: 2008-03-20 17:01 Message: Logged In: YES user_id=2041593 Originator: NO I have also been suffering from the same issue of segfault due to duplicate IP addresses. I have applied diff.zones-541.pat and found that it resolves the issue. But, as stated, all interfaces are set to ipv4z InetAddressType. Is there any update on a patch which sets the correct type? Thank you. ---------------------------------------------------------------------- Comment By: Thomas Anders (tanders) Date: 2007-10-02 13:09 Message: Logged In: YES user_id=848638 Originator: NO For the registration warnings on startup, apply official patch 1805971: http://sf.net/support/tracker.php?aid=1805971 ---------------------------------------------------------------------- Comment By: Matthias Saou (thias) Date: 2007-10-02 12:50 Message: Logged In: YES user_id=34811 Originator: YES With the zones patch applied, snmpd 5.4.1 is now running, thanks! When I restart it, I do get this message 3 times, but it's working fine for what I'm graphing : [...] Oct 2 12:44:04 polar snmpd[10261]: Received TERM or STOP signal... shutting down... Oct 2 12:44:04 polar snmpd[10299]: netsnmp_assert !"registration != duplicate" failed agent_registry.c:535 netsnmp_subtree_load() Oct 2 12:44:04 polar last message repeated 2 times Oct 2 12:44:05 polar snmpd[10299]: NET-SNMP version 5.4.1 ---------------------------------------------------------------------- Comment By: Robert Story (rstory) Date: 2007-10-01 19:59 Message: Logged In: YES user_id=76148 Originator: NO ok, diff.zones-541.pat is an experimental patch which will use 'zones' for all addresses, duplicate or not. If this stops the crashes, then I can look at only using zones when there are duplicates... File Added: diff.zones-541.pat ---------------------------------------------------------------------- Comment By: Matthias Saou (thias) Date: 2007-10-01 18:38 Message: Logged In: YES user_id=34811 Originator: YES File Added: dev.txt ---------------------------------------------------------------------- Comment By: Matthias Saou (thias) Date: 2007-10-01 18:37 Message: Logged In: YES user_id=34811 Originator: YES File Added: if_inet6.txt ---------------------------------------------------------------------- Comment By: Nobody/Anonymous (nobody) Date: 2007-10-01 18:32 Message: Logged In: NO can you attach copies of your /proc/net/if_inet6 and /proc/net/dev? or email to rstory at users dot sourceforge dot net.. ---------------------------------------------------------------------- Comment By: Matthias Saou (thias) Date: 2007-10-01 17:34 Message: Logged In: YES user_id=34811 Originator: YES I just tested with the patch against 5.4.1, which now applies fine, but still get a segfault, of which here is the backtrace : (gdb) bt #0 0x0068756f in ipAddressTable_container_load (container=0x84cd1c8) at ip-mib/ipAddressTable/ipAddressTable_data_access.c:359 #1 0x0068779d in ipAddressTable_container_init (container_ptr_ptr=0x76c5e0, cache=0x84cc988) at ip-mib/ipAddressTable/ipAddressTable_data_access.c:137 #2 0x00682e9e in _ipAddressTable_initialize_interface (reg_ptr=0x0, flags=0) at ip-mib/ipAddressTable/ipAddressTable_interface.c:2003 #3 0x0066aceb in initialize_table_ipAddressTable () at ip-mib/ipAddressTable/ipAddressTable.c:104 #4 0x0066ad7a in init_ipAddressTable () at ip-mib/ipAddressTable/ipAddressTable.c:56 #5 0x0071d09a in init_mib_modules () at ../agent/mibgroup/mib_module_inits.h:20 #6 0x00a470ef in main (argc=2, argv=0xbfed9a14) at snmpd.c:901 It looks like it's pretty much the same :-( Note that this segfault happens very early when I start snmpd, unlike other reports for which the segfaul happens after snmpd has been running for a while. Here's what strace shows just before the segfault : [...] ioctl(7, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="lo", ifr_netmask={AF_INET, inet_addr("255.0.0.0")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="lo", ifr_flags=IFF_UP|IFF_LOOPBACK|IFF_RUNNING}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=2}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="eth0", ifr_netmask={AF_INET, inet_addr("255.255.255.0")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="eth0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="lanbr0", ifr_index=5}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="lanbr0", ifr_netmask={AF_INET, inet_addr("255.255.255.0")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="lanbr0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="lanbr0", ifr_index=5}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="lanbr0", ifr_netmask={AF_INET, inet_addr("255.255.255.0")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="lanbr0", ifr_flags=IFF_UP|IFF_BROADCAST|IFF_RUNNING|IFF_MULTICAST}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="tun44", ifr_index=100}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="tun44", ifr_netmask={AF_INET, inet_addr("255.255.255.255")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="tun44", ifr_flags=IFF_UP|IFF_POINTOPOINT|IFF_RUNNING|IFF_NOARP|IFF_MULTICAST}) = 0 write(3, "Duplicate IP address detected, s"..., 76) = 76 ioctl(7, SIOCGIFINDEX, {ifr_name="tun6", ifr_index=101}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="tun6", ifr_netmask={AF_INET, inet_addr("255.255.255.255")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="tun6", ifr_flags=IFF_UP|IFF_POINTOPOINT|IFF_RUNNING|IFF_NOARP|IFF_MULTICAST}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="tun39", ifr_index=102}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="tun39", ifr_netmask={AF_INET, inet_addr("255.255.255.255")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="tun39", ifr_flags=IFF_UP|IFF_POINTOPOINT|IFF_RUNNING|IFF_NOARP|IFF_MULTICAST}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="tun2", ifr_index=103}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="tun2", ifr_netmask={AF_INET, inet_addr("255.255.255.255")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="tun2", ifr_flags=IFF_UP|IFF_POINTOPOINT|IFF_RUNNING|IFF_NOARP|IFF_MULTICAST}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="tun7", ifr_index=104}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="tun7", ifr_netmask={AF_INET, inet_addr("255.255.255.255")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="tun7", ifr_flags=IFF_UP|IFF_POINTOPOINT|IFF_RUNNING|IFF_NOARP|IFF_MULTICAST}) = 0 ioctl(7, SIOCGIFINDEX, {ifr_name="tun45", ifr_index=105}) = 0 ioctl(7, SIOCGIFNETMASK, {ifr_name="tun45", ifr_netmask={AF_INET, inet_addr("255.255.255.255")}}) = 0 ioctl(7, SIOCGIFFLAGS, {ifr_name="tun45", ifr_flags=IFF_UP|IFF_POINTOPOINT|IFF_RUNNING|IFF_NOARP|IFF_MULTICAST}) = 0 close(7) = 0 open("/proc/net/if_inet6", O_RDONLY|O_LARGEFILE) = 7 fstat64(7, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f49000 read(7, "00000000000000000000000000000001"..., 4096) = 216 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8 ioctl(8, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0 close(8) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8 ioctl(8, SIOCGIFINDEX, {ifr_name="eth0", ifr_index=2}) = 0 close(8) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8 ioctl(8, SIOCGIFINDEX, {ifr_name="lanbr0", ifr_index=5}) = 0 close(8) = 0 socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 8 ioctl(8, SIOCGIFINDEX, {ifr_name="eth1", ifr_index=3}) = 0 close(8) = 0 write(3, "Duplicate IP address detected, s"..., 76) = 76 read(7, "", 4096) = 0 close(7) = 0 munmap(0xb7f49000, 4096) = 0 --- SIGSEGV (Segmentation fault) @ 0 (0) --- +++ killed by SIGSEGV (core dumped) +++ Process 30120 detached ---------------------------------------------------------------------- Comment By: Thomas Anders (tanders) Date: 2007-09-28 15:41 Message: Logged In: YES user_id=848638 Originator: NO The original patch is against SVN trunk. I've attached another version of the patch that is against 5.4.1. ---------------------------------------------------------------------- Comment By: Thomas Anders (tanders) Date: 2007-09-28 15:40 Message: Logged In: YES user_id=848638 Originator: NO File Added: diff.ipaddress-patch-541 ---------------------------------------------------------------------- Comment By: Matthias Saou (thias) Date: 2007-09-26 18:10 Message: Logged In: YES user_id=34811 Originator: YES Against what version is this patch supposed to apply? I've tried 5.4.1 and it seems quite far off... [dude@python3 net-snmp-5.4.1]$ patch -p0 --dry-run < ../diff.ipaddress-patch patching file agent/mibgroup/ip-mib/ipAddressTable/ipAddressTable_data_access.c Reversed (or previously applied) patch detected! Assume -R? [n] n Apply anyway? [n] n Skipping patch. 1 out of 1 hunk ignored -- saving rejects to file agent/mibgroup/ip-mib/ipAddressTable/ipAddressTable_data_access.c.rej patching file agent/mibgroup/ip-mib/data_access/ipaddress_ioctl.c Hunk #1 FAILED at 318. 1 out of 1 hunk FAILED -- saving rejects to file agent/mibgroup/ip-mib/data_access/ipaddress_ioctl.c.rej patching file agent/mibgroup/ip-mib/data_access/ipaddress_common.c patching file agent/mibgroup/ip-mib/data_access/ipaddress_linux.c Hunk #1 FAILED at 430. 1 out of 1 hunk FAILED -- saving rejects to file agent/mibgroup/ip-mib/data_access/ipaddress_linux.c.rej ---------------------------------------------------------------------- Comment By: Robert Story (rstory) Date: 2007-09-25 22:23 Message: Logged In: YES user_id=76148 Originator: NO can you try the attached patch? File Added: diff.ipaddress-patch ---------------------------------------------------------------------- Comment By: Matthias Saou (thias) Date: 2007-09-18 15:19 Message: Logged In: YES user_id=34811 Originator: YES Possibly. Of the 4 servers on which it happens for me, 3 are Xen hosts, thus have some bridge interfaces automatically configured, and the other one is an office gateway with a VPN setup requiring a bridge between a physical eth and a virtual tap interfaces. But in my case, these are all RHEL5 i386 (32bit), I haven't seen the problem on any RHEL4 i386 nor RHEL5 x86_64. And net-snmp doesn't even start, the segfault happens very early when starting it. And as I already wrote, 5.4 was running fine on those same servers. ---------------------------------------------------------------------- Comment By: Thomas Anders (tanders) Date: 2007-09-17 21:30 Message: Logged In: YES user_id=848638 Originator: NO Probably related to / a dup of 1792716. ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112694&aid=1794532&group_id=12694 |