snmpwalk command got a connection timeout after polling hrStorage MIB. And, it cannot query anything for a while.
SNMPv2-MIB::sysName.0 = STRING: ae1dwnaugdbs
HOST-RESOURCES-MIB::hrMemorySize.0 = INTEGER: 12139840 KBytes
HOST-RESOURCES-MIB::hrStorageIndex.1 = INTEGER: 1
HOST-RESOURCES-MIB::hrStorageIndex.3 = INTEGER: 3
HOST-RESOURCES-MIB::hrStorageIndex.6 = INTEGER: 6
HOST-RESOURCES-MIB::hrStorageIndex.7 = INTEGER: 7
HOST-RESOURCES-MIB::hrStorageIndex.8 = INTEGER: 8
HOST-RESOURCES-MIB::hrStorageIndex.10 = INTEGER: 10
Timeout: No Response from test.example.com
Timeout: No Response from test.example.com
==> after query with hrStorage, any query failed with timeout for a while.
I saw the following errors about statfs() with many autofs mount points on the snmpd server while the problem happens.
1621 11:28:59.875371 write(3, "Cannot statfs /informatics/data/sdd\n: Interrupted system call\n", 62) = 62 <0.000006>
1621 11:29:00.775544 write(3, "Cannot statfs /informatics/data/pubchem\n: No such file or directory\n", 68) = 68 <0.000005>
1621 11:29:01.507282 write(3, "Cannot statfs /.bixdbs/genbank\n: No such file or directory\n", 59) = 59 <0.000005>
1621 11:29:02.279816 write(3, "Cannot statfs /informatics/tools\n: No such file or directory\n", 61) = 61 <0.000005>
1621 11:29:03.655527 write(3, "Cannot statfs /sg/win_lsb_groups\n: No such file or directory\n", 61) = 61 <0.000005>
1621 11:29:04.394058 write(3, "Cannot statfs /informatics/data/apps\n: No such file or directory\n", 65) = 65 <0.000005>
1621 11:29:04.613188 write(3, "Cannot statfs /software/prod\n: No such file or directory\n", 57) = 57 <0.000060>
This machine has many autofs entries in /etc/mtab and, it took much time to finish statfs() for those autofs ones. While it's doing statfs(), it could not respond any query from clients.
I think, that best option for this is skip statfs() for autofs by default.
skipNFSInHostResources didn't resolve this problem because autofs is not recognized as a remote file system in _fsys_remote().
Attached patch should skip autofs entries by default. (Functionality confirmed by several testers).
Do you think, this behaviour should be part of net-snmp?
Thanks for your opinions.
Thanks for the patch! A modified version of this patch has been applied. Please retest.
As mentioned in https://sourceforge.net/p/net-snmp/bugs/2968/ we're facing a similar issue in Ubuntu pacakges.
Could you please point out the commit id of modified version of this patch so I can retest?
Thanks!
The commit ID is as follows: cf41e6e91015 ("HOST-MIB: Skip autofs entries").
Thanks very much for this. I retested and confirmed that the modified patch resolves the issue.
I think it introduced regression.
snmpd 5.7.3+dfsg-1ubuntu4.2 on Ubuntu 16.04.06 LTS (before this patch) worked like this:
snmp 5.7.3+dfsg-1ubuntu4.3 on Ubuntu 16.04.06 LTS (with this patch) works like this:
I miss / and /boot entries ;) Both are ext filesystems, mounted via /etc/fstab, no autofs mounts.
Last edit: Lukasz Wasikowski 2019-09-06
We see the same regression where all file sytems are missing from the above snmpwalk with:
snmp 5.7.3+dfsg-1ubuntu4.3 on Ubuntu 16.04.06 LTS
snmp 5.7.3+dfsg-1.8ubuntu3.2 on Ubuntu 18.04.3 LTS
I filed a regression bug in launchpad:
https://bugs.launchpad.net/ubuntu/+source/net-snmp/+bug/1843036
$ git show a0df31c18
commit a0df31c18c513a0d79f4d526b1af7fad48748e57
Author: Bart Van Assche bvanassche@acm.org
Date: Fri Jul 26 21:40:12 2019 -0700
diff --git a/agent/mibgroup/host/hrh_storage.c b/agent/mibgroup/host/hrh_storage.c
index 6f8ff6c53..c7c53922a 100644
--- a/agent/mibgroup/host/hrh_storage.c
+++ b/agent/mibgroup/host/hrh_storage.c
@@ -371,7 +371,7 @@ really_try_next:
NETSNMP_DS_AGENT_SKIPNFSINHOSTRESOURCES) &&
Check_HR_FileSys_NFS())
return NULL;
- if (Check_HR_FileSys_AutoFs())
+ if (HRFS_entry && Check_HR_FileSys_AutoFs())
return NULL;
if (store_idx <= NETSNMP_MEM_TYPE_MAX ) {
mem = (netsnmp_memory_info*)ptr;
Disregard, the fix above is already found in Ubuntu net-snmp:
agent/mibgroup/host/hrh_storage.c
375 if (HRFS_entry && Check_HR_FileSys_AutoFs())
376 return NULL;
I'll investigate further more and do some code inspection. Will keep you posted if I find anything.
getfsstat doesn't seems to be in Linux
https://www.freebsd.org/cgi/man.cgi?query=getfsstat&sektion=2
https://ubuntuforums.org/showthread.php?t=1227025
Look like another method of checking the fs needs to be added for linux.
and shouldn't return 1; be inside the if block ?
buildlog:
checking for library containing getfsstat... no
checking for getfsstat... no
https://launchpadlibrarian.net/440100268/buildlog_ubuntu-bionic-amd64.net-snmp_5.7.3+dfsg-1.8ubuntu3.2_BUILDING.txt.gz
So seems the whole pre-processor block HAVE_GETFSSTAT is ignored, and the result is just this return 1.
I haven't test (minus using gcc -E) but that should do the trick:
The #else instruction never get caught otherwise:
The #endif IMHO is wrongly placed, so that the #else isn't doing what it is suppose to do.
Instead of being a else for HAVE_GETFSSTAT it is an ELSE for MNTTYPE_AUTOFS.
So the case where GETFSSTAT is not supported never get triggered on Linux operating system where getfsstat doesn't exist.
Last edit: Eric 2019-09-06
If one is willing to try that test package:
sudo add-apt-repository ppa:slashd/lp1843036
sudo apt-get update
and let me know the outcome, it will be appreciated.
after re-consideration even better :
We simply drop the HAVE_GETFSSTAT definition, anyway, it's good for both (GETFSSTAT system and not GETFSSTAT system) scenarios to look if the pointer is not NULL anyway.
package to test: 5.7.3+dfsg-1.8ubuntu3.2+testpkg20190906b2
Current ubuntu build now doesn't show ANY physical mounts.
https://bugs.launchpad.net/ubuntu/+source/net-snmp/+bug/1842924
Due to a comparison:
~~~
/ Skip AUTOFS entries /
if (entry->type == NETSNMP_FS_TYPE_AUTOFS)
continue;
~~~
At which point you realize the | is going to make this if TRUE all the time.
The attached patch puts () around the #defines.
When I tested the above, I had also independently made the same change to Check_HR_FileSys_AutoFs - so that tested out fine on Ubuntu.
I can test a PPA for you if you want to roll the above patch in too.
It's already there upstream:
Only thing missing upstream would be to remove the HAVE_GETFSSTAT preprocessing directive then.
Last edit: Eric 2019-09-06
A candidate fix has been applied on the v5.8 and master branches ([bcb1a6b8afc4]). Please retest.
Thanks Bart Van Assche !
The Ubuntu src pkg of 'net'snmp' is in progress to be updated to include the following commit: 71e487212bd65839e7454df9701524d08cf0d74f, which fixes the problem.
To follow the progress, please follow : https://bugs.launchpad.net/ubuntu/+source/net-snmp/+bug/1843036
Thanks everyone !