From: SourceForge.net <no...@so...> - 2007-10-18 19:20:21
|
Bugs item #1197183, was opened at 2005-05-07 06:26 Message generated for change (Comment added) made by gsaray101 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112694&aid=1197183&group_id=12694 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: agent Group: linux Status: Closed Resolution: Fixed Priority: 5 Private: No Submitted By: Roland Rosenfeld (roro) Assigned to: Robert Story (rstory) Summary: ssRawCpu* stays on 2^32-1 instead of rolling over Initial Comment: Hi! In UCD-SNMP-MIB there are some ssRawCpu* objects defined under systemStats. I use these for RRD graphics and noticed on one machine after several months without reboot, that ssRawCpuIdle reached 4294967295 (=2^32-1) and stayed on this instead of rolling over and starting with 0 again like Counter32 (the type of ssRawCpuIdle) let me expect. I searched the code and found the problem in agent/mibgroup/ucd-snmp/vmstat.c function getstat(), where /proc/stat is read and the line beginning with "cpu " is read using sscanf(b, "cpu %lu %lu %lu %lu %lu %lu %lu", cuse, cice, csys, cide, ciow, cirq, csoft) This causes trouble if one of the numbers becomes larger than 2^32, which results in cide been set to 4294967295 instead of dividing it by 2^32 and showing only the rest. I created the attached patch, which uses sscanf with %llu instead of %lu and scanning into unsigned long long variables which are casted to unsigned long at the end, which realizes the 32bit roll over. My system is Debian sarge with net-snmp 5.1.2-6.1 on i386 Linux 2.6. For the records here an example of plain 5.1.2-6.1: $ snmpwalk -v2c -cxxxxx xxx.xxx.xx.xx 1.3.6.1.4.1.2021.11 [...] UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 103338508 UCD-SNMP-MIB::ssCpuRawNice.0 = Counter32: 712456 UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 21972438 UCD-SNMP-MIB::ssCpuRawIdle.0 = Counter32: 4294967295 <--- UCD-SNMP-MIB::ssCpuRawWait.0 = Counter32: 1736944 UCD-SNMP-MIB::ssCpuRawKernel.0 = Counter32: 6566013 UCD-SNMP-MIB::ssCpuRawInterrupt.0 = Counter32: 4044025 [...] And after applying my patch: UCD-SNMP-MIB::ssCpuRawUser.0 = Counter32: 103339140 UCD-SNMP-MIB::ssCpuRawNice.0 = Counter32: 712456 UCD-SNMP-MIB::ssCpuRawSystem.0 = Counter32: 21973413 UCD-SNMP-MIB::ssCpuRawIdle.0 = Counter32: 30284027 <--- UCD-SNMP-MIB::ssCpuRawWait.0 = Counter32: 1737308 UCD-SNMP-MIB::ssCpuRawKernel.0 = Counter32: 6566223 UCD-SNMP-MIB::ssCpuRawInterrupt.0 = Counter32: 4044197 Tschoeeee Roland ---------------------------------------------------------------------- Comment By: mike (gsaray101) Date: 2007-10-18 14:20 Message: Logged In: YES user_id=1916458 Originator: NO how do I apply this patch. Do I just replaced the existing vmstat.c file with vmstat.32bit-overflow.patch? ---------------------------------------------------------------------- Comment By: Robert Story (rstory) Date: 2005-07-13 13:43 Message: Logged In: YES user_id=76148 similar fix for contexts/intr.. ---------------------------------------------------------------------- Comment By: Robert Story (rstory) Date: 2005-06-14 11:01 Message: Logged In: YES user_id=76148 ok sounds reasonable. thanks for the patch. applied for future releases 5.1.3, 5.2.2, 5.3 and later. ---------------------------------------------------------------------- Comment By: Roland Rosenfeld (roro) Date: 2005-06-09 17:27 Message: Logged In: YES user_id=43129 > is this a 64bit machine? No, I noticed this on several i686 machines (Intel Dual-Xeon) machines. It usually happens after approximately after half a year uptime. Just look at the forth number of the "cpu" line of /proc/stat of a long running machine and you will notice, that this number gets bigger than 2^32: $ cat /proc/stat | grep ^cpu cpu 267786244 1372531 123062669 5029648046 147238197 12805429 28215747 cpu0 173394626 723196 70883219 1022608556 95121311 12805429 26995857 cpu1 27993246 176534 16765769 1339833360 17366789 0 396513 cpu2 26236288 180268 15623795 1344113091 16010802 0 367982 cpu3 40162082 292531 19789884 1323093037 18739293 0 455393 (the above example is with 2.6.10 kernel on 2.4 you will have less columns). I don't like to reboot the machines every half year, but my attached patch solved the problem on half a dozen machines now without rebooting them. ---------------------------------------------------------------------- Comment By: Robert Story (rstory) Date: 2005-06-09 16:54 Message: Logged In: YES user_id=76148 is this a 64bit machine? ---------------------------------------------------------------------- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=112694&aid=1197183&group_id=12694 |