From: Javier S. <ja...@jf...> - 2005-02-16 16:02:36
|
Reading your poller output, the code seems ok. The problem is with your RRDs. The only way to get NaN, if its no data (or no valid data) in the period = of time. Javier Kurta L=E1szl=F3 wrote: > Hi Javier! > The RRDs OK now, and the poller runned sometimes with the constant 12=20 > return value. After 5 constant value runs I modifid the poller, to=20 > return the collected values, but it seems like the earlier mentioned=20 > interfaces give the collected values, and the earlier 'nan' ones give 1= 2! > I don't understand it all :( How can it get the 12 value? >=20 > This is the poller code (sorry about hungarian variable names...): >=20 > function poller_sssh ($options) { > $ip=3D$options['host_ip']; > $param=3D$options['poller_parameters']; > //the format is avail.<index>, or used.<index> > $pont=3Dstrpos($param,'.'); > $melyik=3Dsubstr($param,0,$pont); > $interface=3Dsubstr($param,$pont+1); > $parancs=3D"ssh lkurta@".$ip." 'df -k'"; > $y=3D`$parancs`; > $sorok=3Dexplode("\n",$y); > $misor=3D$sorok[$interface]; > $rekord=3Dsplit('[ ]+',$misor); > if ($melyik=3D=3D'avail') > $oszlop=3D3; > else > $oszlop=3D2; > $vissza=3D($rekord[$oszlop]+1)-1; //+- for numeric value > return $vissza; > //return 12; > } >=20 > Javier Szyszlican wrote: >=20 >> If you are getting rrd Error is probable that you don't have Interface= =20 >> Type FIELDS corresponding with the name of the RRD DS you are using. >> >> Javier >> >> Kurta L=E1szl=F3 wrote: >> >>> Hi Javier! >>> >>> I set the GAUGE type for the 'avail', and 'used' values. >>> I deleted the rrds, but that not helps :( The permissions are=20 >>> correct, I have upgraded to 0.8.0 just now, and the last step was=20 >>> changing chown's, like at other previous upgrades. I use Debian=20 >>> Woody, with www-data Apache user. The other, originally shipped=20 >>> pollers, and grapher are working correctly. >>> >>> My poller able to access Solaris boxes only via ssh, and with=20 >>> pre-generated null password DSA keys. (SNMP, etc. are not playing :()= =20 >>> The poller runs a 'df -k' on the remote machine, and takes the=20 >>> corresponding column from the 'interface's' line from the output: >>> host 10: >>> www-data@rhein:/opt/jffnms/engine$ ssh lkurta@host10 df -k >>> Filesystem kbytes used avail capacity Mounted on >>> /dev/md/dsk/d10 494235 68496 376316 16% / >>> /dev/md/dsk/d40 2056211 1014897 979628 51% /usr >>> /proc 0 0 0 0% /proc >>> fd 0 0 0 0% /dev/fd >>> mnttab 0 0 0 0% /etc/mnttab >>> /dev/md/dsk/d30 494235 190336 254476 43% /var >>> swap 3263208 24 3263184 1% /var/run >>> swap 3263528 344 3263184 1% /tmp >>> /dev/md/dsk/d50 11984956 7863085 4002022 67% /opt >>> /dev/md/dsk/d60 231795 139717 68899 67% /export/home >>> host 11: >>> www-data@rhein:/opt/jffnms/engine$ ssh lkurta@host11 df -k >>> Filesystem kbytes used avail capacity Mounted on >>> /dev/md/dsk/d10 2053605 1903936 88061 96% / >>> /proc 0 0 0 0% /proc >>> fd 0 0 0 0% /dev/fd >>> mnttab 0 0 0 0% /etc/mnttab >>> /dev/md/dsk/d30 4129290 2962450 1125548 73% /var >>> swap 5409400 24 5409376 1% /var/run >>> swap 5409392 16 5409376 1% /tmp >>> /dev/md/dsk/d40 10323610 2279287 7941087 23% /opt >>> /dev/md/dsk/d50 10323610 8651994 1568380 85% =20 >>> /opt/oracle/data/oradata >>> /dev/md/dsk/d60 4016614 2752870 1223578 70% =20 >>> /opt/oracle/admin/reporter/arch >>> >>> As I mentioned earlier, from host10 the used:68496 value=20 >>> (/dev/md/dsk/d10), from host11 the avail:88061 (/dev/md/dsk/d10), and= =20 >>> avail:1568380 (/dev/md/dsk/d50) values goes to the rrds correctly,=20 >>> the others seems like 'nan'. But the poller's log shows, that other=20 >>> values are fetched correctly, too!? >>> >>> Now, I changed the poller's code to return with a static integer (12)= . >>> The poller's log shows that: >>> rhein:/opt/jffnms/engine$ php -q poller.php 10 >>> 11:19:48 : H 10 : Poller Start : 20 Items. >>> 11:19:48 : H 10 : I 107 : P 10 : sssh:avail(avail.1): 12 ->=20 >>> buffer(): 1 (time P:0.52 | 0.64) >>> 11:19:48 : H 10 : I 108 : P 10 : sssh:avail(avail.2): 12 ->=20 >>> buffer(): 2 (time P:0.2 | 0.28) >>> 11:19:48 : H 10 : I 109 : P 10 : sssh:avail(avail.6): 12 ->=20 >>> buffer(): 3 (time P:0.19 | 0.33) >>> 11:19:48 : H 10 : I 110 : P 10 : sssh:avail(avail.9): 12 ->=20 >>> buffer(): 4 (time P:0.18 | 0.28) >>> 11:19:48 : H 10 : I 111 : P 10 : sssh:avail(avail.10): 12 ->=20 >>> buffer(): 5 (time P:0.18 | 0.33) >>> 11:19:48 : H 10 : I 107 : P 20 : sssh:used(used.1): 12 ->=20 >>> buffer(): 6 (time P:0.18 | 0.33) >>> 11:19:48 : H 10 : I 108 : P 20 : sssh:used(used.2): 12 ->=20 >>> buffer(): 7 (time P:0.18 | 0.28) >>> 11:19:48 : H 10 : I 109 : P 20 : sssh:used(used.6): 12 ->=20 >>> buffer(): 8 (time P:0.18 | 0.33) >>> 11:19:48 : H 10 : I 110 : P 20 : sssh:used(used.9): 12 ->=20 >>> buffer(): 9 (time P:0.18 | 0.34) >>> 11:19:48 : H 10 : I 111 : P 20 : sssh:used(used.10): 12 ->=20 >>> buffer(): 10 (time P:0.18 | 0.41) >>> RRD error: 11:19:48 : H 10 : I 107 : P 30 : no_poller(): 0 ->=20 >>> rrd(*): 0 (time P:0.4 | 149.02) >>> RRD error: 11:19:49 : H 10 : I 108 : P 30 : no_poller(): 0 ->=20 >>> rrd(*): 0 (time P:0.23 | 128.78) >>> RRD error: 11:19:49 : H 10 : I 109 : P 30 : no_poller(): 0 ->=20 >>> rrd(*): 0 (time P:0.22 | 129.01) >>> RRD error: 11:19:49 : H 10 : I 110 : P 30 : no_poller(): 0 ->=20 >>> rrd(*): 0 (time P:0.27 | 128.74) >>> RRD error: 11:19:49 : H 10 : I 111 : P 30 : no_poller(): 0 ->=20 >>> rrd(*): 0 (time P:0.22 | 128.93) >>> 11:19:49 : H 10 : I 107 : P LPD : last_poll_date(): 1108549188=20 >>> -> db(last_poll_date): 1 (time P:0.53 | 7.17) >>> 11:19:49 : H 10 : I 108 : P LPD : last_poll_date(): 1108549188=20 >>> -> db(last_poll_date): 1 (time P:0.32 | 6.05) >>> 11:19:49 : H 10 : I 109 : P LPD : last_poll_date(): 1108549189=20 >>> -> db(last_poll_date): 1 (time P:0.27 | 5.99) >>> 11:19:49 : H 10 : I 110 : P LPD : last_poll_date(): 1108549189=20 >>> -> db(last_poll_date): 1 (time P:0.23 | 6) >>> 11:19:49 : H 10 : I 111 : P LPD : last_poll_date(): 1108549189=20 >>> -> db(last_poll_date): 1 (time P:0.22 | 6.11) >>> 11:19:49 : H 10 : Poller End, Total Time: 761.63 msec. >>> >>> RRD error!? There are no rrd files (because I have deleted these).=20 >>> The other problem is, that the setup page is very slow (after the=20 >>> upgrade)! >>> >>> Please help! >>> Thank You! >>> Qrta >>> >>> Javier Szyszlican wrote: >>> >>>> Hi Kurta, >>>> >>>> Are you sure about the type of RRD you are using, this seems like a=20 >>>> GAUGE value, and you may have set COUNTER. >>>> >>>> Also, try deleting all your rrds for this interface type, and let=20 >>>> the poller create them again, you may be dealing with a permission=20 >>>> issue. >>>> >>>> Hope that helps, and I would like to see you contribute this changes= =20 >>>> back to jffnms once they work correctly. >>>> >>>> Javier >>>> >>>> Kurta L=E1szl=F3 wrote: >>>> >>>>> Hi! >>>>> I wrote a new poller, grapher, etc. to get Solaris box's free disk=20 >>>>> space via ssh remotely initiated scripts (df -k...). >>>>> As I see in the poller's log, the 'used', and 'avail' values are=20 >>>>> collected succesfully: >>>>> www-data@rhein:/opt/jffnms/engine$ php -q poller.php 10 >>>>> : H 10 : Poller Start : 15 Items. >>>>> : H 10 : I 97 : P 10 : sssh:avail(avail.1): 376316 ->=20 >>>>> buffer(): 1 (time P:1712.34 | 0.76) >>>>> : H 10 : I 98 : P 10 : sssh:avail(avail.2): 979628 ->=20 >>>>> buffer(): 2 (time P:1768.78 | 0.37) >>>>> : H 10 : I 99 : P 10 : sssh:avail(avail.6): 254476 ->=20 >>>>> buffer(): 3 (time P:1713.03 | 0.47) >>>>> : H 10 : I 100 : P 10 : sssh:avail(avail.9): 4025566 ->=20 >>>>> buffer(): 4 (time P:1793.9 | 0.38) >>>>> : H 10 : I 101 : P 10 : sssh:avail(avail.10): 68899 ->=20 >>>>> buffer(): 5 (time P:1791.65 | 0.38) >>>>> : H 10 : I 97 : P 20 : sssh:used(used.1): 68496 -> buffer():=20 >>>>> 6 (time P:2165.84 | 0.38) >>>>> : H 10 : I 98 : P 20 : sssh:used(used.2): 1014897 ->=20 >>>>> buffer(): 7 (time P:1739.41 | 0.38) >>>>> : H 10 : I 99 : P 20 : sssh:used(used.6): 190336 -> buffer():= =20 >>>>> 8 (time P:6831.9 | 0.38) >>>>> : H 10 : I 100 : P 20 : sssh:used(used.9): 7839551 ->=20 >>>>> buffer(): 9 (time P:1866.62 | 0.38) >>>>> : H 10 : I 101 : P 20 : sssh:used(used.10): 139717 ->=20 >>>>> buffer(): 10 (time P:2209.4 | 0.39) >>>>> : H 10 : I 97 : P 60 : no_poller(): 0 -> rrd(*): used:68496 -= =20 >>>>> avail:376316 (time P:0.37 | 22.97) >>>>> : H 10 : I 98 : P 60 : no_poller(): 0 -> rrd(*): used:1014897= =20 >>>>> - avail:979628 (time P:0.2 | 20.73) >>>>> : H 10 : I 99 : P 60 : no_poller(): 0 -> rrd(*): used:190336=20 >>>>> - avail:254476 (time P:0.26 | 20.64) >>>>> : H 10 : I 100 : P 60 : no_poller(): 0 -> rrd(*): used:7839551= =20 >>>>> - avail:4025566 (time P:0.27 | 20.7) >>>>> : H 10 : I 101 : P 60 : no_poller(): 0 -> rrd(*): used:139717=20 >>>>> - avail:68899 (time P:0.27 | 20.56) >>>>> : H 10 : I 97 : P LPD : last_poll_date(): 1108481687 ->=20 >>>>> db(last_poll_date): 1 (time P:0.5 | 7.17) >>>>> : H 10 : I 98 : P LPD : last_poll_date(): 1108481687 ->=20 >>>>> db(last_poll_date): 1 (time P:0.26 | 6.01) >>>>> : H 10 : I 99 : P LPD : last_poll_date(): 1108481687 ->=20 >>>>> db(last_poll_date): 1 (time P:0.27 | 6.02) >>>>> : H 10 : I 100 : P LPD : last_poll_date(): 1108481687 ->=20 >>>>> db(last_poll_date): 1 (time P:0.27 | 6.05) >>>>> : H 10 : I 101 : P LPD : last_poll_date(): 1108481687 ->=20 >>>>> db(last_poll_date): 1 (time P:0.26 | 6.08) >>>>> : H 10 : Poller End, Total Time: 23804.93 msec. >>>>> >>>>> But when I fetch the RRD's from 10 different interfaces (you see=20 >>>>> five above) 8 don't collect data at all, one collects 'avail', and=20 >>>>> one another collects 'used' values!? But the poller log shows that=20 >>>>> all data pushed to RRDs succesfully. >>>>> >>>>> Here is the 'used' collector's RRD fetch output: >>>>> rhein:/opt/jffnms/rrd# rrdtool fetch interface-97-0.rrd AVERAGE -r=20 >>>>> 900 -s -1h >>>>> data >>>>> >>>>> 1108478400: 6.8496000000e+04 >>>>> 1108478700: 6.8496000000e+04 >>>>> 1108479000: 6.8496000000e+04 >>>>> 1108479300: 6.8496000000e+04 >>>>> 1108479600: 6.8496000000e+04 >>>>> 1108479900: 6.8496000000e+04 >>>>> 1108480200: 6.8496000000e+04 >>>>> 1108480500: 6.8496000000e+04 >>>>> 1108480800: 6.8496000000e+04 >>>>> 1108481100: 6.8496000000e+04 >>>>> 1108481400: 6.8496000000e+04 >>>>> 1108481700: 6.8496000000e+04 >>>>> 1108482000: 6.8496000000e+04 >>>>> 1108482300: nan >>>>> rhein:/opt/jffnms/rrd# rrdtool fetch interface-97-1.rrd AVERAGE -r=20 >>>>> 900 -s -1h >>>>> data >>>>> >>>>> 1108478400: nan >>>>> 1108478700: nan >>>>> 1108479000: nan >>>>> 1108479300: nan >>>>> 1108479600: nan >>>>> 1108479900: nan >>>>> 1108480200: nan >>>>> 1108480500: nan >>>>> 1108480800: nan >>>>> 1108481100: nan >>>>> 1108481400: nan >>>>> 1108481700: nan >>>>> 1108482000: nan >>>>> 1108482300: nan >>>>> >>>>> The others 8 shows nan at all time. >>>>> >>>>> What did I do wrong? >>>>> Thanx! >>>>> Qrta >>>>> >>>>> >>>>> ------------------------------------------------------- >>>>> SF email is sponsored by - The IT Product Guide >>>>> Read honest & candid reviews on hundreds of IT Products from real=20 >>>>> users. >>>>> Discover which products truly live up to the hype. Start reading no= w. >>>>> http://ads.osdn.com/?ad_id=3D6595&alloc_id=3D14396&op=3Dclick >>>>> _______________________________________________ >>>>> jffnms-users mailing list >>>>> jff...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/jffnms-users >>>> >>>> >>>> >>>> >>>> >>> >> >=20 --=20 =3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D-=3D= -=3D-=3D-=3D-=3D-=3D Javier Szyszlican, Project Leader, JFFNMS ja...@jf... I hope JFFNMS or I were helpful to you, if you can, please donate at http://jffnms.org/donate |