I was trying to use vmstat module on RHEL 3 Update 4
(kernel 2.4.21-4.EL). However, on all the linux servers
that the agents were running on, the cpu status showed
up as purple because of "cpu test failing". After
further debugging of uxmon, it turned out that the
module calls vmstat with "vmstat -20 2" command which
is invalid. This seems to be the reason why vmstat
module doesn't work on Redhat Enterprise 3 and higher.
The output from uxmon is below:
[sanket ]$ uxmon -D 5
Mon May 22 11:03:52 2006: uxmon: starting up
loading rule file uxmon-rules.pl
uxmon-rules.pl: executing community=public for host DEFAULT
uxmon-rules.pl: executing frequency=0.16 for host DEFAULT
uxmon-rules.pl: executing perf=5 for host DEFAULT
uxmon-rules.pl: executing ALL for host DEFAULT
uxmon-rules.pl: executing version=1 for host DEFAULT
uxmon-rules.pl: executing proto=udp for host DEFAULT
uxmon-rules.pl: executing rpc for host DEFAULT
uxmon-rules.pl: executing proto=udp for host DEFAULT
uxmon-rules.pl: executing ping for host DEFAULT
uxmon-rules.pl: executing features=unix,linux for host
DESCR
uxmon-rules.pl: executing localhost for host DESCR
uxmon-rules.pl: executing features=unix,linux for host
DESCR
uxmon-rules.pl: executing 10.0.0.129 for host DESCR
uxmon-rules.pl: executing load for host 10.0.0.129
Tester known tests: vmstat_cpuload CPUperf_load
fileaccess_base procs_macProcs tcp_http true64_diskload
procs_PROCS sysvsar_cpuload fileaccess_writebandwidth
win32perf_disk solswap_virtualmemory snmpnetwork_check
ldap single_value tcp_ica tcp_TCP
fileaccess_readbandwidth win32perf_network aix_dfdisk
unix_snmpdisk snmp_anyvar tcp_imap snmp_virtualmemory
snmphost_numusers linux_virtualmemory
sysvsar_virtualmemory win32perf_diskload
win32perf_virtualmemory win32perf_cpuload realhttp
bsd_dfdisk ping_ping etherport_check true64_dfdisk
true64_virtualmemory tcp_pop3 nut tcp_nntp
lxnetwork_check storage win32perf_procs procs_BSDProcs
expedap_base sysvsar_diskload expedap_myexpedap
solaris_dfdisk tcp_telnet hpux_dfdisk procs_sysvProcs
mailq snmpprocs_procs_alternate procs_hpuxProcs
procs_SolarisProcs generic_snmpdisk tcp_smtp tcp_ftp
unix_dfdisk snmpprocs_procs who_numusers
procs_true64Procs single_value_persecond tcp_ssh
fileaccess_accesstime netware_numusers
btethernetbox_sensor tcp_ident win32service_service
oracle ntp_NTP mysql true64_cpuload procs_LinuxProcs
using test vmstat_cpuload
instantiating vmstat_cpuload
Tester executing method pernode.precheck
Tester executing method instance.precheck
Tester executing method pernode.init
Tester executing method instance.init
uxmon-rules.pl: executing bsdisplay for host
qalin2.appiancorp.com
Monitor::bb new()
next pass ...
uxmon: next pass ...
Tester executing method pernode.discover
Tester executing method instance.discover
Tester executing method pernode.postdiscover
Tester executing method instance.postdiscover
Tester executing method pernode.monitor
Tester executing method instance.monitor
start_sensor() starting vmstat -20 2
usage: vmstat [-V] [-n] [-a] [delay [count]]
-V prints version.
-n causes the headers not to be reprinted
regularly.
-a print active/inactive page stats.
delay is the delay between updates in
seconds.
count is the number of updates.
Tester executing method pernode.postmonitor
Tester executing method instance.postmonitor
Tester executing method pernode.report
Tester executing method instance.report
Tester executing method pernode.perf
Tester executing method instance.perf
uxmon-net
Logged In: YES
user_id=1526740
Actually, it seems to be related to frequency issue. When I
changed the frequency to fractional value of 0.16 (i.e.
check approx. every 10 seconds), %d argument for vmstat in
vmstat.pm was calculated as -20 (instead of 10 ?). When I
changed the frequency to 5, it was calculated as 240 (vmstat
240 2). So, it seems to be a bug related to frequency.
I am using latest stable release 1.02.
Logged In: YES
user_id=77961
Indeed, that's a bug in Requester::WatchRequester - it's
subtracting 30s (Requester::WatchRequester::request()) from
the frequency value which makes it fail if frequency is
around 30s or lower. That's because before 1.02 frequency
couldn't get less than 1 minute ...
Thanks for reporting,
Tom
Logged In: YES
user_id=1526740
hmm....could I do a work-around to use frequency less than
30 sec.? I will like to be able to monitor the systems every
5 seconds (for performance tests on the systems). It seems
from your description that fail condition is forced for
frequency less than 30s for backward-compatibility with
versions earlier than 1.02.
Logged In: YES
user_id=77961
fixed in CVS ...
(however, hope to get the new test parallelizer done for
1.03, which will make this duration guessing obsolete)