From: Michael P. <mi...@pr...> - 2009-05-27 09:10:18
|
Hi! I've seen a strange behavior of nagios with a very simple check script. the relevant part of the script: ######################################################################### MAINTCNT="`/usr/sbin/metastat |grep -i maint |wc -l`" RESYNCNT="`/usr/sbin/metastat |grep -i resync |wc -l`" NOTOK=0 status=$STATE_UNKNOWN if [ $RESYNCNT -gt 0 ]; then NOTOK=1 TEXT="WARNING - One or more disks are in resync state. " status=$STATE_WARNING fi if [ $MAINTCNT -gt 0 ]; then NOTOK=1 TEXT="CRITICAL - One or more disks are in maintenance state." status=$STATE_CRITICAL fi if [ $NOTOK -eq 1 ]; then echo $TEXT datum=`date` echo $datum $status >> /tmp/svm.debug exit $status fi echo "OK - There is no maintenance necessary!" exit $STATE_OK ######################################################################### when executing the script from command line, the return code always is 2 and the output always is "CRITICAL - One or more disks are in maintenance state." (because there is one dead disk) => thats ok when nagios executes the script, the output always is "CRITICAL - One or more disks are in maintenance state." but the return code sometimes is 0 and sometimes is 2 => thats not good snippet from nagios.log: [1243410051] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL - One or more disks are in maintenance state. [1243410063] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410061 [1243410071] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or more disks are in maintenance state. [1243410083] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410081 [1243410091] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL - One or more disks are in maintenance state. [1243410124] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243410122 [1243410131] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or more disks are in maintenance state. [1243411031] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL - One or more disks are in maintenance state. [1243411316] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or more disks are in maintenance state. [1243411323] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411320 [1243411326] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL - One or more disks are in maintenance state. [1243411363] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411361 [1243411366] SERVICE ALERT: acgweb1;BASIC_SVM;OK;SOFT;2;CRITICAL - One or more disks are in maintenance state. [1243411370] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411368 [1243411376] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;1;CRITICAL - One or more disks are in maintenance state. [1243411391] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411389 [1243411396] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;2;CRITICAL - One or more disks are in maintenance state. [1243411398] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411396 [1243411406] SERVICE ALERT: acgweb1;BASIC_SVM;CRITICAL;SOFT;3;CRITICAL - One or more disks are in maintenance state. [1243411407] EXTERNAL COMMAND: SCHEDULE_SVC_CHECK;acgweb1;BASIC_SVM;1243411405 /tmp/svm.debug confirmes the command line result: > cat /tmp/svm.debug Wed May 27 08:21:33 GMT 2009 2 Wed May 27 08:22:28 GMT 2009 2 Wed May 27 08:22:39 GMT 2009 2 Wed May 27 08:22:46 GMT 2009 2 Wed May 27 08:23:00 GMT 2009 2 Wed May 27 08:23:11 GMT 2009 2 Wed May 27 08:23:46 GMT 2009 2 Wed May 27 08:24:01 GMT 2009 2 Wed May 27 08:27:09 GMT 2009 2 Wed May 27 08:27:19 GMT 2009 2 Wed May 27 08:27:35 GMT 2009 2 Wed May 27 08:27:50 GMT 2009 2 Wed May 27 08:27:56 GMT 2009 2 Wed May 27 08:29:01 GMT 2009 2 Wed May 27 08:32:55 GMT 2009 2 Wed May 27 08:34:01 GMT 2009 2 Wed May 27 08:37:55 GMT 2009 2 Wed May 27 08:39:01 GMT 2009 2 Wed May 27 08:39:55 GMT 2009 2 Wed May 27 08:44:01 GMT 2009 2 Wed May 27 08:44:55 GMT 2009 2 and so on..... any ideas whats going here wrong? best regards, michael |