$Id: README,v 1.1 1999/08/18 11:40:33 clay Exp $
Monitors, alarms, and alert processing programs for INFORMIX Dynamic
Server
Table of contents
-----------------
1. Introduction
2. Disclaimer
3. alarm.pl
a. Event-Severity codes
b. Sending alarms
c. host name and INFORMIXSERVER
4. informix.monitor
5. Monitoring i.Sell Application Server
6. SNPP Servers
7. References
8. Acknowledgements
1. Introduction
---------------
This distribution provide a set of tools to monitor INFORMIX-Online
Dynamic
Server and the INFORMIX i.Sell Application Server. These tools were
developed
to ensure high availability of INFORMIX databases by constantly monitoring
warnings, errors, and online status of the databases.
Announcements of new releases of this software are posted to Usenet
(comp.databases.informix).
2. Disclaimer
-------------
The programs work fine for my environment. Some of the monitors and alarm/
alert processing programs have been running for many months and have
proved
to be highly reliable monitors for INFORMIX-Online Dynamic Server. Your
mileage may very, though -- Please test before you deploy to a production
environment.
3. alarm.pl
-----------
alarm.pl is a Perl program to monitor alarms generated by INFORMIX IDS.
You must set the ALARMPROGRAM configuration parameter to process
event-alarms
with this program (See the INFORMIX-Online Dynamic Server Administrator's
Guide for additional information about ALARMPROGRAM).
a. Event-severity codes
-----------------------
The INFORMIX Online engine reports an event-severity code to the
ALARMPROGRAM.
Severity codes are listed below (borrowed from the INFORMIX-Online Dynamic
Server Administrator's Guide).
Severity Description
-------- ------------------------------------------------------------
1 Not noteworthy
Will not be reported to the alarm program
2 Information
No error has occurred, but some routine event completed
successfully (for example, checkpont of log backup completes).
3 Attention
This event does not compromise data or prevent the use of
the system; however, it warrants attention (for example, one
chunk of a mirrored pair goes down).
4 Emergency
Something unexpected occured that might compromise data or
access to data (assertion failure, or oncheck reports data
corrupt).
5 Fatal
Something unexpected occured and caused the database server
to fail.
b. Sending alerts
-----------------
If the severity of the alarm is greater than 2, the alarm.pl program uses
SNPP (Simple Network Paging Protocol -- See RFC-1861) to send the alarm to
a pager.
The program could easily be modified to send an Email message. Use the
Mail::Send module to build Something like this (Warning: untested!):
use Config;
use Mail::Send;
my $backup_cmd = "onbar -l";
my $exit_status = 0;
my $event_severity = $ARGV[0];
my $event_class = $ARGV[1];
my $event_msg = $ARGV[2];
my $event_add_text = $ARGV[3];
my $event_file = $ARGV[4];
my $onconfig = $ENV{'ONCONFIG'};
my $hostname = $Config{'myhostname'};
my %severity = (
"3" => "Attention",
"4" => "Emergency",
"5" => "Fatal",
);
chomp $onconfig;
$database = (split /\./, $onconfig)[1];
if ($event_class == 23) {
`$backup_cmd 2>&1 >> /dev/null`;
$exit_status = $?;
}
if ($event_severity > 2) {
$msg = new Mail::Send;
$msg->to('clay@panix.com');
$msg->subject('INFORMIX Alarm');
$fh = $msg->open;
print $fh "System: $hostname\n";
print $fh "Database: $database\n";
print $fh "Alarm Severity: $severity{$event_severity}\n";
print $fh "Message: $event_msg\n";
$exit_status = 1;
$fh->close
}
exit $exit_status;
c. host name and INFORMIXSERVER
-------------------------------
The host name and the INFORMIXSERVER name are included in the alert
message -- This is extremely helpful information in a mutli-host,
multi-database server environment. The program obtains the host name
from the Config module, and INFORMIXSERVER name from the value of the
environment variable:
my $informix_server = $ENV{'INFORMIXSERVER'}
my $hostname = $Config{'myhostname'};
4. informix.monitor
-------------------
There is a major problem with monitoring event-alarms with ALARMPROGRAM --
If
the database crashes, the engine will not be able to send an alarm. To
ensure
the database is online and available, "mon", a service monitoring daemon
(See
References), can monitor online status with the help of the Perl
DBD::Informix
module.
informix.monitor is an INFORMIX-Online Dynamic Server monitor for mon. It
uses
DBD::Informix to connect to the database and retrieve the database name.
If
it can't connect and retrieve the name, and alert is sent to a pager using
SNPP.
A typical database alarm message looks like this:
Subject: ALERT informix/database: shoe@raft is down (Wed Aug 4
15:05:24)
Date: Wed, 4 Aug 1999 15:05:26 -0700 (PDT)
From: Super-User <root>
To: clay@skechers.com
Summary output : shoe@raft is down
Group : informix
Service : database
Time noticed : Wed Aug 4 15:05:24 1999
Secs until next alert : 3600
Members : shoe@raft shoe@motto spock@sparks shoe@groovy
skechers_main@boss
Detailed text (if any) follows:
-------------------------------
A typical mon.cf configuration file using the informix.monitor is:
hostgroup informix database@server
watch informix
service database
interval 5m
monitor informix.monitor
period wd {Sun-Sat}
alert snpp.alert 8005551212
alert mail.alert clay@panix.com
alertevery 1h
It works beautifully.
5. Monitoring i.Sell Application Server
---------------------------------------
The mon service monitoring daemon may be used to monitor the INFORMIX
i.Sell E-Commerce/Application Server to ensure the server is alive and
responding to requests. The monitor checks the HTTP response of one of
the i.Sell administration ports with the http monitor included with the
"mon" distribution. This is a typical mon.cf configuration:
watch app_servers
service http
interval 5m
monitor http.monitor -p 8840
period wd {Sun-Sat}
alert snpp.alert 8005551212
alert mail.alert clay@panix.com
alertevery 1h
6. SNPP Servers
---------------
- Nextel
Server: pecos.nextel.com
Port: 444
- SkyTel
Server: snpp.skytel.com
Port 7777
- PageMart US
Server: pagemart.net
Port: 444
- PageMart Canada
Server: pmcl.net
Port: 444
7. References
-------------
- Config module
On a command line, type: perldoc Config
- RFC-1861
http://info.internet.isi.edu:70/in-notes/rfc/files/rfc1861.txt
- mon, the service monitoring daemon
http://www.kernel.org/software/mon/
- Net::SNPP Perl Module is part of the libnet bundle
http://www.perl.com/CPAN/modules/by-module/Net
- Mail::Send is part of the Mail::Tools distribution
http://www.perl.com/CPAN/modules/by-module/Mail
- DBI, the Perl Database Interface module
http://www.perl.com/CPAN/modules/by-module/DBI/
- DBD:Informix, the Database Driver for Informix
http://www.perl.com/CPAN/modules/by-module/DBD/
- SkyTel SNPP Specification
http://www.mtel.com/develop/snpp.html
- INFORMIX-Online Dynamic Server Administrator's Guide
8. Acknowledgements
-------------------
SKECHERS USA, Inc., my employer
Informix Software, Inc.
Tim Garritty, our Account Executive
Don Jennings and James Rollins, our i.Sell application development team
Please feel free to contact via Email with questions, suggestions, or
comments: Clay Irving <clay@panix.com>.