On 05/03/2011 09:04 AM, nap wrote:
> What do you mean about retention module? Is this something, I can
> Yes, it's a module that you link with your scheduler in the
> shinken-specific.cfg file.
Great, I have it:)
> Anyway, this was me. Before I removed retention.dat than restarted
> shinken. This time stop, rm retention.dat, start and success, I
> see pendings:)
> And now here is the surprise, I got service notifications!
> Than I changed the address in the host definition, so it got DOWN
> and a notification was sent. Still OK.
> I changed the host address back, it got UP with no notification:
> [2011-05-02 17:52:51] HOST ALERT: cindy;UP;HARD;10;OK -
> 192.168.232.55 <http://192.168.232.55>: rta 0.040ms, lost 0%
> [2011-05-02 17:52:04] SERVICE ALERT: cindy;ssh;OK;HARD;3;SSH OK -
> OpenSSH_4.3 (protocol 2.0)
> [2011-05-02 17:43:38] HOST ALERT: cindy;DOWN;HARD;10;CRITICAL -
> 192.168.232.56 <http://192.168.232.56>: Host unreachable @
> 192.168.232.89. rta nan, lost 100%
> [2011-05-02 17:43:38] HOST NOTIFICATION:
> tompos;cindy;DOWN;notify-host-by-email;CRITICAL - 192.168.232.56
> <http://192.168.232.56>: Host unreachable @ 192.168.232.89. rta
> nan, lost 100%
> I also have some question again.
> 1. Why was SERVICE ALERT before HOST ALERT? My guess is and
> documentation says HOST SATE is checked before SERVICE STATE. Or I
> misunderstanding something?
> Yes, when the service got a problem, the host is checked too. But the
> ALERT in the log is still here, after all the service got a problem.
Yes, reverse order, my fault.
> 2. Why was them not in SOFT UP before HARD UP?
> When you are HARD DOWN, the next UP is a HARD one, there is no soft
> state for this recovery part. Like for Nagios.
Something again I was wrong:)
> 3. Why was not made a notification?;)
> For the host? I can be your host notification configuration (must got
> the r option) or your contact one (same option :) ).
I made a new test:
[2011-05-03 13:22:34] SERVICE ALERT: bubo;SSH;OK;HARD;3;SSH OK -
OpenSSH_5.3p1 Debian-3ubuntu6 (protocol 2.0)
[2011-05-03 13:20:48] SERVICE ALERT: bubo;HTTP;OK;HARD;3;HTTP OK:
HTTP/1.1 200 OK - 453 bytes in 0.001 second response time
[2011-05-03 13:20:05] HOST ALERT: bubo;UP;HARD;10;OK - 192.168.232.242:
rta 1.797ms, lost 0%
[2011-05-03 13:14:17] HOST ALERT: bubo;DOWN;HARD;10;CRITICAL -
192.168.232.242: Host unreachable @ 192.168.232.89. rta nan, lost 100%
[2011-05-03 13:14:17] HOST NOTIFICATION:
tompos;bubo;DOWN;notify-host-by-email;CRITICAL - 192.168.232.242: Host
unreachable @ 192.168.232.89. rta nan, lost 100%
So there was no notification about HOST RECOVERY.
Than again and now it was there:
2011-05-03 13:33:54] HOST NOTIFICATION:
tompos;bubo;UP;notify-host-by-email;OK - 192.168.232.242: rta 0.035ms,
I don't understand:/
Anyway, now it's OK after a few test.
> 4. If I understand the documentation and your suggestion well,
> SERVICE ALERTS depend on HOST STATE? In other words if HOST STATE
> IS DOWN, there is no SERVICE ALERT (CHECK?).
> Services are checks, but service notification are not send. It only
> send notification about the root problem, not about impacts :)
It's different from Nagios, am I right?
ui.: More on questions:)
Why are notification command definitions not included in the default
install by default?
Also sendmailhost.pl and sendmailservices.pl are in source tree, but
they are not installed by default. Why?