Re: [Nagios-db-devel] neb startup revisited
Status: Beta
Brought to you by:
bench23
From: Matthew K. <mk...@ma...> - 2005-01-27 21:19:47
|
On Thu, 2005-27-01 at 09:55 -0800, Ben wrote: > On Wed, 26 Jan 2005, Matthew Kent wrote: > > > In restart.sql (during nagios startup) I'm not quite clear on why an > > existing host and service should be reset back to a pending state by > > setting has_been_checked = FALSE. If you look at nagios and the standard > > cgis during startup, the data is read in from the retention file and the > > previous states for hosts/services is assumed to be correct. > > Well, technically, the check *is* pending. I understand that the retention > file says the host or service has a certain state, but that hasn't been > verified. If nagios was simply restarted, then the retention data is > likely accurate. But after a restart, it's entirely possible that nagios > might have been down for days, and in that case the retention data is much > more questionable. > > Frankly, I think it makes a lot more sense to label everything as pending > until it's been checked. Weren't you the one that convinced me of that? :) > My apologies, I wasn't thinking clearly as to what the effect would be on the tac display. The current implementation is the most accurate. > > Also in restart.sql, I'm not sure about inserting the empty host/service > > when one isn't found in the database. For example if you clear out the > > database and start nagios up the tac display will show X hosts with flap > > detection/notifications etc disabled which will slowly count backwards > > as all the checks complete. Kinda funky :) > > Yeah, that's a serious hack. However, I'm not sure how else to record > services for a new host that has yet to be checked, because if there isn't > a placeholder record, then the service cannot be entered into the > database. Perhaps I should set the host options to null, or some other > "unknown" state. > > > As I see it the solution for both issues would be to > > > > - set configured = false for all hosts/services > > - do the 'select into thisHostID id FROM host WHERE name = hostName;' > > > > if > > the host/service is NOT found, send the object to processStatus like > > > > /* update this host */ > > nebstruct_host_status_data ds; > > ds.object_ptr=(void *)hl; > > > > processStatus(NEBCALLBACK_HOST_STATUS_DATA, (void*)&ds); > > > > which will set configured = TRUE, update the has_been_checked field etc. > > else > > we set host/service configured = true and assume the rest of the data in > > the db is correct and leave it alone (save the extra resources of > > running the stored proc) > > I think a better idea would be to change configure_host() and > configure_service() to take in all the data we have on the host/service > before it gets checked, so that we can make our placeholder records more > accurate. > Sounds good. Is passing everything to configure_host/configure_service instead of just throwing it at processStatus to save processing time or just a logical seperation? > > Come to think of it at this point you could actually > > delete from host,service where configured = false > > to prune any hosts that have been removed from the config. > > I can't support deleting unconfigured hosts, because one of the > requirements my company has is to be able report on historical > availablity, even if the host isn't used anymore. > Was thinking about that too, if you removed a host (and maybe went to add it back later) you might be annoyed to find all the history had disappeared. I'll remove this from the mysql module and put a note about adding a db_cleanup.php down the line so users can do it themselves. > > This should give a more immediate overview of nagios's status right > > after startup. > > Like I said, I think showing most things in a pending state shows the most > accurate status. Well, actually, I suppose marking things as "Pending > (assumed up)" or "Pending (assumed down)" and such would be the most > accurate, but that could get messy. > Yeah, hardly worth the effort. Oh and did you get that other email about use of current_notification_number (it being defined in the schema but not referenced by the stored procs)? I'm not getting anything from the mailing list today. Thanks, -- Matthew Kent \ SA \ bravenet.com |