Re: [Nagios-db-devel] neb startup revisited
Status: Beta
Brought to you by:
bench23
From: Ben <be...@si...> - 2005-01-27 17:55:25
|
On Wed, 26 Jan 2005, Matthew Kent wrote: > In restart.sql (during nagios startup) I'm not quite clear on why an > existing host and service should be reset back to a pending state by > setting has_been_checked = FALSE. If you look at nagios and the standard > cgis during startup, the data is read in from the retention file and the > previous states for hosts/services is assumed to be correct. Well, technically, the check *is* pending. I understand that the retention file says the host or service has a certain state, but that hasn't been verified. If nagios was simply restarted, then the retention data is likely accurate. But after a restart, it's entirely possible that nagios might have been down for days, and in that case the retention data is much more questionable. Frankly, I think it makes a lot more sense to label everything as pending until it's been checked. Weren't you the one that convinced me of that? :) > Also in restart.sql, I'm not sure about inserting the empty host/service > when one isn't found in the database. For example if you clear out the > database and start nagios up the tac display will show X hosts with flap > detection/notifications etc disabled which will slowly count backwards > as all the checks complete. Kinda funky :) Yeah, that's a serious hack. However, I'm not sure how else to record services for a new host that has yet to be checked, because if there isn't a placeholder record, then the service cannot be entered into the database. Perhaps I should set the host options to null, or some other "unknown" state. > As I see it the solution for both issues would be to > > - set configured = false for all hosts/services > - do the 'select into thisHostID id FROM host WHERE name = hostName;' > > if > the host/service is NOT found, send the object to processStatus like > > /* update this host */ > nebstruct_host_status_data ds; > ds.object_ptr=(void *)hl; > > processStatus(NEBCALLBACK_HOST_STATUS_DATA, (void*)&ds); > > which will set configured = TRUE, update the has_been_checked field etc. > else > we set host/service configured = true and assume the rest of the data in > the db is correct and leave it alone (save the extra resources of > running the stored proc) I think a better idea would be to change configure_host() and configure_service() to take in all the data we have on the host/service before it gets checked, so that we can make our placeholder records more accurate. > Come to think of it at this point you could actually > delete from host,service where configured = false > to prune any hosts that have been removed from the config. I can't support deleting unconfigured hosts, because one of the requirements my company has is to be able report on historical availablity, even if the host isn't used anymore. > This should give a more immediate overview of nagios's status right > after startup. Like I said, I think showing most things in a pending state shows the most accurate status. Well, actually, I suppose marking things as "Pending (assumed up)" or "Pending (assumed down)" and such would be the most accurate, but that could get messy. |