Re: [opennms-devel] Surveillance view categories

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

On Nov 14, 2007, at 10:44 PM, Mike Huot wrote:

> Any thoughts?

Well, since you ask ;-)  I've been struggling with this dilemma that  
we introduced (OMG!) almost 2 years ago now.  Where does the time go?

I believe that on the surface of the problem, it starts with the poor  
use of the noun "category".  (BTW: I didn't jump in on this discussion  
earlier because I just didn't have the bandwidth to help and wouldn't  
have been anything more help than than just a pest ;-)  Below the  
surface, the problem is the model itself.

I think we all know know that the current use of "Node Categories",  
now known as "Surveillance Categories", is to quickly provide a way  
that we can present the "current status" of a group of nodes as  
opposed to the "availability" of a group of services over "X amount of  
time".

The node categories were introduced during the development of the  
Model Importer and had nothing do do with Surveillance Views (this  
concept had not yet been conceived).  The original design of the model  
importer began with a use case and an XSD the supported the importing  
of nodes.  The <category> element was defined in the XSD to  
automatically populate the traditional OpenNMS SLA category in  
categories.xml.  During development, this quickly became infeasible  
and obviously not very well thought out, so we moved to a fall back  
position, due to time constraints, and introduced node categories to  
be persisted to the DB.  This eased synchronization, aided  
performance, and shortened development time of the Importer.  The  
"catinc" grammar was then added for use with filter rules.

So, the term category was:

   a) A poor choice of naming.  At the very least, another name should  
have been chosen: class, group, set, etc.  Anything but category.
   b) Weakly implemented... it should be entity based categorization  
and not just limited to nodes (node entities).

This model is as obviously confusing as it is, not so obviously,  
broke.  It is so broke, in fact, that determining the status of a node  
in the surveillance view doesn't ask the Poller, where state is  
actually maintained, but it asks the node, which has no state for its  
own status, so it has to be re-calculated, deterministically, each  
time it is asked.  It does this by plowing through current outages for  
each interface, every time it is asked.  It can be asked possibly  
hundreds of times in a complex surveillance view... Yikes!

I propose that rather than throwing the baby out with the bath water  
(getting rid of one category vs. the other), we really need to decide  
what to call these categories with respect to these two different  
functionalities and perhaps finding a way to integrate the concepts,  
in the underlying model, into something more a lot more unified:

Currently we have:

	Noun		Source			Use
1	Category	categories.xml		SLA reporting
2	Category	categories table	Node (entity) status reporting

We've now changed the semantics to present #2 as "Surveillance View  
Categories" and I believe that does direct one more towards the notion  
of what node categories are and how they are currently being used  
(btw, thanks Bill!), but we're still left with a model that needs  
fixing.

What can we do before 1.8.0-1 to make all this more comprehensible or  
easier to use?  Here are a couple of my thoughts:

1) Automate node categorizing.
2) Provide tool to create node categories from SLA catgories.xml.  One  
would define categories in categories.xml and generate the same  
categories in the categories table and assign nodes based on the  
categories.xml defined rule.

Perhaps we can be more aggressive than that prior to the 1.8 release  
but I don't think we have time.  Thoughts? back at ya mhuot ;)

David Hustace
The OpenNMS Group, Inc.

Re: [opennms-devel] Surveillance view categories

A Java based fault and performance management system

Re: [opennms-devel] Surveillance view categories