Re: [Nodebrain-users] Nagios integration and counting objects
Rule Engine for State and Event Monitoring
Brought to you by:
trettevik
From: <mar...@km...> - 2013-12-10 09:15:32
|
On 02/dic/2013, at 21:51, Trettevik, Ed A <ed....@bo...> wrote: Hi Dr Marco, Hi Ed :) However, this requires that you only make the assertion when changing state. If the assertion is performed as a rule action, you can add a test to the rule condition to only make the assertion on a state change. Here’s an example using this approach. Thanks for pointing out this solution, in fact I used that! I adapted the solution for our environment creating some "nodes" in the tree to reflect the business service the servers are delivering along with every KPI that when must check (I fact I spent the last days writing the servants and adding a SQL backend to store some meaningful data, I read somewhere about a JournalKit that maybe does that) You might also consider using a Cache node instead of Tree node. A Cache node is like a Tree node is some respects, but entries don’t have values. Instead, the nodes of a Cache have counters upon which you can set thresholds. Without knowing your requirements better, I can’t say if this is a good approach for your use case. I definitely will because I don’t want a too-reactive node brain instance: the concept is indeed similar to the SOFT/HARD states in Nagios but of course the cache is way more powerful… hope to get to that soon. You may find that you need different rules for each resource/application. This is where the notion of rule compilers can come in handy. A rule compiler is a script you write, if necessary, and is based on This is another good hint... The Caboodle NodeBrain Kit provides a framework for managing rules as XML documents from which compilers generate the actual NodeBrain rules. However, you may find you can get along just fine with a simpler approach of your own design, or you may develop something much more effective. Seems I’ve strayed a bit from your question, to which the answer might have been simply “no”. J But hopefully this helps in some way. Always helpful, thanks! bye From: Marco Musso [mailto:ma...@mu...] Sent: Monday, November 25, 2013 3:29 AM To: nod...@li... Subject: [Nodebrain-users] Nagios integration and counting objects Hi fellow nodebrain users! I'd like to submit a solution for my problem that you can probably improve... Let's suppose to define a tree node to store the status of a resource (let's say apache) of some servers (the total number of servers is dynamic and unknown): define servers node tree; and then populate the tree (via some servant scripts): servers. assert ("srv1","apache")=0; # or alert servers("srv1","apache")=0 servers. assert ("srv2","apache")=0; servers. assert ("srv3","apache")=0; servers. assert ("srv4","apache")=0; we'll get: show servers "srv1" "apache"=0 "srv2" "apache"=0 "srv3" "apache"=0 "srv1" "apache"=0 The goal is to count the number of server with apache != 0 (ie. resource not available). The first thing I tried was: define broken node tree; # a tree that contains servers without running apache define r1 on(!servers(x,"apache")=0): broken(x); # very much like the tutorial (paragraph 6.3) which should trigger on assert x="srv1" and check the status and eventually define "srv1"=1, like this: servers. assert ("srv1","apache")=1; Rule local.r1 fired (@.local.broken(@.local.x)=1) show broken broken = ! == node tree "srv1"=1 To clear the state when apache is available again I can define another rule: define r2 on(servers(x,"apache")=0) ?broken(x); # or broken(x)=0 servers. assert ("srv1","apache")=0; Rule local.r2 fired (@.local.broken(@.local.x)=?) show broken broken = ! == node tree This works as far as x has the value of a server (i.e. to trigger those rules I have to assert x=). To me this doesn't sound as an elegant solution (and probably I should have used IF/ALERT instead of ON/ASSERT). Then there is the problem that I want to know how many server are broken and call an adapter. How can I count the cardinality of a tree (or the number of element with a given property/value directly on the servers tree)? With those questions in mind I started to thing that probably the method I'm following is not the best (also because it resembles too closely an standard programming logic): is there a better way? TIA — Dr. Marco Musso -- Dr. Marco Musso SIP: +39 011 2178981 Mob: +39 348 2303085 Fax: +39 02 700410445 | +39 011 83031108 |