[Nodebrain-users] Re: [Fwd: [Nodebrain-announce] NodeBrain 0.5.2 Released - Numskull Patch]
Rule Engine for State and Event Monitoring
Brought to you by:
trettevik
From: <li...@ho...> - 2003-03-20 23:33:20
|
ok.. trying to move this to the list ;-) Trettevik, Ed A wrote: > Hi Ian, > > 1) Loadable module interface: In general I like the idea of providing a loadable module interface, but I may be getting twisted around a bit in understanding your idea. Do you want to load NodeBrain or have NodeBrain load your code? Do you want to interact with a cell or be a cell? Somehow I have a feeling you're going to say "both to both". :) And that's probably the right answer. > > You may want to take a look at nbmain.c in the patch release I just put out there. It provides a simple C API. It isn't what you want, but may give you a chance to experiment some. > I was after a way to "be" the cell, i never thought of nbmain as being rules engine.. would be nice. The other thing would be to have a agent API to communicate to a running brain, so I wouldn't have to spawn a process every time. > Hint: > define r1 on(~=a):$ +hey ian a=$${a} > assert a=1; > assert a=2; > assert a="abc"; > > Oops, now I need to check to see if the prefix operator ~= is documented. It responds to any change. > > 2) Anomaly detection: I have actually been giving this one a bit of thought. I just scanned the link you provided, so I'm not sure, but it looks like something I have been planning on including. An exponentially weighted moving average is extremely efficient to compute in a real-time monitor, and a similar technique can be applied to calculate a moving deviation. The control limits can then be expressed in deviations from the moving average independent of magnitude, and bias between past and recent history can be adjusted with a selectable factor. This is standard stuff I've studied a bit at http://www.itl.nist.gov/div898/handbook/index.htm and want to apply to NodeBrain. But I need to give more thought to the actual implementation. It would not be difficult to generate time series by hour or minute of day within day of week. This would incorporate the typical human-schedule based variations, except holidays. However, the real problem in many environments is more c omplex than that. I'm thinking we might need to split out separate time series based on other variables that are not scheduled. For example, in a load balanced environment, the number of servers running may have a major impact on the expected load on the other servers. If the number of servers was a factor in splitting out separate time series, the software could learn "normal" even under a "two server down" condition. Provided we alarm on server downs, we could avoid generating alarms on every other server taking on a larger load. The software could recognize it as a normal load with two servers down. I need to play with it a bit to see if this can be done without making life overly complex for the rule coder. Do you think this is something you would apply? > Thanks for the link.. I'll have a look at it. we're just a 'simple' web site in a lot of respects, I'm sure a lot of other data centers would require pretty complex setups. > Ed > > > regards Ian > > > > -----Original Message----- > From: Ian Holsman [mailto:Ian...@cn...] > Sent: Thursday, March 20, 2003 12:23 PM > To: Trettevik, Ed A > Subject: [Fwd: [Nodebrain-announce] NodeBrain 0.5.2 Released - Numskull > Patch] > > > hey ed, > two more things... > > > 1. I was thinking maybe a loadable module interface. > the module would define some cells, so that instead of > firing off a perl script I could just query the cell (which > would know how to figure out the value, and could fire event > when the cell changes)... eg.. a cpu.sys% or dbms.num_users > > 2. One of the major problems I have is that we get very seasonal > traffic. ..so at 3AM 'normal' might be 100, but at 12pm that would > be an error condition. now.. I understand the nodebrain can be setup > to handle this. but I was wondering if there could be some forcecasting > built in (see http://cricket.sourceforge.net/aberrant/) so that nodebrain > could 'know' what normal was. > > > oh.. I found another tool you might be interested in PCP (http://oss.sgi.com/projects/pcp/) > it is more of a monitoring tool, but it contains a rule-based mechansim for alerting > > eg.. > some_host ( > ($SWAP.free $HOSTS / $SWAP.length $HOSTS) * 100 < 50 && > ($SWAP.free $HOSTS / $SWAP.length $HOSTS) * 100 >= 25 > ) -> print 10 min "swap more than half-full: " "%h: %v% free " & > shell 10 min "rsh -n guest@%h /sbin/ps -eo > 'ruser=UID,pid=PID,ppid=PPID,pcpu=%CPU,sz=9999999SZ,rss=RSS,stime=STIME,time=TIME,args=CMD' | sort > +4 -nr | sed -e 's/9999999SZ / SZ:/' | /usr/sbin/Mail -s '%h swap more than half-full (%v% > free)' $MINDER &"; > > > but it is pretty complex to setup properly > > regards > Ian. > > > -------- Original Message -------- > Subject: [Nodebrain-announce] NodeBrain 0.5.2 Released - Numskull Patch > Date: Thu, 20 Mar 2003 09:55:59 -0800 > From: Trettevik, Ed A <ed....@bo...> > Organization: Holsman.NET > Newsgroups: other > > Patch release 0.5.2 for version 0.5 (Numskull) is now available for download. This release corrects > some minor differences between the code and documentation. In addition, the source now includes a > makefile and supports compilation on Mac OS X (Darwin). A simple prototype C API was included as > part of the minor restructuring needed to support the makefile. > > > ------------------------------------------------------- > This SF.net email is sponsored by: Tablet PC. > Does your code think in ink? You could win a Tablet PC. > Get a free Tablet PC hat just for playing. What are you waiting for? > http://ads.sourceforge.net/cgi-bin/redirect.pl?micr5043en > _______________________________________________ > Nodebrain-announce mailing list > Nod...@li... > https://lists.sourceforge.net/lists/listinfo/nodebrain-announce > |