RE: [Nodebrain-users] Help! Examples wanted.
Rule Engine for State and Event Monitoring
Brought to you by:
trettevik
From: Trettevik, Ed A <ed....@bo...> - 2003-07-07 17:59:45
|
Hi Gilbert, Sorry my response is not timely, I've been away on vacation. I probably don't understand your application properly yet, but I'll try = to provide a couple examples. You can correct me if my examples don't = fit your problem. =20 Suppose you have three services that you want to monitor: s1, s2, and = s3. Let's say s3 depends on s1 and s2, while s2 depends on s1. We'll = say s1 is independent. If I understand correctly, you only want to = report s3 down if s1 and s2 are up. Similarly, you only want to report = s2 down, if s1 is up. =20 Let's assume you have some simple method of representing your = requirements. For example, you might represent your requirements as = follows in a file you name service.cfg. s1=3D"Service 1" s2=3D"Service 2":s1 s3=3D"Service 3":s1,s2 Now you want to convert this into NodeBrain, preferably with a script or = program you construct. Example 1: In this example, we keep the NodeBrain code simple by assuming you have = a command called "checkService" that can check the status of several = monitored services and report the status to your NodeBrain agent with a = single assertion. Suppose you use this command as follows. (Alternatively you could = reference service.cfg.) checkService s1 s2 s3 If your NodeBrain agent is defined as "mynb" in your private.nb file, = you could assert the current states as follows, where a value of 1 = represents UP and a value of 0 represents DOWN.=20 nb ":>mynb service assert s1=3D1,s2=3D0,s3=3D0" Based on these states, I'm thinking you would like to be notified that = s2 is down, but not that s3 is down because the dependencies are not = satisfied. This could be accomplished by the following NodeBrain rules to be = included in your agent configuration file, perhaps sourced from a = separate file called service.nb. define service context; # Check status of services every 5 minutes service define schedule on(~(5m)):=3DcheckServices s1 s2 s3 # Alarm when service is down with dependencies up. service define r1 on(s1=3D0):=3Dalarm "Service 1 is down" service define r2 on(s2=3D0 and s1):=3Dalarm "Service 2 is down" service define r3 on(s3=3D0 and s1 and s2):=3Dalarm "Service 3 is down" Notice the relationship of the conditions for rules r1, r2, and r3 to = the hypothetical configuration file above. You must provide a host = command "alarm" to perform the required notification, changing the name = and syntax as desired. Example 2: In this example, we'll complicate the NodeBrain code a bit---no big deal = if you are generating it with a script or program. This time we assume = you have a script or program called checkService that checks and reports = the status of a single service, and that you only want to execute it for = a given service when the dependencies are satisfied. To check the status of s1 the following command might be used. checkService s1 To report the status of s1 as UP the following nb command would be used = by checkService. nb ":>mynb service s1.up=3D1" For this example we'll use a context containing a separate context for = each monitored service. The context for a given service contains a set = of rules and defined cells. # context containing a set of contexts---one for each monitored service define service context;=20 # Service 1 context service define s1 context;=20 # dependencies for s1 - always satisfied - see dependencies for s2 and = s3 below service.s1 define dep cell 1; # schedule to check service status=20 service.s1 define sched cell (dep and ~(5m)); # status is no longer known (see note [1] below) service.s1 define rExpire on(sched):assert up=3D?; # check service status=20 service.s1 define rCheck on(sched):=3DcheckService s1 # note when service is down service.s1 define down cell (dep and not up); # respond when service goes down (see note [2] below) service.s1 define rAlarm on(down ^ not down):=3Dalarm "Service 1 is = down"; service define s2 context; service.s2 define dep cell s1.up; service.s2 define sched cell (dep and ~(5m)); service.s2 define rExpire on(sched):assert up=3D?;=20 service.s2 define rCheck on(sched):=3DcheckService s2 service.s2 define down cell (dep and not up); service.s2 define rAlarm on(down ^ not down):=3Dalarm "Service 2 is = down"; service define s3 context; service.s3 define dep cell s1.up and s2.up; service.s3 define sched cell (dep and ~(5m)); service.s3 define rExpire on(sched):assert up=3D?;=20 service.s3 define rCheck on(sched):=3DcheckService s2 service.s3 define down cell (dep and not up); service.s3 define rAlarm on(down ^ not down):=3Dalarm "Service 3 is = down"; [1] The way we've scheduled the checkService commands, they can all run = concurrently and we don't know in what order they will report status = back to NodeBrain. By setting the status to ? ("Unknown") at the same = time we spawn the checkStatus command, we ensure that no response is = taken until the status of a service and all dependencies is current. [2] We introduced a variable called "down" to use as a basis for = alarming. This variable is true only when the status of a service is = known to be DOWN and the status of all dependencies is known to be UP. = In our alarm rule we use the "down" variable with the "flip flop" = operator ("^") to make sure it only toggles on known conditions. Since = the value of "up" for a given service and all dependencies will take on = the value of ? ("Unknown"), the value of "down" will also take on the = unknown value. But only known conditions will toggle the "flip flop" = condition. So the rule will only fire when the known status transitions = to DOWN (with dependencies known to be UP) when the previous known = condition was UP. Agent Configuration Example: In either case above, let's say your rules are stored as service.nb. = Your agent configuration file might look like this. #!/usr/bin/nb set log=3D"/home/myuser/log/myagent.log"; set out=3D"/home/myuser/out"; define l1 listener protocol=3D"NBP",port=3D12345; # you pick the port source /home/myuser/service.nb; Your private.nb file might look like this. (The identity string will be = different.) declare myid identity 3.3575658473647a8b34.3434578934738473.0; # = generate with the identity command portray myid; declare mynb brain myid@localhost:12345; Hopefully this will get you started. Let me know if I've misunderstood = your application. Ed Trettevik -----Original Message----- From: gh...@rl... [mailto:gh...@rl...] Sent: Thursday, June 26, 2003 4:14 PM To: nod...@li... Subject: [Nodebrain-users] Help! Examples wanted. Hello, I just finished reading the User Manual and nodebrain looks great, could = you help me with afew examples. I would like to monitor items with a dependence tree, where I would not report items when one of the things it dependences on is down. Nodebrain is very different than the programming I am use to, and does = not come naturally. The group I work for would like me to put together the PROS/CONS to = using it for a meeting tomarrow, and I did not like I have enough knowledge yet = to do it justice. An example or three would really help. I want to schedule monitors if = the dependences are not down, and report back status on the items. Thanks for all your work, Gilbert. ------------------------------------------------------- This SF.Net email is sponsored by: INetU Attention Web Developers & Consultants: Become An INetU Hosting Partner. Refer Dedicated Servers. We Manage Them. You Get 10% Monthly Commission! INetU Dedicated Managed Hosting http://www.inetu.net/partner/index.php _______________________________________________ Nodebrain-users mailing list Nod...@li... https://lists.sourceforge.net/lists/listinfo/nodebrain-users |