I understand now, thank you. I added a rule between the two to cover all =
syslog implementations we have that solved the problem and changed the =
<host>_SYSLOG_LINE context to infinite lifetime. The new rule basically =
covers the case of starting and having "last message repeated" as the =
first event since the first rule would be skipped as there is no =
<host>_SYSLOG_LINE.
type=3DSingle
continue=3DDontCont
ptype=3DRegExp
pattern=3D^\S+\s+\d+ \S+ (\S+) last message repeated (\d+) times
desc=3DStop "last message repeated" loop if first event for host
action=3Dnone
-----Original Message-----
From: Risto Vaarandi [mailto:ristov@...]
Sent: Thursday, December 18, 2003 3:46 PM
To: ANDERSON RUSSELL D (ANDY); simple-evcorr-users@...
Subject: RE: [Simple-evcorr-users] Memory and CPU usage
hi Andy,
there is one caveat that could cause things go wrong. The =
<host>_SYSLOG_LINE context has a lifetime of 1 minute, but suppose the =
host logs a message, which is then followed by a "last message repeated" =
from the same host after _more_ than 60 seconds. Since the context =
<host>_SYSLOG_LINE does not exist any more, the 2nd rule does not match =
and the "last message repeated" line will be passed to the 3rd rule, =
which saves the "last message repeated" message to the =
<host>_SYSLOG_LINE context.
As far as "last message repeated" messages are concerned, I have seen =
two different implementations:
1) syslogd always logs the original message at least once before "last =
message repeated", which means that no consecutive "last message =
repeated" lines can be received from the same host.
2) after syslogd has logged "last message repeated" line, it can log it =
again without logging the original message before, e.g.,
May 27 02:10:03 myhost kernel: VFS: Disk change detected on device =
ide1(22,0)=20
May 27 02:10:35 myhost last message repeated 8 times=20
May 27 02:11:35 myhost last message repeated 15 times=20
May 27 02:12:35 myhost last message repeated 15 times=20
If the latter is the case for your syslogd, then it is possible that =
once "last message repeated" line was erroneously saved to =
<host>_SYSLOG_LINE, then the immediately following "last message =
repeated" line will trigger an endless loop. Note that after the endless =
loop has run for 1 minute, the <host>_SYSLOG_LINE context will be =
deleted, but since there are already many "last message repeated" =
messages generated by the loop, one of those messages will be matched by =
the 3rd rule, which will reinforce the loop.
Try to change the <host>_SYSLOG_LINE contexts to have an infinite =
lifetime and check whether this makes a difference.
hth,
risto
> The dump shows that although sec is running it appears to have hung on =
> the syslog repeated N times rules(2,3) as seen by the old times in the =
> input buffer compared to the dump time. There was only one original > =
entry of "last message repeated". Looking at the context > =
ems0171e_SYSLOG_LINE, the Creation Time: Thu Dec 18 14:08:10 2003 there =
> appears to be a stuck loop condition. Could someone look at these =
rules > for help?
> ----------------------------------
> type=3DSingle
> continue=3DDontCont
> ptype=3DRegExp
> pattern=3D^S+s+d+ S+ (S+) last message repeated (d+) times
> context=3D$1_SYSLOG_LINE
> desc=3D'Decompress' $1 repeated messages compressed by syslogd $2 =
times
> action=3Dcopy $1_SYSLOG_LINE %L;=20
> eval %O ( $line =3D q(%L);=20
> @ret =3D ();=20
> for ($i =3D 0; $i < $2; ++$i) { push @ret, $line; }=20
> return @ret; );=20
> event %O;
>=20
> type=3DSingle
> continue=3DTakeNext
> ptype=3DRegExp
> pattern=3D^S+s+d+ S+ (S+)
> desc=3DKeep the last $1 syslog line in case of "last message repeated"
> action=3Dcreate $1_SYSLOG_LINE 60; add $1_SYSLOG_LINE $0
> ----------------------------------
> Date of dump: Thu Dec 18 14:09:09 2003
> Program version: 2.2.0
>=20
> Performance statistics:
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> Run time: 15859 seconds
> User time: 5387.83 seconds
> System time: 50.45 seconds
> Child user time: 7.91 seconds
> Child system time: 6.07 seconds
> Processed input lines: 2836320
>=20
> Rule usage statistics:
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>=20
> Statistics for the rules from /etc/sec.conf
> ------------------------------------------------------------
> Rule 1 at line 20 (Ignore these sshd: Failed password for root) has > =
matched 0 events
> Rule 2 at line 34 ('Decompress' $1 repeated messages compressed by > =
syslogd $2 times) has matched 1900010 events
> Rule 3 at line 51 (Keep the last $1 syslog line in case of "last =
message > repeated") has matched 61 events
> Rule 4 at line 62 (PAGER) has matched 0 events
> Rule 5 at line 69 (PAGER) has matched 0 events
> Rule 6 at line 78 (Repeated failed system(s) access) has matched 0 > =
events
> Rule 7 at line 88 (Repeated failed system(s) access) has matched 0 > =
events
> Rule 8 at line 101 (Repeated failed system(s) access) has matched 0 > =
events
> Rule 9 at line 126 ([FAILED ACCESS]) has matched 0 events
> Rule 10 at line 135 ([FAILED ACCESS]) has matched 0 events
> Rule 11 at line 144 ([FAILED ACCESS]) has matched 0 events
> Rule 12 at line 156 ([FAILED ACCESS]) has matched 0 events
> Rule 13 at line 166 ([FAILED ACCESS]) has matched 0 events
> Rule 14 at line 176 ([FAILED ACCESS]) has matched 0 events
> Rule 15 at line 196 (Suspect su user access) has matched 0 events
> Rule 16 at line 209 (Suspect su user access) has matched 0 events
> Rule 17 at line 222 (Suspect su user access) has matched 0 events
> Rule 18 at line 244 (ulp.pl (update local password) SUCCESS list) has =
> matched 0 events
> Rule 19 at line 266 (Suppress lines that match: (?-xism:^S+s+d+ S+ > =
(S+) .*ulp:.* [FAIL])) has matched 0 events
> Rule 20 at line 275 (ulp.pl (update local password) FAIL list) has > =
matched 0 events
> Rule 21 at line 295 ($2 VPN tunnel established) has matched 0 events
> Rule 22 at line 319 (Delete $2 VPN_ context(s)) has matched 0 events
> Rule 23 at line 343 (Suspect (host|user) or VPN access) has matched 0 =
> events
> Rule 24 at line 359 (Linux system console access by $2) has matched 0 =
> events
> Rule 25 at line 367 (Linux system console access by $2) has matched 0 =
> events
> Rule 26 at line 381 (Compaq system console access by $2) has matched 0 =
> events
> Rule 27 at line 389 (Compaq system console access by $2) has matched 0 =
> events
> Rule 28 at line 404 (Solaris system console access by $2) has matched =
0 > events
> Rule 29 at line 412 (Solaris system console access by $2) has matched =
0 > events
>=20
> Input sources:
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> /etc/syslog_info (status: Open, received data: 0 lines, no context =
set)
>=20
> Content of input buffer:
> ------------------------------------------------------------
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> ------------------------------------------------------------
>=20
> Pending events that are generated by rules:
> ------------------------------------------------------------
> ------------------------------------------------------------
>=20
> List of active event correlation operations:
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> Total: 0 elements
>=20
> List of active contexts:
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> Context Name: VPN_USER_srestrad_IP_10.1.60.4
> Context Name: VPN_IP_10.1.60.4
> Context Name: VPN_PID_11787
> Creation Time: Thu Dec 18 11:01:24 2003
> Lifetime: infinite
> 1 events associated with context:
> srestrad VPN tunnel IP is 10.1.60.4
> ------------------------------------------------------------
> Context Name: VPN_USER_prsutton_IP_10.1.60.46
> Context Name: VPN_IP_10.1.60.46
> Context Name: VPN_PID_9529
> Creation Time: Thu Dec 18 12:08:09 2003
> Lifetime: infinite
> 1 events associated with context:
> prsutton VPN tunnel IP is 10.1.60.46
> ------------------------------------------------------------
> Context Name: VPN_USER_cagonzal_IP_10.1.60.45
> Context Name: VPN_IP_10.1.60.45
> Context Name: VPN_PID_8494
> Creation Time: Thu Dec 18 10:16:09 2003
> Lifetime: infinite
> 1 events associated with context:
> cagonzal VPN tunnel IP is 10.1.60.45
> ------------------------------------------------------------
> Context Name: ems0171e_SYSLOG_LINE
> Creation Time: Thu Dec 18 14:08:10 2003
> Lifetime: 60 seconds
> 1 events associated with context:
> Dec 18 12:37:02 ems0171e last message repeated 2 times
> ------------------------------------------------------------
> Total: 4 elements
>=20
> Running children:
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> Total: 0 elements
>=20
> User-defined variables:
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
> %L =3D 'Dec 18 12:37:02 ems0171e last message repeated 2 times'
>=20
> %O =3D 'Dec 18 12:37:02 ems0171e last message repeated 2 times
> Dec 18 12:37:02 ems0171e last message repeated 2 times'
>=20
> Total: 2 elements
>=20
>=20
> -----Original Message-----
> From: Risto Vaarandi [mailto:ristov@...]
> Sent: Thursday, December 18, 2003 1:58 PM
> To: ANDERSON RUSSELL D (ANDY); =
simple-evcorr-users@...
> Subject: Re: [Simple-evcorr-users] Memory and CPU usage
>=20
>=20
> > We are seeing this system behavior on our netloghost running SEC.
> > `uname -a` Linux <hostname> 2.4.18-3
> >=20
> > 9:32am up 18:37, 1 user, load average: 1.77, 1.28, 1.10
> > 55 processes: 52 sleeping, 3 running, 0 zombie, 0 stopped
> > CPU states: 97.4% user, 2.5% system, 0.0% nice, 0.0% idle
> > Mem: 772984K av, 761464K used, 11520K free, 0K shrd, > =
10544K > buff
> > Swap: 264560K av, 144836K used, 119724K free > =
20396K > cached
> >=20
> > PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND
> > 29999 root 25 0 669M 547M 16428 R 98.6 72.5 =
> > 230:10 sec.pl
> >=20
> > We just added another 512 memory to this machine because of =
swapping.
> > Has anyone seen this or have recommendations?
>=20
> hi,
>=20
> you can check the status of SEC by sending the SIGUSR1 signal to it > =
which creates a dump file /tmp/sec.dump. This file contains a detailed > =
information how many correlation operations are active, how many > =
contexts have been created, how big the contexts are, etc.
>=20
> I would suspect that although you are seeing both the high memory and =
> CPU usage, the former is causing the latter, e.g., if you have tens of =
> thousands of contexts and event correlation operations in the memory, =
it > will also require a lot of CPU time to process the correlation and =
> context lists.
> So as a starting point, I would check which entities are actually > =
consuming most memory, and try to reduce their number somehow.
>=20
> One common mistake that I've made in the past is to store *all* > =
interesting event sequences to contexts and keeping their in working > =
memory. For example, if you are using that strategy in an IDS for > =
memorizing individual attacks, you could easily run out of memory, since =
> the number of external IP addresses that incidents can originate from =
> can be extremely large. It could be worth of contemplating other > =
strategies, e.g., to store some information in disk files (or database) =
> instead of keeping them in the working memory. But first please check =
> the SEC dump and see what is actually eating your memory.
>=20
> hth,
> risto
>=20
> >=20
> > Russell Anderson (Andy)
> > Computer Applications
> > Phone: 602 -236-6434
> > Fax: 602-236-4327
> >=20
> >=20
> >=20
> >=20
> > -------------------------------------------------------
> > This SF.net email is sponsored by: IBM Linux Tutorials.
> > Become an expert in LINUX or just sharpen your skills. Sign up for =
> IBM's
> > Free Linux Tutorials. Learn everything from the bash shell to sys > =
admin.
> > Click now! http://ads.osdn.com/?ad_id=1278&alloc_id371&op=BFick
> > _______________________________________________
> > Simple-evcorr-users mailing list
> > Simple-evcorr-users@...
> > https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users
> >=20
>=20
> -----------------------------------------
> Hot Mobiil - helinad, logod ja pilts=F5numid!
> http://portal.hot.ee
>=20
>=20
>=20
>=20
>=20
> -------------------------------------------------------
> This SF.net email is sponsored by: IBM Linux Tutorials.
> Become an expert in LINUX or just sharpen your skills. Sign up for =
IBM's
> Free Linux Tutorials. Learn everything from the bash shell to sys =
admin.
> Click now! http://ads.osdn.com/?ad_id=1278&alloc_id371&op=BFick
> _______________________________________________
> Simple-evcorr-users mailing list
> Simple-evcorr-users@...
> https://lists.sourceforge.net/lists/listinfo/simple-evcorr-users
>=20
-----------------------------------------
Hot Mobiil - helinad, logod ja pilts=F5numid!
http://portal.hot.ee
|