[Nfsen-discuss] profile not properly unlocking?
Netflow visualisation and investigation tool
Brought to you by:
phaag
|
From: Ivan A. B. <iv...@li...> - 2006-07-27 22:28:38
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi Peter,
I couldn't think of a suitably descriptive subject :(
I am having a problem whereby a profile locks and, even after unlocking
it, the data does not get parsed by nfsen.
The
==============================
Jul 27 04:55:32 cyan nfsen[10456]: Signal launcher: live:200607270450
Jul 27 04:55:32 cyan nfsen[12828]: Launcher Cycle: received: live,
200607270450
Jul 27 04:55:32 cyan nfsen[12828]: Launcher Cycle: Time: 200607270450,
Profile: live, Module: PortTracker,
Jul 27 04:55:32 cyan nfsen[12828]: PortTracker run: Profile: live, Time:
200607270450
Jul 27 04:55:32 cyan nfsen[12828]: /usr/local/bin/nftrack -M
/srv/data/nfsen/profiles/live/switch01:switch02:switch03:switch08:switch10:switch17:switch19:switch
20:switch26:switch28 -r nfcapd.200607270450 -d /srv/data/nfsen/ports-db
- -A -t 200607270450 -s -p -w /srv/data/nfsen/ports-db/portstat.txt
Jul 27 04:55:32 cyan nfsen[10456]: ERROR Update RRD time:
'200607270450', db: 'switch01', profile: 'live': ERROR while updating
RRD DB switch01.rrd: error mmapping file
/srv/nfsen/profiles/live/switch01.rrd
Jul 27 04:55:32 cyan nfsen[10456]: ERROR Update RRD time:
'200607270450', db: 'switch02', profile: 'live': ERROR while updating
RRD DB switch02.rrd: error mmapping file
/srv/nfsen/profiles/live/switch02.rrd
<SNIP>
Jul 27 04:55:32 cyan nfsen[10456]: ERROR Update RRD time:
'200607270450', db: 'switch28', profile: 'live': ERROR while updating
RRD DB switch28.rrd: error mmapping file
/srv/nfsen/profiles/live/switch28.rrd
Jul 27 04:55:32 cyan nfsen[10456]: Error GenGraph: Profile: live,
traffic-day: malloc fetch data area at /srv/nfsen/libexec/NfSenRRD.pm
line 239.
Jul 27 04:55:37 cyan nfsen[12828]: nftrack exited with value 0
Jul 27 04:55:37 cyan nfsen[12828]: /usr/local/bin/nftrack -d
/srv/data/nfsen/ports-db -S -p -w /srv/data/nfsen/ports-db/portstat24.txt
Jul 27 04:55:37 cyan nfsen[12828]: nftrack exited with value 0
Jul 27 04:55:37 cyan nfsen[12828]: PortTracker run: Done.
Jul 27 05:26:03 cyan nfsen[17957]: connection on UNIX socket
Jul 27 05:26:03 cyan nfsen[17957]: comm server started: 32317
Jul 27 05:26:03 cyan nfsen[17957]: comm child 32317 terminated
Jul 27 05:26:15 cyan nfsen[17957]: connection on UNIX socket
Jul 27 05:26:15 cyan nfsen[17957]: comm server started: 20489
Jul 27 05:26:15 cyan nfsen[17957]: comm child 20489 terminated
==============================
The final few lines repeat. The other thing to note is the "missing
time" between the port-tracker run and the subsequent log entries.
I unlocked the profile (it was only the live profile that was locked
this time), however the nfcapd files didn't get parsed. Reloading nfsen
(nfsen reload) does not clear the problem .. but doing an "nfsen stop &&
nfsen start" fixes the problem .. the scheduler notices the unparsed
logfiles and schedules the parsing:
==============================
Jul 27 10:16:51 cyan nfsen[11914]: Starting /srv/nfsen/bin/nfsen.
Jul 27 10:16:51 cyan nfsen[2268]: Startup. Version: snapshot-20060412
$Id: nfsen
d 55 2006-04-12 08:35:59Z peter $
Jul 27 10:16:51 cyan nfsen[5741]: Launcher started: [26741]
Jul 27 10:16:51 cyan nfsen[11914]: Terminating /srv/nfsen/bin/nfsen.
Jul 27 10:16:51 cyan nfsen[27053]: Comm server started: [27053]
Jul 27 10:16:51 cyan nfsen[5741]: nfsend: [5741]
Jul 27 10:16:51 cyan nfsen[5741]: Run periodic at Thu Jul 27 10:15:00 2006
Jul 27 10:16:51 cyan nfsen[26741]: Frontend module 'PortTracker.php' found
Jul 27 10:16:51 cyan nfsen[5741]: Update profile live
Jul 27 10:16:51 cyan nfsen[26741]: PortTracker BEGIN
Jul 27 10:16:51 cyan nfsen[26741]: Loading plugin 'PortTracker': Success
Jul 27 10:16:51 cyan nfsen[26741]: PortTracker: Init
Jul 27 10:16:51 cyan nfsen[26741]: Initializing plugin 'PortTracker':
Success
Jul 27 10:16:51 cyan nfsen[26741]: ModList: live - PortTracker
Jul 27 10:16:52 cyan nfsen[5741]: nfsend: exit child[19941]
Jul 27 10:16:52 cyan nfsen[5741]: nfsend: exit child[31434]
Jul 27 10:16:52 cyan nfsen[5741]: nfsend: exit child[6586]
Jul 27 10:16:52 cyan nfsen[5741]: nfsend: exit child[26908]
Jul 27 10:16:53 cyan nfsen[5741]: nfsend: exit child[17002]
Jul 27 10:16:53 cyan nfsen[5741]: nfsend: exit child[19267]
Jul 27 10:16:53 cyan nfsen[5741]: nfsend: exit child[29482]
Jul 27 10:16:53 cyan nfsen[5741]: nfsend: exit child[4429]
Jul 27 10:16:54 cyan nfsen[5741]: nfsend: exit child[5155]
Jul 27 10:16:54 cyan nfsen[5741]: nfsend: exit child[27034]
Jul 27 10:16:54 cyan nfsen[26741]: Launcher Cycle: received: live,
200607270450
Jul 27 10:16:54 cyan nfsen[26741]: Launcher Cycle: Time: 200607270450,
Profile:
live, Module: PortTracker,
Jul 27 10:16:54 cyan nfsen[26741]: PortTracker run: Profile: live, Time:
2006072
70450
Jul 27 10:16:54 cyan nfsen[26741]: /usr/local/bin/nftrack -M
/srv/data/nfsen/pro
files/live/switch01:switch02:switch03:switch08:switch10:switch17:switch19:switch
20:switch26:switch28 -r nfcapd.200607270450 -d /srv/data/nfsen/ports-db
- -A -t 20
0607270450 -s -p -w /srv/data/nfsen/ports-db/portstat.txt
Jul 27 10:16:54 cyan nfsen[5741]: Signal launcher: live:200607270450
Jul 27 10:16:56 cyan nfsen[5741]: nfsend: exit child[31871]
Jul 27 10:16:57 cyan nfsen[5741]: nfsend: exit child[31870]
Jul 27 10:16:57 cyan nfsen[5741]: nfsend: exit child[19788]
Jul 27 10:16:57 cyan nfsen[5741]: nfsend: exit child[10781]
Jul 27 10:16:57 cyan nfsen[5741]: nfsend: exit child[16382]
Jul 27 10:16:58 cyan nfsen[5741]: nfsend: exit child[31518]
Jul 27 10:16:58 cyan nfsen[5741]: nfsend: exit child[3112]
Jul 27 10:16:58 cyan nfsen[5741]: nfsend: exit child[13208]
Jul 27 10:16:58 cyan nfsen[5741]: nfsend: exit child[14701]
Jul 27 10:16:58 cyan nfsen[5741]: nfsend: exit child[31109]
Jul 27 10:16:58 cyan nfsen[5741]: Signal launcher: live:200607270455
Jul 27 10:16:58 cyan nfsen[5741]: nfsend: exit child[1296]
<SNIP>
==============================
It continues with the above pattern of lines (signal launcher, then a
number of "nfsend: exit child") until it has caught up with the backlog.
I'm not sure why the nfsend scheduler doesn't pick up the problem, or a
"reload". I prefer not to do a stop/start because it interrupts the data
collection (it stops all sfcapd processes and then starts them all again).
I believe this may be related to RAM (the only "new" thing I've done in
the past few weeks is create a tmpfs partition which is using ~2GB of
the 4GB RAM), but I'd have thought the kernel would free up some buffer
space if required:
==============================
cyan log # free
total used free shared buffers cached
Mem: 3363480 3248756 114724 0 47376 3018772
- -/+ buffers/cache: 182608 3180872
Swap: 3145720 604268 2541452
==============================
Ah ... I've just found this in kernel.log (19/07/2006 @ 11:00 was last
problem) [having written all the rest of the email]:
==============================
Jul 19 11:00:37 cyan grsec: From 195.66.232.38: signal 11 sent to
/srv/nfsen/bin/nfsend[nfsend:430] uid/euid:0/210 gid/egid:81/81, parent
/sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
Jul 20 19:55:32 cyan grsec: From 195.66.232.38: signal 11 sent to
/srv/nfsen/bin/nfsend[nfsend:3018] uid/euid:0/210 gid/egid:81/81, parent
/sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
Jul 27 04:55:33 cyan grsec: From 195.66.232.38: signal 11 sent to
/srv/nfsen/bin/nfsend[nfsend:10456] uid/euid:0/210 gid/egid:81/81,
parent /sbin/init[init:1] uid/euid:0/0 gid/egid:0/0
==============================
As this segfault could be due to duff RAM, I'll try to schedule downtime
for a RAM check .. but can you think of anything else offhand?
Cheers
Ivan
- --
Ivan Beveridge
<iv...@li...> http://www.linx.net/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFEyIxvQQZN5jq7vncRAvGMAJ4kuDInJ1eTmqBIJjDPBCE+yGXJxgCfRPFi
1yl8DMFwRgMkCra5N+aW+b4=
=WGOQ
-----END PGP SIGNATURE-----
|