Re: [Nfsen-discuss] FW: Live profile filling up the drive (profile.dat doesn't get updated)
Netflow visualisation and investigation tool
Brought to you by:
phaag
From: Peter H. <pet...@sw...> - 2008-04-16 13:01:44
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 - --On April 15, 2008 10:27:58 -0400 Bogdan Dumitriu <Bog...@co...> wrote: | I fixed the nfslock issues. For some reason nfslock has to run on the | client. So I did "service nfslock start" and that fixed the kernel | errors. However nfcapd still complains when I stop it (not errors when | it's running): | | Apr 14 23:03:11 pandora4 /usr/local/bin/nfcapd[6337]: ioctl(F_WRLCK) | error in nfstatfile.c line 339: Input/output error | Apr 14 23:03:11 pandora4 /usr/local/bin/nfcapd[6337]: Terminating | nfcapd. | | Also the files are ok and nfsen is able to read them and generate the | graphs. So I guess I can ignore those. | | Now I'm back to my original problem: nfsend is not updating the size for | the live profile. I see the graphs, I can query the flows, get | statistics, etc., it's just the size of the live profile that is not | growing. The other profiles are fine. I also don't have this problem on | the devel machine where both nfsen and nfcapd are on the same machine | and nfcapd writes locally. | | So could this be NFS related? Yes - most likely it is. nfcapd/nfexpire/nfsend use file locks to arbitrate concurrent file access to .nfstat. So you have to verify that proper locking for NFS works, which is a kind of headache most of the time. Check your rpc.lockd ( lockd ) and friends on the NFS host. - Peter | | This is how i mount the share: | | artemis:/opt/data/netflow /data nfs rw 0 0 | | I also tried: | artemis:/opt/data/netflow /data nfs | rw,hard,intr,tcp,lock 0 0 | | Still no luck! | | All is ok if I run nfexpire manually: nfexpire -p -r /profile_data && | nfexpire -p -s 900G -w 90 -e /profile_data | | I also run did: sudo -u apache command and it was ok. | | selinux completely disabled on both the analizer and the collector. | | There are no errors in the logs: | | Apr 15 04:10:15 artemis nfsen[4305]: Run periodic at Tue Apr 15 04:10:00 | 2008 | Apr 15 04:10:15 artemis nfsen[4305]: Prepare profiling './live' | Apr 15 04:10:15 artemis nfsen[4305]: 0 channels/alerts to profile | Apr 15 04:10:15 artemis nfsen[4305]: No continous profiles - nothing to | profile | Apr 15 04:10:15 artemis nfsen[4305]: Update profile live in group . | Apr 15 04:10:15 artemis nfsen[4305]: Add channel size 12099584 | Apr 15 04:10:15 artemis nfsen[4305]: Set new profile size: 12099584 | Apr 15 04:10:15 artemis nfsen[4305]: Add .:live:200804150405 for plugin | processing | Apr 15 04:10:15 artemis nfsen[7087]: Run periodic at Tue Apr 15 04:10:00 | 2008 | Apr 15 04:10:15 artemis nfsen[7087]: Prepare profiling './live' | Apr 15 04:10:15 artemis nfsen[7087]: 0 channels/alerts to profile | Apr 15 04:10:15 artemis nfsen[7087]: No continous profiles - nothing to | profile | Apr 15 04:10:15 artemis nfsen[7087]: Run plugins for 200804150405 | Apr 15 04:10:15 artemis nfsen[7087]: Run plugins done. | Apr 15 04:10:15 artemis nfsen[7087]: Check alerts for Tue Apr 15 | 04:05:00 2008 | Apr 15 04:10:15 artemis nfsen[7087]: Check alerts done. | Apr 15 04:10:15 artemis nfsen[7087]: Run expire at Tue Apr 15 04:10:00 | 2008 | Apr 15 04:10:15 artemis nfsen[7087]: End expire at Tue Apr 15 04:10:00 | 2008 | | The following lines are always the same (even though there are new files | in the data folder and I can see they's been processed and graphs | updated): | | Apr 15 04:15:15 artemis nfsen[7087]: Add channel size 12099584 | Apr 15 04:15:15 artemis nfsen[7087]: Set new profile size: 12099584 | | It seems that nfsend is not able to update the .nfstat. It only gets | updated when I run nfexpire manually. | | [root@pandora4 hala1]# cat .nfstat | first=1208228400 | last=1208265300 | size=6599692288 | maxsize=0 | numfiles=124 | lifetime=0 | watermark=95 | status=0 | | Could this be related to Fedora or the nfs version? | | Linux pandora4 2.6.18-1.2798.fc6 #1 SMP Mon Oct 16 14:54:20 EDT 2006 | i686 i686 i386 GNU/Linux | nfs-utils-1.0.9-8.fc6 | nfs-utils-lib-1.0.8-7.2 | | Thanks, | Bogdan. | | | -----Original Message----- | From: Bogdan Dumitriu | Sent: April 14, 2008 5:52 PM | To: Peter Haag; nfs...@li... | Subject: RE: [Nfsen-discuss] Live profile filling up the drive | (profile.dat doesn't get updated) | | | Hi Peter, | | Sorry for replying so late. I've been busy rebuilding everything from | scratch. :-) | | I'm thinking it's NFS related. I assume that for some reason nfcapd is | not able to lock the files or something like that. We have a distributed | setup: 3 collectors writing to a network share using NFS v3. I only get | errors when I stop the nfcapd (no errors when I start it): | | Apr 14 17:40:25 pandora4 kernel: lockd: cannot monitor 10.0.0.194 Apr 14 | 17:40:25 pandora4 kernel: lockd: failed to monitor 10.0.0.194 Apr 14 | 17:40:25 pandora4 /usr/local/bin/nfcapd[2449]: ioctl(F_WRLCK) error in | nfstatfile.c line 339: No locks available Apr 14 17:40:25 pandora4 | /usr/local/bin/nfcapd[2449]: Terminating nfcapd. | Apr 14 17:40:25 pandora4 /usr/local/bin/nfcapd[2446]: Ident: 'hala1' | Flows: 124710, Packets: 2259605, Bytes: 1526942249, Sequence Errors: 1, | Bad Packets: 0 Apr 14 17:40:25 pandora4 kernel: lockd: cannot monitor | 10.0.0.194 Apr 14 17:40:25 pandora4 kernel: lockd: failed to monitor | 10.0.0.194 Apr 14 17:40:25 pandora4 /usr/local/bin/nfcapd[2446]: | ioctl(F_WRLCK) error in nfstatfile.c line 339: No locks available Apr 14 | 17:40:25 pandora4 /usr/local/bin/nfcapd[2446]: Terminating nfcapd. | | Is there a better way than writing to the share in real-time? Maybe | write locally and rsync hourly or something like that? | | Thanks, | Bogdan. | | | | -----Original Message----- | From: Peter Haag [mailto:pet...@sw...] | Sent: April 2, 2008 3:27 AM | To: Bogdan Dumitriu; nfs...@li... | Subject: Re: [Nfsen-discuss] Live profile filling up the drive | (profile.dat doesn't get updated) | | -----BEGIN PGP SIGNED MESSAGE----- | Hash: SHA1 | | Hi Bogdan, | It looks like that your nfcapd collector processes can not update the | stat files. | Make sure the UID for nfcapd can write and update the files. Also check | the syslog daemon message file, as problems are reported there. Make | also sure that any SElinux policies are set correct if you have them in | place. | Let me know about the results | | - Peter | | - --On March 28, 2008 11:31:34 -0400 Bogdan Dumitriu | <Bog...@co...> wrote: | | | Hello everybody, | | | | First a bit about our system: Linux 2.6.18-1.2849.fc6 #1 SMP | | | | We tried both the latest stable and beta: | | nfsen: 1.3b-20070824 $Id: nfsen 18 2007-07-20 12:33:25Z phaag $ | | | | We have recently started to use nfsen/nfdump and realized it's not | | updating the size of the live profile and filled the whole drive. It's | | | strange that all the other profiles are fine. Both the gui and "nfsen | | -l live" show "Size: 0" for the live profile: | | | | [root@brawn bin]#./nfsen -l live | | name live | | group (nogroup) | | tcreate Fri Mar 28 10:20:00 2008 | | tstart Fri Mar 28 10:23:54 2008 | | tend Fri Mar 28 11:00:00 2008 | | updated Fri Mar 28 11:00:00 2008 | | expire 0 hours | | size 0 | | maxsize 0 | | type live | | locked 0 | | status OK | | version 130 | | channel pego10k sign: + colour: #0000ff order: 1 sourcelist: | | pego10k ERR Channel info file missing for channel 'pego10k' in 'live' | | Files: 0 Size: 0 | | | | even though the live profile is ~800MB: | | | | [root@brawn bin]# du -bs /data/nfsen/profiles-data/live/ 904764050 | | /data/nfsen/profiles-data/live/ | | | | | | By default ".nfstat" (channel info in | | $DATADIR/profile-data/live/channel/.nfstat) is empty and it doesn't | | get | | updated: | | | | Mar 28 10:50:15 brawn nfsen[12577]: Error reading channel stat | | information. Missing key 'first' | | | | | | "nfsen -r live" will regenerate ".nfstat" and "profile.dat" with the | | right info (including the size) | | | | [root@brawn bin]# ./nfsen -r live | | name live | | group (nogroup) | | tcreate Fri Mar 28 10:20:00 2008 | | tstart Fri Mar 28 10:20:00 2008 | | tend Fri Mar 28 11:10:00 2008 | | updated Fri Mar 28 11:10:00 2008 | | expire 0 hours | | size 801.9 MB | | maxsize 0 | | type live | | locked 0 | | status OK | | version 130 | | channel pego10k sign: + colour: #0000ff order: 1 sourcelist: | | pego10k Files: 11 Size: 840855552 | | | | [root@brawn bin]# | | [root@brawn bin]# cat /data/nfsen/profiles-data/live/pego10k/.nfstat | | first=1206714000 | | last=1206717000 | | size=840855552 | | maxsize=0 | | numfiles=11 | | lifetime=0 | | watermark=0 | | status=0 | | | | | | But unfortunately they stay that way and it will no longer get updated | | | automatically. | | | | Mar 28 11:20:15 brawn nfsen[12981]: Update profile live in group . | | Mar 28 11:20:15 brawn nfsen[12981]: Add channel size 840855552 Mar 28 | | 11:20:15 brawn nfsen[12981]: Set new profile size: 840855552 | | | | Mar 28 11:25:15 brawn nfsen[12981]: Update profile live in group . | | Mar 28 11:25:15 brawn nfsen[12981]: Add channel size 840855552 Mar 28 | | 11:25:15 brawn nfsen[12981]: Set new profile size: 840855552 | | | | ------------------- and so on ---------------------------- | | | | | | At the beginning we thought we did something wrong so we tried to | | recompile the whole thing, remove all the channels, re-add the | | channels, expire all the files, add a maxsize to the live profile, | | remove the max size, rebuild the profile, etc. We've tried everything | | we could have thought of! This morning we actually did a new clean | | install of nfse/nfdump on a different machine and, as you can see, the | | | size of the live profile still doesn't get updated automatically! | | | | Has anybody else run into this problem? Is this a known bug? Is there | | a fix? Are we doing something wrong? | | | | Thanks, | | Bogdan. | | | | Do you really need to print this email? Help preserve our environment! | | | Devez-vous vraiment imprimer ce courriel? Pensons a l'environnement! | | __________________________________________________________ | | | | The information in this message, including in all attachments, is | | confidential or privileged. In the event you have received this | | message in error and are not the intended recipient, you are hereby | | advised that any use, copying or reproduction of this document is | strictly forbidden. Please notify immediately the sender of this error | and destroy this message, including its attachments, as the case may be. | | L'information apparaissant dans ce message electronique et dans les | | documents qui y sont joints est de nature confidentielle ou | | privilegiee. Si ce message vous est parvenu par erreur et que vous | | n'en etes pas le destinataire vise, vous etes par les presentes avise | que toute utilisation, copie ou distribution de ce message est | strictement interdite. Vous etes donc prie d'en informer immediatement | l'expediteur et de detruire ce message, ainsi que les documents qui y | sont joints, le cas echeant. | | | | __________________________________________________________ | | | | - -- | _______ SWITCH - The Swiss Education and Research Network ______ Peter | Haag, Security Engineer, Member of SWITCH CERT PGP fingerprint: D9 31 | D5 83 03 95 68 BA FB 84 CA 94 AB FC 5D D7 | SWITCH, Werdstrasse 2, P.O. Box, CH-8021 Zurich, Switzerland | E-mail: pet...@sw... Web: http://www.switch.ch/ -----BEGIN PGP | SIGNATURE----- | Version: GnuPG v1.4.3 (Darwin) | | iQCVAwUBR/MnPf5AbZRALNr/AQL4xwQAiJkq2hwWVcyLbB9XuVwoJV0DTT/wHyS/ | NDmOxKoAjxPnUt79MoceZydwGsyuezsTva0mOudBN904i/3h3L9oH5C+pS70RmFN | PcLLz9IuIVimNw/hp65jzLKvwUvdZt4jAM+TjEpZbvESIRreZ7eSrQ0gmnSyLPrW | cCLZxlBGCkc= | =LhNE | -----END PGP SIGNATURE----- | __________________________________________________________ | | The information in this message, including in all attachments, is confidential or privileged. In the event you have received | this message in error and are not the intended recipient, you are hereby advised that any use, copying or reproduction of | this document is strictly forbidden. Please notify immediately the sender of this error and destroy this message, including | its attachments, as the case may be. | L'information apparaissant dans ce message electronique et dans les documents qui y sont joints est de nature confidentielle | ou privilegiee. Si ce message vous est parvenu par erreur et que vous n'en etes pas le destinataire vise, vous etes par les | presentes avise que toute utilisation, copie ou distribution de ce message est strictement interdite. Vous etes donc prie | d'en informer immediatement l'expediteur et de detruire ce message, ainsi que les documents qui y sont joints, le cas echeant. | | __________________________________________________________ | | ------------------------------------------------------------------------- | This SF.net email is sponsored by the 2008 JavaOne(SM) Conference | Don't miss this year's exciting event. There's still time to save $100. | Use priority code J8TL2D2. | http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone | _______________________________________________ | Nfsen-discuss mailing list | Nfs...@li... | https://lists.sourceforge.net/lists/listinfo/nfsen-discuss - -- _______ SWITCH - The Swiss Education and Research Network ______ Peter Haag, Security Engineer, Member of SWITCH CERT PGP fingerprint: D9 31 D5 83 03 95 68 BA FB 84 CA 94 AB FC 5D D7 SWITCH, Werdstrasse 2, P.O. Box, CH-8021 Zurich, Switzerland E-mail: pet...@sw... Web: http://www.switch.ch/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.3 (Darwin) iQCVAwUBSAX3+P5AbZRALNr/AQJWZwP9Gp/wN0iO8MWCQzUgwMhWURFwTUDUiIcY 1qQOfck09no1nkSE4h+61jAvNy0byR3RnRDjEul7xxURiMvMWEygqtKbO3EMMfb/ Ax/RiM8i0NBOCFog88WGnzpCE5N2PxqtK6ddDC4/5TbOv2MFd4Zliw5Jy5aO7p+9 fuaHafKmTiQ= =mqfe -----END PGP SIGNATURE----- |