From: Yves <yme...@pe...> - 2005-01-14 10:03:19
|
The problem that mentionned Tim's bug was fixed in 0.104.1 or 0.104.2 (ap= pears twice in ChangeLog). This is something new, and I found one bug (I think there are 2 there). I= will post a patch soon. More in my next mail. Yves > This problem does not seem related to the one I was experiencing a > while back (that one definitely got fixed, and the strace output looks > totally different). > > However, I have the same problem on my production machine with 0.104.8 > It did not occur on my test platform. (aargh not good!) > > To Yves: > If I understand correctly, at one point (strace line 13516) perfparse > WRITES to the pipe 'Command OK unknown. Type help ....'. > Of course it picks this back up at the input, and we're off for an > infinite loop where it keeps saying it doesn't understand what it has > written itself. Could it be that it loses track of where messages end > and a new message begins? > > Tim > > nagios@plato:/EUnet/nagios/bin$ ./perfparsed --show_config > Perfparsed [options] > > # File where Perfparse logs messages > # Error_Log =3D "string" > Error_Log =3D "/EUnet/nagios/var/log/perfparse_error" > > # Rotate Perfparse log files > # Error_Log_Rotate =3D "Y/N" > Error_Log_Rotate =3D "Yes" > > # Keep N days of error log. Compress recent logs and remove too old one= s > # Error_Log_Keep_N_Days =3D "value" > Error_Log_Keep_N_Days =3D "7" > > # When perfparse cannot parse a line, it drops it to that file > # Drop_File =3D "string" > Drop_File =3D "/tmp/perfparse.drop" > > # > # Drop_File_Rotate =3D "Y/N" > Drop_File_Rotate =3D "Yes" > > # Keep N days of drop file log. Compress recent logs and remove too old= ones > # Drop_File_Keep_N_Days =3D "value" > Drop_File_Keep_N_Days =3D "7" > > # Port for perfparsed server > Put 0 or "" to disable the server > # Server_Port =3D "value" > Server_Port =3D "0" > > # Log source from nagios (or other tools) that perfparse will scan > Authorized values: a file name, '-' for stdin, '|' for a fifo and '>' > for a host:port socket > For sockets, a command 'history' will be sent before retreiving the dat= a > # Service_Log =3D "string" > Service_Log =3D "|/EUnet/nagios/var/perfparse.pipe" > > # Save the read position in the nagios log file ? If yes, perfparse > will start from that position instead of from the beginning > # Service_Log_Save_Position =3D "Y/N" > Service_Log_Save_Position =3D "yes" > > # Path for files containing the read position for nagios log files > # Service_Log_Position_Mark_Path =3D "string" > Service_Log_Position_Mark_Path =3D "" > > # Lock file for perfparsed > # Daemon_Lock =3D "string" > Daemon_Lock =3D "/EUnet/nagios/var/perfparsed.lock" > > # Run perfparsed as a daemon > # Daemonize =3D "Y/N" > Daemonize =3D "no" > > # Perform some periodic cleanup every day > # Periodic_Cleanup =3D "Y/N" > Periodic_Cleanup =3D "yes" > > # Lock file for perfparsed periodic cleanup process > # Periodic_Cleanup_Lock =3D "string" > Periodic_Cleanup_Lock =3D "/EUnet/nagios/var/perfparsed_periodic_cleanu= p.lock" > > # Perform some periodic cleanup every day at HHMM > # Periodic_Cleanup_Hour =3D "value" > Periodic_Cleanup_Hour =3D "0230" > > # Dummy hostname if gethostname() does not work > # Dummy_Hostname =3D "string" > Dummy_Hostname =3D "dummy" > > # Don't store raw data > # No_Raw_Data =3D "Y/N" > No_Raw_Data =3D "no" > > # Don't store bin data > # No_Bin_Data =3D "Y/N" > No_Bin_Data =3D "no" > > # Path where storage modules are > # Storage_Modules_Dir =3D "string" > Storage_Modules_Dir =3D "/EUnet/nagios/lib" > > # Modules to load (Coma separated values) > # Storage_Modules_Load =3D "string" > Storage_Modules_Load =3D "mysql" > > # File to contain Storage Modules Status > # Storage_Modules_Status_File =3D "string" > Storage_Modules_Status_File =3D "/EUnet/nagios/var/storage_modules.stat= us" > > > > # Storage Module : mysql > # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D > > # Database user > # DB_User =3D "string" > DB_User =3D "nagios" > > # Database password > # DB_Pass =3D "string" > DB_Pass =3D "password" > > # Database name > # DB_Name =3D "string" > DB_Name =3D "nagios" > > # Database hostname > # DB_Host =3D "string" > DB_Host =3D "127.0.0.1" > > > ---------- Forwarded message ---------- > From: Thomas Eriksson <tho...@sl...> > Date: Thu, 13 Jan 2005 11:59:51 -0800 > Subject: [Perfparse-users] CPU usage again > To: per...@li... > > > Hi > > An issue with prefparsed using all the CPU when configured for > "Method 4" was mentioned in post by Tim Wuyts at the end of November. > No resolution was ever posted as far as I can tell. > > Now, I am coming across the same problem. I did try Yves' suggestion > to replace the last argument in the select() call with NULL, but it > had no effect. > > I have far less services checks than the Tim, only about 120. > The host is a 3GHz P4 with 1GB of RAM running RHEL3u4. > Nagios is version 2.0b1, plugins v1.4-beta1 and perfparse is 0.104.8 > > I ran perfparsed with strace and it appears to be running fine for > a few rounds of events. Then some type of error message is starting > to appear in the message buffer: "Command xxx unknown. Type 'help' for > help." It then quickly gets into self degenerating loop trying to deal > with the error message by putting out more of them. > > When writing into a file instead of a pipe, I never see these messages. > > Below is a short extract of the 'strace' once it got into trouble, it > gets very large very quickly. If more of the strace can be of any help > I can post a larger section. > > thanks, > > Thomas > > --- strace ../bin/perfparsed --- > ... > select(5, [4], NULL, NULL, NULL) =3D 1 (in [4]) > rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0 > read(4, "and \'Comma", 10) =3D 10 > read(4, "nd\' unknow", 10) =3D 10 > read(4, "n.\nType \'h", 10) =3D 10 > write(4, "Command \'", 9) =3D 9 > write(4, "Command", 7) =3D 7 > write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D -1 EAGAIN > (Resource temporarily unavailable) > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0 > utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0 > time(NULL) =3D 1105566327 > time(NULL) =3D 1105566327 > waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process= es) > select(5, [4], NULL, NULL, NULL) =3D 1 (in [4]) > rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0 > read(4, "elp\' for h", 10) =3D 10 > read(4, "elp.\nComma", 10) =3D 10 > write(4, "Command \'", 9) =3D 9 > write(4, "Type", 4) =3D 4 > write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D -1 EAGAIN > (Resource temporarily unavailable) > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0 > utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0 > time(NULL) =3D 1105566327 > time(NULL) =3D 1105566327 > waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process= es) > select(5, [4], NULL, NULL, NULL) =3D 1 (in [4]) > rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0 > read(4, "nd \'TypeCo", 10) =3D 10 > read(4, "mmand \'Com", 10) =3D 10 > read(4, "mand\' unkn", 10) =3D 10 > read(4, "own.\nType ", 10) =3D 10 > write(4, "Command \'", 9) =3D 9 > write(4, "Command", 7) =3D 7 > write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D 33 > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0 > utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0 > time(NULL) =3D 1105566327 > time(NULL) =3D 1105566327 > waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process= es) > select(5, [4], NULL, NULL, NULL) =3D 1 (in [4]) > rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0 > read(4, "\'help\' for", 10) =3D 10 > read(4, " help.\nCom", 10) =3D 10 > write(4, "Command \'", 9) =3D 9 > write(4, "Type", 4) =3D 4 > write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D -1 EAGAIN > (Resource temporarily unavailable) > rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0 > utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0 > time(NULL) =3D 1105566327 > time(NULL) =3D 1105566327 > waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process= es) > select(5, [4], NULL, NULL, NULL) =3D 1 (in [4]) > ... > > ------------------------------------------------------- > The SF.Net email is sponsored by: Beat the post-holiday blues > Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek. > It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt > _______________________________________________ > Perfparse-users mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfparse-users > --=20 - Homepage - http://ymettier.free.fr - http://www.logicacmg.com - - GPG key - http://ymettier.free.fr/gpg.txt - - Maitretarot - http://www.nongnu.org/maitretarot/ - - Perfparse - http://perfparse.sf.net/ - |