|
From: Yves <yme...@pe...> - 2005-01-14 10:03:19
|
The problem that mentionned Tim's bug was fixed in 0.104.1 or 0.104.2 (ap=
pears twice in
ChangeLog).
This is something new, and I found one bug (I think there are 2 there). I=
will post a
patch soon. More in my next mail.
Yves
> This problem does not seem related to the one I was experiencing a
> while back (that one definitely got fixed, and the strace output looks
> totally different).
>
> However, I have the same problem on my production machine with 0.104.8
> It did not occur on my test platform. (aargh not good!)
>
> To Yves:
> If I understand correctly, at one point (strace line 13516) perfparse
> WRITES to the pipe 'Command OK unknown. Type help ....'.
> Of course it picks this back up at the input, and we're off for an
> infinite loop where it keeps saying it doesn't understand what it has
> written itself. Could it be that it loses track of where messages end
> and a new message begins?
>
> Tim
>
> nagios@plato:/EUnet/nagios/bin$ ./perfparsed --show_config
> Perfparsed [options]
>
> # File where Perfparse logs messages
> # Error_Log =3D "string"
> Error_Log =3D "/EUnet/nagios/var/log/perfparse_error"
>
> # Rotate Perfparse log files
> # Error_Log_Rotate =3D "Y/N"
> Error_Log_Rotate =3D "Yes"
>
> # Keep N days of error log. Compress recent logs and remove too old one=
s
> # Error_Log_Keep_N_Days =3D "value"
> Error_Log_Keep_N_Days =3D "7"
>
> # When perfparse cannot parse a line, it drops it to that file
> # Drop_File =3D "string"
> Drop_File =3D "/tmp/perfparse.drop"
>
> #
> # Drop_File_Rotate =3D "Y/N"
> Drop_File_Rotate =3D "Yes"
>
> # Keep N days of drop file log. Compress recent logs and remove too old=
ones
> # Drop_File_Keep_N_Days =3D "value"
> Drop_File_Keep_N_Days =3D "7"
>
> # Port for perfparsed server
> Put 0 or "" to disable the server
> # Server_Port =3D "value"
> Server_Port =3D "0"
>
> # Log source from nagios (or other tools) that perfparse will scan
> Authorized values: a file name, '-' for stdin, '|' for a fifo and '>'
> for a host:port socket
> For sockets, a command 'history' will be sent before retreiving the dat=
a
> # Service_Log =3D "string"
> Service_Log =3D "|/EUnet/nagios/var/perfparse.pipe"
>
> # Save the read position in the nagios log file ? If yes, perfparse
> will start from that position instead of from the beginning
> # Service_Log_Save_Position =3D "Y/N"
> Service_Log_Save_Position =3D "yes"
>
> # Path for files containing the read position for nagios log files
> # Service_Log_Position_Mark_Path =3D "string"
> Service_Log_Position_Mark_Path =3D ""
>
> # Lock file for perfparsed
> # Daemon_Lock =3D "string"
> Daemon_Lock =3D "/EUnet/nagios/var/perfparsed.lock"
>
> # Run perfparsed as a daemon
> # Daemonize =3D "Y/N"
> Daemonize =3D "no"
>
> # Perform some periodic cleanup every day
> # Periodic_Cleanup =3D "Y/N"
> Periodic_Cleanup =3D "yes"
>
> # Lock file for perfparsed periodic cleanup process
> # Periodic_Cleanup_Lock =3D "string"
> Periodic_Cleanup_Lock =3D "/EUnet/nagios/var/perfparsed_periodic_cleanu=
p.lock"
>
> # Perform some periodic cleanup every day at HHMM
> # Periodic_Cleanup_Hour =3D "value"
> Periodic_Cleanup_Hour =3D "0230"
>
> # Dummy hostname if gethostname() does not work
> # Dummy_Hostname =3D "string"
> Dummy_Hostname =3D "dummy"
>
> # Don't store raw data
> # No_Raw_Data =3D "Y/N"
> No_Raw_Data =3D "no"
>
> # Don't store bin data
> # No_Bin_Data =3D "Y/N"
> No_Bin_Data =3D "no"
>
> # Path where storage modules are
> # Storage_Modules_Dir =3D "string"
> Storage_Modules_Dir =3D "/EUnet/nagios/lib"
>
> # Modules to load (Coma separated values)
> # Storage_Modules_Load =3D "string"
> Storage_Modules_Load =3D "mysql"
>
> # File to contain Storage Modules Status
> # Storage_Modules_Status_File =3D "string"
> Storage_Modules_Status_File =3D "/EUnet/nagios/var/storage_modules.stat=
us"
>
>
>
> # Storage Module : mysql
> # =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D
>
> # Database user
> # DB_User =3D "string"
> DB_User =3D "nagios"
>
> # Database password
> # DB_Pass =3D "string"
> DB_Pass =3D "password"
>
> # Database name
> # DB_Name =3D "string"
> DB_Name =3D "nagios"
>
> # Database hostname
> # DB_Host =3D "string"
> DB_Host =3D "127.0.0.1"
>
>
> ---------- Forwarded message ----------
> From: Thomas Eriksson <tho...@sl...>
> Date: Thu, 13 Jan 2005 11:59:51 -0800
> Subject: [Perfparse-users] CPU usage again
> To: per...@li...
>
>
> Hi
>
> An issue with prefparsed using all the CPU when configured for
> "Method 4" was mentioned in post by Tim Wuyts at the end of November.
> No resolution was ever posted as far as I can tell.
>
> Now, I am coming across the same problem. I did try Yves' suggestion
> to replace the last argument in the select() call with NULL, but it
> had no effect.
>
> I have far less services checks than the Tim, only about 120.
> The host is a 3GHz P4 with 1GB of RAM running RHEL3u4.
> Nagios is version 2.0b1, plugins v1.4-beta1 and perfparse is 0.104.8
>
> I ran perfparsed with strace and it appears to be running fine for
> a few rounds of events. Then some type of error message is starting
> to appear in the message buffer: "Command xxx unknown. Type 'help' for
> help." It then quickly gets into self degenerating loop trying to deal
> with the error message by putting out more of them.
>
> When writing into a file instead of a pipe, I never see these messages.
>
> Below is a short extract of the 'strace' once it got into trouble, it
> gets very large very quickly. If more of the strace can be of any help
> I can post a larger section.
>
> thanks,
>
> Thomas
>
> --- strace ../bin/perfparsed ---
> ...
> select(5, [4], NULL, NULL, NULL) =3D 1 (in [4])
> rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0
> read(4, "and \'Comma", 10) =3D 10
> read(4, "nd\' unknow", 10) =3D 10
> read(4, "n.\nType \'h", 10) =3D 10
> write(4, "Command \'", 9) =3D 9
> write(4, "Command", 7) =3D 7
> write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D -1 EAGAIN
> (Resource temporarily unavailable)
> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0
> utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0
> time(NULL) =3D 1105566327
> time(NULL) =3D 1105566327
> waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process=
es)
> select(5, [4], NULL, NULL, NULL) =3D 1 (in [4])
> rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0
> read(4, "elp\' for h", 10) =3D 10
> read(4, "elp.\nComma", 10) =3D 10
> write(4, "Command \'", 9) =3D 9
> write(4, "Type", 4) =3D 4
> write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D -1 EAGAIN
> (Resource temporarily unavailable)
> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0
> utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0
> time(NULL) =3D 1105566327
> time(NULL) =3D 1105566327
> waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process=
es)
> select(5, [4], NULL, NULL, NULL) =3D 1 (in [4])
> rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0
> read(4, "nd \'TypeCo", 10) =3D 10
> read(4, "mmand \'Com", 10) =3D 10
> read(4, "mand\' unkn", 10) =3D 10
> read(4, "own.\nType ", 10) =3D 10
> write(4, "Command \'", 9) =3D 9
> write(4, "Command", 7) =3D 7
> write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D 33
> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0
> utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0
> time(NULL) =3D 1105566327
> time(NULL) =3D 1105566327
> waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process=
es)
> select(5, [4], NULL, NULL, NULL) =3D 1 (in [4])
> rt_sigprocmask(SIG_BLOCK, [INT PIPE TERM], [], 8) =3D 0
> read(4, "\'help\' for", 10) =3D 10
> read(4, " help.\nCom", 10) =3D 10
> write(4, "Command \'", 9) =3D 9
> write(4, "Type", 4) =3D 4
> write(4, "\' unknown.\nType \'help\' for help."..., 33) =3D -1 EAGAIN
> (Resource temporarily unavailable)
> rt_sigprocmask(SIG_SETMASK, [], NULL, 8) =3D 0
> utime("/usr/local/nagios/var/storage_modules.status", NULL) =3D 0
> time(NULL) =3D 1105566327
> time(NULL) =3D 1105566327
> waitpid(-1, NULL, WNOHANG) =3D -1 ECHILD (No child process=
es)
> select(5, [4], NULL, NULL, NULL) =3D 1 (in [4])
> ...
>
> -------------------------------------------------------
> The SF.Net email is sponsored by: Beat the post-holiday blues
> Get a FREE limited edition SourceForge.net t-shirt from ThinkGeek.
> It's fun and FREE -- well, almost....http://www.thinkgeek.com/sfshirt
> _______________________________________________
> Perfparse-users mailing list
> Per...@li...
> https://lists.sourceforge.net/lists/listinfo/perfparse-users
>
--=20
- Homepage - http://ymettier.free.fr - http://www.logicacmg.com -
- GPG key - http://ymettier.free.fr/gpg.txt -
- Maitretarot - http://www.nongnu.org/maitretarot/ -
- Perfparse - http://perfparse.sf.net/ -
|