RES: RES: [Perfparse-users] Warning/critical values not going into the database... Again

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ben,

Please see my answers below.

Paulo Afonso Graner Fessel
Administrador de Ambiente e Sistemas UNIX
pau...@pr...
OWT
Fone: +55 (11) 3038-6554
Fax: +55 (11) 3038-6508
http://www.primesys.com.br

> The problem with the range.  Do you use a single value for=20
> the WARN and CRIT, or do you use the range format?  Can you=20
> please provide an example log line so that we can ensure=20
> there is no misunderstanding of the range and how it should=20
> be interpreted?

No, I use single values for WARN and CRIT:

1098196046      psdes01bill     /usr    DISK OK - free space: /usr 775 =
MB (31%):
        OK       /usr=3D775MB;2446;2471;0;2496

> However, this looks like a simple bug, thanks for finding this.
>=20
> The State calculation also looks like a simple bug although I=20
> have not yet looked at the code.  Again, if you can supply a=20
> line or two of raw log this would be extremely useful.

Unfortunately this is not possible at the moment, as I'm not running =
0.101.1 now. I will see if I can setup a test server to reproduce this =
here.

> Your problem is speed is surprising.  I test on a 800MHz=20
> low-memory Linux box, which ensures any speed problems slap=20
> me in the face fast.=20
> The difference between the old and new was about 10% during=20
> my testing.=20
>   Can you please give us more information about your setup. =20
> Is you MySQL local or on a remote server?

It's a local server.

> What method have you chosen to import the log information?

I've just told perfparse-log2db to read the service log, as I did other =
times with earlier versions of perfparse. It took more or less 4 hours =
to parse a 67 MB file.

> What commands do you run?

The equivalent of perfparse-log2db -r -s -l <pathtoservice.log>

> Lastly my apologies for the documentation.  This is beta=20
> code, deliberately not yet on version 1.0.  We obviously have=20
> some work to do here.

Ok. Stand by me for continuing support of PP.

[]'s
Paulo
=20
> Paulo Afonso Graner Fessel wrote:
>=20
> > Ben,
> >=20
> > I've found that perfparse has now the infrastructure to use=20
> ranges of warning/critical data, and the problem is that the=20
> database hasn't changed in order to use these values (as far=20
> as I can see, at least).
> >=20
> > Part of the problem is at save_bin_data at storage_mysql.c:
> >=20
> >         g_string_append_printf(s_SQL, ", %s",
> >                 getSafeD(perf->d[PERF_VALUE_WARN_START]));
> >         g_string_append_printf(s_SQL, ", %s",
> >                 getSafeD(perf->d[PERF_VALUE_CRIT_START]));
> >         g_string_append_printf(s_SQL, ", %s)",
> >                 getSafeD(perf->metric_state));
> >=20
> > I've changed it to
> >=20
> >         g_string_append_printf(s_SQL, ", %s",
> >                 getSafeD(perf->d[PERF_VALUE_WARN_END]));
> >         g_string_append_printf(s_SQL, ", %s",
> >                 getSafeD(perf->d[PERF_VALUE_CRIT_END]));
> >         g_string_append_printf(s_SQL, ", %s)",
> >                 getSafeD(perf->metric_state));
> >=20
> > and the data begun to show up in perfdata_service_bin.=20
> However, metric_state values were still wrong and=20
> inconsistent indeed with perfdata_service_raw. In the example=20
> I sent you, the metrics were 2 (CRITICAL) in=20
> perfdata_service_bin; OTOH they were 0 (NORMAL) in=20
> perfdata_service_raw. And the graphs continued not showing=20
> the guides for warning/critical thresholds. At this point, I=20
> rolled back to release 0.100.7.
> >=20
> > I've found performance problems also in 0.101.1, as I feel=20
> that it is=20
> > much slower than 0.100.7 when reading the
>=20
> serviceperf.log file and putting it into the database. In=20
> 0.100.7 I've got thousands of lines per second; in
>=20
> 0.101.1 this number never is greater than 70 lines/sec. And=20
> I'm running on a machine with 512 MB RAM, 2xIntel
>=20
> Xeon 3.06 GHz! I've tried to optimize MySQL settings to no=20
> avail, and this was other reason that leaded me to roll back=20
> to 0.100.7.
> >=20
> > Also, the documentation is a little confusing in this=20
> release. It took=20
> > me some time until I understood that, with
>=20
> --default-perfdata I wouldn't have to use crontab entries=20
> anymore to update the database. There's no mention
>=20
> of this in the documentation, and if it sounds obvious for=20
> the developing team, it may be not so clear for the users.
> >=20
> > Don't get me wrong: I find that perfparse is THE solution=20
> for gathering performance data for Nagios. However, I feel=20
> that this wasn't the right time to release 0.101.1 because it=20
> clearly lacks polishing and has rough edges on database=20
> architecture and plugin parsing output.
> >=20
> > Why don't you add a compilation switch to disable these=20
> burning edge features? It would make life easier to people=20
> that rely on perfparse for data gathering on production systems.
> >=20
> > []'s and keep the great work,
> >=20
> > Paulo Afonso Graner Fessel
> > Administrador de Ambiente e Sistemas UNIX=20
> pau...@pr...=20
> > OWT
> > Fone: +55 (11) 3038-6554
> > Fax: +55 (11) 3038-6508
> > http://www.primesys.com.br
> > =20
> > =20
> > =20
> > =20
> >=20
> >=20
> >>-----Mensagem original-----
> >>De: Ben Clewett [mailto:Be...@cl...] Enviada em:=20
> ter=E7a-feira,=20
> >>19 de outubro de 2004 04:25
> >>Para: Paulo Afonso Graner Fessel
> >>Cc: per...@li...
> >>Assunto: Re: [Perfparse-users] Warning/critical values not=20
> going into=20
> >>the database... Again
> >>
> >>Paulo,
> >>
> >>Can you please send me a sample of your data, I want to=20
> replicate and=20
> >>fix this problem.
> >>
> >>Regards,
> >>
> >>Ben Clewett.
> >>
> >>Paulo Afonso Graner Fessel wrote:
> >>
> >>
> >>>Hello, folks.
> >>>=20
> >>>I've just upgraded to 0.101.1 and I'm noticing that
> >>
> >>warning/critical
> >>
> >>>values from plugins are not getting into the database again. But=20
> >>>differently (and worst) than last time, seems that the
> >>
> >>state field of
> >>
> >>>perfdata_service_bin is also incorrect:
> >>>=20
> >>>
> >>
> >>+-------------+-----------------------------+-----------------
> >>------------+---------------------+----------+--------+-------
> >>---+--------+
> >>
> >>>| host_name   | service_description         |=20
> >>>metric                      | ctime               | value  =20
> >>
> >> | warn   |=20
> >>
> >>>critical | state  |
> >>>
> >>
> >>+-------------+-----------------------------+-----------------
> >>------------+---------------------+----------+--------+-------
> >>---+--------+
> >>
> >>>| D01         | Cache Hits                  |=20
> >>>lib                         | 2004-10-18 17:35:58 |   =20
> >>
> >>99.99 |      0=20
> >>
> >>>|        0 |      2 |
> >>>| D01         | Cache Hits                  |=20
> >>>buffer                      | 2004-10-18 17:35:58 |   =20
> >>
> >>99.95 |      0=20
> >>
> >>>|        0 |      2 |
> >>>
> >>
> >>+-------------+-----------------------------+-----------------
> >>------------+---------------------+----------+--------+-------
> >>---+--------+
> >>
> >>>In my definition of this particular service, I want these
> >>
> >>two metrics
> >>
> >>>to be as high as possible, with a maximum value of 100%.=20
> >>
> >>However, as
> >>
> >>>you can see, I don't have warning and critical values for
> >>
> >>this metric,
> >>
> >>>and also its state as determined by perfparse is 2 (CRITICAL); it=20
> >>>should be 0 actually (NORMAL).
> >>>=20
> >>>[]'s
> >>>=20
> >>>*Paulo Afonso Graner Fessel*
> >>>/Administrador de Ambiente e Sistemas UNIX/=20
> >>>pau...@pr... <mailto:pau...@pr...>
> >>>OWT
> >>>Fone: +55 (11) 3038-6554
> >>>Fax: +55 (11) 3038-6508
> >>>http://www.primesys.com.br <http://www.primesys.com.br/>
> >>>=20
> >>>=20
> >>>=20
> >>
> >>
> >=20
>=20
>=20