|
From: Yves M. <yme...@li...> - 2004-11-02 11:24:08
|
Comments inline...
Proposal for conversion of PerfParse binary data.
1. Brief.
=3D=3D=3D=3D=3D=3D=3D=3D=3D
Binary data extracted from the performance data from Nagios is stored
>>>>>>>> Say that binary data is in fact the performance data, eg the lis=
t of
"key=3DvalueUOM;ranges..."
binary usually means "$=A3*%F|=E9=E8@*%" data :)
2. Requirements.
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
The functional requirements of the new structure can be listed:
2.1 To hold the metric value as a DOUBLE. Allow NULL to indicate a
missing
value which should be shown as a space on a graph.
>>>>>>>>>>> Not a Number values can be considered as missing values and a=
re coded NULL
too. So there is no problem with NaN values :)
2.3 Store the time in a format easy to program and portable over
different
DBMS.
>>>>>>>> Use some integer to store a timestamp. This should be enough.
However, this is not very simple on sybase (I don't know for other dbms).=
So don't try
to be too portable ! Keep It Simple and Stupid :)
2.4 Store the maximum and minimum range of the metric with every
value. This may be NULL of not recorded.
>>>>>>>>>>>> Thinking...
Infinite value means that there is no mini or no maxi. Nothing recorded a=
lso means no
mini or no maxi. So I agree with using NULL to store not recorded value. =
However, there
will be a choice to do when you read NULL in the database : will you pres=
ent it as not
recorded or as infinite ? Well, this is presentation and has nothing to d=
o here.
2.5 Store the range of warning and critical, as either an inside or
outside range. A value of NULL will indicate infinity. A range type of
NULL will indicate no value.
>>>>>>>>>>>>> Not precise enough.
I suggest NULL for infinite (-inf for start range and +inf for end range,=
because you
cannot have +inf for start range, and you cannot have -inf for end range =
:)
I suggest the default values when not recorded (0 for start range and +in=
f for end
range, as specified on the nagiosplug plugins specification page).
You also need something to say that the range is inverted (the @ characte=
r in the range)
2.6 Offer the correct keys to enable likely queries. Including:
6.1 Extraction of an ordered series of data between two times.
6.2 Extraction of the last entered value for each metric.
6.3 Extraction of data relative to a specific host, service or metric.
2.7 Store the data is an efficient format being small and fast to
extract.
2.8 Use ANSI SQL where ever possible.
>>>>>>>>>>>>>> ...where ever the performance of insertion and queries are=
not reduced
too much.
You will never be able to write SQL that can be ported on mysql, postgres=
ql, oracle,
sybase, odbc, sqlite and other ones, even writing 100% ANSI SQL. Or you w=
ill write some
very poor SQL and program yourself things that SQL dataservers can do for=
you.
On that reflexion (that I made and that is the origin of storage modules =
in perfparse),
you will have to create an abstraction layer (or use an existing one) lik=
e storage
modules for perfparse-db-tools (I will do that one) and CGIs (maybe me to=
o, maybe
not...).
However, when the SQL code is close to ANSI SQL, porting some module to t=
alk to another
database server is easier. So using ANSI SQL is really recommended. Just =
don't drop
performances, and don't reinvent the wheel that SQL servers have already =
invented :)
2.9 Referential integrity will not be important against the data tables.
2.10 Duplicate data should be impossible to add.
>>>>>>>>>>>>> For 2.9 and 2.10, I agree. However, some tools are here to =
purge the
databases : they can also do some additionnal integrity checking and
duplicates removing.
3. Table Schema
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
The two new tables will be defined as:
>>>>>>>>>>>> I'm too bad in SQL... No comments here.
4. Proposed Use Snapshots
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
4.1 The extraction of data for a graph between two dates can be
completed as:
SELECT *
FROM perfdata_bin
JOIN perfdata_bin_extra
ON perfdata_bin.bin_extra_id =3D perfdata_bin_extra.id
WHERE host =3D 1 AND service =3D 2 AND metric =3D 3
AND ctime BETWEEN '2004-01-01' AND '2004-12-31'
ORDER BY ctime;
>>>>>>>>>>>>> For information, this is also possible with perfparsed sinc=
e 0.103.1 if
you enable file_output storage : telnet the perfparsed server and run thi=
s
:
history tm_start tm_end '1' '2' '3'
However, the SQL language makes it more flexible. When both are implement=
ed, choose the
one that best fit your needs ! :)
4.2 Addition of data will be completed as follows:
2.1 Calculation of the hash of the extra data. Eg, MD5(";1;2;3;4")
This
value will be supplied by the parser.
2.2 A check against the extra data will be completed to find out if
this
extra data has been entered, against the key 'extra_id0'. If
this data
has not been entered before, this should take place. In either
case,
the 'id' of the extra will be registered.
>>>>>>>>>>>>> I suggest some option to do that check or not. You can also=
remove
duplicates every night with perfparse-db-tool. Depends on how much
performance you need when inserting data in the database.
>>>>>>>>>>>>>>>>> No other comments :)
Yves
--=20
- Homepage - http://ymettier.free.fr - http://www.logicacmg.com -
- GPG key - http://ymettier.free.fr/gpg.txt -
- Maitretarot - http://www.nongnu.org/maitretarot/ -
- Perfparse - http://perfparse.sf.net/ -
|