Thread: [Perfparse-devel] Proposed specification for the conversion of the database

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi,

Let me address your comments.

 > binary usually means "$=A3*%F|=E9=E8@*%" data :)

Good point.  The name 'Binary' refereed to the way the data is stored,=20
as binary number types not text, and not the way Nagios sends the data.=20
  I'll add a sentence...

 >>>>>>>>>>>> Not a Number values can be considered as missing values=20
and are coded NULL
 >
 >
 > too. So there is no problem with NaN values :)

I'll add a comment to clarify this point.

 >>>>>>>>> Use some integer to store a timestamp. This should be enough.
 >
 >
 > However, this is not very simple on sybase (I don't know for
 > other dbms). So don't try to be too portable ! Keep It Simple
 > and Stupid :)

I am moving toward using the ANSI-SQL standard of DATETIME.  The same as=20
we have at the moment.  It is likely that each DBMS will have it's own=20
best way of handling data/time, and queries which cannot be made=20
ANSI-SQL can optimize around this.

One day all DBMS will be ANSI.  So the mountain in this case will come=20
to Mohamed.  In the mean while, we can try to always keep as close as=20
possible....

 > Infinite value means that there is no mini or no maxi. Nothing
 > recorded also means no mini or no maxi. So I agree with using NULL
 > to store not recorded value. However, there will be a choice to do
 > when you read NULL in the database : will you present it as not
 > recorded or as infinite ? Well, this is presentation and has
 > nothing to do here.

Good point.  Classic Database Theory problem:- Does NULL mean No Data or=20
Bad Data.  Nagios does not support infinite range, so NULL in this case=20
means no data.

One of the advantages of this twin-table format is the ease at which=20
extra data fields can be added.  When PP is used for more than just=20
Nagios, extra fields to clarify the meaning can be added.

 > 2.5  Store the range of warning and critical, as either an inside or
 > outside range.  A value of NULL will indicate infinity.  A range
 > type of NULL will indicate no value.
 >
 >
 >>>>>>>>>>>>>> Not precise enough.
 >
 >
 > I suggest NULL for infinite (-inf for start range and +inf for
 > end range, because you cannot have +inf for start range, and you
 > cannot have -inf for end range :)
 >
 > I suggest the default values when not recorded (0 for start range
 > and +inf for end range, as specified on the nagiosplug plugins
 > specification page).
 >
 > You also need something to say that the range is inverted (the @
 > character in the range)

This section is vague.  Can you look at section 4.3 where these ideas=20
are expanded, and see whether this is acceptable?

 > 2.8  Use ANSI SQL where ever possible.
 >
 >
 >>>>>>>>>>>>>>> ...where ever the performance of insertion and queries
 > are not reduced
 >
 >
 > too much.
 >
 > You will never be able to write SQL that can be ported on mysql,
 > postgresql, oracle,
 > sybase, odbc, sqlite and other ones, even writing 100% ANSI SQL. Or
 > you will write some
 > very poor SQL and program yourself things that SQL dataservers can do
 > for you.

You are right that ANSI sql cannot be used in all cases.  So some form=20
of re-writing for different platforms will be required.  By some=20
mechanism not yet decided.  However, things will be a lot easier in the=20
future is ANSI-SQL is at least used where possible.

I have changed the wording of this.

 > On that reflexion (that I made and that is the origin of storage
 > modules in perfparse), you will have to create an abstraction layer
 > (or use an existing one) like storage modules for perfparse-db-tools
 > (I will do that one) and CGIs (maybe me too, maybe not...).
 >
 > However, when the SQL code is close to ANSI SQL, porting some module
 > to talk to another database server is easier. So using ANSI SQL is
 > really recommended. Just don't drop performances, and don't reinvent
 > the wheel that SQL servers have already invented :)

I totally agree.

 > 2.9  Referential integrity will not be important against the
 > data tables.
 >
 > 2.10 Duplicate data should be impossible to add.
 >
 >>>>>>>>>>>>>> For 2.9 and 2.10, I agree. However, some tools are
 > here to purge the databases : they can also do some additionnal
 > integrity checking and duplicates removing.

Ok good, we can get back the RI in program space, and gain the=20
performance in the database.

 > 4.1 The extraction of data for a graph between two dates can be
 > completed as:
 >
 > SELECT * ...

 >>>>>>>>>>>>>> For information, this is also possible with perfparsed
 > since 0.103.1 if you enable file_output storage : telnet the
 > perfparsed server and run this:
 > history tm_start tm_end '1' '2' '3'

I haven't tested this yet my self, but I am sure this will lead to some=20
interesting applications.

 >   2.2 A check against the extra data will be completed to find out if
 > this extra data has been entered, against the key 'extra_id0'.  If
 > this data has not been entered before, this should take place.

 >>>>>>>>>>>>>> I suggest some option to do that check or not. You can
 > also remove duplicates every night with perfparse-db-tool. Depends on
 > how much performance you need when inserting data in the database.

Do you suggest writing *all* the extra data, one entry for each, then=20
reducing the data in a nightly parse?

I think this one will need to be settled experimentally.  Although your=20
suggestion doesn't require a lookup on every line.  It does require=20
massive insertion and deletion, which are both heavy activities for a=20
database.  When we get there, we can test it.

Another thing you may be suggesting is a flag to ignore *all* the extra=20
data?  Leave the linkage as NULL?  This is an interesting idea is users=20
did not want the warn, crit, max or min.  ?

Another version enclosed reflecting your comments.

Ben

Thread: [Perfparse-devel] Proposed specification for the conversion of the database

perfparse-devel