RE: [Perfparse-devel] Summary tables

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Ben,

Thanks for your reply. A few remarks:
- My first step is to make sure I get something in the db that I can use for
reporting (in my case some sort of summary table). As for the report itself,
I'm actually considering of using a report generator, like Crystal Reports
(comes with Visual Studio .Net, more of that below).
- The idea of using epoch_start & epoch_length for a summary table is a good
one, I'll try to incorporate that.
- you are right about the UDF's, it's a pain to get it compiled. You need
the source tree from MySQL, and I spent half a day trying to figure out the
compile options on my Solaris machine. 
- referential integrity: never leave home without it! (I was planning to add
it, honestly)

But my most important question: what do you guys use for development
environment? Currently I do all my development work in Visual Studio .Net.
It has everything (code editor, debugger, CVS integration) in one single
powerful tool. Of course it is Windows based, but I also tend to use for all
things Perl and (small) C programs, that end up on Unix/Linux machines. I
have no idea if there is something like an IDE (Integrated Development
Environment) for native Unix/Linux C development, what a good editor might
be, the best tool for debugging, etc... On a Unix box I fall back to good
old vi and the command line :) So any advice in this area is more than
welcome.

Kind regards,
Tim.

-----Original Message-----
From: Ben Clewett [mailto:Be...@cl...]
Sent: maandag 11 oktober 2004 10:26
To: Tim Wuyts
Cc: per...@li...
Subject: Re: [Perfparse-devel] Summary tables

Hi Tim,

Tim Wuyts wrote:

> Greetings Programs!
> 
> I noticed you have a few tables in the PP db layout that are meant for 
> summarizing data. What are the plans here?

These tables are present to satisfy a requirement to extract very large time
frames (epochs) of data.  Eg, several years.  These will store a summary
representation of the binary data.  The average, standard deviation, max,
min etc, for a specific epoch.  Eg, the summary for a days data in one
record.  Therefore a graph for a year is an easy 364 points, not the
impossible half a million if every minutes data were to be plotted.  As well
as a drastic cut in the database size.

This has not been completed due to lack of developer time, and discussion
about whether this is the correct method of completing this requirement.  If
there is a real need for this, then I can try and force some development
time.

> I'm currently working on an 'availability' report. It is based on the 
> raw plugin data, i.e. it only looks at the nagios_status field in the 
> _raw table. Basically, I'm creating a report showing ALL hostgroups, 
> hosts and services with their resp. uptime (%) and downtime, similar 
> to the summary
on
> the raw history report.

I am excited to hear you are giving us a report!  I look forward to getting
it merged into the product :)

> My first approach was to create a Mysql UDF (user-defined function),
called
> 'status_time' that can be used in a 'group by' clause. The resulting SQL
> then looks something like this:
> select host_name, service_description,
> status_time(unix_timestamp(ctime), nagios_status, 0) as UPTIME,
> status_time(unix_timestamp(ctime), nagios_status, 1) as WARNTIME,
> status_time(unix_timestamp(ctime), nagios_status, 2) as CRITICALTIME,
> status_time(unix_timestamp(ctime), nagios_status, 3) as UNDEFTIME
> from perfdata_service_raw where ctime > '2004-07-31' and ctime <
> '2004-09-01' 
> group by host_name, service_description order by ctime

My first comment is a a worry.  I have not used UDF, but looking at:
http://dev.mysql.com/doc/mysql/en/Adding_UDF.html I see that this is a 
complex task involving recompiling MySQL.  For those people using RPM's, 
this may be hard.  But there may be other ways.

> This works, but performance is bad (just think about how many records need
> to be processed for a few 100 services and 30 days of data!), and I am
> relying on MySQL to give me the data in chronological order (so far it's
> always been correct, but I'm not comfortable with it)
> 
> Since the data is not changing once it was entered, I thought of
simplifying
> by 'data-warehousing' the raw data. Using the query above, I summarize the
> availability data every x time (x could be daily, weekly, monthly), and
> store them in a table, e.g. perfdata_summary_raw. I'm planning on creating
a
> more elaborate script (in Perl, I could try C, but it's been a while) to
do
> this.

Great idea!

> Q1: did you plan on making something similar? If so, is there any code I
> could look at/use/improve ?

I know of no plans to create summary data from the raw output.

Rather than writing a cron to do this, it may be best to add the data to 
your summary during the initial addition to MySQL.  Wait until version 
0.101.1, or get a pre-version from:

http://ymettier.chez.tiscali.fr/perfparse-devel/index.php

Look at storage_mysql.c and function 'storage_mysql_store_line'.  This 
will be the place to add data to your summary.

Yves will have to reserve you an configuration option to use in your 
code.  Eg --Store_Summary_Raw.  But getting ahead of my self here.

> Q2: what should the summary_raw table look like? Here's my current
proposal:
> CREATE TABLE `perfdata_summary_availability` (
>   `id` int(11) NOT NULL auto_increment,
>   `period` varchar(10) NOT NULL default '',
>   `host_name` varchar(75) NOT NULL default '',
>   `service_description` varchar(75) NOT NULL default '',
>   `sum_uptime` bigint(20) NOT NULL default '0',
>   `sum_warntime` bigint(20) NOT NULL default '0',
>   `sum_criticaltime` bigint(20) NOT NULL default '0',
>   `sum_undeftime` bigint(20) NOT NULL default '0',
>   PRIMARY KEY  (`id`),
>   UNIQUE KEY `perfdata_summary_availability_ix1`
> (`period`,`host_name`,`service_description`),
>   KEY `perfdata_summary_availability_ix0`
> (`host_name`,`service_description`)
> ) TYPE=InnoDB;

Looks good.  Very much the same as our own workings on the binary summary.

 From the work we did on that, I will suggest a few things, which you 
can ignore if you wish!

The 'period VARCHAR(10)'.  I am not sure what this will be for, or what 
you will store in this field.  I can suggest another way of defining a 
period:

If this file will always be for a time span of a constant number of 
seconds.  Eg, 1 hour = 3600.  You can store the start time of the epoch 
as part of an alternate PK:  (Storing as UNIX time)

epoch_start UNSIGNED LONG,
PRIMARY KEY (epoch_start, host, service)

You can then work out which time period to use by subtracting the 
modulus of the epoch duration (summary sample time period) against the time:

file.c:  epoch_start = ctime - (ctime % epoch_length);

Or your update query is then something like:

UPDATE perfdata_summary_availability
	SET sum_uptime = sum_uptime + 1, sum_warn ....
	WHERE epoch_start = ctime - MOD(ctime,epoch_length)
	AND host = ....

(If the summary record already exists.)

A finally idea.  If you include the epoch_length as part of the primary 
key, this enables you to store data for many sample durations in the 
same table.  Eg, 1 hour, 1 day, 1 week etc...

If you were to consider these ideas, it would make the table something like:

	epoch_period UNSIGNED INT NOT NULL,
	epoch_start UNSIGNED LONG NOT NULL,
	host_name VARCHAR(75) NOT NULL,
	service_description VARCHAR(75) NOT NULL,
	PRIMARY KEY (epoch_period,
		epoch_start, host_name, service_description),
	...

Lastly, a few house keeping bits:

The host and service should really be foreign key references to the host 
and service tables.  This stops the parent host and service tables being 
deleted without this table being sorted.

Finally, there needs to be some deletion policies.  See:

http://sourceforge.net/docman/display_doc.php?docid=23729&group_id=109355

And look at other tables.

Both of these would eventually need code in the deletion policy program 
as well, at some distant later date.

I hope this has not put you off.  Please feel free to ignore any of this 
to get your self the report you desire to start with.  I would like to 
see this report as part of the PP suit, so maybe you can start and we 
can look at it after this?

Good luck,

Regards, Ben.