Re: [Perfparse-devel] Summary tables

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Tim Wuyts wrote:

> Thanks for your reply. A few remarks:
> - My first step is to make sure I get something in the db that I can use for
> reporting (in my case some sort of summary table). As for the report itself,
> I'm actually considering of using a report generator, like Crystal Reports
> (comes with Visual Studio .Net, more of that below).
> - The idea of using epoch_start & epoch_length for a summary table is a good
> one, I'll try to incorporate that.
> - you are right about the UDF's, it's a pain to get it compiled. You need
> the source tree from MySQL, and I spent half a day trying to figure out the
> compile options on my Solaris machine. 
> - referential integrity: never leave home without it! (I was planning to add
> it, honestly)

Great!

Maybe a partnership.  If you do what you need to do, the rest of us can 
make the leap to PP.

The option of using Crystal Reports and some .NET is exciting.  We may 
be able to use you to complete a more general reporting tool for PP! 
May I ask which language you use?  May I also ask whether your code will 
be compatible with Mono?  Therefore runnable under Linux and Max OS?

> But my most important question: what do you guys use for development
> environment? Currently I do all my development work in Visual Studio .Net.
> It has everything (code editor, debugger, CVS integration) in one single
> powerful tool. Of course it is Windows based, but I also tend to use for all
> things Perl and (small) C programs, that end up on Unix/Linux machines. I
> have no idea if there is something like an IDE (Integrated Development
> Environment) for native Unix/Linux C development, what a good editor might
> be, the best tool for debugging, etc... On a Unix box I fall back to good
> old vi and the command line :) So any advice in this area is more than
> welcome.

Personally I feel I have done my time with 'vi' :)  I mainly use the 
Windows program EditPlus2.  This not free.  Some readers of this list 
will condemn me for this.  I do recommend it as a very fine editor, if 
your company will pay for it.  It can edit via FTP which is a great 
ability for OS projects, and works well on UNIX using 'Wine'.

There are lots of other good projects out there.  KDeveloper is similar 
to Visual Studio in look and feel, but tailored to OS projects.  A small 
editor called SciED I quite like as it really understands the code, and 
therefore helps you construct bug free structured programs...

I feel this is the start of an interesting thread!

Regards, Ben.

> 
> Kind regards,
> Tim.
> 
> 
> 
> -----Original Message-----
> From: Ben Clewett [mailto:Be...@cl...]
> Sent: maandag 11 oktober 2004 10:26
> To: Tim Wuyts
> Cc: per...@li...
> Subject: Re: [Perfparse-devel] Summary tables
> 
> Hi Tim,
> 
> Tim Wuyts wrote:
> 
> 
>>Greetings Programs!
>>
>>I noticed you have a few tables in the PP db layout that are meant for 
>>summarizing data. What are the plans here?
> 
> 
> These tables are present to satisfy a requirement to extract very large time
> frames (epochs) of data.  Eg, several years.  These will store a summary
> representation of the binary data.  The average, standard deviation, max,
> min etc, for a specific epoch.  Eg, the summary for a days data in one
> record.  Therefore a graph for a year is an easy 364 points, not the
> impossible half a million if every minutes data were to be plotted.  As well
> as a drastic cut in the database size.
> 
> This has not been completed due to lack of developer time, and discussion
> about whether this is the correct method of completing this requirement.  If
> there is a real need for this, then I can try and force some development
> time.
> 
> 
> 
>>I'm currently working on an 'availability' report. It is based on the 
>>raw plugin data, i.e. it only looks at the nagios_status field in the 
>>_raw table. Basically, I'm creating a report showing ALL hostgroups, 
>>hosts and services with their resp. uptime (%) and downtime, similar 
>>to the summary
> 
> on
> 
>>the raw history report.
> 
> 
> I am excited to hear you are giving us a report!  I look forward to getting
> it merged into the product :)
> 
> 
> 
>>My first approach was to create a Mysql UDF (user-defined function),
> 
> called
> 
>>'status_time' that can be used in a 'group by' clause. The resulting SQL
>>then looks something like this:
>>select host_name, service_description,
>>status_time(unix_timestamp(ctime), nagios_status, 0) as UPTIME,
>>status_time(unix_timestamp(ctime), nagios_status, 1) as WARNTIME,
>>status_time(unix_timestamp(ctime), nagios_status, 2) as CRITICALTIME,
>>status_time(unix_timestamp(ctime), nagios_status, 3) as UNDEFTIME
>>from perfdata_service_raw where ctime > '2004-07-31' and ctime <
>>'2004-09-01' 
>>group by host_name, service_description order by ctime
> 
> 
> My first comment is a a worry.  I have not used UDF, but looking at:
> http://dev.mysql.com/doc/mysql/en/Adding_UDF.html I see that this is a 
> complex task involving recompiling MySQL.  For those people using RPM's, 
> this may be hard.  But there may be other ways.
> 
> 
>>This works, but performance is bad (just think about how many records need
>>to be processed for a few 100 services and 30 days of data!), and I am
>>relying on MySQL to give me the data in chronological order (so far it's
>>always been correct, but I'm not comfortable with it)
>>
>>Since the data is not changing once it was entered, I thought of
> 
> simplifying
> 
>>by 'data-warehousing' the raw data. Using the query above, I summarize the
>>availability data every x time (x could be daily, weekly, monthly), and
>>store them in a table, e.g. perfdata_summary_raw. I'm planning on creating
> 
> a
> 
>>more elaborate script (in Perl, I could try C, but it's been a while) to
> 
> do
> 
>>this.
> 
> 
> Great idea!
> 
> 
> 
>>Q1: did you plan on making something similar? If so, is there any code I
>>could look at/use/improve ?
> 
> 
> I know of no plans to create summary data from the raw output.
> 
> Rather than writing a cron to do this, it may be best to add the data to 
> your summary during the initial addition to MySQL.  Wait until version 
> 0.101.1, or get a pre-version from:
> 
> http://ymettier.chez.tiscali.fr/perfparse-devel/index.php
> 
> Look at storage_mysql.c and function 'storage_mysql_store_line'.  This 
> will be the place to add data to your summary.
> 
> Yves will have to reserve you an configuration option to use in your 
> code.  Eg --Store_Summary_Raw.  But getting ahead of my self here.
> 
> 
>>Q2: what should the summary_raw table look like? Here's my current
> 
> proposal:
> 
>>CREATE TABLE `perfdata_summary_availability` (
>>  `id` int(11) NOT NULL auto_increment,
>>  `period` varchar(10) NOT NULL default '',
>>  `host_name` varchar(75) NOT NULL default '',
>>  `service_description` varchar(75) NOT NULL default '',
>>  `sum_uptime` bigint(20) NOT NULL default '0',
>>  `sum_warntime` bigint(20) NOT NULL default '0',
>>  `sum_criticaltime` bigint(20) NOT NULL default '0',
>>  `sum_undeftime` bigint(20) NOT NULL default '0',
>>  PRIMARY KEY  (`id`),
>>  UNIQUE KEY `perfdata_summary_availability_ix1`
>>(`period`,`host_name`,`service_description`),
>>  KEY `perfdata_summary_availability_ix0`
>>(`host_name`,`service_description`)
>>) TYPE=InnoDB;
> 
> 
> Looks good.  Very much the same as our own workings on the binary summary.
> 
>  From the work we did on that, I will suggest a few things, which you 
> can ignore if you wish!
> 
> The 'period VARCHAR(10)'.  I am not sure what this will be for, or what 
> you will store in this field.  I can suggest another way of defining a 
> period:
> 
> If this file will always be for a time span of a constant number of 
> seconds.  Eg, 1 hour = 3600.  You can store the start time of the epoch 
> as part of an alternate PK:  (Storing as UNIX time)
> 
> epoch_start UNSIGNED LONG,
> PRIMARY KEY (epoch_start, host, service)
> 
> You can then work out which time period to use by subtracting the 
> modulus of the epoch duration (summary sample time period) against the time:
> 
> file.c:  epoch_start = ctime - (ctime % epoch_length);
> 
> Or your update query is then something like:
> 
> UPDATE perfdata_summary_availability
> 	SET sum_uptime = sum_uptime + 1, sum_warn ....
> 	WHERE epoch_start = ctime - MOD(ctime,epoch_length)
> 	AND host = ....
> 
> (If the summary record already exists.)
> 
> A finally idea.  If you include the epoch_length as part of the primary 
> key, this enables you to store data for many sample durations in the 
> same table.  Eg, 1 hour, 1 day, 1 week etc...
> 
> If you were to consider these ideas, it would make the table something like:
> 
> 	epoch_period UNSIGNED INT NOT NULL,
> 	epoch_start UNSIGNED LONG NOT NULL,
> 	host_name VARCHAR(75) NOT NULL,
> 	service_description VARCHAR(75) NOT NULL,
> 	PRIMARY KEY (epoch_period,
> 		epoch_start, host_name, service_description),
> 	...
> 
> Lastly, a few house keeping bits:
> 
> The host and service should really be foreign key references to the host 
> and service tables.  This stops the parent host and service tables being 
> deleted without this table being sorted.
> 
> Finally, there needs to be some deletion policies.  See:
> 
> http://sourceforge.net/docman/display_doc.php?docid=23729&group_id=109355
> 
> And look at other tables.
> 
> Both of these would eventually need code in the deletion policy program 
> as well, at some distant later date.
> 
> I hope this has not put you off.  Please feel free to ignore any of this 
> to get your self the report you desire to start with.  I would like to 
> see this report as part of the PP suit, so maybe you can start and we 
> can look at it after this?
> 
> Good luck,
> 
> Regards, Ben.
> 
> 
> 	
>