From: Viljo M. <na...@ma...> - 2004-10-28 13:22:27
|
Hello, Just a question about mysql database schema. Here I'm monitoring about 280 services and perfparse if logging them all. And mysql db grows with speed 1GB per week and after two weeks my perfdata_service_bin table had some 6 million rows. The problem is, that searches were very-very slow. Not for graph drawing but for other things like finding last 200 results for hosts etc, which required sorting and searching. As I see perfdata_service_bin table has 3 indexes and they are in same order, just that one contains 2, second 3 and third 4 fields which are indexed. With these 3 indexes mysql data file was 351MB and index file was 757MB. And doing search puts my server swapping heavily (server has 512MB RAM), because index doesn't fit to RAM. After removing 2 indexes with 2 and 3 keys mysql index file was 374MB and same query took half of time with 30k more rows. We thought here that either this is something really-really smart that we don't get, like speeds up some specific queries or was this some kind weird assumption about index count equals more speed, because if you index host_name, service_description, metric, ctime in one index then you really don't need to do that again with 3 fields, there's no use. Viljo |
From: Ben C. <Be...@cl...> - 2004-10-28 13:52:12
|
Viljo, I can see exactly what you mean, and I understand your point. There are in fact four indexes not three, which is I will be the first to admit, a bit of an overkill. :) The keys are there for a reason, I'll explain why I put them in: Key 1, (id), Primary Key. As well as the PK, used to reference data from perfparse_service_metric.last_perfdata_bin. Key 2, (host, service) Referential integrity (RI) to perfdata_service. Key 3, (host, service, metric) RI to perfdata_service_metric. Key 4, (host, service, metric, time) Used when plotting graph. To get data between a range and in correct ordering. First off, Key 2 does not serve any reason for being. RI is enforced by Key 3. This is redundant and I'll cut that for new users from next version. Thanks :) --------------------------------------- There will be a new version of the schema out soon. This only has *one* key. The one key in the new table will be used for the same reason as Key 1 and Key 4 above. RI. The new schema will not have RI from the main schema to the data schema. The reason for this is speed and size. The advantages of RI are far out-weighted by the need for speed and small size. The new schema will look *something* like: TABLE perfdata_bin ( mid UNSIGNED INT NOT NULL, ctime DATETIME NOT NULL, PRIMARY KEY (mid, ctime), value DOUBLE, state TINY INT, extra_data UNSIGNED INT NOT NULL ) TYPE=innodb; (Field 'extra_data' is a link to a table containing warning, critical, max and min data. Field 'mid' = Metric ID = (host, service, metric) as a unique INT) What do you think? Will this be better? I note this has no concept of a HOST as-is, will this bother you? Regards, Ben. Viljo Marrandi wrote: > Hello, > > Just a question about mysql database schema. > > Here I'm monitoring about 280 services and perfparse if logging them > all. And mysql db grows with speed 1GB per week and after two weeks my > perfdata_service_bin table had some 6 million rows. > > The problem is, that searches were very-very slow. Not for graph drawing > but for other things like finding last 200 results for hosts etc, which > required sorting and searching. > > As I see perfdata_service_bin table has 3 indexes and they are in same > order, just that one contains 2, second 3 and third 4 fields which are > indexed. > > With these 3 indexes mysql data file was 351MB and index file was 757MB. > And doing search puts my server swapping heavily (server has 512MB > RAM), because index doesn't fit to RAM. > > After removing 2 indexes with 2 and 3 keys mysql index file was 374MB > and same query took half of time with 30k more rows. > > We thought here that either this is something really-really smart that > we don't get, like speeds up some specific queries or was this some kind > weird assumption about index count equals more speed, because if you > index host_name, service_description, metric, ctime in one index then > you really don't need to do that again with 3 fields, there's no use. > > Viljo > > > ------------------------------------------------------- > This SF.Net email is sponsored by: > Sybase ASE Linux Express Edition - download now for FREE > LinuxWorld Reader's Choice Award Winner for best database on Linux. > http://ads.osdn.com/?ad_id=5588&alloc_id=12065&op=click > _______________________________________________ > Perfparse-devel mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfparse-devel > |
From: Flo G. <fl...@bi...> - 2004-10-28 19:05:46
|
Hi, i did some thoughts about the schema too: the mid field is a nice concept, but it makes queries that sort by hostname or service description impossible. I think, if indexes in mysql (or other db's) are efficient, it's the same if you use a "mid" field or a combined index. The extra data table is probably useless, if you assume that most performance is gained if the index search is good. So my proposal would be: TABLE perfdata_bin ( host_name VARCHAR(75) NOT NULL, service_description VARCHAR(75) NOT NULL, metric VARCHAR(75) NOT NULL, ctime INTEGER NOT NULL, PRIMARY KEY (host_name,service_description,metric,ctime), value DOUBLE, state TINY INT, warn DOUBLE, critical DOUBLE, max DOUBLE, min DOUBLE ) What i would do to save storage: create table hosts_names ( host_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, host_name VARCHAR(75) NOT NULL ) and replace host_name column in perfdata_bin with the host_id. Do joins to get the textual host_name in queries. The same could be done for service_description and metric, since there is a limited number of different values. Another proposal would be to use a integer to store timestamps. I'd use unix timestamps (seconds since 1.1.1970) to store dates, since - in my experience - the handling of dates in different databases (mysql, oracle, ...) is not very portable. Last i'd propose to store the raw data in the bin table, since i really don't know why this is stored in a extra table. But probably there is a reason i didn't realize. Finally i get: TABLE perfdata_bin ( host_id INTEGER NOT NULL, service_id INTEGER NOT NULL, metric_id INTEGER NOT NULL, ctime INTEGER NOT NULL, PRIMARY KEY (host_id,service_id,metric_id,ctime), value DOUBLE, state TINY INT, warn DOUBLE, critical DOUBLE, max DOUBLE, min DOUBLE, txt_data VARCHAR(255) ) And tables for host_name, service_description and metric. Note that this table has almost only fixed sized small columns. What do you think? Flo Gleixner On Thu, 28 Oct 2004, Ben Clewett wrote: ... > > There will be a new version of the schema out soon. This only has *one* key. > The one key in the new table will be used for the same reason as Key 1 and > Key 4 above. > > RI. The new schema will not have RI from the main schema to the data schema. > The reason for this is speed and size. The advantages of RI are far > out-weighted by the need for speed and small size. > > The new schema will look *something* like: > > TABLE perfdata_bin ( > mid UNSIGNED INT NOT NULL, > ctime DATETIME NOT NULL, > PRIMARY KEY (mid, ctime), > value DOUBLE, > state TINY INT, > extra_data UNSIGNED INT NOT NULL > ) TYPE=innodb; > > (Field 'extra_data' is a link to a table containing warning, critical, max > and min data. Field 'mid' = Metric ID = (host, service, metric) as a unique > INT) > > What do you think? Will this be better? I note this has no concept of a > HOST as-is, will this bother you? > > Regards, > > Ben. > > |
From: Viljo M. <na...@ma...> - 2004-10-29 06:28:41
|
Hello, Actually, the first thing I thought about perfparse data schema, was that why store hostnames in varchar for every value nagios gives us. Therefore I like Flo's idea about putting hosts, metric names and service_descriptions to different tables and simply joining them with _bin table. And also putting raw data to same table would save lots of space and unneeded duplicate values. I guess that would speed up things greatly. I'm not sure about storing times in unix timestamp format. I know mysql supports this, don't know about Oracle, but it's not supported by PostgreSQL, for which I indend to write module sometime (if somebody doesn't beat me on this). By 'supported' I mean that postgres doesn't have TO_UNIXTIME and FROM_UNIXTIME functions (or however they were called). So searching would be really weird, because first I had to convert unix timestamp to date and then compare dates to find one week interval or whatever. But then again, I might be wrong too :) Rgds, Viljo > What i would do to save storage: > > create table hosts_names ( > host_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, > host_name VARCHAR(75) NOT NULL > ) > > and replace host_name column in perfdata_bin with the host_id. Do joins > to get the textual host_name in queries. The same could be done for > service_description and metric, since there is a limited number of > different values. > > Another proposal would be to use a integer to store timestamps. I'd use > unix timestamps (seconds since 1.1.1970) to store dates, since - in my > experience - the handling of dates in different databases (mysql, > oracle, ...) is not very portable. > > Last i'd propose to store the raw data in the bin table, since i really > don't know why this is stored in a extra table. But probably there is a > reason i didn't realize. > > Finally i get: > > TABLE perfdata_bin ( > host_id INTEGER NOT NULL, > service_id INTEGER NOT NULL, > metric_id INTEGER NOT NULL, > ctime INTEGER NOT NULL, > PRIMARY KEY (host_id,service_id,metric_id,ctime), > value DOUBLE, > state TINY INT, > warn DOUBLE, > critical DOUBLE, > max DOUBLE, > min DOUBLE, > txt_data VARCHAR(255) > ) > > And tables for host_name, service_description and metric. > > Note that this table has almost only fixed sized small columns. > > What do you think? |
From: Ben C. <Be...@cl...> - 2004-10-29 07:29:59
|
Hi Viljo, First off, I am really excited about your plans to do a PostgreSQL conversion. I have kicked my self on more than one occasion for not starting with PSQL. Please let me know if you need any help. On this subject, it has been suggested that we use Sqlite as an abstraction layer. I think because 'if it's ok for php, it's ok for us'. :) Do you know anything about this? Comment on integer for date/time. Yes I agree. Also a design feature of PerfParse: 'Keep to ANSI SQL'. Which is why I write my schema as: value INT Rather than the MySQL syntax: 'value' INT(11) Date is another example. DATETIME is and ANSI SQL standard and therefore a good chance it will work with and RDBMS. Regards Ben. Viljo Marrandi wrote: > Hello, > > Actually, the first thing I thought about perfparse data schema, was > that why store hostnames in varchar for every value nagios gives us. > Therefore I like Flo's idea about putting hosts, metric names and > service_descriptions to different tables and simply joining them with > _bin table. And also putting raw data to same table would save lots of > space and unneeded duplicate values. I guess that would speed up things > greatly. > > I'm not sure about storing times in unix timestamp format. I know mysql > supports this, don't know about Oracle, but it's not supported by > PostgreSQL, for which I indend to write module sometime (if somebody > doesn't beat me on this). By 'supported' I mean that postgres doesn't > have TO_UNIXTIME and FROM_UNIXTIME functions (or however they were called). > > So searching would be really weird, because first I had to convert unix > timestamp to date and then compare dates to find one week interval or > whatever. But then again, I might be wrong too :) > > Rgds, > Viljo > >> What i would do to save storage: >> >> create table hosts_names ( >> host_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, >> host_name VARCHAR(75) NOT NULL >> ) >> >> and replace host_name column in perfdata_bin with the host_id. Do >> joins to get the textual host_name in queries. The same could be done >> for service_description and metric, since there is a limited number of >> different values. >> >> Another proposal would be to use a integer to store timestamps. I'd >> use unix timestamps (seconds since 1.1.1970) to store dates, since - >> in my experience - the handling of dates in different databases >> (mysql, oracle, ...) is not very portable. >> >> Last i'd propose to store the raw data in the bin table, since i >> really don't know why this is stored in a extra table. But probably >> there is a reason i didn't realize. >> >> Finally i get: >> >> TABLE perfdata_bin ( >> host_id INTEGER NOT NULL, >> service_id INTEGER NOT NULL, >> metric_id INTEGER NOT NULL, >> ctime INTEGER NOT NULL, >> PRIMARY KEY (host_id,service_id,metric_id,ctime), >> value DOUBLE, >> state TINY INT, >> warn DOUBLE, >> critical DOUBLE, >> max DOUBLE, >> min DOUBLE, >> txt_data VARCHAR(255) >> ) >> >> And tables for host_name, service_description and metric. >> >> Note that this table has almost only fixed sized small columns. >> >> What do you think? > > > > ------------------------------------------------------- > This Newsletter Sponsored by: Macrovision For reliable Linux application > installations, use the industry's leading > setup authoring tool, InstallShield X. Learn more and evaluate today. > http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/ > _______________________________________________ > Perfparse-devel mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfparse-devel > |
From: Flo G. <fl...@bi...> - 2004-10-29 10:03:00
|
Hi again, to show that handling date datatypes in different RDBMS is not easy portable, here some syntax examples (copied from sqlzoo.net): Formating a date MySQL: SELECT DATE_FORMAT(wk,'%d/%m/%Y'), song FROM totp WHERE singer='Tom Jones' Postges (this is ANSI i think): SELECT CAST(EXTRACT(DAY FROM wk) AS VARCHAR(2)) || '/' || CAST(EXTRACT(MONTH FROM wk) AS VARCHAR(2)) || '/' || CAST(EXTRACT(YEAR FROM wk) AS VARCHAR(4)), song FROM totp WHERE singer='Tom Jones' Oracle: SELECT TO_CHAR(wk, 'DD/MM/YYYY'), song FROM totp WHERE singer='Tom Jones' MSSQL: SELECT CONVERT(VARCHAR(10), wk, 103), song FROM totp WHERE singer='Tom Jones' DB2: SELECT CHAR(wk, eur), song FROM totp WHERE singer='Tom Jones' The conversion of unix timestamps to a textual representation can be easily done using the functions in <time.h> or equivalent functions in php. Basically you're right. Dates work in all RDBMS the same. Only the textual representation is different. As far as i remember, Oracle does not print hours, minutes and seconds by default. And the representation may depend on initialization Parameters (NLS_ parameters). So one may get 12/24/2004 while another one sees 24.12.2004 13:45:32 MESZ or even 24 Dezember 2004 13:45:32 MESZ. You can check this at sqlzoo.net. Do a SELECT wk, song FROM totp WHERE singer='Tom Jones' using the different RDBMS, you get the default date representations: Mysql and Postgres: 1987-04-30 Oracle: 30-APR-87 Finding time intervals with timestamps is easy too. One week is 60*60*24*7 seconds. So my idea was to make all textual representation and conversion in C/PHP which ist portable. To make all portable to other RDBMS there is some more work to do. I made tikiwiki (tikiwiki.org) work with mysql, postgres, oracle and sybase. It was really hard. Some examples: - Oracle does not understand mysql's join syntax - Mysql is case insensitive by default - in php you get the result of a query as a associative array. The array keys are lower case on most systems, upper case at Oracle. - To circumvent the above problem, one should quote the table- and column names. These quotes are backquotes in mysql, double quotes in Oracle and others - sybase only supports identifiers (table names, column names) until a maximum length of (i think) 26 characters (don't ask me why!) - some systems have a maximum varchar length of 250 characters. - sybase is picky if you try to insert a integer value like this: insert into bla values('12') mysql and others do a cast. for them it's ok. - many functions (CONCAT ...) are different ... So you see, it's really hard to use ANSI SQL :-) But perfparse does not use to many of the special cases, so i think it will be easier to make it db independent. Greetings, Flo Gleixner On Fri, 29 Oct 2004, Viljo Marrandi wrote: > Hello, > > Actually, the first thing I thought about perfparse data schema, was that why > store hostnames in varchar for every value nagios gives us. Therefore I like > Flo's idea about putting hosts, metric names and service_descriptions to > different tables and simply joining them with _bin table. And also putting > raw data to same table would save lots of space and unneeded duplicate > values. I guess that would speed up things greatly. > > I'm not sure about storing times in unix timestamp format. I know mysql > supports this, don't know about Oracle, but it's not supported by PostgreSQL, > for which I indend to write module sometime (if somebody doesn't beat me on > this). By 'supported' I mean that postgres doesn't have TO_UNIXTIME and > FROM_UNIXTIME functions (or however they were called). > > So searching would be really weird, because first I had to convert unix > timestamp to date and then compare dates to find one week interval or > whatever. But then again, I might be wrong too :) > > Rgds, > Viljo > >> What i would do to save storage: >> >> create table hosts_names ( >> host_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, >> host_name VARCHAR(75) NOT NULL >> ) >> >> and replace host_name column in perfdata_bin with the host_id. Do joins to >> get the textual host_name in queries. The same could be done for >> service_description and metric, since there is a limited number of >> different values. >> >> Another proposal would be to use a integer to store timestamps. I'd use >> unix timestamps (seconds since 1.1.1970) to store dates, since - in my >> experience - the handling of dates in different databases (mysql, oracle, >> ...) is not very portable. >> >> Last i'd propose to store the raw data in the bin table, since i really >> don't know why this is stored in a extra table. But probably there is a >> reason i didn't realize. >> >> Finally i get: >> >> TABLE perfdata_bin ( >> host_id INTEGER NOT NULL, >> service_id INTEGER NOT NULL, >> metric_id INTEGER NOT NULL, >> ctime INTEGER NOT NULL, >> PRIMARY KEY (host_id,service_id,metric_id,ctime), >> value DOUBLE, >> state TINY INT, >> warn DOUBLE, >> critical DOUBLE, >> max DOUBLE, >> min DOUBLE, >> txt_data VARCHAR(255) >> ) >> >> And tables for host_name, service_description and metric. >> >> Note that this table has almost only fixed sized small columns. >> >> What do you think? > > > ------------------------------------------------------- > This Newsletter Sponsored by: Macrovision For reliable Linux application > installations, use the industry's leading > setup authoring tool, InstallShield X. Learn more and evaluate today. > http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/ > _______________________________________________ > Perfparse-devel mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfparse-devel > |
From: Yves M. <yme...@li...> - 2004-10-29 10:35:37
|
> Hi again, > > to show that handling date datatypes in different RDBMS is not easy > portable, here some syntax examples (copied from sqlzoo.net): [...] > So you see, it's really hard to use ANSI SQL :-) But perfparse does not > use to many of the special cases, so i think it will be easier to make = it > db independent. [...] Remember that with modules, you can do something very different from one = db module to another. You don't need to follow the same database schema if you don't w= ant to. Too much different modules is bad (more work, more bugs...) But if you need speed and space, don't try to be 100% compatible. The mys= ql module will never connect to a postgresql database :) And again, the main problem that you will have is in the CGIs : they don'= t use modules. I have ideas for the CGIs and we can talk about that when you want. Yves --=20 - Homepage - http://ymettier.free.fr - http://www.logicacmg.com - - GPG key - http://ymettier.free.fr/gpg.txt - - Maitretarot - http://www.nongnu.org/maitretarot/ - - Perfparse - http://perfparse.sf.net/ - |
From: Ben C. <Be...@cl...> - 2004-10-29 11:09:20
|
So, question:- To make code slightly more uniform. What do we use for a date/time: DATETIME Suffers problems Flo demonstrated. But each module has own code, may not be significant. INT Limited to 1970 -> ~2035 DOUBLE Unlimited, can also use decimal part for milli-seconds. But hard to use with time.h. Yves was going to propose a method for making the 'c' CGI SQL easier to port. I would be interested. Regards, Ben Yves Mettier wrote: >>Hi again, >> >>to show that handling date datatypes in different RDBMS is not easy >>portable, here some syntax examples (copied from sqlzoo.net): > > > [...] > > >>So you see, it's really hard to use ANSI SQL :-) But perfparse does not >>use to many of the special cases, so i think it will be easier to make it >>db independent. > > > [...] > > Remember that with modules, you can do something very different from one db module to > another. You don't need to follow the same database schema if you don't want to. > > Too much different modules is bad (more work, more bugs...) > But if you need speed and space, don't try to be 100% compatible. The mysql module will > never connect to a postgresql database :) > > And again, the main problem that you will have is in the CGIs : they don't use modules. > I have ideas for the CGIs and we can talk about that when you want. > > Yves |
From: Ben C. <Be...@cl...> - 2004-10-29 07:55:38
|
Hi Flo, I wrote your response first. Unusually my Thunderbird lost the email, so here goes again :) I like your idea of replacing the VARCHAR(75) host, service metric with an INTEGER for each. This will save space on the existing method, without loosing the informational context. The INTEGER UID for these can be added directly to their existing tables. So we can forget my MID idea :) I believe MySQL, in common with most DBMS's, store DATETIME as a number internally. I think MySQL uses a 'unsigned long', MS SQL uses a 'double'. There is no real advantage in using out own date format, and we loose the ability to specify BETWEEN '2003-10-10' AND '2003-10-11' type syntax. My feeling is that this is ok as it is. I forgot to mention on this group that the existing warn/crit: warn FLOAT, crit FLOAT, Will become: warn_lower DOUBLE, warn_upper DOUBLE, warn_range ENUM ('inner','outer') NOT NULL, crit_lower DOUBLE, crit_upper DOUBLE, crit_range ENUM ('inner','outer') NOT NULL, An increases in size from 6 to 22 bytes. (DOUBLE = 4, ALLOW NULL = 1, EMUM = 1) This lead me to observer: - This will drastically increase the size of the binary data table. - This will take a long time to edit this table schema. - This leads to mass duplication. An example may be a graph of a weeks data, 10080 points. During this time the warn, crit, max and min are unlikely to change. Therefore can be stored in one tuple. A JOIN for this data: SELECT * FROM _bin JOIN _bin_extra Will not degrade performance. Rather the speed advantages of reading a smaller table will result in faster data access. So using the 'extra' table will lead to smaller data and faster access. I am really glad to see some DBA's on this group. What do you think? Regards, Ben. Flo Gleixner wrote: > > Hi, > > i did some thoughts about the schema too: > the mid field is a nice concept, but it makes queries that sort by > hostname or service description impossible. I think, if indexes in mysql > (or other db's) are efficient, it's the same if you use a "mid" field or > a combined index. The extra data table is probably useless, if you > assume that most performance is gained if the index search is good. So > my proposal would be: > > TABLE perfdata_bin ( > host_name VARCHAR(75) NOT NULL, > service_description VARCHAR(75) NOT NULL, > metric VARCHAR(75) NOT NULL, > ctime INTEGER NOT NULL, > PRIMARY KEY (host_name,service_description,metric,ctime), > value DOUBLE, > state TINY INT, > warn DOUBLE, > critical DOUBLE, > max DOUBLE, > min DOUBLE > ) > > What i would do to save storage: > > create table hosts_names ( > host_id INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY, > host_name VARCHAR(75) NOT NULL > ) > > and replace host_name column in perfdata_bin with the host_id. Do joins > to get the textual host_name in queries. The same could be done for > service_description and metric, since there is a limited number of > different values. > > Another proposal would be to use a integer to store timestamps. I'd use > unix timestamps (seconds since 1.1.1970) to store dates, since - in my > experience - the handling of dates in different databases (mysql, > oracle, ...) is not very portable. > > Last i'd propose to store the raw data in the bin table, since i really > don't know why this is stored in a extra table. But probably there is a > reason i didn't realize. > > Finally i get: > > TABLE perfdata_bin ( > host_id INTEGER NOT NULL, > service_id INTEGER NOT NULL, > metric_id INTEGER NOT NULL, > ctime INTEGER NOT NULL, > PRIMARY KEY (host_id,service_id,metric_id,ctime), > value DOUBLE, > state TINY INT, > warn DOUBLE, > critical DOUBLE, > max DOUBLE, > min DOUBLE, > txt_data VARCHAR(255) > ) > > And tables for host_name, service_description and metric. > > Note that this table has almost only fixed sized small columns. > > What do you think? > > Flo Gleixner > > > On Thu, 28 Oct 2004, Ben Clewett wrote: > ... > >> >> There will be a new version of the schema out soon. This only has >> *one* key. The one key in the new table will be used for the same >> reason as Key 1 and Key 4 above. >> >> RI. The new schema will not have RI from the main schema to the data >> schema. The reason for this is speed and size. The advantages of RI >> are far out-weighted by the need for speed and small size. >> >> The new schema will look *something* like: >> >> TABLE perfdata_bin ( >> mid UNSIGNED INT NOT NULL, >> ctime DATETIME NOT NULL, >> PRIMARY KEY (mid, ctime), >> value DOUBLE, >> state TINY INT, >> extra_data UNSIGNED INT NOT NULL >> ) TYPE=innodb; >> >> (Field 'extra_data' is a link to a table containing warning, critical, >> max and min data. Field 'mid' = Metric ID = (host, service, metric) >> as a unique INT) >> >> What do you think? Will this be better? I note this has no concept >> of a HOST as-is, will this bother you? >> >> Regards, >> >> Ben. >> >> > > > ------------------------------------------------------- > This Newsletter Sponsored by: Macrovision For reliable Linux application > installations, use the industry's leading > setup authoring tool, InstallShield X. Learn more and evaluate today. > http://clk.atdmt.com/MSI/go/ins0030000001msi/direct/01/ > _______________________________________________ > Perfparse-devel mailing list > Per...@li... > https://lists.sourceforge.net/lists/listinfo/perfparse-devel > |
From: Viljo M. <na...@ma...> - 2004-10-29 11:18:13
|
Hey, Now me being quite ignorant about C stuff, but aren't there really any libraries or whatever that we could use to make this database thing more abstract? I'm perl programmer and I know there's plethora of different sql abstraction layers made in perl, which make is easy to use same code for different databases (my latest and favourite finding is Class::DBI). Maybe it would be nice to get rid (I mean not to make at all) those different db storage modules and have one general module, which uses some smart lib to talk to SQL with whichever server. I found this: http://libdbi.sourceforge.net/ and this: http://siag.nu/libsdb/ And ODBC?? Best regards, Viljo Yves Mettier wrote: >>Hi again, >> >>to show that handling date datatypes in different RDBMS is not easy >>portable, here some syntax examples (copied from sqlzoo.net): > > > [...] > > >>So you see, it's really hard to use ANSI SQL :-) But perfparse does not >>use to many of the special cases, so i think it will be easier to make it >>db independent. > > > [...] > > Remember that with modules, you can do something very different from one db module to > another. You don't need to follow the same database schema if you don't want to. > > Too much different modules is bad (more work, more bugs...) > But if you need speed and space, don't try to be 100% compatible. The mysql module will > never connect to a postgresql database :) > > And again, the main problem that you will have is in the CGIs : they don't use modules. > I have ideas for the CGIs and we can talk about that when you want. > > Yves |
From: Yves M. <yme...@li...> - 2004-10-29 12:39:00
|
> Hey, > > Now me being quite ignorant about C stuff, but aren't there really any > libraries or whatever that we could use to make this database thing mor= e > abstract? I'm perl programmer and I know there's plethora of different > sql abstraction layers made in perl, which make is easy to use same cod= e > for different databases (my latest and favourite finding is Class::DBI)= . > > Maybe it would be nice to get rid (I mean not to make at all) those > different db storage modules and have one general module, which uses > some smart lib to talk to SQL with whichever server. The main problem is speed. Will the generic db storage module be fast eno= ugh ? You need to code it to see by yourself. I have no idea. :) About db storage modules, we only have one : mysql :) All other storage modules are different, and besides sql, there will prob= ably be file, socket, xml, rrd. Our modules are already an abstraction layer :) > > I found this: http://libdbi.sourceforge.net/ Somebody has already talked about libdbi. That was for Oracle. But perfpa= rse uses too many specificities of mysql. So he did not start. That was before the mod= ules implementation. Moreover, libdbi does not seem to be very active, and not very used. > and this: http://siag.nu/libsdb/ http://freshmeat.net/projects/libsdb/?branch_id=3D15175 Looks that the author worked on it recently (6 months ago). It is used in siag office, but is siag office still alive ? What about th= e other projects ? The more important, if you make a module that use such a library, is to b= e sure that if you find a bug, it will be fixed soon and a new release will follow. Now, such a library can be used, and one day, stop being maintained. That= day, you have to recode your software without it. That's why I like modules. If one day= some lib is not supported any more, just work on the module, not on all the project c= ode ! > And ODBC?? Just do it :) Well, you want to code a module, not to reimplement anything in perfparse= . So just do it, and then we see how it works. 1 month ago, it would have been very di= fferent because if working on perfparse to support another database had failed, we would = have been forced to go back to 0.100.7. Today, whatever the version of perfparse is, when your module is ready, w= e try it and if it works, we include it to perfparse. And any change in perfparse has no = impact on your module. So just do it :) Now, an answer to Ben about some ideas about the CGIs. I think that the code to produce graphs and statistics should be in a lib= rary, using some modules too. Modules because with them, you can read the data from e= verywhere : just plug the right module ! Who would use such libraries ? - the CGIs of course - some command-line tools. I would like some tool that get some data from= the base and compute some file that my boss could read without any SQL knowledge. - some Perl or PHP modules (or other languages if any) I'm also asking myself if C is the best language to get data from the dat= abase and generate documents with statistics. Aren't there other tools that you can= give them the numbers and they tell you the statistics you need ? For the graphs, however, I consider that C is good because you always do = the same : get data between 2 dates, optionnaly compute the scale and limits, and draw. = Result is a file, or some CGI compatible output. Well, I know my needs : get the data and generate graphs. What are your n= eeds ? Yves --=20 - Homepage - http://ymettier.free.fr - http://www.logicacmg.com - - GPG key - http://ymettier.free.fr/gpg.txt - - Maitretarot - http://www.nongnu.org/maitretarot/ - - Perfparse - http://perfparse.sf.net/ - |