Re: [Firebird-devel] AutoOptimize

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi,

>> Why only not unique ones?
>>
> Sorry, Arno, but the unique indexes doesn't have constant selectivity?

No, selectivity for every index is calculated with :

1 / (index-nodes - duplicate-index-nodes)

Also UNIQUE index can contain multiple NULLS.

Assume table A with fields ID and FK_ID and index for every field:

ID    FK_ID
1        NULL
2        NULL

run SET STATISTICS for both indexes IDX_ID and IDX_FK_ID

IDX_ID = 1 / 2 = 0.5
IDX_FK_ID = 0.0

insert more values

ID    FK_ID
1        NULL
2        NULL
3        1
4        2
5        2

run SET STATISTICS only for IDX_FK_ID

IDX_ID will stay 0.5
IDX_FK_ID = 1 / (5 - 2) = 0.3333

Selectivity for IDX_FK_ID is now better as IDX_ID.

>> It's not interesting to track modifications of index.
>> All indexes which belong to the same table should update the statistics simultaneously else the 
>> statistics become
>> "unbalanced" compared to each other.
>>
> Yes, but the "separate" indexes from the same table doesn't update at
> (approximative) same moment in time?

That's ofcourse true, not always all indexes are modified when updating data.

>> Tracking modifications (U/D/I) of a table (or data-pages updates) would be enough.
>> Using that number to calculate the percent that has changed of the table x should prepare the 
>> "update statistics batch
>> for table x" in the scheduler.

> For "D" and "I" - agree :).For "U" how can we track when the selectivity
> should be /really/ updated? (see the Microsoft papers from Laco's
> message and from mine). Let's have a common usage case: In an ERP we
> have an ACC table with the following structure (*all* the fields are
> indexed)

> ACCID INTEGER, /* PK */
> NAME VARCHAR(70), /* The account name  */
> DB01 DOUBLE PRECISION,
> DB02 DOUBLE PRECISION,
> ...
> DB12 DOUBLE PRECISION,
> CR01 DOUBLE PRECISION,
> CR02 DOUBLE PRECISION,
> ...
> CR12 DOUBLE PRECISION,
> etc.

Why not:
ACCID INTEGER NOT NULL PRIMARY KEY,
NAME VARCHAR(70)

Second table:
FK_ACCID INTEGER,
ACC_TYPE INTEGER,
ACC_MONTH DOUBLE PRECISION

but i assume it has a good reason.

> On daily basis, the debit and credit fields (which are 24) are updated
> frequently on different records. So, when the max_thresold will hit
> (let's say) because of DB01 (because we are in January) then all 25
> indexes will be recalculated (the PK not because is unique, CIIW, the
> NAME and the other 24 numeric fields). The main overhead IMHO comes from
> the field NAME (due to fact that is very large compared with others -
> 210 bytes on UNICODE) which is a very static index in day-by-day usage.

For your case above i wonder if it would be really usefull to update statistics after x-% 
modifications. Those double precision values are probably very unique to each other.

> Hence a rather 'small' efficiency, IMHO. But, of course, you know better
> than me the engine internals, so if the things became complex / worse
> feel free to do as you wish. Anyway, my small benchmark shows that
> updating the statistics isn't a performance hit.

Running SET STATISTICS for many indexes in a busy system is certainly going to slow down, because 
all index pages will be fetched, but i agree that it must be happen by the engine automaticly at 
some point.

> Also, some 'reading' for you :)  :

I will travel to your links-collection and see if there usefull information applicable for firebird.

Regards,
Arno Brinkman
ABVisie

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
General database development support:
  http://www.databasedevelopmentforum.com

Firebird open source database (based on IB-OE) with many SQL-99 features :
  http://www.firebirdsql.org
  http://www.firebirdsql.info
  http://www.fingerbird.de/
  http://www.comunidade-firebird.org/

Support list for Firebird and Interbase users :
fir...@ya...

Nederlandse firebird nieuwsgroep :
news://newsgroups.firebirdsql.info 

Re: [Firebird-devel] AutoOptimize

A powerful, cross platform, SQL database system

Re: [Firebird-devel] AutoOptimize