|
From: Arno B. <fbd...@ab...> - 2007-01-02 18:50:01
|
Hi, >> Why only not unique ones? >> > Sorry, Arno, but the unique indexes doesn't have constant selectivity? No, selectivity for every index is calculated with : 1 / (index-nodes - duplicate-index-nodes) Also UNIQUE index can contain multiple NULLS. Assume table A with fields ID and FK_ID and index for every field: ID FK_ID 1 NULL 2 NULL run SET STATISTICS for both indexes IDX_ID and IDX_FK_ID IDX_ID = 1 / 2 = 0.5 IDX_FK_ID = 0.0 insert more values ID FK_ID 1 NULL 2 NULL 3 1 4 2 5 2 run SET STATISTICS only for IDX_FK_ID IDX_ID will stay 0.5 IDX_FK_ID = 1 / (5 - 2) = 0.3333 Selectivity for IDX_FK_ID is now better as IDX_ID. >> It's not interesting to track modifications of index. >> All indexes which belong to the same table should update the statistics simultaneously else the >> statistics become >> "unbalanced" compared to each other. >> > Yes, but the "separate" indexes from the same table doesn't update at > (approximative) same moment in time? That's ofcourse true, not always all indexes are modified when updating data. >> Tracking modifications (U/D/I) of a table (or data-pages updates) would be enough. >> Using that number to calculate the percent that has changed of the table x should prepare the >> "update statistics batch >> for table x" in the scheduler. > For "D" and "I" - agree :).For "U" how can we track when the selectivity > should be /really/ updated? (see the Microsoft papers from Laco's > message and from mine). Let's have a common usage case: In an ERP we > have an ACC table with the following structure (*all* the fields are > indexed) > ACCID INTEGER, /* PK */ > NAME VARCHAR(70), /* The account name */ > DB01 DOUBLE PRECISION, > DB02 DOUBLE PRECISION, > ... > DB12 DOUBLE PRECISION, > CR01 DOUBLE PRECISION, > CR02 DOUBLE PRECISION, > ... > CR12 DOUBLE PRECISION, > etc. Why not: ACCID INTEGER NOT NULL PRIMARY KEY, NAME VARCHAR(70) Second table: FK_ACCID INTEGER, ACC_TYPE INTEGER, ACC_MONTH DOUBLE PRECISION but i assume it has a good reason. > On daily basis, the debit and credit fields (which are 24) are updated > frequently on different records. So, when the max_thresold will hit > (let's say) because of DB01 (because we are in January) then all 25 > indexes will be recalculated (the PK not because is unique, CIIW, the > NAME and the other 24 numeric fields). The main overhead IMHO comes from > the field NAME (due to fact that is very large compared with others - > 210 bytes on UNICODE) which is a very static index in day-by-day usage. For your case above i wonder if it would be really usefull to update statistics after x-% modifications. Those double precision values are probably very unique to each other. > Hence a rather 'small' efficiency, IMHO. But, of course, you know better > than me the engine internals, so if the things became complex / worse > feel free to do as you wish. Anyway, my small benchmark shows that > updating the statistics isn't a performance hit. Running SET STATISTICS for many indexes in a busy system is certainly going to slow down, because all index pages will be fetched, but i agree that it must be happen by the engine automaticly at some point. > Also, some 'reading' for you :) : I will travel to your links-collection and see if there usefull information applicable for firebird. Regards, Arno Brinkman ABVisie -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- General database development support: http://www.databasedevelopmentforum.com Firebird open source database (based on IB-OE) with many SQL-99 features : http://www.firebirdsql.org http://www.firebirdsql.info http://www.fingerbird.de/ http://www.comunidade-firebird.org/ Support list for Firebird and Interbase users : fir...@ya... Nederlandse firebird nieuwsgroep : news://newsgroups.firebirdsql.info |