From: Alex K. <al_...@ya...> - 2005-04-07 10:58:42
|
Hi! It will be nice to see in future versions of Firebird the following things: 1) Better support of aliases in queries For example the following query produces error but perfectly legal from my point of view: select employeeid, sum( amount ) as income from payouts group by employeeid having income > 10000; 2) Indexes should be used for unstrict search conditions For example index is _used_ during execution of query below: select recid from mytable where status = 0; but NOT used during execution of the following query (even if valid plan was specified); select recid from mytable where status < 1; SY, Alex Kotov |
From: Dmitry Y. <di...@us...> - 2005-04-12 06:25:27
|
"Alex Kotov" <al_...@ya...> wrote: > > 1) Better support of aliases in queries > > For example the following query produces error but perfectly legal > from my point of view: > select employeeid, sum( amount ) as income from payouts > group by employeeid having income > 10000; I understand such a wish, although I wound't assign it a high priority. We don't support CSE (common subexpression elimination), so now aliases just hide the fact that any complex aliased expression may be evaluated more than once. This is often a subject of hidden performance issues. Personally, I prefer explicitness in this case. More to type but much easier to understand what should be expected from the query. > 2) Indexes should be used for unstrict search conditions > > For example index is _used_ during execution of query below: > select recid from mytable where status = 0; > > but NOT used during execution of the following query > (even if valid plan was specified); > select recid from mytable where status < 1; I'm pretty sure the index _is_ used. Feel free to prove me wrong with a real example. Dmitry |
From: Eyal <ey...@re...> - 2005-04-13 14:29:48
|
Dmitry Yemanov wrote: > "Alex Kotov" <al_...@ya...> wrote: > >>1) Better support of aliases in queries >>For example the following query produces error but perfectly legal >>from my point of view: >> select employeeid, sum( amount ) as income from payouts >> group by employeeid having income > 10000; > > I understand such a wish, although I wound't assign it a high priority. We > don't support CSE (common subexpression elimination), so now aliases just > hide the fact that any complex aliased expression may be evaluated more than > once. This is often a subject of hidden performance issues. Personally, I > prefer explicitness in this case. More to type but much easier to understand > what should be expected from the query. From the Release Notes of FB2alpha: "Improved GROUP BY and ORDER BY clauses A. Brinkman Column aliases are now allowed in both these clauses. Examples: ..." Does this mean that column aliases are only supported for non-expression columns? Or that they are supported for GROUP/ORDER BY but not for HAVING? Thanks, Eyal. |
From: Dmitry Y. <di...@us...> - 2005-04-13 20:00:44
|
"Eyal" <ey...@re...> wrote: > > From the Release Notes of FB2alpha: > > "Improved GROUP BY and ORDER BY clauses A. Brinkman > > Column aliases are now allowed in both these clauses. > > Examples: ..." > > Does this mean that column aliases are only supported for non-expression > columns? For all columns, including expressions. With possible performance issues I outlined. > Or that they are supported for GROUP/ORDER BY but not for HAVING? They are not supported for HAVING, as far as I remember. Dmitry |
From: Alex K. <al_...@uk...> - 2005-06-02 12:10:00
|
"Dmitry Yemanov" <di...@us...>: >> 2) Indexes should be used for unstrict search conditions >> >> For example index is _used_ during execution of query below: >> select recid from mytable where status = 0; >> >> but NOT used during execution of the following query >> (even if valid plan was specified); >> select recid from mytable where status < 1; > > I'm pretty sure the index _is_ used. Feel free to prove me wrong with a > real > example. First of all I need a clarify my point a bit -- Firebird (1.5.2) reports that index _is_ used (i.e. you're correct) but query execution time is unacceptable. Please repeat the following test case as close as possible: 1) Create test database and table: create database '/var/lib/databases/test.db' user 'sysdba' password 'topsecret'; connect '/var/lib/databases/test.db' user 'sysdba' password 'topsecret'; create generator myrecid; create table mytable ( recid bigint not null primary key, status smallint ); 2) Fill the table by test data // --------- please execute the perl script below to fill the table, cut below #!/usr/bin/perl use strict; my $dsn = "DBI:InterBase:dbname=/var/lib/databases/test.db;ib_dialect=3;ib_charset=unicode_fss"; my $dbuser = "sysdba"; my $dbpassword = "topsecret"; use DBI; my $dbh = DBI->connect( $dsn, $dbuser, $dbpassword, { RaiseError => 0 } ) or die( "unable to connect to database '$dsn'" ); $dbh->{ AutoCommit } = 0; $dbh->func( -access_mode => 'read_write', -isolation_level => 'read_committed', -lock_resolution => 'wait', 'ib_set_tx_param' ); my $i = 0; for ( $i = 0; $i < 900000; $i++ ) { my $x = rand( 1 ) + 1; $dbh->do( "insert into mytable ( recid, status ) values( gen_id( myrecid, 1 ), ? )", undef, $x ); } for ( $i = 0; $i < 30; $i++ ) { $dbh->do( "insert into mytable ( recid, status ) values( gen_id( myrecid, 1 ), 0 )" ); } $dbh->commit; $dbh->disconnect; undef( $dbh ); // --------- end of script, stop cutting above 3) Create index create index myidx0 on mytable( status ); 4) Simulate some past table activity (important! do not skip this step) update mytable set status = 3 where status = 1; update mytable set status = 1 where status = 2; update mytable set status = 2 where status = 3; commit; 5) Please execute both queries below and compare time of execution: a) select recid from mytable where status = 0; b) select recid from mytable where status < 1; Also you may try to measure execution time of select status,count(*) from mytable group by status; SY, Alex Kotov |
From: Dmitry Y. <di...@us...> - 2005-06-03 12:04:11
|
"Alex Kotov" <al_...@uk...> wrote: > > First of all I need a clarify my point a bit -- Firebird (1.5.2) reports > that index _is_ used (i.e. you're correct) but query execution time is > unacceptable. > > [snip] > > 5) Please execute both queries below and compare time of execution: > a) select recid from mytable where status = 0; > b) select recid from mytable where status < 1; (status < 1) performs a range index scan looking for 0 and 1 values (i.e. the upper bound is strict). Values of 1 are eliminated later, after all corresponding records are fetched from cache/disk. For non-uniform data distribution, this may become a real performance issue. This is exactly what you observe. This is going to be improved in the future versions. Dmitry |
From: Vlad H. <hv...@us...> - 2005-06-03 12:18:29
|
> 3) Create index > create index myidx0 on mytable( status ); > > 4) Simulate some past table activity (important! do not skip this step) > update mytable set status = 3 where status = 1; > update mytable set status = 1 where status = 2; > update mytable set status = 2 where status = 3; > commit; You insert into index all possible key values for the each record > 5) Please execute both queries below and compare time of execution: > a) select recid from mytable where status = 0; No keys with value of 0 > b) select recid from mytable where status < 1; Index key with value 1 point to the each record in table because of garbage you create in step 4 Regards, Vlad |
From: Jim S. <ja...@ne...> - 2005-06-03 14:00:44
|
Alex Kotov wrote: >"Dmitry Yemanov" <di...@us...>: > > > >>>2) Indexes should be used for unstrict search conditions >>> >>> For example index is _used_ during execution of query below: >>> select recid from mytable where status = 0; >>> >>> but NOT used during execution of the following query >>> (even if valid plan was specified); >>> select recid from mytable where status < 1; >>> >>> Excuse me, but I don't understand who said that indexes should not be used for inequalities or why he or she would say that. Unless 100% of the records meet the selection critereon, the indexed version will always be faster, and even if 100% of the records are selected, the cost of the index scan is small compared to fetching the actual records. Am I missing something here? -- Jim Starkey Netfrastructure, Inc. 978 526-1376 |
From: Ann W. H. <aha...@ib...> - 2005-06-03 16:34:00
|
Jim Starkey wrote: >>>> > > Excuse me, but I don't understand who said that indexes should not be > used for inequalities or why he or she would say that. Unless 100% of > the records meet the selection critereon, the indexed version will > always be faster, and even if 100% of the records are selected, the cost > of the index scan is small compared to fetching the actual records. Am > I missing something here? > Yes... the process of building the data for the query, includes this step: 4) Simulate some past table activity (important! do not skip this step) update mytable set status = 3 where status = 1; update mytable set status = 1 where status = 2; update mytable set status = 2 where status = 3; commit; Those updates create back versions and index entries. The result is that every record has a version that has the value 1. Most of them are old versions and the garbage collection may also affect the query timings. Regards, Ann |
From: Alex K. <al_...@uk...> - 2005-06-03 20:29:07
|
""Ann W. Harrison"" <aha...@ib...> ???????/???????? ? ???????? >> Excuse me, but I don't understand who said that indexes should not be >> used for inequalities or why he or she would say that. Unless 100% of >> the records meet the selection critereon, the indexed version will always >> be faster, and even if 100% of the records are selected, the cost of the >> index scan is small compared to fetching the actual records. Am I >> missing something here? Problem a bit different and Dmitry already gave answer on it -- existing versions of Firebird not optimally handle "greater than" and "less than" index lookups - actually they performs "greater or equal" and "less or equal" index lookup and eliminate "equal" rule at the end of query execution. As a result with non-uniform key distribution we have performance penalty like in the demonstrated case (i.e. process ~450030 records instead of 30). > Yes... the process of building the data for the query, includes this step: > 4) Simulate some past table activity (important! do not skip this step) > update mytable set status = 3 where status = 1; > update mytable set status = 1 where status = 2; > update mytable set status = 2 where status = 3; > commit; > Those updates create back versions and index entries. The result is that > every record has a version that has the value 1. Please do not repeat mistakes of Vlad Horsun!!! First of all there are 30 records in table which have key value "0" and NEVER change it. And we're interested in those records in my case and not in the records which we did shuffle on. At second, all old versions should be discarded at commit and may stay as garbage but must not be used and should not affect performance so dramatically (it degrades from 1 second to about several minutes on Xeon 2.8GHz with 1Gb of RAM). Indexes also must be updated to reflect latest committed changes. > Most of them are old versions and the garbage collection may also affect > the query timings. I agree that garbage collection may affect but this is not the case. SY, Alex Kotov |
From: Jim S. <ja...@ne...> - 2005-06-04 11:26:45
|
Alex Kotov wrote: > > Problem a bit different and Dmitry already gave answer on it -- existing >versions of Firebird not optimally handle "greater than" and "less than" >index lookups - actually they performs "greater or equal" and "less or >equal" index lookup and eliminate "equal" rule at the end of query >execution. As a result with non-uniform key distribution we have performance >penalty like in the demonstrated case (i.e. process ~450030 records instead >of 30). > > > Since the alternative was reading the table exhaustively, those records would have been read anyway, so the worst case hit in your case was scanning more of the index than was necessary -- not exactly am earth shattering problem. |