From: Bernd P. <b.p...@pc...> - 2008-11-24 11:28:17
|
Hi, in my maddr table are about 400.000 recordsets. Im using mysql 5.0.x and amavisd-new 2.6.1. On a lookup like SELECT msgs.time_num, msgs.mail_id, subject FROM msgs JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id WHERE sid='16' AND rid='2' AND content!='V' AND ds='P' ORDER BY msgs.time_num DESC LIMIT 1; I received an answer within 0.1 second. I realized a different problem: the combination of order by and LIMIT does not always list the last message. According to http://dev.mysql.com/doc/refman/5.0/en/limit-optimization.html MySQL ends sorting after - in this case - one recordset. There could exist newer recordsets in the database. Since the penpal bonus score is exponentially decreasing in time, it might be too low. Greetings Bernd > I didn't notice this until we upgraded to 5.1 in anticipation of using > partition tables, but I have a penpals query suggestion that took a 4 > hour SINGLE EMAIL LOOKUP to 90 seconds. > > Currently using mysql 5.1.x , amavisd 2.5.4, innodb tables. > Having major delays on email with penpals enabled on sites with a large > maddr table. > I never noticed this till we went to 5.1. and I don't have any 5.0 or > 4.x installations available to test this on. > > A query like this will take, sometimes several hours!!! (on a fast dual > quad machine, scsi mirrored drives, lots of ram, swap fine) > if you want to try this inhouse, do a select * from maddr limit 50; and > select a 'sid' from outside your domain, and a 'rid' from inside your > domain. > > SELECT msgs.time_num, msgs.mail_id, subject FROM msgs JOIN msgrcpt ON > msgs.mail_id=msgrcpt.mail_id WHERE sid='16' AND rid='2' AND > content!='V' > AND ds='P' ORDER BY msgs.time_num DESC LIMIT 1 > > explain shows things like this: (320K rows, 'Using where') and it > takes > 4 hours to run! > > explain SELECT msgs.time_num, msgs.mail_id, subject FROM msgs JOIN > msgrcpt ON msgs.mail_id=msgrcpt.mail_id WHERE sid='16' AND rid='2' AND > content!='V' AND ds='P' ORDER BY msgs.time_num DESC LIMIT 1; > +----+-------------+---------+-------+--------------------------------- > ----+---------------------+---------+-------------------+--------+----- > --------+ > | id | select_type | table | type | > possible_keys | key | key_len | > ref | rows | Extra | > +----+-------------+---------+-------+--------------------------------- > ----+---------------------+---------+-------------------+--------+----- > --------+ > | 1 | SIMPLE | msgs | index | > PRIMARY,msgs_idx_sid | msgs_idx_time_num | 4 | > NULL | 319783 | Using where | > | 1 | SIMPLE | msgrcpt | ref | > msgrcpt_idx_mail_id,msgrcpt_idx_rid | msgrcpt_idx_mail_id | 14 | > mail.msgs.mail_id | 1 | Using where | > +----+-------------+---------+-------+--------------------------------- > ----+---------------------+---------+-------------------+--------+----- > --------+ > 2 rows in set (0.01 sec) > > > I previously posted about this, and I suppose I didn't explain it well > enough. > > Three currious things make the query faster:. > 1) drop the ORDER BY in 'ORDER BY msgs.time_num DESC LIMIT 1' > this will probally not select the latest sent message, thereby > breaking penpals. > 2) drop the LIMIT 1. I don't know if this will break penpals, Mark > would need to tell us this > (maybe 2.6.2 could eliminate the limit 1? and penpals could just > grab the first response? > 3) force an index, as in > setting this in amavisd.conf: > > $sql_clause{sel_penpals}="Select msgs.time_num, msgs.mail_id, subject > FROM msgs force index (msgs_idx_sid) JOIN > msgrcpt ON msgs.mail_id=msgrcpt.mail_id WHERE sid=? AND rid=? AND > content!='V' AND ds='P' ORDER BY msgs.time_num > LIMIT 1"; > > both EXPLAIN in mysql, and timed tests confirm that either of the three > options takes a 4 HOUR QUERY into a 90 second query (yes, 90 seconds is > way too long, and yes, the maddr table is too big, and, yes, mysql 5.1, > amavisd 2.6 and partitions will help) > > results: > original query, > original: 319783 rows, might be using msgs_idx_time_num key > tests, > 1) no order by: 2018 rows, using msgs_idx_sid key > either case, if forcing an index helps, should not this be a good > edition to the default? > 2) no limit 1: 2018 rows, using msgs_idx_sid key > 3) force index, 2018 rows. > > documentation on explains: > original: 319783 rows, no key used, 'Using where' > > explain SELECT msgs.time_num, msgs.mail_id, subject FROM msgs JOIN > msgrcpt ON msgs.mail_id=msgrcpt.mail_id WHERE sid='16' AND rid='2' AND > content!='V' AND ds='P' ORDER BY msgs.time_num DESC LIMIT 1; > +----+-------------+---------+-------+--------------------------------- > ----+---------------------+---------+-------------------+--------+----- > --------+ > | id | select_type | table | type | > possible_keys | key | key_len | > ref | rows | Extra | > +----+-------------+---------+-------+--------------------------------- > ----+---------------------+---------+-------------------+--------+----- > --------+ > | 1 | SIMPLE | msgs | index | > PRIMARY,msgs_idx_sid | msgs_idx_time_num | 4 | > NULL | 319783 | Using where | > | 1 | SIMPLE | msgrcpt | ref | > msgrcpt_idx_mail_id,msgrcpt_idx_rid | msgrcpt_idx_mail_id | 14 | > mail.msgs.mail_id | 1 | Using where | > +----+-------------+---------+-------+--------------------------------- > ----+---------------------+---------+-------------------+--------+----- > --------+ > > any of #1, 2 or 3 above: > > explain SELECT msgs.time_num, msgs.mail_id, subject FROM msgs force > index (msgs_idx_sid) JOIN msgrcpt ON msgs.mail_id=msgrcpt.mail_id WHERE > sid='16' AND rid='2' AND content!='V' AND ds='P' ORDER BY msgs.time_num > DESC LIMIT 1; > +----+-------------+---------+------+---------------------------------- > ---+---------------------+---------+-------------------+------+-------- > ---------------------+ > | id | select_type | table | type | > possible_keys | key | key_len | > ref | rows | Extra | > +----+-------------+---------+------+---------------------------------- > ---+---------------------+---------+-------------------+------+-------- > ---------------------+ > | 1 | SIMPLE | msgs | ref | > msgs_idx_sid | msgs_idx_sid | 8 | > const | 2018 | Using where; Using filesort | > | 1 | SIMPLE | msgrcpt | ref | > msgrcpt_idx_mail_id,msgrcpt_idx_rid | msgrcpt_idx_mail_id | 14 | > mail.msgs.mail_id | 1 | Using where | > +----+-------------+---------+------+---------------------------------- > ---+---------------------+---------+-------------------+------+-------- > ---------------------+ > > table creates: > Table: msgs > Create Table: CREATE TABLE `msgs` ( > `mail_id` varbinary(12) NOT NULL, > `secret_id` varbinary(12) DEFAULT '', > `am_id` varchar(20) NOT NULL DEFAULT '', > `time_num` int(10) unsigned NOT NULL DEFAULT '0', > `time_iso` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE > CURRENT_TIMESTAMP, > `sid` bigint(20) unsigned NOT NULL, > `policy` varchar(255) DEFAULT '', > `client_addr` varchar(255) DEFAULT '', > `size` int(10) unsigned NOT NULL DEFAULT '0', > `content` binary(1) DEFAULT NULL, > `quar_type` binary(1) DEFAULT NULL, > `dsn_sent` char(1) DEFAULT NULL, > `spam_level` float DEFAULT NULL, > `message_id` varchar(255) DEFAULT '', > `from_addr` varchar(255) DEFAULT '', > `subject` varchar(255) DEFAULT '', > `host` varchar(255) NOT NULL DEFAULT '', > `quar_loc` varbinary(255) DEFAULT '', > PRIMARY KEY (`mail_id`), > KEY `msgs_idx_sid` (`sid`), > KEY `msgs_idx_mess_id` (`message_id`), > KEY `msgs_idx_time_num` (`time_num`), > KEY `msgs_idx_reports` (`time_num`,`content`,`quar_loc`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > > Table: msgrcpt > Create Table: CREATE TABLE `msgrcpt` ( > `mail_id` varbinary(12) NOT NULL, > `rid` bigint(20) unsigned NOT NULL, > `ds` char(1) NOT NULL DEFAULT '', > `rs` binary(1) NOT NULL, > `bl` char(1) DEFAULT '', > `wl` char(1) DEFAULT '', > `bspam_level` float DEFAULT NULL, > `smtp_resp` varchar(255) DEFAULT '', > KEY `msgrcpt_idx_mail_id` (`mail_id`), > KEY `msgrcpt_idx_rid` (`rid`), > KEY `idx_resp` (`smtp_resp`) > ) ENGINE=InnoDB DEFAULT CHARSET=latin1 > > -- > Michael Scheidell, CTO > Phone: 561-999-5000, x 1259 > > *| *SECNAP Network Security Corporation > > * Certified SNORT Integrator > * King of Spam Filters, SC Magazine 2008 > * Information Security Award 2008, Info Security Products Guide > * CRN Magazine Top 40 Emerging Security Vendors > > _______________________________________________________________________ > __ > This email has been scanned and certified safe by SpammerTrap(r). > For Information please see http://www.spammertrap.com > _______________________________________________________________________ > __ > > ----------------------------------------------------------------------- > -- > This SF.Net email is sponsored by the Moblin Your Move Developer's > challenge > Build the coolest Linux based applications with Moblin SDK & win great > prizes > Grand prize is a trip for two to an Open Source event anywhere in the > world > http://moblin-contest.org/redirect.php?banner_id=100&url=/ > _______________________________________________ > AMaViS-user mailing list > AMa...@li... > https://lists.sourceforge.net/lists/listinfo/amavis-user > AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3 > AMaViS-HowTos:http://www.amavis.org/howto/ |