From: Cedric K. G. <ce...@gn...> - 2007-11-04 17:24:25
|
Hi there We have a particular ActionApps-driven site on its own virtual server, because it has some serious CPU, memory and disk demands. The main slice has 40 000 items, but the problem is the number of slow queries, most of which are RSS feeds, jsview "top 10" lists, tickers and so on. A typical query from such a view looks like: SELECT DISTINCT item.id as itemid , item.publish_date as s0 FROM item LEFT JOIN content as c0 ON (c0.item_id=item.id AND (c0.field_id='place...........' OR c0.field_id is NULL)) LEFT JOIN content as c1 ON (c1.item_id=item.id AND (c1.field_id='category........' OR c1.field_id is NULL)) WHERE item.slice_id = @myslice AND ( ( item.status_code=1 AND (item.publish_date <= '1194184800') AND (item.expiry_date > '1194184800') ) ) AND ( (c0.text LIKE "%Heal%") ) AND ( (c1.text LIKE "Somestring%") ) ORDER BY item.publish_date DESC; # Query_time: 11 Lock_time: 0 Rows_sent: 2364 Rows_examined: 107941 2364 rows are returned to PHP, but in fact the list length specified in the view is only 6. By using the index optimisation described at http://dev.mysql.com/doc/refman/5.0/en/limit-optimization.html much less of the database should need to be scanned. What would be needed would be to drop the DISTINCT, and a LIMIT 6 for this type of view. This optimization works correctly under MySQL 5.0 given the appropriate indexing, although really it should work even if DISTINCT is still present (if MySQL could de-duplicate as results were returned). So, firstly, is DISTINCT really necessary in the query at /include/searchlib.php3 line 1291 in QueryZIDs()? Just dropping 'DISTINCT' in phpMyAdmin, I get the same dataset returned in a fraction of the time, under a second. So these views should be much faster if if an appropriate index of the item table is created if it's not already there, e.g. (slice_id, status_code, last_edit), and we pass the view's "listlen" and "list_from" properties to LIMIT and OFFSET options respectively. With DISTINCT on the query above, I get 2375 rows in set (14.67 sec) Without DISTINCT and with LIMIT 6, I get 6 rows in set (0.01 sec) That is, the same ordering, but just the top 6 results. I also wonder whether this affects scrollers? Would it be possible to use "SELECT SQL_CALC_FOUND_ROWS" when a scroller needs to determine FOUND_ROWS(), but use LIMIT and OFFSET instead when looking at particular sections? The type of view could be passed to QueryZIDs() as well as the listlen. More simply the listlen parameter would only be passed if it were a view like that above which didn't include a scroller. If scrollers are to take advantage of the same optimisation, perhaps there could be a different scroller alias which only links to 'next page'. The site performance is bad enough that I'm tempted to leap in and hack some code, but what might go wrong? Is there a schematic of the AA code anywhere? Thanks Cedric |