#4243 (ok 4.3) Super slow page rendering with tens of thousands of DBs

4.1.5
fixed
None
1
2014-12-05
2014-01-21
No

I recently upgraded phpmyadmin from 3.x to 4.x in a DB setup that has about 25,000 DBs (though only a small number of per-page DBs are visible, due to our only_db setting), and found that page load times were in the realm of 60-90 seconds each -- agonizingly slow (we run MySQL 5.5, btw). I traced the cause down to all the counting and selecting against the information_schema DB being done for DB names: anytime an explicit name isn't mentioned (e.g. LIKE is used) the query took 20, 40, even 60 seconds to complete.

I eventually made enough changes to the code to get page loads down to normal, and will mention some of the things I discovered:

  • The only SELECT of information_schema's DB names that is quick is an exact match against SCHEMA_NAME w/o any charset transformation (i.e. no use of BINARY casts).
  • Even a SELECT using IN('db1', 'db2') is super slow.
  • Doing SELECT queries with a restricted (non-super) user is slower than the super-user (probably due to all the security checks that the super-user can avoid) so don't be fooled by any speed tests you run using a root login.
  • Doing a COUNT() of DB info in the information_schema is just as super slow as a SELECT.
  • The use of SHOW DATABASES LIKE is quite fast.

Thus, for a general fix, I'd recommend that you change the code that does any DB-name matching to do a SHOW DATABASES LIKE call, get the list of DB names, and then loop over it with individual-DB-name SELECTs into information_schema.

FYI, the fixes I did were a simpler set of changes that work for our particular site. If you're curious, here's what I did:

  • I changed all the count operations (e.g. getPresence() calls) into array-length operations of the only_db array (since that is always a full list of the user's tables in our setup).
  • I removed all BINARY casts in the database-name matching code.
  • I removed all use of LIKE in the database-name matching code.
  • I changed the code path in getDatabasesFull() that lists all DBs (when $database isn't specified) into a loop over the only_db list of DB names, adding each individual DB lookup into the $databases[$db_name] array.

Discussion

1 2 > >> (Page 1 of 2)
  • Wayne Davison

    Wayne Davison - 2014-01-21
    • Description has changed:

    Diff:

    --- old
    +++ new
    @@ -1,4 +1,4 @@
    -I recently upgraded phpmyadmin from 3.x to 4.x in a DB setup that has about 25,000 DBs, and found that page load times were in the realm of 60-90 seconds each -- agonizingly slow (we run MySQL 5.5, btw).  I traced the cause down to all the counting and selecting against the information_schema DB being done for DB names: anytime an explicit name isn't mentioned (e.g. LIKE is used) the query took 20, 40, even 60 seconds to complete.
    +I recently upgraded phpmyadmin from 3.x to 4.x in a DB setup that has about 25,000 DBs (though only a small number of per-page DBs are visible, due to our only_db setting), and found that page load times were in the realm of 60-90 seconds each -- agonizingly slow (we run MySQL 5.5, btw).  I traced the cause down to all the counting and selecting against the information_schema DB being done for DB names: anytime an explicit name isn't mentioned (e.g. LIKE is used) the query took 20, 40, even 60 seconds to complete.
    
     I eventually made enough changes to the code to get page loads down to normal, and will mention some of the things I discovered:
    
     
  • Marc Delisle

    Marc Delisle - 2014-01-21

    Hi,
    I assume you are using phpMyAdmin 4.1.5. Which exact MySQL 5.5 version are you using? Which PHP version? Are you using the mysqli extension?

     
  • Wayne Davison

    Wayne Davison - 2014-01-21

    We're using Percona's 5.5.34-rel32.0.591. While we're indeed using mysqli and php 5.3, the interface details don't really impact the root cause -- the queries themselves that are quite slow. e.g. (3 DB names are in only_db array):

    [count] => 2
    [query] => SELECT COUNT(*) FROM `INFORMATION_SCHEMA`.`SCHEMATA` WHERE TRUE
         AND `SCHEMA_NAME` NOT REGEXP '^information_schema$'
         AND ( `SCHEMA_NAME` LIKE 'prefix_foo'
            OR `SCHEMA_NAME` LIKE 'prefix_piwik'
            OR `SCHEMA_NAME` LIKE 'prefix_test' )
    [time] => 25.233343839645
    

    Since information_schema doesn't have any indexing, any time that query isn't SCHEMA_NAME = 'single_name' then it does a scan of the whole list of DBs, which is very expensive when there are a lot of DB names around. Note that removing the "NOT REGEXP" doesn't speed anything up, nor does changing "LIKE" to "=" in this instance (since it's still doing a full scan of the DBs with multiple names).

    As an example, this executes in 0.00 sec:

    SELECT SCHEMA_NAME
    FROM `information_schema`.SCHEMATA WHERE `SCHEMA_NAME` = 'prefix_foo'
    UNION ALL SELECT SCHEMA_NAME
    FROM `information_schema`.SCHEMATA WHERE `SCHEMA_NAME` = 'prefix_piwik'
    UNION ALL SELECT SCHEMA_NAME
    FROM `information_schema`.SCHEMATA WHERE `SCHEMA_NAME` = 'prefix_test';
    

    But putting all 3 of those into a single "IN (...)" or "OR" list takes over 20 seconds to return fron the scan.

     
  • Marc Delisle

    Marc Delisle - 2014-01-22

    Wayne,
    thanks for your report. Could you show us the results of these?

    show variables like 'innodb_stats_on_metadata'
    show variables like 'innodb_stats_auto_update'

     
  • Wayne Davison

    Wayne Davison - 2014-01-22

    We turned off stats on metadata while debuggin the issue. The stats auto-update is on, which I think is what we want:

    | innodb_stats_on_metadata      | OFF         |
    | innodb_stats_auto_update      | 1           |
    

    One other thing that might contribute to the slowness is that we have around 150,000 grants in the mix (dunno if lots of grants slows it down more than just the 25,000 databases or not).

     
  • Wayne Davison

    Wayne Davison - 2014-01-22

    No change. The former cited SELECT...COUNT query still takes 23 seconds for a normal user (root only needs 0.41 seconds -- not sure why the two differ so widely).

    FYI, the comments on the blog post you cited make it sound like innodb_stats_on_metadata is authoritative when both vars exist, but it indeed seems better to turn them both off, so I have done so.

     
  • Sean J.

    Sean J. - 2014-04-25

    Wayne, would it be possible to provide your patch? I too am having the same issues. Using MySQL 5.5.37, and PMA 4.1.13, even with only 5k databases, PMA can take anywhere from 60 seconds to 5 minutes to load. Exporting a database is equally slow as there's numerous queries performed to I_S well before the export begins.

     
  • Sean J.

    Sean J. - 2014-05-03

    Due to the severe performance issues with 4.x and its heavy usage of I_S, I've been forced to downgrade 100% of our 1500+ servers to PMA 3.5.8.2 til such time as PMA devs make appropriate optimizations to the code base to be useful again in shared hosting environments or those with hundreds or thousands of databases.

    PMA devs, please consider implementing Wayne's changes.

     
    Last edit: Sean J. 2014-05-03
  • Wayne Davison

    Wayne Davison - 2014-05-03

    Sorry for the delay. The attached patch includes the changes I made to get 4.1.3 to work acceptably in our environment. It REQUIRES that the config puts all the DB names that the user should have access to inside the 'only_db' as an array (and does not support any wildcarding in that array). Note also that a more recent change that was made in response to my report on wildcards in the sidebar not working would need to be reverted (if that change is in the version you want to use), since it adds more slow scanning to the code.

    Keep in mind that this patch is not a suggestion for how to fix the issue for the general release, though if the general code were modified to populate the only_db list with a static list of DB names (via a table scan, not a schema scan) it could be pretty close to a more general solution. i.e. my kluge does that scan in the config file, but a general solution could allow the list to be empty and/or contain "LIKE" wildcards as long as it created a list of tables in the only_db array right after reading the config.

     
    Last edit: Wayne Davison 2014-05-03
  • Sean J.

    Sean J. - 2014-05-03

    Thanks Wayne, I'll check out the patch. Wouldn't a middle-ground general case solution be where 'only_db' was absent, and even say 'hide_db' to do a single pass of 'SELECT SCHEMA_NAME FROM INFORMATION_SCHEMA.SCHEMATA WHERE TRUE ORDER BY SCHEMA_NAME ASC LIMIT 0, 250;', store the results of those DBs into possibly the 'only_db' array and foreach off those for subsequent queries? No LIKE, REGEXP, IN calls, just simple ='s.

    Looking through MySQL's general log and slow queries, it seems PMA is just relying on the the users permissions which is the slow down for these I_S queries due to internal error handling. Granted, MariaDB, and I presume Percona, has introduced error suppression fixes to return 0 vs. going through the silent error handling code, which does offer a degree of performance gain over MySQL. However, if we already know which DBs the user has access to from the first query, it seems wise to just loop through those for all the COUNT(*) or any other query against I_S.

     
  • Wayne Davison

    Wayne Davison - 2014-05-03

    No scanning of SCHEMA_NAME can be allowed or performance becomes too slow. Only individual DB name selects should be used (see my UNION ALL comment earlier). Thus, the code needs to use SHOW TABLES LIKE ... to find matching names (if any matching needs to be done), and then those names can be used for individual SCHEMA_NAME lookups.

     
  • Sean J.

    Sean J. - 2014-05-03

    Also, and PMA 3.x and likely older versions suffered from this when 'only_db' was absent, is that after selecting a database, it'll re-trigger the 'SHOW DATABASES' or the I_S query if 4.x. Unless a user clicks the "Reload navigation frame" button, we shouldn't be requerying all their DBs again. Otherwise, we take the performance hit consistently just navigating around through PMA, even if it's more minimalistic on 3.x.

     
  • Marc Delisle

    Marc Delisle - 2014-05-03

    Sean,
    just to be sure, your servers are using the mysqli extension, right?

     
  • Marc Delisle

    Marc Delisle - 2014-05-03

    Also, any chance Wayne and Sean could test MySQL 5.6 in your environment? It was marked on 2013-02-05 for General Availability.

     
  • Sean J.

    Sean J. - 2014-05-03

    Yes, we're using mysqli. Unfortunately I can't use MySQL 5.6 in our environment easily without recompiling a number of RPMs, nor do we intend to use it, but plan on skipping to MariaDB 10.x in the near future. However, I have tested MySQL 5.5.37 vs. MariaDB 5.5.37 and due to the error suppression on permission handling, MariaDB has shown a noticeable performance gain for I_S queries.

    I "possibly" could whip something up on a test server, generate several thousand DBs, but it begs the question Marc. What are you hoping to get out of MySQL 5.6? Even if I_S is generally better, it would still be in PMA's better interest in my opinion to optimize several of your queries and make more precise use of I_S (ie, WHERE SCHEMA_NAME = "$database"). Plus, avoid requerying the I_S when selecting a database for all the users databases when you should already have that stored. That request should only be performed when clicking "Reload navigation frame" or going back to home.

     
    Last edit: Sean J. 2014-05-03
  • Marc Delisle

    Marc Delisle - 2014-05-04

    Sean,
    we have a team meeting next week and this subject will be discussed.

     
  • Marc Delisle

    Marc Delisle - 2014-05-04

    Sean,
    I'm trying to reproduce your problem. I have set up a Linux server with MySQL 5.5.37 compiled from source. I copied support-files/my-medium.cnf to my.cnf. PHP is 5.5.7, with mysqli and mysqlnd. phpMyAdmin is 4.1.13. I have 5000 databases, each with 10 tables. The tables are empty. InnoDB everywhere.

    User root logs in and after 2 seconds, sees the first page of the database names in the navi panel. Going to the next page takes also 2 seconds. A normal user having wildcard privileges on databases starting with "username_" is able to log in in 2 seconds also.

     
  • Sean J.

    Sean J. - 2014-05-04

    Just created a throw away account created no databases for it, just the primary user, extracted the tarballs for PMA 3.5.8.2 and PMA 4.1.14. No modifications made whatsoever to either PMA setups.

    System setup:
    CentOS 6.5
    MySQL 5.5.37
    PHP 5.4.27

    As root:
    SELECT COUNT() FROM INFORMATION_SCHEMA.SCHEMATA WHERE TRUE;
    +----------+
    | COUNT(
    ) |
    +----------+
    | 5021 |
    +----------+
    1 row in set (0.07 sec)

    As regular user:
    mysql> SELECT COUNT() FROM INFORMATION_SCHEMA.SCHEMATA WHERE TRUE;
    +----------+
    | COUNT(
    ) |
    +----------+
    | 1 |
    +----------+
    1 row in set (1 min 14.26 sec)

    mysql> show grants for 'somethro'@'localhost';
    +-----------------------------------------------------------------------------------------------------------------+
    | Grants for somethro@localhost |
    +-----------------------------------------------------------------------------------------------------------------+
    | GRANT USAGE ON . TO 'somethro'@'localhost' IDENTIFIED BY PASSWORD '66AAA7BFA54C8705C7B1B93CD95B4D7E5B7C38FB' |
    | GRANT ALL PRIVILEGES ON somethro\_%.
    TO 'somethro'@'localhost' |
    +-----------------------------------------------------------------------------------------------------------------+

    mysql> select count() from mysql.db;
    +----------+
    | count(
    ) |
    +----------+
    | 96370 |
    +----------+
    1 row in set (0.01 sec)

    mysql> select count() from mysql.user;
    +----------+
    | count(
    ) |
    +----------+
    | 19283 |
    +----------+
    1 row in set (0.00 sec)

    After 2+ minutes of it taking to load on both 3.5.8.2 and 4.1.14, I stopped counting the time to fully load the page. I'll test adjusting the my.cnf to the support-files/my-medium.cnf, just to do a fair comparison to your own test and test as root next for PMA logins.

     
  • Sean J.

    Sean J. - 2014-05-04

    As root, PMA 3.5.8.2 and 4.1.14 load instantly. I'll test the my.cnf adjustment next.

     
  • Sean J.

    Sean J. - 2014-05-04

    No change with my-medium.cnf. In fact, because it's not properly tuned for our environment, the results for non-root users was significantly poorer.

     
  • Sean J.

    Sean J. - 2014-05-04

    Worth mentioning too, the databases are a mix of MyISAM and InnoDB with a sprinkling of ARCHIVE tables. Partition tables are not allowed. On our servers with MariaDB 5.5.37, we have Aria disabled for the time being.

     
  • Sean J.

    Sean J. - 2014-05-04

    Just did one last test of removing several custom patches that tighten down MySQL/MariaDB and tested as root and the same regular user. Same results. Root logs in and shows the left navigation pane instantly, regular user takes several minutes.

     
  • Wayne Davison

    Wayne Davison - 2014-05-04

    Yeah, I had mentioned earlier in this thread that the slowdown doesn't affect the root user. I had also mentioned that we have around 150,000 grants for around 25,000 databases.

     
    Last edit: Wayne Davison 2014-05-04
  • Marc Delisle

    Marc Delisle - 2014-05-08

    Wayne,
    could you show us an example from these 150,000 grants (obfuscating the details)? I assume that these grants all follow the same pattern?

     
1 2 > >> (Page 1 of 2)