Bug #36588 retrieving DDL-information for a database is slow
Submitted: 8 May 2008 8:47 Modified: 27 Dec 2012 19:32
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S4 (Feature request)
Version:5.0+ OS:Any
Assigned to: Vasil Dimov CPU Architecture:Any
Tags: qc

[8 May 2008 8:47] Peter Laursen
Description:
We want to optimize the query 

1) SHOW FULL FIELDS FROM sometable.  

We parse the output from this and generate an ALTER TABLE script.

SHOW FULL FIELDS does a count(*) to return row_length.  That seems to be slow in particular on InnoDB. I think that is documented or at least a known issue. It is also understandable that 

2) SELECT * FROM I_S.tables WHERE table_name = sometable 

takes approx. the same time as the same row_lenght information will be returned as with the SHOW statement.  But if specific columns - an no row_length info - are returned like in this query:

3) SELECT ENGINE, CHARSET FROM I_S.tables WHERE table_name = sometable 

.. I cannot see the reason to do this count.  But I think it does (or does something comparable) as it is equally slow as query 2)!

We have a 9 GB test case (250 tables with 100.000+ rows in each). On a Core Duo 2*2 Ghz (doing nothing else) these queries take up to 2 minutes.  And consider that we have to do on all 250 tables.

Not sure about the S* level.  I have set to S2 (it is really a 'showstopper for us right now!), but you can change ... also change category as you find appropirate!

How to repeat:

We have a 9 GB test case (250 tables with 100.000+ rows in each). On a Core Duo 2*2 Ghz running Vista 32 bit (doing nothing else than this) the above queries take up to 2 minutes to complete.  And consider that we have to do on all tables.

Please tell what additonal info you need!  We can provide the schema and the C program that we used to populate it.

BTW: any workaround?

Suggested fix:
Only do COUNT when query specifies a column to be returned that require a count!
[8 May 2008 9:21] Peter Laursen
The InnoDB information I am referring to is here:
http://mysqldatabaseadministration.blogspot.com/2006/07/where-did-records-go.html
[8 May 2008 11:19] Peter Laursen
basically I think that the query

SELECT ENGINE, CHARSET FROM I_S.tables WHERE table_name = sometable 

should take the same time whether there are data or not in the table!
[8 May 2008 13:24] Peter Laursen
correction: actually it is (most often)

SHOW TABLE STATUS 

that should be replaced with something more efficient!

SHOW FULL TABLES is no alternative as what information we need is not there.
We also want to avoid parsing SHOW CREATE TABLE for each table.
[12 May 2008 11:11] Peter Laursen
a clarification:

If we

SELECT engine FROM I_S where table_schema = 'something' and table_name = 'something else'

it is very fast.  But if we only use one of the where-conditions or use wildcards like

SELECT engine FROM I_S where table_schema = 'something' and table_name = 'something%'

it is equally slow as SHOW TABLE STATUS.
[12 May 2008 11:34] Peter Laursen
I meant

SELECT engine FROM I_S where table_schema = 'something' and table_name LIKE 'something%'

.. of course!
[12 May 2008 12:07] Peter Laursen
seems that only wildcards in table_name causes SHOW TABLE STATUS to be ececuted internally( or what ever happens here):

show table status; -- 17-18 seconds
select engine from information_schema.TABLES where table_name = 'tab0'; -- 0.1 - 0.2 secs
select engine from information_schema.TABLES where table_name like 'tab0'; -- same
select engine from information_schema.TABLES where table_name like 'tab%'; -- 17-18 seconds
select engine from information_schema.TABLES where table_schema = 'test22'; -- around 20 secons
select engine from information_schema.TABLES where table_schema like 'test22'; -- same
select engine from information_schema.TABLES where table_schema like 'test2%'; -- a little more than 20 seconds
select engine from information_schema.TABLES where table_name = 'tab0' and table_schema = 'test22'; -- 0.1 - 0.2 secs
select engine from information_schema.TABLES where table_name like 'tab%' and table_schema = 'test22'; -- 17-18 seconds
select engine from information_schema.TABLES where table_name = 'tab0' and table_schema like 'test2%'; -- 0.1 - 0.2 secs

populate test case with a SP like this (250 tables wiht 100000 rows):
(will take 20-30 minutes on a reasonable PC)

DELIMITER $$

DROP PROCEDURE IF EXISTS `test22`.`qwerty`$$

CREATE DEFINER=`root`@`%` PROCEDURE `qwerty`()
BEGIN

declare innercount integer;
set @mycount = 0;

while @mycount < 250 do

create table tab (id integer primary key auto_increment, ts timestamp, t varchar(50)) engine = innodb;

start transaction;
set innercount = 0;
while innercount <100000 do
insert into tab (t) values (hex(now()));
set innercount = innercount + 1;
end while;
commit;

set @tablename1 = 'tab';
set @tablename2 = concat('tab',@mycount);
set @stmt = concat('rename table ',@tablename1,' to ',@tablename2);
prepare mystmt from @stmt;
execute mystmt;

set @mycount = @mycount + 1;

end while;

END$$

DELIMITER ;
[12 May 2008 19:55] Peter Laursen
I tried the same with MyISAM tables (also MySQL 5.0.51b).

Here SHOW TABLE STATUS is almost instantaneous (~50 ms)
With queries to I_S it is almost the same as with InnoDB - not specying table_name (or using wildcards) in WHERE clause is equally slow as with InnoDB!

Further with Falcon (on server 6.0.4) . Now it becomes interesting!:
* Here SHOW TABLE STATUS is 'medium slow' (4 secs) first time.  But executed again less than 50 ms!
* "select engine from information_schema.TABLES where table_name like 'tab%'" takes only 125 ms! Is this because of Falcon or because I_S has improved in this repect in 6.04?

So I will summarize like this:

With InnoDB there is no appropriate way to get 'DDL-only' information for all tables from a database.  

1) SHOW TABLE STATUS is slow (because row_count is slow with InnoDB) 
2) and SELECT from I_S is not appropriate because I_S organisation is not appropriate (you will need to query both .tables, .table_constraints and .columns tables) and besides retrieving data for more objects (like for more tables in .tables table) in one query is slow.

InnoDB should solve this, or a 'DDL-only variant' of SHOW TABLE STATUS should be implemented or I_S for tables should be (what I would think is out of question!) completely reorganised (and perform better)!
[14 May 2008 16:16] Valeriy Kravchuk
Thank you for a problem report. Sorry, but I do not see any row_length related columns neither in SHOW FULL FIELDS nor in SHOW FULL COLUMNS in 5.0.x:

mysql> select version();
+--------------------------+
| version()                |
+--------------------------+
| 5.0.60-enterprise-gpl-nt |
+--------------------------+
1 row in set (0.00 sec)

mysql> desc t1;
+-------+---------+------+-----+---------+-------+
| Field | Type    | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| foo   | int(11) | YES  | MUL | NULL    |       |
+-------+---------+------+-----+---------+-------+
1 row in set (0.03 sec)

mysql> show full columns from t1\G
*************************** 1. row ***************************
     Field: foo
      Type: int(11)
 Collation: NULL
      Null: YES
       Key: MUL
   Default: NULL
     Extra:
Privileges: select,insert,update,references
   Comment:
1 row in set (0.00 sec)

mysql> show full fields from t1\G
*************************** 1. row ***************************
     Field: foo
      Type: int(11)
 Collation: NULL
      Null: YES
       Key: MUL
   Default: NULL
     Extra:
Privileges: select,insert,update,references
   Comment:
1 row in set (0.00 sec)

See http://dev.mysql.com/doc/refman/5.0/en/show-columns.html also.

Now back to item 2 from your initial description. I see the following results:

mysql> show session status like 'Handler%';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_commit             | 21     |
| Handler_delete             | 0      |
| Handler_discover           | 0      |
| Handler_prepare            | 0      |
| Handler_read_first         | 20     |
| Handler_read_key           | 40     |
| Handler_read_next          | 262143 |
| Handler_read_prev          | 0      |
| Handler_read_rnd           | 0      |
| Handler_read_rnd_next      | 262284 |
| Handler_rollback           | 0      |
| Handler_savepoint          | 0      |
| Handler_savepoint_rollback | 0      |
| Handler_update             | 0      |
| Handler_write              | 524409 |
+----------------------------+--------+
15 rows in set (0.00 sec)

mysql> select avg_row_length from information_schema.tables where table_name = '
t1' and table_schema='test';
+----------------+
| avg_row_length |
+----------------+
|             33 |
+----------------+
1 row in set (0.00 sec)

mysql> show session status like 'Handler%';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_commit             | 21     |
| Handler_delete             | 0      |
| Handler_discover           | 0      |
| Handler_prepare            | 0      |
| Handler_read_first         | 20     |
| Handler_read_key           | 40     |
| Handler_read_next          | 262143 |
| Handler_read_prev          | 0      |
| Handler_read_rnd           | 0      |
| Handler_read_rnd_next      | 262302 |
| Handler_rollback           | 0      |
| Handler_savepoint          | 0      |
| Handler_savepoint_rollback | 0      |
| Handler_update             | 0      |
| Handler_write              | 524425 |
+----------------------------+--------+
15 rows in set (0.00 sec)

So, +18 at Handler_read_rnd_next and +16 on Handler_write (temporary tables are used). Now let's take other columns:

mysql> select engine, table_collation from information_schema.tables where table
_name = 't1' and table_schema='test';
+--------+-------------------+
| engine | table_collation   |
+--------+-------------------+
| InnoDB | latin1_swedish_ci |
+--------+-------------------+
1 row in set (0.00 sec)

mysql> show session status like 'Handler%';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_commit             | 21     |
| Handler_delete             | 0      |
| Handler_discover           | 0      |
| Handler_prepare            | 0      |
| Handler_read_first         | 20     |
| Handler_read_key           | 40     |
| Handler_read_next          | 262143 |
| Handler_read_prev          | 0      |
| Handler_read_rnd           | 0      |
| Handler_read_rnd_next      | 262320 |
| Handler_rollback           | 0      |
| Handler_savepoint          | 0      |
| Handler_savepoint_rollback | 0      |
| Handler_update             | 0      |
| Handler_write              | 524441 |
+----------------------------+--------+
15 rows in set (0.00 sec)

Again +18 at Handler_read_rnd_next and +16 on Handler_write (temporary tables are used). Now compare that to SHOW TABLE STATUS:

mysql> show table status like 't1'\G
*************************** 1. row ***************************
           Name: t1
         Engine: InnoDB
        Version: 10
     Row_format: Compact
           Rows: 262651
 Avg_row_length: 33
    Data_length: 8929280
Max_data_length: 0
   Index_length: 17858560
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2008-04-15 16:19:00
    Update_time: NULL
     Check_time: NULL
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment: InnoDB free: 527360 kB
1 row in set (0.00 sec)

mysql> show session status like 'Handler%';
+----------------------------+--------+
| Variable_name              | Value  |
+----------------------------+--------+
| Handler_commit             | 21     |
| Handler_delete             | 0      |
| Handler_discover           | 0      |
| Handler_prepare            | 0      |
| Handler_read_first         | 20     |
| Handler_read_key           | 40     |
| Handler_read_next          | 262143 |
| Handler_read_prev          | 0      |
| Handler_read_rnd           | 0      |
| Handler_read_rnd_next      | 262338 |
| Handler_rollback           | 0      |
| Handler_savepoint          | 0      |
| Handler_savepoint_rollback | 0      |
| Handler_update             | 0      |
| Handler_write              | 524457 |
+----------------------------+--------+
15 rows in set (0.00 sec)

Same change in 'Handler_*' values, and table has 262651 or so rows. That's why I think your theory is wrong.

The fact that many queries to INFORMATION_SCHEMA are slow with large number of tables is well known (big temporary tables used, no indexes etc) and "To be fixed later", but I doubt this your report explains the real reason for your performance problem.
[14 May 2008 16:40] Peter Laursen
you are probably correct!
row_length issue was my first guess.  I also abandoned that 'theory' myself!

I summarized like this in my last post:

"With InnoDB there is no appropriate way to get 'DDL-only' information for all tables from
a database.  

1) SHOW TABLE STATUS is slow (because row_count is slow with InnoDB) 
2) and SELECT from I_S is not appropriate because I_S organisation is not appropriate (you
will need to query both .tables, .table_constraints and .columns tables) and besides
retrieving data for more objects (like for more tables in .tables table) in one query is
slow.

InnoDB should solve this, or a 'DDL-only variant' of SHOW TABLE STATUS should be
implemented or I_S for tables should be (what I would think is out of question!)
completely reorganised (and perform better)!"

I changed the synopsis and category accordingly.
[23 Jun 2008 17:38] Heikki Tuuri
Assigning this feature request to Vasil.
[31 Jul 2012 20:36] MySQL Verification Team
Hi Peter!

--innodb-stats-on-metadata=0 will help speed up I_S and SHOW statements for innodb tables.
http://dev.mysql.com/doc/refman/5.5/en/innodb-parameters.html#sysvar_innodb_stats_on_metad...
[31 Jul 2012 21:09] Peter Laursen
@Shane .. Holy Cow .. what should I use this information for as it arrives 4 years too late? 

Is the report still 'Open', is it 'Verified' or is it 'Closed' with reference to specified improvements in server versions release since?
[27 Dec 2012 19:32] MySQL Verification Team
Let's close it.  5.5 and 5.6 can hardly be compared to 5.0 anymore, as many improvements have been made.  Also, options to control the stats collection should have helped already.