Bug #36513 | Wildly incorrect cardinality stats causing poor execution plans | ||
---|---|---|---|
Submitted: | 5 May 2008 16:45 | Modified: | 23 Sep 2009 0:36 |
Reporter: | Gordon Shannon | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.1 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[5 May 2008 16:45]
Gordon Shannon
[5 May 2008 16:51]
Gordon Shannon
select count(distinct dw_topic_dim_id) from dw_hist_message_fact; +---------------------------------+ | count(distinct dw_topic_dim_id) | +---------------------------------+ | 13684 | +---------------------------------+
[5 May 2008 19:24]
Valeriy Kravchuk
Thank you for a problem report. Can you upload smallest set of table's data to demonstrate the behaviour described? Is it possible for you to alter storage engine to MyISAM and check if cardinality estimation will be more correct then? I suspect it is NOT an optimizer bug but a problem in InnoDB's implementation of statistics estimation routines.
[5 May 2008 23:01]
Gordon Shannon
Regarding altering the table to MyIsam, that would probably take hours, and we require the table to be transactional. Are you suggesting I do that because it would fix my problem, or because you want to just see what it does?
[5 May 2008 23:17]
Gordon Shannon
It's going to be very tedious and time consuming, if not impossible, for me to generate a data set that will consistently reproduce this behavior. From what I have heard, and from my own observations, I believe that this is most likely a result of the "8 random dives". Is is not true that the cardinality is the result of a calculation made by 8 random samples of the index? If so, it would not surprise me to see cardinality numbers way too high or way too low, and they would generally be worse the more rows in the table. My guess about this one is that it happened to find the same topic id 8 times in a row, and assumed the "lowest" cardinality, which often I see as the magic number 18.
[6 May 2008 3:31]
Valeriy Kravchuk
Yes, I wanted to check that it is related to InnoDB only and its random dives method. If the problem is repeatable with MyISAM then it is in MySQL server's code, not in storage engine's. Surely, please, do not check this in production as MyISAM is not an option for you anyway.
[6 May 2008 16:20]
Gordon Shannon
I do not have a database to try this on at the moment, given that it would be unusable for 8 hours or so. I will try to get it done as soon as I can.
[6 May 2008 18:30]
Gordon Shannon
Here's some possibly useful stats on the selectivity of the column in question (dw_topic_dim_id). Rows in table: 59 million Number of distinct values: 13,684 Average row count of each value: 4,350 rows Number of values with > 100,000 rows: 98 Maximum number of rows with a single value: 6,070,442 ! Could the random dive be hitting this value all 8 times?
[6 May 2008 18:47]
Peter Laursen
A simple statistical calculation of the probability that this would happen would go like ((6.070.444/59.0000.000)^8) = 0.0000000000000126 % Even if all distinct values were equally probable (what they are not) the total chance that same value would be hit 8 times would be ((6.070.444/59.0000.000)^8)*13684 = 0.000000000172%. To put it another way: only for around 1000 billion times you would do this 'experiment' it would hit the same values 8 times. The probability that your neighboring nuclear powever station will blow up tomorrow is much-much higher I guess! Peter (not a MySQL person)
[6 May 2008 19:16]
Gordon Shannon
I suppose it depends on how truly random the algorithm is.
[3 Jul 2008 22:43]
Maggie Pong
I am seeing same problem on innodb table with 598982 rows. Analyze the table multiple times give vastly different cardinality ranging from 17, ~3000, ~1500. If your table have the same column in different indexes, u also get different cardinality across various indexes. Once I convert that table to myisam, the cardinality is much closer and very consistent and close to the actual distinct value. How can mysql rely on these unreliable innodb stats to come up with query plan? Analyze on Innodb has got to do a better job in sampling. Maggie
[8 Jul 2008 3:19]
Gordon Shannon
I loaded 84million rows into an identical myisam table and viewed statistics. Cardinality here seems to be much more accurate. [root@db2] 16:57:15 6> show index from messages; +----------+------------+--------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+----------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------+------------+--------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+----------+ | messages | 0 | PRIMARY | 1 | dw_hist_message_fact_id | A | 84806491 | NULL | NULL | | BTREE | | | messages | 0 | message_topic_idx | 1 | message_lookup_id | A | 28268830 | NULL | NULL | | BTREE | | | messages | 0 | message_topic_idx | 2 | dw_topic_dim_id | A | 84806491 | NULL | NULL | YES | BTREE | | | messages | 1 | dw_date_idx | 1 | dw_date_dim_id | A | 1323 | NULL | NULL | | BTREE | disabled | | messages | 1 | dw_date_idx | 2 | dw_topic_dim_id | A | 1367846 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | board_topic_density_idx | 1 | dw_source_dim_id | A | 14134415 | NULL | NULL | | BTREE | disabled | | messages | 1 | board_topic_density_idx | 2 | dw_date_dim_id | A | 14134415 | NULL | NULL | | BTREE | disabled | | messages | 1 | board_topic_density_idx | 3 | valid | A | 14134415 | NULL | NULL | | BTREE | disabled | | messages | 1 | board_topic_density_idx | 4 | dw_topic_dim_id | A | 84806491 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | topic_created_valid_idx2 | 1 | dw_topic_dim_id | A | 1367846 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | topic_created_valid_idx2 | 2 | ci_time | A | 84806491 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | topic_created_valid_idx2 | 3 | valid | A | 84806491 | NULL | NULL | | BTREE | disabled | | messages | 1 | topic_created_valid_idx2 | 4 | dw_source_category_id | A | 84806491 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | topic_created_valid_idx2 | 5 | alg_ci_rank_rollup | A | 84806491 | NULL | NULL | | BTREE | disabled | | messages | 1 | source_topic_date_idx | 1 | dw_source_dim_id | A | 14134415 | NULL | NULL | | BTREE | disabled | | messages | 1 | source_topic_date_idx | 2 | dw_topic_dim_id | A | 84806491 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | source_topic_date_idx | 3 | dw_date_dim_id | A | 84806491 | NULL | NULL | | BTREE | disabled | | messages | 1 | message_lookup_idx | 1 | message_lookup_id | A | 42403245 | NULL | NULL | | BTREE | disabled | | messages | 1 | created_source_idx | 1 | created_time | A | 7709681 | NULL | NULL | | BTREE | disabled | | messages | 1 | created_source_idx | 2 | source_type | A | 7709681 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | source_time_idx | 1 | dw_source_dim_id | A | 14134415 | NULL | NULL | | BTREE | disabled | | messages | 1 | source_time_idx | 2 | ci_time | A | 28268830 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | fk_message_language | 1 | language_code | A | 220 | NULL | NULL | | BTREE | disabled | | messages | 1 | nk_parent | 1 | parent_id | A | 84806491 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | nk_updated | 1 | updated_on | A | 84806491 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | nk_updated | 2 | source_type | A | 84806491 | NULL | NULL | YES | BTREE | disabled | | messages | 1 | nk_updated | 3 | valid | A | 84806491 | NULL | NULL | | BTREE | disabled | +----------+------------+--------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+----------+ 27 rows in set (0.07 sec)
[8 Jul 2008 8:17]
Susanne Ebrecht
ANALYZE is for cleaning the statistics. Means: when you have an index statistic values will be created. When you make lots of DML statements like update, insert, delete these values can get out of date. By using ANALYZE the statistics will be cleaned and you have proper statistics after that. Neither I can reproduce your problem at the moment nor I can find abnormal source code. But I will let check this from our InnoDB developers.
[8 Jul 2008 16:27]
Gordon Shannon
All the statistics I have reported to you were immediately after running analyze table. That's part of the issue, that even after analyzing the table the cardinality is still way off on certain columns.
[30 Jul 2008 3:56]
Gordon Shannon
Just to let you know, this problem is not going away. Today the table in question is 120 million rows, double the size when I first reported this problem. Here's a query that InnoDB does a primary key scan and takes 20 minutes. If I force it to use the correct index, it takes .1 seconds. [root@db3] 21:50:33 2> explain -> select distinct message_lookup_id from ci_sentiment_db.dw_hist_message_fact -> where dw_date_dim_id=20080423 order by dw_hist_message_fact_id limit 200; +----+-------------+----------------------+-------+---------------+---------+---------+------+-----------+------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------------------+-------+---------------+---------+---------+------+-----------+------------------------------+ | 1 | SIMPLE | dw_hist_message_fact | index | dw_date_idx | PRIMARY | 8 | NULL | 133375624 | Using where; Using temporary | +----+-------------+----------------------+-------+---------------+---------+---------+------+-----------+------------------------------+ 1 row in set (0.06 sec) [root@db3] 21:50:36 3> explain -> select distinct message_lookup_id from ci_sentiment_db.dw_hist_message_fact -> use index (dw_date_idx) -> where dw_date_dim_id=20080423 order by dw_hist_message_fact_id limit 200; +----+-------------+----------------------+------+---------------+-------------+---------+-------+------+----------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------------------+------+---------------+-------------+---------+-------+------+----------------------------------------------+ | 1 | SIMPLE | dw_hist_message_fact | ref | dw_date_idx | dw_date_idx | 4 | const | 196 | Using where; Using temporary; Using filesort | +----+-------------+----------------------+------+---------------+-------------+---------+-------+------+----------------------------------------------+ 1 row in set (0.15 sec) The index in question: +----------------------+------------+--------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------------------+------------+--------------------------+--------------+-------------------------+-----------+-------------+----------+--------+------+------------+---------+ | dw_hist_message_fact | 0 | PRIMARY | 1 | dw_hist_message_fact_id | A | 130170129 | NULL | NULL | | BTREE | | | dw_hist_message_fact | 1 | dw_date_idx | 1 | dw_date_dim_id | A | 17 | NULL | NULL | | BTREE | | | dw_hist_message_fact | 1 | dw_date_idx | 2 | dw_topic_dim_id | A | 110127 | NULL | NULL | YES | BTREE | | Notice cardinality is 17 on the date, even though there are about 140 distinct dates. Even so, I don't understand why it would choose to do a full primary key scan.
[16 Oct 2008 7:21]
Vasil Dimov
Godron, this is a well known problem and a workaround will be available in the next InnoDB Plugin release (1.0.2) which will be released soon. Unfortunately 5.1 is frozen for such changes. The cardinality of every index (the number of distinct values) is calculated when the table is opened, at SHOW TABLE STATUS, ANALYZE TABLE and on other circumstances (like when the table has changed too much). Note that if the mysql client is running with the auto-rehash setting turned on (default) this causes all tables to be opened when it starts. Previously InnoDB sampled 8 random pages from the index to get an estimate of the cardinality. Now the number of sampled pages can be changed via the global parameter innodb_stats_sample_pages which can be tuned at runtime. The default value for this parameter is 8. If the value of this parameter is changed, there may be serious problems: - small values (say, 1) can cause an error in table stats; - values much larger than 8 (say, 100, 1000), can cause a slowdown in table opening time, SHOW TABLE status, etc. - query plans may be different from the old ones. So, large number of dives will lead to better stats but will cause some slow down (which could as well be insignificant), small number of dives will lead to worse stats, but there will be no slow down. It is a matter of experimenting to find the best number of dives for a particular workload. One page is 16kb, so sampling N random pages means that the disk will do N seeks, reading 16kb at each.
[24 Oct 2008 16:46]
Mark Callaghan
Vasil, That workaround will be extremely difficult to use because stats collection for InnoDB has to be fast as it is done each time a table is first opened. When can we have optional persistent stats for InnoDB tables? This is something that will eventually be in a Google patch. By optional, I mean that the mysql schema has a stats table into which stats can be inserted for some tables and these override anything the storage engine would provide.
[1 Apr 2009 19:15]
MySQL Verification Team
Could the root cause of this problem be related to the root cause of bug #43660 as well?
[16 Sep 2009 21:21]
Calvin Sun
Change it to Documenting as the implementation is already in the innodb plugin (1.0.4 or earlier), hence fixed in 5.4.2.
[23 Sep 2009 0:36]
Paul DuBois
Noted in 5.4.2 changelog. For InnoDB tables, poor cardinality statistics could result in poor query execution plans.