Bug #44059 | Incorrect cardinality of indexes on a partitioned table | ||
---|---|---|---|
Submitted: | 3 Apr 2009 2:26 | Modified: | 19 Dec 2009 8:59 |
Reporter: | Maxim Kharlamov | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Partitions | Severity: | S5 (Performance) |
Version: | 5.1.32, 5.1.35-bzr | OS: | Linux (CentOS 5 kernel 2.6.18-53.el5 SMP x86_64) |
Assigned to: | Mattias Jonsson | CPU Architecture: | Any |
Tags: | cardinality, INDEX, innodb, partitioning |
[3 Apr 2009 2:26]
Maxim Kharlamov
[3 Apr 2009 2:28]
Maxim Kharlamov
The log
Attachment: log.txt (text/plain), 8.63 KiB.
[3 Apr 2009 2:53]
Maxim Kharlamov
I use InnoDB storage, BTW.
[5 Apr 2009 14:02]
Valeriy Kravchuk
Thank you for the problem report. I had noted that the table from your dump uploaded is NOT partitioned. How do you partition it? I've got the following cardinalities: mysql> show index from alert_received\G *************************** 1. row *************************** Table: alert_received Non_unique: 0 Key_name: PRIMARY Seq_in_index: 1 Column_name: alr_guid Collation: A Cardinality: 214191 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 2. row *************************** Table: alert_received Non_unique: 0 Key_name: PRIMARY Seq_in_index: 2 Column_name: created_timestamp Collation: A Cardinality: 214191 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 3. row *************************** Table: alert_received Non_unique: 1 Key_name: cus_guid Seq_in_index: 1 Column_name: cus_guid Collation: A Cardinality: 96 Sub_part: NULL Packed: NULL Null: YES Index_type: BTREE Comment: *************************** 4. row *************************** Table: alert_received Non_unique: 1 Key_name: created_time_bay_num_idx Seq_in_index: 1 Column_name: sct_guid Collation: A Cardinality: 14 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 5. row *************************** Table: alert_received Non_unique: 1 Key_name: created_time_bay_num_idx Seq_in_index: 2 Column_name: bay_number Collation: A Cardinality: 144 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 6. row *************************** Table: alert_received Non_unique: 1 Key_name: created_time_bay_num_idx Seq_in_index: 3 Column_name: created_timestamp Collation: A Cardinality: 214191 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 7. row *************************** Table: alert_received Non_unique: 1 Key_name: basic_report_idx Seq_in_index: 1 Column_name: created_timestamp Collation: A Cardinality: 214191 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 8. row *************************** Table: alert_received Non_unique: 1 Key_name: created_time_idx Seq_in_index: 1 Column_name: sct_guid Collation: A Cardinality: 14 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 9. row *************************** Table: alert_received Non_unique: 1 Key_name: created_time_idx Seq_in_index: 2 Column_name: bay_guid Collation: A Cardinality: 14 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 10. row *************************** Table: alert_received Non_unique: 1 Key_name: created_time_idx Seq_in_index: 3 Column_name: created_timestamp Collation: A Cardinality: 214191 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 11. row *************************** Table: alert_received Non_unique: 1 Key_name: alt_guid Seq_in_index: 1 Column_name: alt_guid Collation: A Cardinality: 10 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: *************************** 12. row *************************** Table: alert_received Non_unique: 1 Key_name: bay_guid Seq_in_index: 1 Column_name: bay_guid Collation: A Cardinality: 214 Sub_part: NULL Packed: NULL Null: Index_type: BTREE Comment: 12 rows in set (0.02 sec) Please, point out what exact cardinality value is wrong in your case, when partitioning is used.
[6 Apr 2009 0:26]
Maxim Kharlamov
I attached log.txt file which contains cardinality before partitioning, the partition command itself and cardinality after partitioning. You probably didn't notice it, it's several comments above.
[6 Apr 2009 6:50]
Valeriy Kravchuk
Verified just as described with recent 5.1.35 from bzr on Linux. After partitioning the table I've got: openxs@suse:/home2/openxs/dbs/5.1> bin/mysql -uroot test Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 4 Server version: 5.1.35-debug Source distribution Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> analyze table alert_received; +---------------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +---------------------+---------+----------+----------+ | test.alert_received | analyze | status | OK | +---------------------+---------+----------+----------+ 1 row in set (0.46 sec) mysql> show index from alert_received; +----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+ | alert_received | 0 | PRIMARY | 1 | alr_guid | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 0 | PRIMARY | 2 | created_timestamp | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | cus_guid | 1 | cus_guid | A | 215891 | NULL | NULL | YES | BTREE | | | alert_received | 1 | created_time_bay_num_idx | 1 | sct_guid | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_bay_num_idx | 2 | bay_number | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_bay_num_idx | 3 | created_timestamp | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | basic_report_idx | 1 | created_timestamp | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_idx | 1 | sct_guid | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_idx | 2 | bay_guid | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_idx | 3 | created_timestamp | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | alt_guid | 1 | alt_guid | A | 215891 | NULL | NULL | | BTREE | | | alert_received | 1 | bay_guid | 1 | bay_guid | A | 215891 | NULL | NULL | | BTREE | | +----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+ 12 rows in set (0.50 sec) With MyISAM the result is different, but not any better: mysql> alter table alert_received engine=MyISAM; Query OK, 213868 rows affected (16.67 sec) Records: 213868 Duplicates: 0 Warnings: 0 mysql> analyze table alert_received; +---------------------+---------+----------+----------+ | Table | Op | Msg_type | Msg_text | +---------------------+---------+----------+----------+ | test.alert_received | analyze | status | OK | +---------------------+---------+----------+----------+ 1 row in set (0.91 sec) mysql> show index from alert_received; +----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+ | Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | +----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+ | alert_received | 0 | PRIMARY | 1 | alr_guid | A | NULL | NULL | NULL | | BTREE | | | alert_received | 0 | PRIMARY | 2 | created_timestamp | A | 213868 | NULL | NULL | | BTREE | | | alert_received | 1 | cus_guid | 1 | cus_guid | A | NULL | NULL | NULL | YES | BTREE | | | alert_received | 1 | created_time_bay_num_idx | 1 | sct_guid | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_bay_num_idx | 2 | bay_number | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_bay_num_idx | 3 | created_timestamp | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | basic_report_idx | 1 | created_timestamp | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_idx | 1 | sct_guid | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_idx | 2 | bay_guid | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | created_time_idx | 3 | created_timestamp | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | alt_guid | 1 | alt_guid | A | NULL | NULL | NULL | | BTREE | | | alert_received | 1 | bay_guid | 1 | bay_guid | A | NULL | NULL | NULL | | BTREE | | +----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+ 12 rows in set (0.06 sec) ANALYZE TABLE..., ALTER TABLE ... ANALYZE PARTITION etc changes nothing.
[10 Apr 2009 11:02]
Mattias Jonsson
The rec_per_key statistics (as seen in the column Cardinality) it taken from the first partition only. And since that partition is empty the statistics are incorrect. (there may also be en error on relying on stats.records since that might include the full table, not only the number of rows in the specific partition) To fix this we need to fix the ha_partition::info(HA_STATUS_CONST) call to at least use a partition with content (preferrable the partition with most rows). This would not affect the performance too much. An even more correct way, but possibly affecting performance on opening tables would be to iterate over all partitions with content, and calculate a sum or mean value for the rec_per_key/cardinality, this could be done like this: allocate temporary duplicate memory for table->key_info (a 2 dimension array key_info[i].rec_per_key[j]) and initialize it to zero. for all partitions call file->info(HA_STATUS_CONST) (if no rows, skip. At least InnoDB seems to sets rec_per_key to stat.records if there is no records for a specific key in an index) add table->key_info statistics to the temporary duplicate memory and then overwrite table->key_info with the temporary duplicate memory and free it. Here is the comment regarding the current behavior in ha_partition::info(HA_STATUS_CONST): Monty comment: This should NOT be changed! It's the handlers responsibility to correct table->s->keys_xxxx information if keys have been disabled. The most important parameters set here is records per key on all indexes. block_size and primar key ref_length. For each index there is an array of rec_per_key. As an example if we have an index with three attributes a,b and c we will have an array of 3 rec_per_key. rec_per_key[0] is an estimate of number of records divided by number of unique values of the field a. rec_per_key[1] is an estimate of the number of records divided by the number of unique combinations of the fields a and b. rec_per_key[2] is an estimate of the number of records divided by the number of unique combinations of the fields a,b and c. Many handlers only set the value of rec_per_key when all fields are bound (rec_per_key[2] in the example above). If the handler doesn't support statistics, it should set all of the above to 0. We will allow the first handler to set the rec_per_key and use this as an estimate on the total table.
[15 Apr 2009 8:35]
MySQL Verification Team
Here's a simpler testcase: ------------- drop table if exists `exec`; create table `exec` (`id` int,key (`id`)) engine=myisam default charset=latin1 partition by list (`id`) ( partition p1 values in (1), partition p2 values in (2) ); insert into `exec` values (2); show indexes from exec; analyze table exec; show indexes from exec; ------------- The above will show NULL cardinality always. If you insert 1 into the table, it'll be correct. So, it seems like only the first partition is indeed used to get this value.
[29 Jul 2009 15:50]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/79540 3478 Mikael Ronstrom 2009-07-29 Bug#44059 result file for added test included modified: mysql-test/r/partition.result
[29 Jul 2009 16:29]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/79548 3477 Mikael Ronstrom 2009-07-29 Bug#44059, Fixed such that info(HA_STATUS_CONST) uses the partition with most records rather than the first partition modified: mysql-test/r/partition.result mysql-test/t/partition.test sql/ha_partition.cc
[5 Aug 2009 14:43]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/80172 3516 Mikael Ronstrom 2009-08-05 [merge] Merge bug#44059 modified: mysql-test/r/partition.result mysql-test/t/partition.test sql/ha_partition.cc
[14 Sep 2009 16:04]
Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (version source revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (merge vers: 5.4.4-alpha) (pib:11)
[15 Sep 2009 10:35]
Jon Stephens
Documented bugfix in the 5.4.4 changelog as follows: The cardinality of indexes on partitioned tables was calculated using the first partition in the table, which could result in suboptimal query execution plans being chosen. Now the partition having the most records is used instead, which should result in better use of indexes and thus improved performance of queries against partitioned tables in many if not most cases. Closed.
[15 Sep 2009 11:13]
Mattias Jonsson
Reopening, asking to do a backport to 5.1.
[18 Sep 2009 9:10]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/83701 3114 Mattias Jonsson 2009-09-18 backport for bug#44059 from mysql-pe to mysql-5.1-bugteam
[8 Oct 2009 13:59]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/86198 3152 Mattias Jonsson 2009-10-08 Bug#44059: Incorrect cardinality of indexes on a partitioned table backport for bug#44059 from mysql-pe to mysql-5.1-bugteam Using the partition with most rows instead of first partition to estimate the cardinality of indexes. @ mysql-test/r/partition.result Bug#44059: Incorrect cardinality of indexes on a partitioned table Added test result @ mysql-test/t/partition.test Bug#44059: Incorrect cardinality of indexes on a partitioned table Added test case @ sql/ha_partition.cc Bug#44059: Incorrect cardinality of indexes on a partitioned table Checking which partition that has the most rows, and using that partition for HA_STATUS_CONST instead of first partition
[9 Oct 2009 8:48]
Mattias Jonsson
Pushed into mysql-5.1-bugteam and mysql-pe
[14 Oct 2009 14:39]
Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091014143611-cphb0enjlx6lpat1) (version source revid:satya.bn@sun.com-20091013071829-zc4c3go44j6re592) (merge vers: 5.1.40) (pib:13)
[15 Oct 2009 8:18]
Jon Stephens
Documented in the 5.1.41 changelog. Commented out 5.4.4 version of changelog entry until it's determined where this fix will end up, post-5.1.
[22 Oct 2009 6:33]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091022063126-l0qzirh9xyhp0bpc) (version source revid:alik@sun.com-20091019135554-s1pvptt6i750lfhv) (merge vers: 6.0.14-alpha) (pib:13)
[22 Oct 2009 7:06]
Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version source revid:alik@sun.com-20091019131708-bc6pv55x6287a0wc) (merge vers: 5.5.0-beta) (pib:13)
[22 Oct 2009 15:03]
Jon Stephens
Also documented in the 5.5.0 and 6.0.14 changelogs. Closed.
[18 Dec 2009 10:31]
Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:47]
Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:02]
Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:17]
Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[19 Dec 2009 8:59]
Jon Stephens
No new changelog entries required. Closing.