Bug #44059 Incorrect cardinality of indexes on a partitioned table
Submitted: 3 Apr 2009 2:26 Modified: 19 Dec 2009 8:59
Reporter: Maxim Kharlamov Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S5 (Performance)
Version:5.1.32, 5.1.35-bzr OS:Linux (CentOS 5 kernel 2.6.18-53.el5 SMP x86_64)
Assigned to: Mattias Jonsson CPU Architecture:Any
Tags: cardinality, INDEX, innodb, partitioning

[3 Apr 2009 2:26] Maxim Kharlamov
Description:
Hello!

After upgrading MySQL from 5 to 5.1 and partitioning some of the big tables I noticed that application performance became 2-2.5 times slower. EXPLAIN shows suboptimal plans for typical queries (indexes either not being used or not the best indexes are used).
I also noticed that the index cardinality isn't properly calculated on partitioned tables (and, I think that's what causes performance problems).

After partitioning mysql thinks that cardinality of each index equals to the cardinality of primary key.
I tried analyze table, analyze partition, optimize table, reloading data - no success.
Interestingly enough, on one of the other partitioned tables the cardinality is not the same for each index, but still has nothing to do with the 'true' cardinality (instead of 18 it shows 24000ish, for instance).

Thanks,
Max

How to repeat:
I'll attach the log and dump files.
[3 Apr 2009 2:28] Maxim Kharlamov
The log

Attachment: log.txt (text/plain), 8.63 KiB.

[3 Apr 2009 2:53] Maxim Kharlamov
I use InnoDB storage, BTW.
[5 Apr 2009 14:02] Valeriy Kravchuk
Thank you for the problem report. I had noted that the table from your dump uploaded is NOT partitioned. How do you partition it?

I've got the following cardinalities:

mysql> show index from alert_received\G
*************************** 1. row ***************************
       Table: alert_received
  Non_unique: 0
    Key_name: PRIMARY
Seq_in_index: 1
 Column_name: alr_guid
   Collation: A
 Cardinality: 214191
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 2. row ***************************
       Table: alert_received
  Non_unique: 0
    Key_name: PRIMARY
Seq_in_index: 2
 Column_name: created_timestamp
   Collation: A
 Cardinality: 214191
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 3. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: cus_guid
Seq_in_index: 1
 Column_name: cus_guid
   Collation: A
 Cardinality: 96
    Sub_part: NULL
      Packed: NULL
        Null: YES
  Index_type: BTREE
     Comment:
*************************** 4. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: created_time_bay_num_idx
Seq_in_index: 1
 Column_name: sct_guid
   Collation: A
 Cardinality: 14
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 5. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: created_time_bay_num_idx
Seq_in_index: 2
 Column_name: bay_number
   Collation: A
 Cardinality: 144
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 6. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: created_time_bay_num_idx
Seq_in_index: 3
 Column_name: created_timestamp
   Collation: A
 Cardinality: 214191
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 7. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: basic_report_idx
Seq_in_index: 1
 Column_name: created_timestamp
   Collation: A
 Cardinality: 214191
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 8. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: created_time_idx
Seq_in_index: 1
 Column_name: sct_guid
   Collation: A
 Cardinality: 14
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 9. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: created_time_idx
Seq_in_index: 2
 Column_name: bay_guid
   Collation: A
 Cardinality: 14
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 10. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: created_time_idx
Seq_in_index: 3
 Column_name: created_timestamp
   Collation: A
 Cardinality: 214191
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 11. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: alt_guid
Seq_in_index: 1
 Column_name: alt_guid
   Collation: A
 Cardinality: 10
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
*************************** 12. row ***************************
       Table: alert_received
  Non_unique: 1
    Key_name: bay_guid
Seq_in_index: 1
 Column_name: bay_guid
   Collation: A
 Cardinality: 214
    Sub_part: NULL
      Packed: NULL
        Null:
  Index_type: BTREE
     Comment:
12 rows in set (0.02 sec)

Please, point out what exact cardinality value is wrong in your case, when partitioning is used.
[6 Apr 2009 0:26] Maxim Kharlamov
I attached log.txt file which contains cardinality before partitioning, the partition command itself and cardinality after partitioning. You probably didn't notice it, it's several comments above.
[6 Apr 2009 6:50] Valeriy Kravchuk
Verified just as described with recent 5.1.35 from bzr on Linux. After partitioning the table I've got:

openxs@suse:/home2/openxs/dbs/5.1> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 4
Server version: 5.1.35-debug Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> analyze table alert_received;
+---------------------+---------+----------+----------+
| Table               | Op      | Msg_type | Msg_text |
+---------------------+---------+----------+----------+
| test.alert_received | analyze | status   | OK       |
+---------------------+---------+----------+----------+
1 row in set (0.46 sec)

mysql> show index from alert_received;
+----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table          | Non_unique | Key_name                 | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| alert_received |          0 | PRIMARY                  |            1 | alr_guid          | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          0 | PRIMARY                  |            2 | created_timestamp | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | cus_guid                 |            1 | cus_guid          | A         |      215891 |     NULL | NULL   | YES  | BTREE      |         |
| alert_received |          1 | created_time_bay_num_idx |            1 | sct_guid          | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_bay_num_idx |            2 | bay_number        | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_bay_num_idx |            3 | created_timestamp | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | basic_report_idx         |            1 | created_timestamp | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_idx         |            1 | sct_guid          | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_idx         |            2 | bay_guid          | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_idx         |            3 | created_timestamp | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | alt_guid                 |            1 | alt_guid          | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | bay_guid                 |            1 | bay_guid          | A         |      215891 |     NULL | NULL   |      | BTREE      |         |
+----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
12 rows in set (0.50 sec)

With MyISAM the result is different, but not any better:

mysql> alter table alert_received engine=MyISAM;
Query OK, 213868 rows affected (16.67 sec)
Records: 213868  Duplicates: 0  Warnings: 0

mysql> analyze table alert_received;
+---------------------+---------+----------+----------+
| Table               | Op      | Msg_type | Msg_text |
+---------------------+---------+----------+----------+
| test.alert_received | analyze | status   | OK       |
+---------------------+---------+----------+----------+
1 row in set (0.91 sec)

mysql> show index from alert_received;
+----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| Table          | Non_unique | Key_name                 | Seq_in_index | Column_name       | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
| alert_received |          0 | PRIMARY                  |            1 | alr_guid          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          0 | PRIMARY                  |            2 | created_timestamp | A         |      213868 |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | cus_guid                 |            1 | cus_guid          | A         |        NULL |     NULL | NULL   | YES  | BTREE      |         |
| alert_received |          1 | created_time_bay_num_idx |            1 | sct_guid          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_bay_num_idx |            2 | bay_number        | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_bay_num_idx |            3 | created_timestamp | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | basic_report_idx         |            1 | created_timestamp | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_idx         |            1 | sct_guid          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_idx         |            2 | bay_guid          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | created_time_idx         |            3 | created_timestamp | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | alt_guid                 |            1 | alt_guid          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
| alert_received |          1 | bay_guid                 |            1 | bay_guid          | A         |        NULL |     NULL | NULL   |      | BTREE      |         |
+----------------+------------+--------------------------+--------------+-------------------+-----------+-------------+----------+--------+------+------------+---------+
12 rows in set (0.06 sec)

ANALYZE TABLE..., ALTER TABLE ... ANALYZE PARTITION etc changes nothing.
[10 Apr 2009 11:02] Mattias Jonsson
The rec_per_key statistics (as seen in the column Cardinality) it taken from the first partition only. And since that partition is empty the statistics are incorrect. (there may also be en error on relying on stats.records since that might include the full table, not only the number of rows in the specific partition)

To fix this we need to fix the ha_partition::info(HA_STATUS_CONST) call to at least use a partition with content (preferrable the partition with most rows). This would not affect the performance too much.

An even more correct way, but possibly affecting performance on opening tables would be to iterate over all partitions with content, and calculate a sum or mean value for the rec_per_key/cardinality, this could be done like this:
allocate temporary duplicate memory for table->key_info (a 2 dimension array key_info[i].rec_per_key[j]) and initialize it to zero.
for all partitions
  call file->info(HA_STATUS_CONST)
  (if no rows, skip. At least InnoDB seems to sets rec_per_key to stat.records if there is no records for a specific key in an index)
  add table->key_info statistics to the temporary duplicate memory

and then overwrite table->key_info with the temporary duplicate memory and free it.

Here is the comment regarding the current behavior in ha_partition::info(HA_STATUS_CONST):
      Monty comment: This should NOT be changed!  It's the handlers
      responsibility to correct table->s->keys_xxxx information if keys
      have been disabled.

      The most important parameters set here is records per key on
      all indexes. block_size and primar key ref_length.

      For each index there is an array of rec_per_key.
      As an example if we have an index with three attributes a,b and c
      we will have an array of 3 rec_per_key.
      rec_per_key[0] is an estimate of number of records divided by
      number of unique values of the field a.
      rec_per_key[1] is an estimate of the number of records divided
      by the number of unique combinations of the fields a and b.
      rec_per_key[2] is an estimate of the number of records divided
      by the number of unique combinations of the fields a,b and c.

      Many handlers only set the value of rec_per_key when all fields
      are bound (rec_per_key[2] in the example above).

      If the handler doesn't support statistics, it should set all of the
      above to 0.

      We will allow the first handler to set the rec_per_key and use
      this as an estimate on the total table.
[15 Apr 2009 8:35] MySQL Verification Team
Here's a simpler testcase:

-------------
drop table if exists `exec`;
create table `exec` (`id` int,key (`id`)) engine=myisam default charset=latin1
partition by list (`id`)
(
partition p1 values in (1), 
partition p2 values in (2)
 );
insert into `exec` values (2);
show indexes from exec;
analyze table exec;
show indexes from exec;
-------------

The above will show NULL cardinality always.  If you insert 1 into the table, it'll be correct.  So, it seems like only the first partition is indeed used to get this value.
[29 Jul 2009 15:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79540

3478 Mikael Ronstrom	2009-07-29
      Bug#44059 result file for added test included
      modified:
        mysql-test/r/partition.result
[29 Jul 2009 16:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79548

3477 Mikael Ronstrom	2009-07-29
      Bug#44059, Fixed such that info(HA_STATUS_CONST) uses the partition with most records rather than the first partition
      modified:
        mysql-test/r/partition.result
        mysql-test/t/partition.test
        sql/ha_partition.cc
[5 Aug 2009 14:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80172

3516 Mikael Ronstrom	2009-08-05 [merge]
      Merge bug#44059
      modified:
        mysql-test/r/partition.result
        mysql-test/t/partition.test
        sql/ha_partition.cc
[14 Sep 2009 16:04] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (version source revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (merge vers: 5.4.4-alpha) (pib:11)
[15 Sep 2009 10:35] Jon Stephens
Documented bugfix in the 5.4.4 changelog as follows:

        The cardinality of indexes on partitioned tables was calculated
        using the first partition in the table, which could result in
        suboptimal query execution plans being chosen. Now the partition
        having the most records is used instead, which should result in
        better use of indexes and thus improved performance of queries
        against partitioned tables in many if not most cases.
      
Closed.
[15 Sep 2009 11:13] Mattias Jonsson
Reopening, asking to do a backport to 5.1.
[18 Sep 2009 9:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/83701

3114 Mattias Jonsson	2009-09-18
      backport for bug#44059 from mysql-pe to mysql-5.1-bugteam
[8 Oct 2009 13:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/86198

3152 Mattias Jonsson	2009-10-08
      Bug#44059: Incorrect cardinality of indexes on a partitioned table
      
      backport for bug#44059 from mysql-pe to mysql-5.1-bugteam
      
      Using the partition with most rows instead of first partition
      to estimate the cardinality of indexes.
     @ mysql-test/r/partition.result
        Bug#44059: Incorrect cardinality of indexes on a partitioned table
        
        Added test result
     @ mysql-test/t/partition.test
        Bug#44059: Incorrect cardinality of indexes on a partitioned table
        
        Added test case
     @ sql/ha_partition.cc
        Bug#44059: Incorrect cardinality of indexes on a partitioned table
        
        Checking which partition that has the most rows, and using that
        partition for HA_STATUS_CONST instead of first partition
[9 Oct 2009 8:48] Mattias Jonsson
Pushed into mysql-5.1-bugteam and mysql-pe
[14 Oct 2009 14:39] Bugs System
Pushed into 5.1.41 (revid:joro@sun.com-20091014143611-cphb0enjlx6lpat1) (version source revid:satya.bn@sun.com-20091013071829-zc4c3go44j6re592) (merge vers: 5.1.40) (pib:13)
[15 Oct 2009 8:18] Jon Stephens
Documented in the 5.1.41 changelog.

Commented out 5.4.4 version of changelog entry until it's determined where this fix will end up, post-5.1.
[22 Oct 2009 6:33] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091022063126-l0qzirh9xyhp0bpc) (version source revid:alik@sun.com-20091019135554-s1pvptt6i750lfhv) (merge vers: 6.0.14-alpha) (pib:13)
[22 Oct 2009 7:06] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091022060553-znkmxm0g0gm6ckvw) (version source revid:alik@sun.com-20091019131708-bc6pv55x6287a0wc) (merge vers: 5.5.0-beta) (pib:13)
[22 Oct 2009 15:03] Jon Stephens
Also documented in the 5.5.0 and 6.0.14 changelogs. Closed.
[18 Dec 2009 10:31] Bugs System
Pushed into 5.1.41-ndb-7.1.0 (revid:jonas@mysql.com-20091218102229-64tk47xonu3dv6r6) (version source revid:jonas@mysql.com-20091218095730-26gwjidfsdw45dto) (merge vers: 5.1.41-ndb-7.1.0) (pib:15)
[18 Dec 2009 10:47] Bugs System
Pushed into 5.1.41-ndb-6.2.19 (revid:jonas@mysql.com-20091218100224-vtzr0fahhsuhjsmt) (version source revid:jonas@mysql.com-20091217101452-qwzyaig50w74xmye) (merge vers: 5.1.41-ndb-6.2.19) (pib:15)
[18 Dec 2009 11:02] Bugs System
Pushed into 5.1.41-ndb-6.3.31 (revid:jonas@mysql.com-20091218100616-75d9tek96o6ob6k0) (version source revid:jonas@mysql.com-20091217154335-290no45qdins5bwo) (merge vers: 5.1.41-ndb-6.3.31) (pib:15)
[18 Dec 2009 11:17] Bugs System
Pushed into 5.1.41-ndb-7.0.11 (revid:jonas@mysql.com-20091218101303-ga32mrnr15jsa606) (version source revid:jonas@mysql.com-20091218064304-ezreonykd9f4kelk) (merge vers: 5.1.41-ndb-7.0.11) (pib:15)
[19 Dec 2009 8:59] Jon Stephens
No new changelog entries required. Closing.