Bug #48229 group by performance issue of partitioned table
Submitted: 22 Oct 2009 13:22 Modified: 18 Jun 2010 12:54
Reporter: Hao Wu Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S5 (Performance)
Version:5.1.37,5.1.38 OS:Linux (CentOS 4.4)
Assigned to: Mattias Jonsson CPU Architecture:Any
Tags: partition, performance
Triage: Triaged: D3 (Medium)

[22 Oct 2009 13:22] Hao Wu
Description:
The performance of some sql with "group by" on partitioned table is very bad.

How to repeat:
1.
mysql> show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
/*!50100 PARTITION BY HASH (a)
PARTITIONS 20 */
1 row in set (0.00 sec)

2.
mysql> show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

3. insert into t1 and t2 with same 2 million random data

4. key_buffer is large enough to hold all indexes in memory

5. 
mysql> select a, max(b) from t1 where a in (10,100) group by a;
+------+--------+
| a    | max(b) |
+------+--------+
|   10 |   2956 |
|  100 |   2982 |
+------+--------+
2 rows in set (0.25 sec)

6.
mysql> select a, max(b) from t2 where a in (10,100) group by a;
+------+--------+
| a    | max(b) |
+------+--------+
|   10 |   2956 |
|  100 |   2982 |
+------+--------+
2 rows in set (0.00 sec)

7.
mysql> select a, max(b) from (select a, b from t1 where a in (10,100)) as t  group by a;
+------+--------+
| a    | max(b) |
+------+--------+
|   10 |   2956 |
|  100 |   2982 |
+------+--------+
2 rows in set (0.00 sec)

8. In our product server, a table like "t1" has 100 million records, and the sql in step 5 may take several seconds even minutes to finish, but the sql in step 7 takes only milliseconds.
[22 Oct 2009 13:36] Valeriy Kravchuk
Thank you for the problem report. Please, send the results of:

explain select a, max(b) from t1 where a in (10,100) group by a\G
explain partitions select a, max(b) from t1 where a in (10,100) group by a\G

explain select a, max(b) from t2 where a in (10,100) group by a\G

for your test data.
[22 Oct 2009 14:01] Hao Wu
mysql> explain select a, max(b) from t1 where a in (10,100) group by a\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
         type: range
possible_keys: a
          key: a
      key_len: 5
          ref: NULL
         rows: 1
        Extra: Using where; Using index for group-by
1 row in set (0.00 sec)

mysql> explain partitions select a, max(b) from t1 where a in (10,100) group by a\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t1
   partitions: p0,p10
         type: range
possible_keys: a
          key: a
      key_len: 5
          ref: NULL
         rows: 1
        Extra: Using where; Using index for group-by
1 row in set (0.00 sec)

mysql> explain select a, max(b) from t2 where a in (10,100) group by a\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: t2
         type: range
possible_keys: a
          key: a
      key_len: 5
          ref: NULL
         rows: 2
        Extra: Using where; Using index for group-by
1 row in set (0.00 sec)
[22 Oct 2009 14:20] Miguel Solorzano
See bug: http://bugs.mysql.com/bug.php?id=37252.
[29 Oct 2009 15:04] Miguel Solorzano
Thank you for the feedback. Verified as described.

c:\dbs>c:\dbs\5.1\bin\mysql -uroot --port=3510 --prompt="mysql 5.1 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 32
Server version: 5.1.41-Win X64-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.1 >use test
Database changed
mysql 5.1 >select count(*) from t1;
+----------+
| count(*) |
+----------+
|  2100000 |
+----------+
1 row in set (0.00 sec)

mysql 5.1 >select count(*) from t2;
+----------+
| count(*) |
+----------+
|  2100000 |
+----------+
1 row in set (0.00 sec)

mysql 5.1 >select a, max(b) from t1 where a in (10,100) group by a;
+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (7.85 sec)

mysql 5.1 >select a, max(b) from t2 where a in (10,100) group by a;
+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

mysql 5.1 >select a, max(b) from (select a, b from t1 where a in (10,100)) as t  group by a;
+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.04 sec)

mysql 5.1 >show create table t1;

| t1    | CREATE TABLE `t1` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
/*!50100 PARTITION BY HASH (a)
PARTITIONS 20 */ |

mysql 5.1 >show create table t2;

| t2    | CREATE TABLE `t2` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 |
[29 Oct 2009 15:06] Miguel Solorzano
Client to populate tables

Attachment: bug48229.cpp (application/octet-stream, text), 3.74 KiB.

[29 Jan 2010 18:38] Mattias Jonsson
Is this repeatable in 5.1.41+ ?

I could not repeat it in latest mysql-5.1-bugteam or with 5.1.4[12] on mac os x 10.6 running x86_64 binaries.

Running this script on mysql-5.1.41-osx10.5-x86_64 (on ramdisk):
mysql-test$ cat bug48229.sql 
drop view if exists v;
drop view if exists x;
drop table if exists t1;
create view v as select 1 union select 2 union select 3 union select 4;
create view x as select 1 from v,v a,v b,v c,v d,v e;
CREATE TABLE `t1` (
 `a` int(11) DEFAULT NULL,
 `b` int(11) DEFAULT NULL,
 KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
PARTITION BY HASH (a)
PARTITIONS 1;

select 'One partition';
set @a := 0;
insert t1 select @a, (@a := @a + 1) from x,x a;

select count(*) from t1;
select count(*) from t1;
select a, max(b) from t1 where a in (10,100) group by a;
select a, max(b) from t1 where a in (10,100) group by a;
select a, max(b) from (select a, b from t1 where a in (10,100)) as t group by a;
select a, max(b) from (select a, b from t1 where a in (10,100)) as t group by a;
DROP TABLE t1;
CREATE TABLE `t1` (
 `a` int(11) DEFAULT NULL,
 `b` int(11) DEFAULT NULL,
 KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
PARTITION BY HASH (a)
PARTITIONS 20;

select 'Twenty partitions';
set @a := 0;
insert t1 select @a, (@a := @a + 1) from x,x a;

select count(*) from t1;
select count(*) from t1;
select a, max(b) from t1 where a in (10,100) group by a;
select a, max(b) from t1 where a in (10,100) group by a;
select a, max(b) from (select a, b from t1 where a in (10,100)) as t group by a;
select a, max(b) from (select a, b from t1 where a in (10,100)) as t group by a;
DROP TABLE t1;
CREATE TABLE `t1` (
 `a` int(11) DEFAULT NULL,
 `b` int(11) DEFAULT NULL,
 KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8;

select 'Non partitioned table';
set @a := 0;
insert t1 select @a, (@a := @a + 1) from x,x a;

select count(*) from t1;
select count(*) from t1;
select a, max(b) from t1 where a in (10,100) group by a;
select a, max(b) from t1 where a in (10,100) group by a;
select a, max(b) from (select a, b from t1 where a in (10,100)) as t group by a;
select a, max(b) from (select a, b from t1 where a in (10,100)) as t group by a;
drop view x;
drop view v;
drop table t1;

Gives this result:
mysql-test$ ../client/mysql -u root -h 127.0.0.1 -P 13000 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.41-log MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> source bug48229.sql
Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 0 rows affected, 1 warning (0.01 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

+---------------+
| One partition |
+---------------+
| One partition |
+---------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 16777216 rows affected (2 min 33.17 sec)
Records: 16777216  Duplicates: 0  Warnings: 0

+----------+
| count(*) |
+----------+
| 16777216 |
+----------+
1 row in set (0.00 sec)

+----------+
| count(*) |
+----------+
| 16777216 |
+----------+
1 row in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.01 sec)

Query OK, 0 rows affected (0.01 sec)

Query OK, 0 rows affected (0.01 sec)

+-------------------+
| Twenty partitions |
+-------------------+
| Twenty partitions |
+-------------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 16777216 rows affected (2 min 24.20 sec)
Records: 16777216  Duplicates: 0  Warnings: 0

+----------+
| count(*) |
+----------+
| 16777216 |
+----------+
1 row in set (0.00 sec)

+----------+
| count(*) |
+----------+
| 16777216 |
+----------+
1 row in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

Query OK, 0 rows affected (0.07 sec)

Query OK, 0 rows affected (0.01 sec)

+-----------------------+
| Non partitioned table |
+-----------------------+
| Non partitioned table |
+-----------------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 16777216 rows affected (3 min 14.34 sec)
Records: 16777216  Duplicates: 0  Warnings: 0

+----------+
| count(*) |
+----------+
| 16777216 |
+----------+
1 row in set (0.00 sec)

+----------+
| count(*) |
+----------+
| 16777216 |
+----------+
1 row in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

+------+--------+
| a    | max(b) |
+------+--------+
|   10 |     11 |
|  100 |    101 |
+------+--------+
2 rows in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.02 sec)

mysql> select version();
+------------+
| version()  |
+------------+
| 5.1.41-log |
+------------+
1 row in set (0.00 sec)

I have also made runs on linux both on ram and disk without getting worse performance of partitioned tables.

If this bug still exists, could you please include test data or generation script (preferable perl or any other non platform specific language)?
[31 Jan 2010 5:30] Hao Wu
I can repeat it in 5.1.42, but can not with your sql srcipt.
I think maybe because I use random data.
[31 Jan 2010 5:30] Hao Wu
perl script to populate table

Attachment: bug48229.pl (application/octet-stream, text), 211 bytes.

[1 Feb 2010 15:57] Miguel Solorzano
Using Perl script provided by reporter (changing just # of inserts):

C:\bugs>c:\dbs\51

C:\bugs>c:\dbs\5.1\bin\mysql -uroot --port=3306 --prompt="mysql 5.1 >"
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.1.44-Win X64-log Source distribution

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql 5.1 >use test
Database changed
mysql 5.1 >select count(*) from t1;
+----------+
| count(*) |
+----------+
|  2000000 |
+----------+
1 row in set (0.00 sec)

mysql 5.1 >select count(*) from t2;
+----------+
| count(*) |
+----------+
|  2000000 |
+----------+
1 row in set (0.00 sec)

mysql 5.1 >select a, max(b) from t1 where a in (977,10010) group by a;
+-------+---------+
| a     | max(b)  |
+-------+---------+
|   977 | 1960571 |
| 10010 | 1994995 |
+-------+---------+
2 rows in set (0.20 sec)

mysql 5.1 >select a, max(b) from t2 where a in (977,10010) group by a;
+-------+---------+
| a     | max(b)  |
+-------+---------+
|   977 | 1968079 |
| 10010 | 1980286 |
+-------+---------+
2 rows in set (0.00 sec)

mysql 5.1 >show create table t1\G
*************************** 1. row ***************************
       Table: t1
Create Table: CREATE TABLE `t1` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
/*!50100 PARTITION BY HASH (a)
PARTITIONS 20 */
1 row in set (0.00 sec)

mysql 5.1 >show create table t2\G
*************************** 1. row ***************************
       Table: t2
Create Table: CREATE TABLE `t2` (
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
1 row in set (0.00 sec)

mysql 5.1 >
[2 Feb 2010 11:12] Mattias Jonsson
The problem is in different execution plans.

Created by:
CREATE TABLE `t1` (
 `a` int(11) DEFAULT NULL,
 `b` int(11) DEFAULT NULL,
 KEY `a` (`a`,`b`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8
PARTITION BY HASH (a)
PARTITIONS 1;
<run the perlscript loading 1'000'000 rows into t1>
create table t2 like t1;
alter table t2 remove partitioning;
insert into t2 select * from t1;

mysql> explain select a, max(b) from t2 where a in (10,100) group by a;
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                    |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
|  1 | SIMPLE      | t2    | range | a             | a    | 5       | NULL |    3 | Using where; Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+--------------------------+
1 row in set (0.00 sec)

mysql> explain select a, max(b) from t1 where a in (10,100) group by a;
+----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                                 |
+----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
|  1 | SIMPLE      | t1    | range | a             | a    | 5       | NULL |    1 | Using where; Using index for group-by |
+----+-------------+-------+-------+---------------+------+---------+------+------+---------------------------------------+
1 row in set (0.00 sec)

Note the different text in the 'Extra' column.
[4 Feb 2010 15:03] Mattias Jonsson
stats.block_size is not set in ha_partition::info(HA_STATUS_CONSTANT), fixed by:
=== modified file 'sql/ha_partition.cc'
--- sql/ha_partition.cc	2009-11-17 19:02:16 +0000
+++ sql/ha_partition.cc	2010-02-04 14:59:15 +0000
@@ -5054,6 +5054,7 @@
     file= m_file[handler_instance];
     file->info(HA_STATUS_CONST);
     stats.create_time= file->stats.create_time;
+    stats.block_size= file->stats.block_size;
     ref_length= m_ref_length;
   }
   if (flag & HA_STATUS_ERRKEY)
[4 Feb 2010 18:12] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/99334

3319 Mattias Jonsson	2010-02-04
      Bug#48229: group by performance issue of partitioned table
      
      Problem was block_size on partitioned tables was not set,
      resulting in bad optimizer decision.
      
      Fixed by setting stats.block_size correctly.
     @ mysql-test/r/partition_range.result
        Bug#48229: group by performance issue of partitioned table
        
        Added result
     @ mysql-test/t/partition_range.test
        Bug#48229: group by performance issue of partitioned table
        
        Added test
     @ sql/ha_partition.cc
        Bug#48229: group by performance issue of partitioned table
        
        Added missing assignment of stats.block_size.
[4 Mar 2010 11:09] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/102278

3319 Mattias Jonsson	2010-03-04
      Bug#48229: group by performance issue of partitioned table
      
      Problem was block_size on partitioned tables was not set,
      resulting in keys_per_block was not correct which affects
      the cost calculation for read time of indexes (including
      cost for group min/max).Which resulted in a bad optimizer
      decision.
      
      Fixed by setting stats.block_size correctly.
     @ mysql-test/r/partition_range.result
        Bug#48229: group by performance issue of partitioned table
        
        Added result
     @ mysql-test/t/partition_range.test
        Bug#48229: group by performance issue of partitioned table
        
        Added test
     @ sql/ha_partition.cc
        Bug#48229: group by performance issue of partitioned table
        
        Added missing assignment of stats.block_size.
[12 Mar 2010 10:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/103068

3320 Mattias Jonsson	2010-03-12
      Bug#48229: group by performance issue of partitioned table
      
      Additional result file update
[16 Mar 2010 13:14] Mattias Jonsson
Found an issue when merging to mysql-6.0-codebase trees, I'm #ifdef'ing the fix out in mysql-pe since bug#52092.
[16 Mar 2010 16:03] Mattias Jonsson
pushed to mysql-5.1-bugteam and null-merged to mysql-pe (waiting for bug#52092).
[26 Mar 2010 8:22] Bugs System
Pushed into 5.5.4-m3 (revid:alik@sun.com-20100326080914-2pz8ns984e0spu03) (version source revid:alexey.kopytov@sun.com-20100320202342-3oapaq7r0t6qhexq) (merge vers: 5.5.3-m2) (pib:16)
[26 Mar 2010 8:26] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100326081116-m3v4l34yhr43mtsv) (version source revid:alik@sun.com-20100325072612-4sds00ix8ajo1e84) (pib:16)
[26 Mar 2010 8:30] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100326081944-qja07qklw1p2w7jb) (version source revid:alik@sun.com-20100325073410-4t4i9gu2u1pge7xb) (merge vers: 6.0.14-alpha) (pib:16)
[29 Mar 2010 8:44] Jon Stephens
Documented bugfix in the 5.5.4 changelog as follows:

        GROUP BY queries performed poorly for some partitioned tables.
        This was due to block_size not being set for partitioned tables,
        with the result that keys_per_block was not correct, which could
        cause such queries to be optimized incorrectly.

Waiting for 5.1 merge.
[31 Mar 2010 16:02] Paul Dubois
5.5.4 changelog entry was moved to 5.5.5.
[6 Apr 2010 7:59] Bugs System
Pushed into 5.1.46 (revid:sergey.glukhov@sun.com-20100405111026-7kz1p8qlzglqgfmu) (version source revid:martin.hansson@sun.com-20100316162138-u9724fhm54cj3or0) (merge vers: 5.1.46) (pib:16)
[6 Apr 2010 10:21] Jon Stephens
Also documented fix in the 5.1.46 changelog.

Closed.
[6 Apr 2010 10:22] Jon Stephens
Added changelog entry for 6.0.14.
[17 Jun 2010 11:58] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:38] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:martin.skold@mysql.com-20100609211156-tsac5qhw951miwtt) (merge vers: 5.1.46-ndb-6.2.19) (pib:16)
[17 Jun 2010 13:24] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)