Bug #40195 Slowdown queries by the use of groupby in Innodb
Submitted: 21 Oct 2008 8:35 Modified: 21 Oct 2008 17:46
Reporter: Jose Antonio Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.1.28 OS:Linux (SLES10)
Assigned to: CPU Architecture:Any
Tags: GROUP BY, order by., Slowdown

[21 Oct 2008 8:35] Jose Antonio
Description:
*MySQL version: 5.1.28 ( the same problem have been found in 5.1.26 and 5.1.24)
*Engine : Innodb
*Table information: 
A)We are using a table that contains 21 fields, only one of them is a blob with the following keys:
B)Keys
PRIMARY=(`FTIME_SECS`,`FTIME_USECS`,`CTIME_SECS`,`CTIME_USECS`,`CREATE_ID`,`DSTREAM`)
KEY= `SPID_KEY` (`DSTREAM`,`SPID`,`FTIME_SECS`,`FTIME_USECS`)
* Number of rows in table = 48063594
*Problem: the use of 'group by' in the query sentence slowdown drastically the time response of mysql.

- SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  GROUP BY  FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, DSTREAM ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51;

51 rows in set (5.00 sec)

The explain of the query is
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-----------------------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows    | Extra                       |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-----------------------------+
|  1 | SIMPLE      | TM_DATA | range | SPID_KEY      | SPID_KEY | 13      | NULL | 2064500 | Using where; Using filesort |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-----------------------------+

IF we remove the group by of the query the querie time is faster
- SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51;
51 rows in set (0.01 sec)

The explain of the query is
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows    | Extra       |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
|  1 | SIMPLE      | TM_DATA | range | SPID_KEY      | SPID_KEY | 13      | NULL | 2064500 | Using where |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+

*Versions: This problem is not present in the version MySQL 5.1.22 ( in this version the results of both queries is fast), but we have found the problem in the version 5.1.24, 5.126 and 5.1.28

How to repeat:
To reproduce the problem with a database with a high number of records.
[21 Oct 2008 8:59] Valeriy Kravchuk
Thank you for a problem report. Why do you think it is a bug, though? Your GROUP BY contains columns not covered by index, so additional filesort step is needed. Queries are different, have different execution plans, and may need different time to run. This is expected.

Also if you are sure 5.1.22 had no problem like this, can you, please, send EXPLAIN results for both queries from 5.1.22?
[21 Oct 2008 10:58] Jose Antonio
Thanks a lot for your quick response.

You are right, we are using a SPID_KEY to do the search but using the Primary key fields in the group by, therefore it is needed to do a filesort.
We have modify the queries in order to do the group by by the fields of the SPID_KEY (forced index) and the query is fast. Thanks.

The explain of the queries in 5.1.22 are the following:
* With group by:
mysql> explain SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  GROUP BY  FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, DSTREAM ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51;
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows    | Extra       |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
|  1 | SIMPLE      | TM_DATA | range | SPID_KEY      | SPID_KEY | 13      | NULL | 2064500 | Using where |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
1 row in set (0.00 sec)

* Without group by:
mysql> explain SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51;
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
| id | select_type | table   | type  | possible_keys | key      | key_len | ref  | rows    | Extra       |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
|  1 | SIMPLE      | TM_DATA | range | SPID_KEY      | SPID_KEY | 13      | NULL | 2064500 | Using where |
+----+-------------+---------+-------+---------------+----------+---------+------+---------+-------------+
1 row in set (0.01 sec)

Maybe there is a bug in the Mysql 5.1.22 version?

Thanks in advance
[21 Oct 2008 13:19] Valeriy Kravchuk
Please, send the SHOW CREATE TABLE results. I need a proof that you group by PRIMARY KEY columns ONLY, and in the same order they are in the PRIMARY KEY.
[21 Oct 2008 13:26] Jose Antonio
This are the show create table results
*************************** 1. row ***************************
       Table: TM_DATA
Create Table: CREATE TABLE `TM_DATA` (
  `FTIME_SECS` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet filing time seconds',
  `FTIME_USECS` mediumint(8) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet filing time useconds',
  `CTIME_SECS` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet creation time seconds',
  `CTIME_USECS` mediumint(8) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet creation time useconds',
  `CREATE_ID` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet creation ID',
  `SPID` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet SPID',
  `DSTREAM` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'Data stream',
  `VERSION` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet TDEV version',
  `GSID` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'Ground station ID',
  `COUNTER` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet counter',
  `IDENT` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT 'Identification entry',
  `GROUP_ID` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'SPID group ID',
  `FLAGS` mediumint(8) unsigned NOT NULL DEFAULT '0' COMMENT 'TDEV Flags',
  `VALIDITY` int(10) unsigned NOT NULL DEFAULT '0' COMMENT 'Packet validity flags',
  `TPSD` int(11) NOT NULL DEFAULT '0' COMMENT 'TPSD Variable structure',
  `ROUTEID` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'Route ID',
  `PUS_APID` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'PUS application ID',
  `PUS_SSC` smallint(5) unsigned NOT NULL DEFAULT '0' COMMENT 'PUS source site code',
  `PUS_ST` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT 'PUS service type',
  `PUS_SST` tinyint(3) unsigned NOT NULL DEFAULT '0' COMMENT 'PUS service sub-type',
  `DATA` blob COMMENT 'TM header + frame',
  PRIMARY KEY (`FTIME_SECS`,`FTIME_USECS`,`CTIME_SECS`,`CTIME_USECS`,`CREATE_ID`,`DSTREAM`),
  KEY `SPID_KEY` (`DSTREAM`,`SPID`,`FTIME_SECS`,`FTIME_USECS`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='Telemetry packets data STA' /*!50100 PARTITION BY RANGE ( `FTIME_SECS`) (PARTITION p2000 VALUES LESS THAN (946684800) ENGINE = InnoDB, PARTITION p2001 VALUES LESS THAN (978307200) ENGINE = InnoDB, PARTITION p2002 VALUES LESS THAN (1009843200) ENGINE = InnoDB, PARTITION p2003 VALUES LESS THAN (1041379200) ENGINE = InnoDB, PARTITION p2004 VALUES LESS THAN (1072915200) ENGINE = InnoDB, PARTITION p2005 VALUES LESS THAN (1104537600) ENGINE = InnoDB, PARTITION p2006 VALUES LESS THAN (1136073600) ENGINE = InnoDB, PARTITION p2007 VALUES LESS THAN (1167609600) ENGINE = InnoDB, PARTITION p2008 VALUES LESS THAN (1199145600) ENGINE = InnoDB, PARTITION p2009 VALUES LESS THAN (1230768000) ENGINE = InnoDB, PARTITION p2010 VALUES LESS THAN (1262304000) ENGINE = InnoDB, PARTITION p2011 VALUES LESS THAN (1293840000) ENGINE = InnoDB, PARTITION p2012 VALUES LESS THAN (1325376000) ENGINE = InnoDB, PARTITION p2013 VALUES LESS THAN (1356998400) ENGINE = InnoDB, PARTITION p2014 VALUES LESS THAN (1388534400) ENGINE = InnoDB, PARTITION p2015 VALUES LESS THAN (1420070400) ENGINE = InnoDB, PARTITION p2016 VALUES LESS THAN (1451606400) ENGINE = InnoDB, PARTITION p2017 VALUES LESS THAN (1483228800) ENGINE = InnoDB, PARTITION p2018 VALUES LESS THAN (1514764800) ENGINE = InnoDB, PARTITION p2019 VALUES LESS THAN (1546300800) ENGINE = InnoDB, PARTITION p2020 VALUES LESS THAN (1577836800) ENGINE = InnoDB, PARTITION p2021 VALUES LESS THAN MAXVALUE ENGINE = InnoDB) */
1 row in set (0.00 sec)
[21 Oct 2008 13:38] Valeriy Kravchuk
This table is partitioned... Please, send the results of EXPLAIN PARTITIONS ...\G for queries with and without GROUP BY from 5.1.28 and 5.1.22, if possible.

What is the reason to GROUP BY entire PRIMARY KEY? There can be at most one row for each primary key value anyway...
[21 Oct 2008 14:17] Jose Antonio
Mysql 5.1.28 (with group by)

explain partitions SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  GROUP BY  FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, DSTREAM ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TM_DATA
   partitions: p2000,p2001,p2002,p2003,p2004,p2005,p2006,p2007,p2008,p2009,p2010,p2011,p2012,p2013,p2014,p2015,p2016,p2017,p2018,p2019,p2020,p2021
         type: range
possible_keys: SPID_KEY
          key: SPID_KEY
      key_len: 13
          ref: NULL
         rows: 2064500
        Extra: Using where; Using filesort
1 row in set (0.45 sec)

Mysql 5.1.28 (without group by)
explain partitions SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TM_DATA
   partitions: p2000,p2001,p2002,p2003,p2004,p2005,p2006,p2007,p2008,p2009,p2010,p2011,p2012,p2013,p2014,p2015,p2016,p2017,p2018,p2019,p2020,p2021
         type: range
possible_keys: SPID_KEY
          key: SPID_KEY
      key_len: 13
          ref: NULL
         rows: 2064500
        Extra: Using where
1 row in set (0.00 sec)

Mysql 5.1.24 (with group by)
explain partitions SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  GROUP BY  FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, DSTREAM ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TM_DATA
   partitions: p2000,p2001,p2002,p2003,p2004,p2005,p2006,p2007,p2008,p2009,p2010,p2011,p2012,p2013,p2014,p2015,p2016,p2017,p2018,p2019,p2020,p2021
         type: range
possible_keys: SPID_KEY
          key: SPID_KEY
      key_len: 13
          ref: NULL
         rows: 2064500
        Extra: Using where; Using filesort
1 row in set (0.46 sec)

Mysql 5.1.24 (without group by)

explain partitions SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TM_DATA
   partitions: p2000,p2001,p2002,p2003,p2004,p2005,p2006,p2007,p2008,p2009,p2010,p2011,p2012,p2013,p2014,p2015,p2016,p2017,p2018,p2019,p2020,p2021
         type: range
possible_keys: SPID_KEY
          key: SPID_KEY
      key_len: 13
          ref: NULL
         rows: 2064500
        Extra: Using where
1 row in set (0.01 sec)

Mysql 5.1.22 (with group by)
explain partitions SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  GROUP BY  FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, DSTREAM ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TM_DATA
   partitions: p2000,p2001,p2002,p2003,p2004,p2005,p2006,p2007,p2008,p2009,p2010,p2011,p2012,p2013,p2014,p2015,p2016,p2017,p2018,p2019,p2020,p2021
         type: range
possible_keys: SPID_KEY
          key: SPID_KEY
      key_len: 13
          ref: NULL
         rows: 2064500
        Extra: Using where
1 row in set (0.01 sec)

Mysql 5.1.22 (without group by)
explain partitions SELECT FTIME_SECS, FTIME_USECS, CTIME_SECS, CTIME_USECS, CREATE_ID, SPID, DSTREAM, VERSION, GSID, COUNTER, IDENT, GROUP_ID, FLAGS, VALIDITY, TPSD, ROUTEID, PUS_APID, PUS_SSC, PUS_ST, PUS_SST, DATA, LENGTH(DATA) FROM TM_DATA FORCE INDEX(SPID_KEY) WHERE  SPID=10015 AND  (FTIME_SECS > 0 OR (FTIME_SECS = 0 AND FTIME_USECS > 1))  AND  (DSTREAM = 65535)  ORDER BY  FTIME_SECS, FTIME_USECS LIMIT 51\G;
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: TM_DATA
   partitions: p2000,p2001,p2002,p2003,p2004,p2005,p2006,p2007,p2008,p2009,p2010,p2011,p2012,p2013,p2014,p2015,p2016,p2017,p2018,p2019,p2020,p2021
         type: range
possible_keys: SPID_KEY
          key: SPID_KEY
      key_len: 13
          ref: NULL
         rows: 2064500
        Extra: Using where
1 row in set (0.00 sec)
[21 Oct 2008 14:43] Valeriy Kravchuk
Looks like the behaviour of 5.1.22 was wrong, and it changed to correct one after patch for bug #32202 was applied. You had incorrect results, likely, in 5.1.22.
[21 Oct 2008 17:46] Jose Antonio
Thanks a lot.

Could toy tell us which is the version of mysql that you recommend for a system with a large amount of data (800,000,000) in an operational system?.

Thanks in advance