Bug #70444 MyISAM engine doesnt use index for count(distinct field) query
Submitted: 26 Sep 2013 21:57 Modified: 2 Dec 2013 13:56
Reporter: Oleksandr Voytsekhovskyy Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.5.31 OS:Any
Assigned to: CPU Architecture:Any

[26 Sep 2013 21:57] Oleksandr Voytsekhovskyy
Description:
I have a simple table:

 CREATE TEMPORARY TABLE `norepl_sqldata_20587014` (
  `v_1` int(11) NOT NULL DEFAULT '0',
  `axis_x1` int(11) NOT NULL DEFAULT '0',
  `axis_y1` varchar(255) CHARACTER SET cp1251 COLLATE cp1251_bin NOT NULL DEFAULT '',
  `axis_y2` varchar(128) CHARACTER SET cp1251 COLLATE cp1251_bin NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=cp1251

mysql> select count(*) from norepl_sqldata_20587014;
+----------+
| count(*) |
+----------+
| 20132984 |
+----------+

Real size is:
-rw-rw----   1 mysql mysql 443M Sep 27 00:38 #sql2837_c54c_1.MYD

This simple query:
mysql> SELECT count(distinct v_1) as v_1, axis_x1, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;                                                                 ERROR 3 (HY000): Error writing file '/.ram/MYrRd3wm' (Errcode: 28)

Use 30Gb available disk space and crash. Related table size only 400mb.

Db version:
mysql> select version();
+------------+
| version()  |
+------------+
| 5.5.31-log |
+------------+

How to repeat:
I can provide dump of norepl_sqldata_20587014 table
[26 Sep 2013 22:39] MySQL Verification Team
Please provide the dump file. Thanks.
[27 Sep 2013 7:40] Oleksandr Voytsekhovskyy
SELECT count(v_1) as v_1, axis_x1, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;

works without disk usage at all and very fast

SELECT count(DISTINCT v_1) as v_1, axis_x1, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;

use all available disk space and then crash
[27 Sep 2013 9:03] Oleksandr Voytsekhovskyy
I have workaround:
- convert table to innodb
- alter table norepl_sqldata_20587014 add index t (axis_x1, axis_y1, axis_y2);

if i do both this steps query works.

MyISAM with index doesn't work.
Innodb without index doesn't work.

Only innodb with index
[27 Sep 2013 13:41] MySQL Verification Team
I couldn't repeat with 5.6.14. I will test 5.5

mysql 5.6 > SHOW VARIABLES LIKE "%version%";
+-------------------------+---------------------+
| Variable_name           | Value               |
+-------------------------+---------------------+
| innodb_version          | 5.6.14              |
| protocol_version        | 10                  |
| slave_type_conversions  |                     |
| version                 | 5.6.14              |
| version_comment         | Source distribution |
| version_compile_machine | x86_64              |
| version_compile_os      | Win64               |
+-------------------------+---------------------+
7 rows in set (0.00 sec)

mysql 5.6 > SELECT count(distinct v_1) as v_1, axis_x1, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;
+------+---------+---------+-------------+
| v_1  | axis_x1 | axis_y1 | axis_y2     |
+------+---------+---------+-------------+
|  211 |       1 |         | Glade       |
|    5 |       1 |         | KIWI        |
|  236 |       1 |         | Mr Muscle   |
|  122 |       1 |         | Off         |
<CUT>
| 1446 |       3 | E       | KIWI        |
| 4449 |       3 | E       | Mr Muscle   |
| 4054 |       3 | E       | Off         |
|  670 |       3 | E       | Oust        |
| 3203 |       3 | E       | Pronto      |
| 4623 |       3 | E       | Raid        |
|    1 |       3 | E       | Shout       |
| 4782 |       3 | E       | Toilet Duck |
+------+---------+---------+-------------+
170 rows in set (1 hour 2 min 8.39 sec)
[27 Sep 2013 13:45] Oleksandr Voytsekhovskyy
Maybe, you have more than 50Gb available for temporary table?

And query time is 1h, it looks like strange! On innodb query time around 3-5 min
[30 Sep 2013 13:40] MySQL Verification Team
Thank you for the feedback. I think the big temporary file is expected with Distinct + varchars column however there is a bug with the slow performance of MyISAM engine when using an index in opposite with InnoDB engine.
[30 Sep 2013 18:26] Oleksandr Voytsekhovskyy
MyISAM, No index, MyISAM, No index, MyISAM, Indexed - same result:

mysql> explain extended SELECT count(distinct v_1) as v_1, axis_x1, 0 as axis_x2, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;
+----+-------------+-------------------------+------+---------------+------+---------+------+----------+----------+----------------+
| id | select_type | table                   | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra          |
+----+-------------+-------------------------+------+---------------+------+---------+------+----------+----------+----------------+
|  1 | SIMPLE      | norepl_sqldata_20587014 | ALL  | NULL          | NULL | NULL    | NULL | 20265125 |   100.00 | Using filesort |
+----+-------------+-------------------------+------+---------------+------+---------+------+----------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

mysql> explain extended SELECT count(distinct v_1) as v_1, axis_x1, 0 as axis_x2, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;
+----+-------------+-------------------------+-------+---------------+------+---------+------+----------+----------+-------+
| id | select_type | table                   | type  | possible_keys | key  | key_len | ref  | rows     | filtered | Extra |
+----+-------------+-------------------------+-------+---------------+------+---------+------+----------+----------+-------+
|  1 | SIMPLE      | norepl_sqldata_20587014 | index | NULL          | t    | 391     | NULL | 20231675 |   100.00 |       |
+----+-------------+-------------------------+-------+---------------+------+---------+------+----------+----------+-------+
1 row in set, 1 warning (0.00 sec)
[30 Sep 2013 18:29] Oleksandr Voytsekhovskyy
MyISAM not indexed, MyISAM indexed, InnoDB not indexed give same result:

mysql> explain extended SELECT count(distinct v_1) as v_1, axis_x1, 0 as axis_x2, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;
+----+-------------+-------------------------+------+---------------+------+---------+------+----------+----------+----------------+
| id | select_type | table                   | type | possible_keys | key  | key_len | ref  | rows     | filtered | Extra          |
+----+-------------+-------------------------+------+---------------+------+---------+------+----------+----------+----------------+
|  1 | SIMPLE      | norepl_sqldata_20587014 | ALL  | NULL          | NULL | NULL    | NULL | 20265125 |   100.00 | Using filesort |
+----+-------------+-------------------------+------+---------------+------+---------+------+----------+----------+----------------+
1 row in set, 1 warning (0.00 sec)

InnoDB indexed doing much better:

mysql> explain extended SELECT count(distinct v_1) as v_1, axis_x1, 0 as axis_x2, axis_y1, axis_y2 FROM norepl_sqldata_20587014 GROUP BY axis_x1, axis_y1, axis_y2;
+----+-------------+-------------------------+-------+---------------+------+---------+------+----------+----------+-------+
| id | select_type | table                   | type  | possible_keys | key  | key_len | ref  | rows     | filtered | Extra |
+----+-------------+-------------------------+-------+---------------+------+---------+------+----------+----------+-------+
|  1 | SIMPLE      | norepl_sqldata_20587014 | index | NULL          | t    | 391     | NULL | 20231675 |   100.00 |       |
+----+-------------+-------------------------+-------+---------------+------+---------+------+----------+----------+-------+
1 row in set, 1 warning (0.00 sec)
[1 Oct 2013 12:16] MySQL Verification Team
I agreed that the index doesn't works with MyISAM then I suggest you to change
the Synopsis. Thanks.
[2 Nov 2013 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[2 Nov 2013 7:04] Oleksandr Voytsekhovskyy
Synopsis changed
[2 Dec 2013 13:56] MySQL Verification Team
\thank you for the feedback. Only MyISAM engine affected.