Bug #53137 Optimizer may be confused by an index
Submitted: 24 Apr 2010 0:14 Modified: 10 Jan 2013 12:43
Reporter: Mikhail T Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S2 (Serious)
Version:5.0.90, 4.1, 5.0, 5.1, 5.6.99 OS:Any (FreeBSD 7.3/amd64, Linux)
Assigned to: CPU Architecture:Any

[24 Apr 2010 0:14] Mikhail T
Description:
While working on a "hairy" query, I noticed, that I can get vastly different results by simply using the LIMIT clause:

select o.topic_id, count(o.post_id) count, min(o.post_id) first, t.post_id from pc3_posts o join pc3_posts t on o.post_id=t.post_id group by o.topic_id having t.post_id=first and count>1 order by topic_id desc;
....
4267 rows in set (0.39 sec)   <-- normal, as expected

   versus:

select o.topic_id, count(o.post_id) count, min(o.post_id) first, t.post_id from pc3_posts o join pc3_posts t on o.post_id=t.post_id group by o.topic_id having t.post_id=first and count>1 order by topic_id desc limit 4267;
Empty set (0.80 sec)

How to repeat:
The full dump of the table can be provided upon request. It is declared as:

CREATE TABLE `pc3_posts` (
  `post_id` mediumint(8) unsigned NOT NULL auto_increment,
  `topic_id` mediumint(8) unsigned NOT NULL default '0',
  `forum_id` mediumint(8) unsigned NOT NULL default '0',
  `poster_id` mediumint(8) unsigned NOT NULL default '0',
  `icon_id` mediumint(8) unsigned NOT NULL default '0',
  `poster_ip` varchar(40) collate utf8_bin NOT NULL default '',
  `post_time` int(11) unsigned NOT NULL default '0',
  `post_approved` tinyint(1) unsigned NOT NULL default '1',
  `post_reported` tinyint(1) unsigned NOT NULL default '0',
  `enable_bbcode` tinyint(1) unsigned NOT NULL default '1',
  `enable_smilies` tinyint(1) unsigned NOT NULL default '1',
  `enable_magic_url` tinyint(1) unsigned NOT NULL default '1',
  `enable_sig` tinyint(1) unsigned NOT NULL default '1',
  `post_username` varchar(255) collate utf8_bin NOT NULL default '',
  `post_subject` varchar(255) character set utf8 collate utf8_unicode_ci NOT NULL default '',
  `post_text` mediumtext collate utf8_bin NOT NULL,
  `post_checksum` varchar(32) collate utf8_bin NOT NULL default '',
  `post_attachment` tinyint(1) unsigned NOT NULL default '0',
  `bbcode_bitfield` varchar(255) collate utf8_bin NOT NULL default '',
  `bbcode_uid` varchar(8) collate utf8_bin NOT NULL default '',
  `post_postcount` tinyint(1) unsigned NOT NULL default '1',
  `post_edit_time` int(11) unsigned NOT NULL default '0',
  `post_edit_reason` varchar(255) collate utf8_bin NOT NULL default '',
  `post_edit_user` mediumint(8) unsigned NOT NULL default '0',
  `post_edit_count` smallint(4) unsigned NOT NULL default '0',
  `post_edit_locked` tinyint(1) unsigned NOT NULL default '0',
  PRIMARY KEY  (`post_id`),
  UNIQUE KEY `topicids` (`topic_id`,`post_id`),
  KEY `forum_id` (`forum_id`),
  KEY `topic_id` (`topic_id`),
  KEY `poster_ip` (`poster_ip`),
  KEY `poster_id` (`poster_id`),
  KEY `post_approved` (`post_approved`),
  KEY `post_username` (`post_username`),
  KEY `tid_post_time` (`topic_id`,`post_time`)
) ENGINE=MyISAM AUTO_INCREMENT=90230 DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
[24 Apr 2010 6:36] Sveta Smirnova
Thank you for the report.

> The full dump of the table can be provided upon request.

Please provide full dump.
[25 Apr 2010 18:13] Mikhail T
Reproduce the problem described in bug 53137

Attachment: t.sql (application/octet-stream, text), 474 bytes.

[25 Apr 2010 18:14] Mikhail T
The subset of rows and columns reproducing the bug 53137

Attachment: pc3_posts.sql (application/octet-stream, text), 3.27 KiB.

[25 Apr 2010 18:16] Mikhail T
Ok, I was able to reduce the test case to a much smaller column- and row-counts. See the already attached t.sql and pc3_posts.sql.

The two queries inside t.sql are supposed to provide the same rows in different order. Instead, the second one of them returns an empty set...

In the process of doing that, I also found, that the results seem to depend on what else I was doing in my mysql-client session. There is definitely a problem, but it is not really with the LIMIT-clause itself... There are various strange discrepancies in the output, and they all seem triggered by the following index:

UNIQUE KEY `topicids` (`topic_id`,`post_id`)

If this index is dropped, the bug is no more...
[26 Apr 2010 5:54] Sveta Smirnova
Thank you for the feedback.

Verified as described.
[26 Apr 2010 11:40] Mikhail T
Thank you, Светлана.
[10 Jan 2013 12:43] Erlend Dahl
The query stated in the bugpage is not legal if we enable 'ONLY_FULL_GROUP_BY' sql mode. It is expected that the result set can vary in cases where "ONLY_FULL_GROUP_BY" is not enabled and the query is executed.

This is what we have in manual.

"In standard SQL, a query that includes a GROUP BY clause cannot refer to nonaggregated columns in the HAVING clause that are not named in the GROUP BY clause. A MySQL extension permits references to such columns to simplify calculations. This extension assumes that the nongrouped columns will have the same group-wise values. Otherwise, the result is indeterminate."

Hence closing it as not a bug