Bug #46508 MySQL 5.1.36 where / order by seems to sort first
Submitted: 1 Aug 2009 13:27 Modified: 5 Aug 2009 11:57
Reporter: Harald Reindl Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1.36 OS:Any
Assigned to: CPU Architecture:Any

[1 Aug 2009 13:27] Harald Reindl
Description:
We have a performance gain in "dbmail" using innodb-Backend while running imapsync. Well, i know that the query is very bad "order by unindexed-blobfield" and possible there is no need for any order-by

SELECT messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id=89842
ORDER BY messageblk;

This seems FIRST order by and THEN look on the where which could use an index and would return few records: So if you have millions of records the difference is between minutes and milliseconds.

My dbmail-bugreport on fedora-bugzilla:
https://bugzilla.redhat.com/show_bug.cgi?id=515056

How to repeat:
CREATE TABLE `dbmail_messageblks` (
  `messageblk_idnr` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `physmessage_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `messageblk` longblob NOT NULL,
  `blocksize` bigint(20) unsigned NOT NULL DEFAULT '0',
  `is_header` tinyint(3) unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`messageblk_idnr`),
  KEY `physmessage_id_index` (`physmessage_id`),
  KEY `physmessage_id_is_header_index` (`physmessage_id`,`is_header`),
  CONSTRAINT `dbmail_messageblks_ibfk_1` FOREIGN KEY (`physmessage_id`) REFERENCES `dbmail_physmessage` (`id`) ON DELETE CASCADE ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=207708 DEFAULT CHARSET=utf8;

Fill in much data and make a query described above
We have 207708 records with 16 GB data in this table now

Suggested fix:
I think mysql could simple fix the behavier if a query is written in such a bad way because records which does not match where are not needed to touch here.

* Filter records based on the where-statement
* order this result which are few records most time
[1 Aug 2009 14:12] Valeriy Kravchuk
Thank you for the problem report. Please, send the results of:

EXPLAIN SELECT messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id=89842
ORDER BY messageblk\G

from your system.
[1 Aug 2009 14:26] Harald Reindl
I found the real problem and had one error in my bugreport
The order goes to "messageblk_idnr" but the query uses this primary key which will affect millions of records instead of the where-clause

The query is located in dbmail-message.c line 741
Seems mysql uses the wrong index!
Look at my third query with "FORCE INDEX" that will solve the problem
So i need to know how to make a patch for rpmbuild/specfile

mysql> explain SELECT SQL_NO_CACHE messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id=89842 ORDER BY messageblk_idnr;
+----+-------------+--------------------+-------+-----------------------------------------------------+---------+---------+------+--------+-------------+
| id | select_type | table              | type  | possible_keys                                       | key     | key_len | ref  | rows   | Extra       |
+----+-------------+--------------------+-------+-----------------------------------------------------+---------+---------+------+--------+-------------+
|  1 | SIMPLE      | dbmail_messageblks | index | physmessage_id_index,physmessage_id_is_header_index | PRIMARY | 8       | NULL | 642706 | Using where |
+----+-------------+--------------------+-------+-----------------------------------------------------+---------+---------+------+--------+-------------+
1 row in set (0.18 sec)

SELECT SQL_NO_CACHE messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id=89842;
2 rows in set (0.00 sec)

SELECT SQL_NO_CACHE messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id=89842 ORDER BY messageblk_idnr;
runs since 727 seconds, so i stopped it
mysql produces and i/o up to 25 MB per second!

SELECT SQL_NO_CACHE messageblk, is_header FROM dbmail_messageblks FORCE INDEX (physmessage_id_index) WHERE physmessage_id=89842 ORDER BY messageblk_idnr;
2 rows in set (0.20 sec)

/*
 *
 * retrieve the full message
 *
*/
static struct DbmailMessage * _fetch_full(struct DbmailMessage *self)
{
 char *query_template = "SELECT messageblk, is_header "
  "FROM %smessageblks "
  "WHERE physmessage_id = %llu "
  "ORDER BY messageblk_idnr";
 return _retrieve(self, query_template);
}
[1 Aug 2009 14:50] Valeriy Kravchuk
This looks like a (simpler) duplicate of bug #44969. Please, check.

Send also the results of

SHOW INDEXES FROM  dbmail_messageblks;
[1 Aug 2009 15:27] Harald Reindl
Here they are

I fix the dbmail-source for this time  with "FORCE INDEX (physmessage_id_index)" and suggest upstream also to do so, but finally this should never happen in mysql because it seems to be a simple query where mysql should decide use this index instead of the auto_increment primary key used for order

mysql> SHOW INDEXES FROM  dbmail_messageblks;
+--------------------+------------+--------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| Table              | Non_unique | Key_name                       | Seq_in_index | Column_name     | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------------------+------------+--------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| dbmail_messageblks |          0 | PRIMARY                        |            1 | messageblk_idnr | A         |     6933765 |     NULL | NULL   |      | BTREE      |         |
| dbmail_messageblks |          1 | physmessage_id_index           |            1 | physmessage_id  | A         |     2311255 |     NULL | NULL   |      | BTREE      |         |
| dbmail_messageblks |          1 | physmessage_id_is_header_index |            1 | physmessage_id  | A         |     3466882 |     NULL | NULL   |      | BTREE      |         |
| dbmail_messageblks |          1 | physmessage_id_is_header_index |            2 | is_header       | A         |     6933765 |     NULL | NULL   |      | BTREE      |         |
+--------------------+------------+--------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
4 rows in set (0.47 sec)
[1 Aug 2009 18:50] Valeriy Kravchuk
As cardinality looks correct enough (and proper index was declared just after primary key) this should be a duplicate of bug #36259 actually. It should be fixed in 5.1.37. Let's wait for it to be released officially and check.
[1 Aug 2009 22:31] Harald Reindl
Nice to hear and thank you for your response (on other projects like php and mozilla bugreports are everytime ignored or set to "notabug" until you make no monkey-dance :-)

When will 5.1.37 be released?
Maybe it solves another problem

> Invalid memory reads and writes were generated when altering merge and base 
> tables. This could lead to a crash or Valgrind errors: 

If i make a null-alter-table to innodb-tables there is no disk-io and no cpu-usage and it works poorly slow, so after some minutes less mb are written in the tmp-file, the table is locked and i do not understand what the hell takes so long on a machine which can write 100 MB/Sek. to disk...
[5 Aug 2009 11:57] Harald Reindl
I can confirm the fix in mysql 5.1.37

mysql> SELECT SQL_NO_CACHE messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id=99842 ORDER BY messageblk_idnr;
2 rows in set (0.02 sec)

mysql> explain SELECT SQL_NO_CACHE messageblk, is_header FROM dbmail_messageblks WHERE physmessage_id=99842 ORDER BY messageblk_idnr;
+----+-------------+--------------------+------+-----------------------------------------------------+----------------------+---------+-------+------+-------------+
| id | select_type | table              | type | possible_keys                                       | key                  | key_len | ref   | rows | Extra       |
+----+-------------+--------------------+------+-----------------------------------------------------+----------------------+---------+-------+------+-------------+
|  1 | SIMPLE      | dbmail_messageblks | ref  | physmessage_id_index,physmessage_id_is_header_index | physmessage_id_index | 8       | const |    2 | Using where |
+----+-------------+--------------------+------+-----------------------------------------------------+----------------------+---------+-------+------+-------------+
1 row in set (0.01 sec)