Bug #37759 Performance/Index problem updating from 5.1.22 to 5.1.25
Submitted: 1 Jul 2008 9:41 Modified: 1 Jul 2008 19:56
Reporter: Andreas Mller Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:5.1.25 OS:Any
Assigned to: CPU Architecture:Any

[1 Jul 2008 9:41] Andreas Mller
Description:
I have a InnoDB table with about 90 million recrods:

CREATE TABLE `workassociation` (
  `workassociation_id` int(11) NOT NULL AUTO_INCREMENT,
  `createdate` datetime DEFAULT NULL,
  `create_user_id` int(11) DEFAULT NULL,
  `editdate` datetime DEFAULT NULL,
  `edit_user_id` int(11) DEFAULT NULL,
  `workvalue_id` int(11) NOT NULL DEFAULT '0',
  `direction_id` int(11) NOT NULL DEFAULT '0',
  `sortnr` int(11) NOT NULL DEFAULT '0',
  `workprocess_id` int(11) NOT NULL DEFAULT '0',
  `value` double DEFAULT '0',
  `calc` tinyint(4) NOT NULL DEFAULT '1',
  `estimated` tinyint(4) NOT NULL DEFAULT '0',
  `view` tinyint(4) NOT NULL DEFAULT '1',
  PRIMARY KEY (`workassociation_id`),
  UNIQUE KEY `unique1` (`workvalue_id`,`direction_id`,`workprocess_id`),
  KEY `workvalue_id` (`workvalue_id`),
  KEY `workprocess_id` (`workprocess_id`),
  KEY `sel_index` (`workvalue_id`,`direction_id`)
) ENGINE=InnoDB AUTO_INCREMENT=143600478 DEFAULT CHARSET=latin1;

After moving the database to an other machine wir 5.1.25 instead of 5.1.22 I found a problem. Executing:

explain
SELECT *
FROM workassociation
WHERE workvalue_id=20260883
ORDER BY workassociation_id

shows that PRIMARY index is used instead of index "unique1" or "workvalue_id". There are only 2 rows for this workvalue_id so optimizer should take an other index that PRIMARY even the result set should be sortet by the PRIMARY column.

Executing:

explain
SELECT *
FROM workassociation
WHERE workvalue_id=20260883

show that index "unique1" will be used.

So I guess that there is a problem with either the index stats (OPTIMIZE TABLE has been running) or the optimizer.

On version 5.1.22 always the right index is used.

How to repeat:
Try with the table format above. Because the table is very big I can't give other examples.
[1 Jul 2008 13:28] Susanne Ebrecht
Andreas,

many thanks for writing a bug report.

But I can't see a bug here. Please correct me, when I am wrong.

UNIQUE KEY `unique1` (`workvalue_id`,`direction_id`,`workprocess_id`),

This will implicit means that you have three index possibilites:
1) workvalue_id (non unique)
2) workvalue_id and direction_id (non unique)
3) workvalue_id and direction_id and workprocess_id (unique)

So, you don't need an extra index for workvalue_id here. You can just remove this key here: KEY `workvalue_id` (`workvalue_id`), KEY `sel_index` (`workvalue_id`,`direction_id`) because they are duplicates.

You can read more about this here:
http://dev.mysql.com/doc/refman/5.1/en/mysql-indexes.html
http://dev.mysql.com/doc/refman/5.1/de/mysql-indexes.html

"If the table has a multiple-column index, any leftmost prefix of the index can be used by the optimizer to find rows. For example, if you have a three-column index on (col1, col2, col3), you have indexed search capabilities on (col1), (col1, col2), and (col1, col2, col3)."

When I understand you right, your problem was just this index stuff. You don't have performance issues. If you have them, please just feel free to re-open this bug report.
[1 Jul 2008 15:29] Andreas Mller
Hello Susanne,

thanks for you answer but I think you don't get the point - so I try so explain it a little bit better.

First: Yes you're right an I know that there are 2 indieces too much:
- KEY `workvalue_id` (`workvalue_id`)
- KEY `sel_index` (`workvalue_id`,`direction_id`)
can be removed. They are based on historical changes to the table but a ALTER TABLE tooks about 12 hours on this table so no one has removed this up today.

BUT: This ist not the Problem.

The Query:

SELECT *
FROM workassociation
WHERE workvalue_id=20260883
ORDER BY workassociation_id

uses the primary key:

  PRIMARY KEY (`workassociation_id`)

and not one of the key's with column "workvalue_id". This is wrong.

The query:

SELECT *
FROM workassociation
WHERE workvalue_id=20260883

(without the ORDER BY workassociation_id) uses key "unique1". This is right.

The record count of the query above is 2 - the complete table has about 90 million rows. So using the primary key with is based on column "workassociation_id" results in a full table scan over the complete table to select 2 rows. This could not be right because optimizer should see that he shoult take a index with column "workvalue_id".

This is not append in 5.1.22 - so there must be a change in the optimizer or the index statistics upto version 5.1.25.

And this I would call a bug.
[1 Jul 2008 18:50] Valeriy Kravchuk
Looks like a duplicate of bug #36259. Please, check.
[1 Jul 2008 19:52] Andreas Mller
It looks like that. Sorry for the doublicated report but I searched for similar bug report but found nothing.

If the report #36259 behaves like my problem and removing of the order by will course in optimizer is taking the right index than it is realy doublicated. This feature is't checkt in the other report. But Valeriy you had verified the other bug so you simply could check this.

This would also be good for me because I need a awe version an so I know with source i've to compile.
[1 Jul 2008 19:56] Valeriy Kravchuk
I think this is a duplicate (or a kind of) bug #36259.