Bug #29857 Optimizer is getting confused by extra WHERE condition
Submitted: 17 Jul 2007 21:11 Modified: 5 May 2009 18:20
Reporter: Gabriel Tataranu Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.41, 5.0.45, 5.0.67 OS:Linux (64 bit, gcc 4.1.1)
Assigned to: CPU Architecture:Any

[17 Jul 2007 21:11] Gabriel Tataranu
Description:
Query 1:
mysql>  explain select count(*) from t1 where  t1.gender = 'f' AND t1.date_of_birth <= '1989-07-17' AND t1.date_of_birth >= '1952-07-17'; 

mysql>  explain select count(*) from t1 where  t1.gender = 'f' AND t1.bdate <= '1989-07-17' AND t1.bdate >= '1952-07-17'; 
+----+-------------+------------+-------+------------------------------+----------------+---------+------+---------+--------------------------+
| id | select_type | table      | type  | possible_keys                | key            | key_len | ref  | rows    | Extra                    |
+----+-------------+------------+-------+------------------------------+----------------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | t1         | range | t1_helper_idx,t1_helper2_idx | t1_helper2_idx | 4       | NULL | 1789851 | Using where; Using index | 
+----+-------------+------------+-------+------------------------------+----------------+---------+------+---------+--------------------------+
1 row in set (0.00 sec)

Query 1 with one more WHERE condition:

mysql>  explain select count(*) from t1 where  t1.gender = 'f' AND t1.bdate <= '1989-07-17' AND t1.bdate >= '1952-07-17' AND t1.lge IN('e','m');
+----+-------------+------------+------+-----------------------------------------+----------------+---------+-------+---------+-------------+
| id | select_type | table      | type | possible_keys                           | key            | key_len | ref   | rows    | Extra       |
+----+-------------+------------+------+-----------------------------------------+----------------+---------+-------+---------+-------------+
|  1 | SIMPLE      | t1         | ref  | t1_helper_idx,t1_helper2_idx,t1_he3_idx | t1_helper2_idx | 1       | const | 8256338 | Using where | 
+----+-------------+------------+------+-----------------------------------------+----------------+---------+-------+---------+-------------+
1 row in set (0.00 sec)

Please note -on the second query- that index selected is the same as Query 1, the values in "key_len" and "rows" are just weird. The query run time is supporting the fact that second query goes (needlessly) trough about 5 times more rows than Query 1.

t1_helper2_idx is a combined index on (gender,bdate). As far as I can tell the value for "rows" does not match to any part of the index - gender or bdate.

How to repeat:
The table has been optimized/analyzed before running "explain". 

Suggested fix:
There is no fix.
[18 Jul 2007 6:45] Sveta Smirnova
Thank you for the report.

Version 5.0.41 is a bit old. Please upgrade to current version 5.0.45, try with it and if you can repeat described behaviour provide output of SHOW CREATE TABLE t1 and SHOW TABLE STATUS LIKE 't1'
[6 Aug 2007 10:53] Valeriy Kravchuk
Thank you for additional information. Can you try to create and upload smaller set of data that demonstates the behaviour described?
[7 Aug 2007 0:55] Gabriel Tataranu
I would, if I could, but I'm only allowed to upload 500KB. My test case has 86MB - gzipped- so I'm waiting for instructions on alternate upload procedure. Thanks.
[7 Aug 2007 6:21] Valeriy Kravchuk
Please, upload your file to ftp://ftp.mysql.com/pub/mysql/upload/, with bug #, 29857, in a file name, and inform me when done.
[7 Aug 2007 13:30] Gabriel Tataranu
Done. Please dispose of the data when done.
[1 Feb 2008 18:36] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.51a, and inform about the results.
[2 Mar 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[17 Mar 2008 10:12] Susanne Ebrecht
We still need to know if this issues still occur by using our newest version (5.0.51a).
[24 Mar 2008 17:22] Gabriel Tataranu
Well, it seems that this bug is fixed. It does show more records for the query with the extra "WHERE" but in a much smaller degree (2.9M records with extra "WHERE" vs 2.2M without. Note: the count gives 2.9M so the estimate seems a bit off for the query without the extra "WHERE"). However, the run time of the 2 queries is VERY different (19.3s for the query with extra "WHERE" vs 3.7s for the query without). The difference is from processing of the extra "WHERE". The slow logs is consistent with the "EXPLAIN" estimate. Overall a success, still something fishy is still present as why the row estimate is different for the 2 queries.
[9 Aug 2008 6:02] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.67, and inform about the results.
[20 Aug 2008 20:01] Gabriel Tataranu
The bug is back.

mysql> explain select count(*) from t1 where  t1.gender = 'f' AND t1.bdate <='1989-07-17' AND t1.bdate >= '1952-07-17';
+----+-------------+-------+-------+-----------------------------+---------------+---------+------+---------+--------------------------+
| id | select_type | table | type  | possible_keys               | key           | key_len | ref  | rows    | Extra                    |
+----+-------------+-------+-------+-----------------------------+---------------+---------+------+---------+--------------------------+
|  1 | SIMPLE      | t1    | range | t1_helper_idx,t1_dobgen_idx | t1_helper_idx | 4       | NULL | 2400885 | Using where; Using index | 
+----+-------------+-------+-------+-----------------------------+---------------+---------+------+---------+--------------------------+
1 row in set (0.00 sec)

mysql> explain select count(*) from t1 where  t1.gender = 'f' AND t1.bdate <='1989-07-17' AND t1.bdate >= '1952-07-17' AND t1.lge IN('e','m');
+----+-------------+-------+------+-------------------------------------+---------------+---------+-------+----------+-------------+
| id | select_type | table | type | possible_keys                       | key           | key_len | ref   | rows     | Extra       |
+----+-------------+-------+------+-------------------------------------+---------------+---------+-------+----------+-------------+
|  1 | SIMPLE      | t1    | ref  | t1_helper_idx,t1_dobgen_idx,t1g_idx | t1_helper_idx | 1       | const | 10764408 | Using where | 
+----+-------------+-------+------+-------------------------------------+---------------+---------+-------+----------+-------------+
1 row in set (0.00 sec)

Same result after "optimize table".
[5 Apr 2009 18:20] Valeriy Kravchuk
Please, try to repeat with a newer version, 5.0.77 at least, and inform about the results.
[5 May 2009 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".