Bug #44604 Wrong number of results returned when executing query
Submitted: 1 May 2009 14:26 Modified: 16 May 2009 14:43
Reporter: Pim Pronk Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S1 (Critical)
Version:5.1.34, 5.0, 5.1 bzr, 5.4 OS:Linux (Cent OS v5)
Assigned to:
Tags: innodb, nested join, Optimizer, wrong results
Triage: Triaged: D2 (Serious)

[1 May 2009 14:26] Pim Pronk
Description:
We have an innodb table which has a reference to itself (parent_id). When executing the query listed below a wrong number of results is returned. If we change the first left join for table2 to a inner join, the query does return the correct number of records. The missing records are the records which have a parent_id. 
In our application, records in table1 with a parent_id almost never have a link to table4. And records without a parent_id always have a link to table4.

How to repeat:
Import attached database.

Run the following query:
SELECT count(t.table1_id), sum(if(t.parent_id is null, 0, 1))
FROM table1 AS t
LEFT JOIN (
table2 AS t2
INNER JOIN table3 AS t3 ON t3.table3_id = t2.table3_id
) ON t2.table2_id = t.table2_id
LEFT JOIN (
table4 AS t4
LEFT JOIN table5 AS t5 ON t5.table5_id = t4.table5_id
LEFT JOIN table6 AS t6 ON t6.table6_id = t4.table6_id
LEFT JOIN (
table7 AS t7
INNER JOIN table8 AS t8 ON t8.table8_id = t7.table8_id
) ON t7.table4_id = t4.table4_id
LEFT JOIN (
table9 AS t9
LEFT JOIN table10 AS t10 ON t10.table10_id = t9.table10_id
) ON t9.table4_id = t4.table4_id
) ON t4.table4_id = t.table4_id
WHERE t2.custom_id =1
GROUP BY t2.custom_id;
Result:
+--------------------+------------------------------------+
| count(t.table1_id) | sum(if(t.parent_id is null, 0, 1)) |
+--------------------+------------------------------------+
|               1986 |                                  0 |
+--------------------+------------------------------------+
1 row in set (0.03 sec)

Now change the LEFT JOIN for table2 to INNER JOIN:
SELECT count( t.table1_id ) , sum( if( t.parent_id IS NULL , 0, 1 ) ) 
FROM table1 AS t
INNER JOIN (
table2 AS t2
INNER JOIN table3 AS t3 ON t3.table3_id = t2.table3_id
) ON t2.table2_id = t.table2_id
LEFT JOIN (
table4 AS t4
LEFT JOIN table5 AS t5 ON t5.table5_id = t4.table5_id
LEFT JOIN table6 AS t6 ON t6.table6_id = t4.table6_id
LEFT JOIN (
table7 AS t7
INNER JOIN table8 AS t8 ON t8.table8_id = t7.table8_id
) ON t7.table4_id = t4.table4_id
LEFT JOIN (
table9 AS t9
LEFT JOIN table10 AS t10 ON t10.table10_id = t9.table10_id
) ON t9.table4_id = t4.table4_id
) ON t4.table4_id = t.table4_id
WHERE t2.custom_id =1
GROUP BY t2.custom_id;
+----------------------+-----------------------------------------+
| count( t.table1_id ) | sum( if( t.parent_id IS NULL , 0, 1 ) ) |
+----------------------+-----------------------------------------+
|                 1992 |                                       6 |
+----------------------+-----------------------------------------+
1 row in set (0.03 sec)

We expect both queries to return the same results because the left join should be a automatically converted to an inner join because of the where clause.
[1 May 2009 14:32] Pim Pronk
Database has been uploaded with ftp
[3 May 2009 1:17] Miguel Solorzano
Thank you for the bug report. Which file name have you uploaded?. Thanks in advance.
[3 May 2009 6:34] Pim Pronk
The filename I used is 'bug-data-44604.zip'. The zip contains a README.txt and a testdb.sql file.
You can also find it here: http://www.trentt.com/bug-data-44604.zip
[4 May 2009 5:58] Sveta Smirnova
Thank you for the report.

Verified as described. GROUP BY is not needed to repeat the problem: just WHERE. Everything other is required.

This is solved in version 6.0
[6 May 2009 7:38] Sveta Smirnova
Bug exists in version 5.4 as well.
[11 May 2009 14:27] Pim Pronk
Why is the target version of this bug 6.0-rc? 

I'd actually rather have NO data at all then data I cannot trust upon. Can you at least explain in detail why and when this bug is triggered?
[14 May 2009 19:37] Omer Barnir
Issue does not exist in 6.0 - it will not be fixed in 5.1
[14 May 2009 22:33] Pim Pronk
Ok, really. Wtf?!?! Can you at least give me an estimate on when MySQL-community v6.0 be released? 

If not and if in any other case, why won't you fix this in v5.1? Really, how can you take yourself seriously? Returning incorrent/incomplete data with queries which are not (yet) detectable is like saying don't use MySQL in an entreprise environment but use a database like Oracle or MSSQL instead.

If this update by Omer Barnir really represents MySQL AB/Sun Microsystems,Inc 's opinion, this is only prove for people saying opensource software is not yet mature or suitable for entreprise environments. And as a believer in open-source software I would really hate that.

Please, I almost would like to beg you: this bug is rated S1 (Critical) and D2 (Serious). At least give me a believable comment about why MySQL AB chooses not to fix this in v5.1 and/or v5.4.

And yes, if you didn't notice: I am very much disappointed by this very short and unsatisfying update/explanation..
[15 May 2009 11:31] Sergei Golubchik
The reason for "won't fix" - the bug was considered to be "low impact", it doesn't affect many queries or many users. And it was thought that the risk of adding new bugs to the GA version when fixing it overweight the benefits :(

But this bug is not present in 6.0.
And most probably it won't be in 5.4 either.
[16 May 2009 14:43] Pim Pronk
Still, when and why does this problem occur? Is this known or have you not looked into that? 

The reason I have to know this is that I then can prevent users from creating this kind of queries. Now we have detected this bug almost by accident, but it is possible that a lot of queries we are running are returning incorrect results. I just don't know this. For you it's maybe clear that in general this bug does not affect many queries. But in my situation with the information you have provided me so far it could be that if one of our query returns incorrect data all my queries are returning incorrect results. Do I have to check them all one by one or is there some kind of pattern when and why this bug occurs?

And again, when will this bug be fixed? If this bug is fixed in v5.4, when will v5.4 approx. become GA? If this bug is fixed in v6.0, when will v6.0 approx. become GA? You have to give me some date which I can tell the management...
[17 May 2009 9:48] Sergei Golubchik
Answering the second question: according to the new server development cycle
(http://forge.mysql.com/wiki/Development_Cycle and pdf presentation referenced there) a new milestone release (5.4, 5.5, 5.6, etc) is planned every 3-6 months.