| Bug #32400 | Complex SELECT query returns correct result only on some occasions | ||
|---|---|---|---|
| Submitted: | 14 Nov 2007 22:49 | Modified: | 14 Jan 2008 19:24 |
| Reporter: | huji huji | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Optimizer | Severity: | S2 (Serious) |
| Version: | 5.0.41/5.0BK/5.1BK | OS: | Any |
| Assigned to: | Georgi Kodinov | CPU Architecture: | Any |
| Tags: | bad result, complex, nested, SELECT | ||
[15 Nov 2007 2:46]
MySQL Verification Team
Thank you for the bug report. Server 5.0 and 5.1 affected, server 4.1 gives
expected result:
[miguel@skybr 5.1]$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.23-rc-debug Source distribution
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> create index temp_index
-> on temp(rev_timestamp);
Query OK, 1939 rows affected (0.03 sec)
Records: 1939 Duplicates: 0 Warnings: 0
mysql> select rev_page as pg2, min(rev_timestamp) as m2 from temp
-> where rev_timestamp > (select min(rev_timestamp) from temp where rev_page = pg2)
-> and rev_page < 4
-> group by pg2
-> order by pg2 asc;
Empty set (6.22 sec)
mysql>
[miguel@skybr 4.1]$ bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A
Welcome to the MySQL monitor. Commands end with ; or \g.
Your MySQL connection id is 2 to server version: 4.1.24-debug
Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
mysql> create index temp_index
-> on temp(rev_timestamp);
Query OK, 1939 rows affected (0.01 sec)
Records: 1939 Duplicates: 0 Warnings: 0
mysql> select rev_page as pg2, min(rev_timestamp) as m2 from temp
-> where rev_timestamp > (select min(rev_timestamp) from temp where rev_page = pg2)
-> and rev_page < 4
-> group by pg2
-> order by pg2 asc;
+------+----------------+
| pg2 | m2 |
+------+----------------+
| 0 | 20070430155538 |
| 1 | 20060113175304 |
| 2 | 20060113175851 |
| 3 | 20060730182601 |
+------+----------------+
4 rows in set (3.07 sec)
mysql>
[15 Nov 2007 2:50]
MySQL Verification Team
Data file to populate table
Attachment: bug32400.zip (application/zip, text), 8.28 KiB.
[15 Nov 2007 2:53]
MySQL Verification Team
create table temp( rev_page int(8) unsigned, rev_timestamp varchar(14) ); load data infile 'f168cd947.txt' into table temp; create index temp_index on temp(rev_timestamp); select rev_page as pg2, min(rev_timestamp) as m2 from temp where rev_timestamp > (select min(rev_timestamp) from temp where rev_page = pg2) and rev_page < 4 group by pg2 order by pg2 asc;
[20 Nov 2007 17:18]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/38157 ChangeSet@1.2575, 2007-11-20 19:18:21+02:00, gkodinov@magare.gmz +11 -0 Bug #32400: Complex SELECT query returns correct result only on some occasions Referencing an element from the SELECT list in a WHERE clause is not permitted. The namespace of the WHERE clause is the table columns only. This was not enforced correctly when resolving outer references in sub-queries. Fixed by not allowing references to aliases in a sub-query in WHERE.
[20 Nov 2007 20:02]
huji huji
Thanks for your reply here. If I don't get things wrong here, the solution for this problem has been not to allow running a query like what I posted above. In this case, I would like to know if there is an alternative way for getting the "second minimum" value, like what I described above. Please advise.
[22 Nov 2007 16:04]
Georgi Kodinov
Hi, What's prohibited is the use of aliases in WHERE. Here's how to rewrite your query for the second minimum without aliases in WHERE: select rev_page pg2, min(rev_timestamp) as m2 from temp temp_outer where rev_timestamp > (select min(rev_timestamp) from temp where rev_page = temp_outer.rev_page) and rev_page < 4 group by pg2 order by pg2 asc;
[22 Nov 2007 19:58]
huji huji
Thank you Georgi, but the command you offered here returns "nothing" again :( Use the same temp table provided above. I think even preventing usage of aliases in WHERE hasn't solved the issue completely.
[23 Nov 2007 13:49]
Georgi Kodinov
Please try with a newer version. With my fix applied I'm getting: create table temp( rev_page int(8) unsigned, rev_timestamp varchar(14) ) -------------- Query OK, 0 rows affected (0.00 sec) -------------- load data infile 'f168cd947.txt' into table temp -------------- Query OK, 1939 rows affected, 1938 warnings (0.01 sec) Records: 1939 Deleted: 0 Skipped: 0 Warnings: 1938 -------------- create index temp_index on temp(rev_timestamp) -------------- Query OK, 1939 rows affected (0.01 sec) Records: 1939 Duplicates: 0 Warnings: 0 -------------- select rev_page as pg2, min(rev_timestamp) as m2 from temp outr where rev_timestamp > (select min(rev_timestamp) from temp where rev_page = outr.rev_page) and rev_page < 4 group by pg2 order by pg2 asc -------------- +------+----------------+ | pg2 | m2 | +------+----------------+ | 0 | 20070430155538 | | 1 | 20060113175304 | | 2 | 20060113175851 | | 3 | 20060730182601 | +------+----------------+ 4 rows in set (2.56 sec)
[23 Nov 2007 17:29]
huji huji
I repeated the process you said, and used the same query. What I get is just like yours, except for the last step (nested select) which returns: pg2 m2 ------ ----- 0 record(s) selected [Fetch MetaData: 0/ms] [Fetch Data: 0/ms] [Executed: 11/23/07 8:58:55 PM GMT+03:30 ] [Execution: 4797/ms] Ran on Windows XP SP2 and MySQL 5.0.41-community-nt
[23 Nov 2007 17:41]
huji huji
I repeated the same procedure on another machine, and it worked:
mysql> create table temp( rev_page int(8) unsigned, rev_timestamp varchar(14) );Query OK, 0 rows affected (0.04 sec)
mysql> load data infile 'f168cd947.txt' into table temp;
Query OK, 1939 rows affected, 1938 warnings (0.09 sec)
Records: 1939 Deleted: 0 Skipped: 0 Warnings: 1938
mysql> create index temp_index on temp(rev_timestamp);
Query OK, 1939 rows affected (0.11 sec)
Records: 1939 Duplicates: 0 Warnings: 0
mysql> select rev_page as pg2, min(rev_timestamp) as m2 from temp outr
-> where rev_timestamp > (select min(rev_timestamp) from temp where rev_page =
-> outr.rev_page)
-> and rev_page < 4
-> group by pg2
-> order by pg2 asc;
+------+----------------+
| pg2 | m2 |
+------+----------------+
| 0 | 20070430155538 |
| 1 | 20060113175304 |
| 2 | 20060113175851 |
| 3 | 20060730182601 |
+------+----------------+
4 rows in set (2.11 sec)
mysql> select version();
+------------------------------+
| version() |
+------------------------------+
| 5.0.38-Ubuntu_0ubuntu1.1-log |
+------------------------------+
1 row in set (0.01 sec)
I'm not sure why it still exists on my Windows MySQL installation. Please also notice that the version of the database differs in Windows and Ubuntu installations.
Seeking your advice,
huji
[14 Dec 2007 8:14]
Bugs System
Pushed into 5.0.54
[14 Dec 2007 8:17]
Bugs System
Pushed into 5.1.23-rc
[14 Dec 2007 8:21]
Bugs System
Pushed into 6.0.5-alpha
[14 Jan 2008 19:24]
Jon Stephens
Documented in the 5.0.54, 5.1.23, and 6.0.5 changelogs as follows:
Referencing within a subquery an alias used in the
SELECT list of the outer query was incorrectly permitted.

Description: The structure of the table I quried is like this (table name is "revision"): Field Type Null Key Default Extra -------------- ------------------- ------- ------ ---------- -------- ... rev_page int(8) unsigned NO 0 ... rev_timestamp varchar(14) NO ... For each rev_page, there are several records, which differ in their rev_timestamp. I could simply get the minimum value of rev_timestamp for each rev_page like this: select rev_page as pg1, min(rev_timestamp) as m1 from revision group by pg1 Then, I tried to get the "second minimum" value of rev_timestamp for each rev_page, like this: select rev_page as pg2, min(rev_timestamp) as m2 from revision where rev_timestamp > (select min(rev_timestamp) from revision where rev_page = pg2) group by pg2 This is where the bug appeared. The query returns zero records on most occasions (and it only takes part of a second for it to execute, and it never gets an error message). However, if you re-execute the query over and over again, on some occasions it takes a few seconds, and returns correct results like this: pg2 m2 ------ -------------- 0 20070430155538 1 20060113175304 2 20060113175851 3 20060113175520 4 20060113175919 5 20060113180021 6 20060113180645 7 20060113203013 ... How to repeat: Create a table like this: create table temp( rev_page int(8) unsigned, rev_timestamp varchar(14) ); Then import the data from the file I've uploaded to the following link: http://pastebin.com/pastebin.php?dl=f168cd947 It is 33KB in size, and contains 1939 lines of data. After importing this into the temp table, create an index: create index temp_index on temp(rev_timestamp); finally, try this minimal query: select rev_page as pg2, min(rev_timestamp) as m2 from temp where rev_timestamp > (select min(rev_timestamp) from temp where rev_page = pg2) and rev_page < 4 group by pg2 order by pg2 asc; Expected output is as follows: pg2 m2 ------ -------------- 0 20070430155538 1 20060113175304 2 20060113175851 3 20060730182601 If you get this: pg2 m2 ------ ----- 0 record(s) selected [Fetch MetaData: 0/ms] [Fetch Data: 0/ms] then you are seeing what I'm reporting as a bug here.