Bug #60447 | why are aggregate MAX(...) results different when using different "ORDER BY" ? | ||
---|---|---|---|
Submitted: | 13 Mar 2011 3:29 | Modified: | 17 Feb 2013 9:59 |
Reporter: | Jason Vas Dias | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server | Severity: | S4 (Feature request) |
Version: | 5.1.56 | OS: | Linux (mysql-server-5.1.56-1.el5.art) |
Assigned to: | CPU Architecture: | Any |
[13 Mar 2011 3:29]
Jason Vas Dias
[13 Mar 2011 4:08]
Jason Vas Dias
workaround : mysql> select id, msg, dt from a where ( dt = ( select max(dt) as dt from ( select * from a where msg="msg2" order by dt asc ) as al ) ) ; +----+------+---------------------+ | id | msg | dt | +----+------+---------------------+ | 3 | msg2 | 2011-03-13 04:04:46 | +----+------+---------------------+ 1 row in set (0.00 sec) mysql> select id, msg, dt from a where ( dt = ( select max(dt) as dt from ( select * from a where msg="msg2" order by dt desc ) as al ) ) ; +----+------+---------------------+ | id | msg | dt | +----+------+---------------------+ | 3 | msg2 | 2011-03-13 04:04:46 | +----+------+---------------------+ 1 row in set (0.00 sec) I think this issue needs better documentation and a better workaround.
[13 Mar 2011 5:03]
Jason Vas Dias
actually workaround should be : mysql> select id, msg, dt from a where ( ( dt = ( select max(dt) as dt from ( select * from a where msg="msg2" order by dt desc ) as al ) ) and (msg="msg2")) ; +----+------+---------------------+ | id | msg | dt | +----+------+---------------------+ | 3 | msg2 | 2011-03-13 04:04:46 | +----+------+---------------------+ ie. if you wanted to find the time and id of the latest "msg2" record , and not the id of the latest record. This is a nasty workaround ! Surely there must be a simpler way ?
[13 Mar 2011 10:48]
Valeriy Kravchuk
In your queries: select id, msg, max(dt) from ( select * from a order by dt desc ) as oa ; you use id and msg scalar columns, as well as max(dt) aggregate column, but you do not use GROUP BY. As a result, entire result set is considered one group, but in frames of this group id and message values are different. Thus MySQL server is free to use them from any row in the group, maybe the first one it gets from the derived table oa, and we see that order of rows in derived table oa matters... This is documented at http://dev.mysql.com/doc/refman/5.1/en/group-by-hidden-columns.html: "When using this feature, all rows in each group should have the same values for the columns that are ommitted from the GROUP BY part. The server is free to return any value from the group, so the results are indeterminate unless all values are the same." This is not a bug formally.
[13 Mar 2011 20:23]
Jason Vas Dias
Thanks to last commenter for the info - much appreciated ! I guess I agree that formally this is not a bug - or it is one of those "bugs which are features" or vice versa - but I really wish there was some log level setting that would throw an exception / error if the server ever finds itself having to return an "indeterminate result". And if anyone else finds themselves having to determine the latest record with a msg field of a given type, the final workaround SQL is : select id, msg, max(dt) from a where ( msg = ?) group by(msg) ; and without the "group by" the server finds itself returing an indeterminate result - it would be really nice if it could let us know that!
[13 Mar 2011 21:54]
Jason Vas Dias
My FINAL fix for this issue : DROP PROCEDURE IF EXISTS `LatestCommand` ; DELIMITER | CREATE PROCEDURE `LatestCommand`( IN Esn VARCHAR(10), IN Status ENUM("Pend","Sent","Done","Fail","Received","Confirm","Confirmed") , IN Command VARCHAR(32) ) BEGIN SELECT MAX(`id`) AS id, MAX(`sent_datetime`) AS sent_datetime , `change_datetime`, `sms_command`, `status`, `esn` as esn, `lm_sequence_id`, `lmu_ip`, `lm_message_type`, `lm_message_data`, `reply`, `client_sms_message` FROM ( SELECT * FROM `command_transaction` WHERE ( IF((Esn != NULL),(`command_transaction`.`esn` = Esn),TRUE) AND IF((Status != NULL),(`command_transaction`.`status` = Status),TRUE) AND IF((Command != NULL),(`command_transaction`.`sms_command` = Command),TRUE) ) ORDER BY `sent_datetime` ASC ) AS maxcmd; END | DELIMITER ;
[13 Mar 2011 22:19]
Jason Vas Dias
My FINAL workaround / fix for this issue : DROP PROCEDURE IF EXISTS `LatestCommand` ; DELIMITER | CREATE PROCEDURE `LatestCommand`( IN Esn VARCHAR(10), IN Status SET ("Pend","Sent","Done","Fail","Received","Confirm","Confirmed") , IN Command SET ("start","locate","unlock","trunk","sunroof","freeze","wild") ) BEGIN SELECT MAX(`id`) AS id, MAX(`sent_datetime`) AS sent_datetime , `change_datetime`, `sms_command`, `status`, `esn` as esn, `lm_sequence_id`, `lmu_ip`, `lm_message_type`, `lm_message_data`, `reply`, `client_sms_message` FROM ( SELECT * FROM `command_transaction` WHERE ( IF((Esn != NULL),(`command_transaction`.`esn`=Esn),TRUE) AND IF((Status != NULL),(`command_transaction`.`status` IN ( Status )),TRUE) AND IF((Command != NULL),(`command_transaction`.`sms_command` IN ( Command )),TRUE) ) GROUP BY `command_transaction`.`esn` ) AS maxcmd ; END | DELIMITER ;
[14 Mar 2011 8:11]
Valeriy Kravchuk
What about ONLY_FULL_GROUP_BY SQL mode (http://dev.mysql.com/doc/refman/5.5/en/server-sql-mode.html#sqlmode_only_full_group_by)?
[14 Apr 2011 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[15 May 2011 23:00]
Bugs System
No feedback was provided for this bug for over a month, so it is being suspended automatically. If you are able to provide the information that was originally requested, please do so and change the status of the bug back to "Open".
[28 May 2011 13:34]
Jason Vas Dias
Now I'm getting more weirdness even with "GROUP BY" : I have these two procedures defined : DROP PROCEDURE IF EXISTS `LatestCommand` ; DELIMITER | CREATE PROCEDURE `LatestCommand` ( IN Esn VARCHAR(10), IN Status ENUM('Pend','Sent','Done','Fail','Received','Confirm','Confirmed') , IN Command VARCHAR(32) ) BEGIN SELECT `id`, MAX(`sent_datetime`) AS sent_datetime, `change_datetime`, `sms_command`, `status`, `esn` as esn, `lm_sequence_id`, `lmu_ip`, `reply`, `lm_message_type`, `lm_message_data`, `client_sms_message` FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = Esn) AND IF(Status IS NULL,TRUE,`command_transaction`.`status` = Status) AND IF(Command IS NULL,TRUE,`command_transaction`.`sms_command` = Command) ) GROUP BY `command_transaction`.`esn` ORDER BY `command_transaction`.`sent_datetime` DESC ; END | DELIMITER ; DROP PROCEDURE IF EXISTS `LatestCommands` ; DELIMITER | CREATE PROCEDURE `LatestCommands`(IN Esn VARCHAR(10), IN Status ENUM('Pend','Sent','Done','Fail','Received','Confirm','Confirmed') , IN Command VARCHAR(32) ) BEGIN SELECT `id`, `sent_datetime`, `change_datetime`, `sms_command`, `status`, `esn` as esn, `lm_sequence_id`, `lmu_ip`, `reply`, `lm_message_type`, `lm_message_data`, `client_sms_message` FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = Esn) AND IF(Status IS NULL,TRUE,`command_transaction`.`status` = Status) AND IF(Command IS NULL,TRUE,`command_transaction`.`sms_command` = Command) ) ORDER BY `sent_datetime` DESC ; END | DELIMITER ; And LatestCommands, WITH an ORDER BY, returns different results to LatestCommand, WITHOUT and ORDER BY : $ echo 'call LatestCommand("1431001062",NULL,NULL);' | mysql --database XXXXXXXX id sent_datetime change_datetime sms_command `status` esn lm_sequence_id lmu_ip reply lm_message_type lm_message_data client_sms_message 6 2011-05-28 02:49:45 2010-11-18 12:00:06 locate NULL 1431001062 114 10.16.26.1:20510 0 LocateReport 830514310010620101020800724CE569A54CE569A51F148908C06B23E20000C3FA000000000030090002D0FFBD2F080E000000 GPS CURRENT STATIONARY\n Lat:52.1439496N\nLong:-106.6720286W\nhttps://www.text2car.com/@kPa\n $ echo 'call LatestCommands("1431001062",NULL,NULL);' | mysql --database text2car | head -n 2 id sent_datetime change_datetime sms_command `status` esn lm_sequence_id lmu_ip reply lm_message_type lm_message_data client_sms_message 61605 2011-05-28 02:49:45 2011-05-27 21:49:45 EVENT:REBOOT NULL 1431001062 1 10.16.26.98:20510 0 EventReport 830514310010620101010200014DE062A74DE0629F1F14B92AC077E7A10000000000000000000000260000FF8F00000E080A520000 REBOOT\nGPS CURRENT STATIONARY\n Lat:52.1451818N\nLong:-106.5883743W\nhttps://www.text2car.com/@mXc\n
[28 May 2011 13:44]
Jason Vas Dias
So even when LatestCommand is defined as in previous comment, or like this : -------------- DROP PROCEDURE IF EXISTS `LatestCommand` -------------- -------------- CREATE PROCEDURE `LatestCommand` ( IN Esn VARCHAR(10), IN Status ENUM('Pend','Sent','Done','Fail','Received','Confirm','Confirmed') , IN Command VARCHAR(32) ) BEGIN SELECT `id`, MAX(`sent_datetime`) AS sent_datetime, `change_datetime`, `sms_command`, `status`, `esn` as esn, `lm_sequence_id`, `lmu_ip`, `reply`, `lm_message_type`, `lm_message_data`, `client_sms_message` FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = Esn) AND IF(Status IS NULL,TRUE,`command_transaction`.`status` = Status) AND IF(Command IS NULL,TRUE,`command_transaction`.`sms_command` = Command) ) GROUP BY `command_transaction`.`esn` ; END -------------- ie. with or without any ORDER BY , the query incorrectly and silently returns the maximum 'sent_datetime' value overall records and any values ( I think from the FIRST record returned by the query ) for all other fields . How can I get it to return just the values from the record having `sent_datetime` equal to MAX(`sent_datetime`) ?
[28 May 2011 13:45]
Jason Vas Dias
And adding 'HAVING `sent_datetime` = MAX(`sent_datetime`)' to LatestCommand has no effect.
[28 May 2011 13:57]
Jason Vas Dias
OK, I think I should raise another MAJOR bug about this - this query now enters an infinite loop : $ mysql --database text2car -v < routine.LatestCommand.sql -------------- DROP PROCEDURE IF EXISTS `LatestCommand` -------------- -------------- CREATE PROCEDURE `LatestCommand` ( IN Esn VARCHAR(10), IN Status ENUM('Pend','Sent','Done','Fail','Received','Confirm','Confirmed') , IN Command VARCHAR(32) ) BEGIN SELECT * FROM `command_transaction` WHERE ( ( `id`, `sent_datetime` ) IN (SELECT MAX(`id`) AS `id`, MAX(`sent_datetime`) AS sent_datetime FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = Esn) AND IF(Status IS NULL,TRUE,`command_transaction`.`status` = Status) AND IF(Command IS NULL,TRUE,`command_transaction`.`sms_command` = Command) ) GROUP BY `command_transaction`.`esn` ) ); END -------------- [ root@:~/src/sql !] $ echo 'call LatestCommand("1431001062",NULL,NULL);' | mysql --database text2car Ctrl-C -- sending "KILL QUERY 29405" to server ... Ctrl-C -- query aborted. ERROR 1317 (70100) at line 1: Query execution was interrupted
[28 May 2011 14:00]
Valeriy Kravchuk
Please, send the results of EXPLAIN for the problematic query, with parameters replaced by their values from CALL statement.
[28 May 2011 14:33]
Jason Vas Dias
thanks for responding, Valeriy - Is this what you mean : mysql> EXPLAIN SELECT * FROM `command_transaction` WHERE ( ( `id`, `sent_datetime` ) IN (SELECT MAX(`id`) AS `id`, MAX(`sent_datetime`) AS sent_datetime FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = "1431001062") ) GROUP BY `command_transaction`.`esn` ORDER BY `command_transaction`.`sent_datetime` DESC ) ); -------------- EXPLAIN SELECT * FROM `command_transaction` WHERE ( ( `id`, `sent_datetime` ) IN (SELECT MAX(`id`) AS `id`, MAX(`sent_datetime`) AS sent_datetime FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = "1431001062") ) GROUP BY `command_transaction`.`esn` ORDER BY `command_transaction`.`sent_datetime` DESC ) ) -------------- +----+--------------------+---------------------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+--------------------+---------------------+------+---------------+------+---------+------+-------+-------------+ | 1 | PRIMARY | command_transaction | ALL | NULL | NULL | NULL | NULL | 26065 | Using where | | 2 | DEPENDENT SUBQUERY | command_transaction | ALL | NULL | NULL | NULL | NULL | 26065 | Using where | +----+--------------------+---------------------+------+---------------+------+---------+------+-------+-------------+ 2 rows in set (0.00 sec) mysql>
[28 May 2011 14:36]
Jason Vas Dias
this also enters infinite loop ( waited for 2mins, saw mysqld process growing, pressed CTRL+C ): mysql> SELECT * FROM `command_transaction` WHERE ( ( `id`, `sent_datetime` ) IN (SELECT MAX(`id`) AS `id`, MAX(`sent_datetime`) AS sent_datetime FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = "1431001062") ) GROUP BY `command_transaction`.`esn` ORDER BY `command_transaction`.`sent_datetime` DESC ) ); -------------- SELECT * FROM `command_transaction` WHERE ( ( `id`, `sent_datetime` ) IN (SELECT MAX(`id`) AS `id`, MAX(`sent_datetime`) AS sent_datetime FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = "1431001062") ) GROUP BY `command_transaction`.`esn` ORDER BY `command_transaction`.`sent_datetime` DESC ) ) -------------- Ctrl-C -- sending "KILL QUERY 29473" to server ... Ctrl-C -- query aborted. ERROR 1317 (70100): Query execution was interrupted mysql>
[28 May 2011 14:39]
Valeriy Kravchuk
Based on the execution plan, your query had to scan entire table, and for each row selected it has to scan the entire table again. This gives: mysql> select 26065*26065; +-------------+ | 26065*26065 | +-------------+ | 679384225 | +-------------+ 1 row in set (0.00 sec) that is, 679+ millions of individual row reads... Surely it takes some time for server or process. Leave it working for a day :) But where is the MySQL server bug here?
[28 May 2011 15:14]
Jason Vas Dias
WORKAROUND FOUND : $ mysql -v --database text2car < routine.LatestCommand.sql -------------- DROP PROCEDURE IF EXISTS `LatestCommand` -------------- -------------- CREATE PROCEDURE `LatestCommand` ( IN Esn VARCHAR(10), IN Status ENUM('Pend','Sent','Done','Fail','Received','Confirm','Confirmed') , IN Command VARCHAR(32) ) BEGIN SELECT * FROM `command_transaction` WHERE ( `id` = (SELECT MAX(`id`) AS `id` FROM `command_transaction` WHERE ( (`command_transaction`.`esn` = Esn) AND IF(Status IS NULL,TRUE,`command_transaction`.`status` = Status) AND IF(Command IS NULL,TRUE,`command_transaction`.`sms_command` = Command) ) GROUP BY `command_transaction`.`esn` ORDER BY `command_transaction`.`sent_datetime` DESC ) ); END -------------- $ echo 'call LatestCommand("1431001062",NULL,NULL);' | mysql --database text2car id esn lm_sequence_id status sent_datetime change_datetime reply sms_command source_msisdn dest_msisdn lmu_ip lm_message_type lm_message_data client_sms_message 61605 1431001062 1 Received 2011-05-28 02:49:45 2011-05-27 21:49:45 0 EVENT:REBOOT 13063719079 10.16.26.98:20510 EventReport 830514310010620101010200014DE062A74DE0629F1F14B92AC077E7A10000000000000000000000260000FF8F00000E080A520000 REBOOT\nGPS CURRENT STATIONARY\n Lat:52.1451818N\nLong:-106.5883743W\nhttps://www.text2car.com/@mXc\n
[28 May 2011 15:20]
Jason Vas Dias
Why does it appear to be so impossible to achieve what I'm trying to do here in a single query without sub-queries ?
[28 May 2011 15:22]
Jason Vas Dias
Why is the server allowing a query that it knows contains potentially random data to proceed ?
[17 Feb 2013 9:59]
MySQL Verification Team
You need to set sql_mode='ONLY_FULL_GROUP_BY' and an error will be thrown. At least in 5.6, the installer(s) create a my.cnf containing this setting by default.