Bug #36559 | INSERT .. SELECT takes more time to complete on versions 5.0.46 and newer | ||
---|---|---|---|
Submitted: | 7 May 2008 11:15 | Modified: | 10 May 2010 18:37 |
Reporter: | Victoria Reznichenko | Email Updates: | |
Status: | Not a Bug | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S1 (Critical) |
Version: | 5.0.46-5.0.60 | OS: | Any |
Assigned to: | CPU Architecture: | Any | |
Tags: | regression |
[7 May 2008 11:15]
Victoria Reznichenko
[7 May 2008 11:26]
MySQL Verification Team
File bug-36559.zip uploaded to ftp server.
[7 May 2008 12:55]
Robert Krzykawski
I feel that i have to add a comment to this bug, as it is affecting us very badly. It seems like it's verified that versions prior to 5.0.40 does not have this bug, so for now we solved this by running 5.0.32 (as we did before, but were advised to upgrade to 5.0.56 to solve some of our issues). It's not only that it is 8 times slower, it uses a lot more resources to complete. On a 5.0.32 server, issuing a select query as supplied in this bug causes 0,06 reads/s (during last 7 seconds in innodb status). The same query issued on the same data on 5.0.56 causes 2776 reads/s (also during last 7 seconds in innodb status). How many times more resources are used? 2776/0,06=46266,666... Now, what if you have a system doing 150-200 queries like this/s... We have large SAN's to support this particular system. While 5.0.32 gets an IO wait of approx 1% during peak hours, we get 99% IO wait on 5.0.56 during off peak hours. Converting those numbers to megabytes per second read from the SAN makes 45mb/s reads during off peak hours on 5.0.56 while 5.0.32 gives us around 1mb/s read during peak hours. We can conclude that 5.0.56 is unusable if you do sub selects like we do (supplied in the report) Anyone experiencing the same problem? We can't be the first.. We didn't find a good workaround for this issue other than downgrading. The subquery is really plain. There is just no other "good" way of issuing this query.. We can probably solve this in the application, but that would be kindof dirty..
[15 Sep 2009 12:56]
Olav Sandstå
I have done the following tests using the query from this bug report on MySQL 5.0.37 and MySQL 5.0.51a: The table that the query is run against contains 100.000 records. The time to execute the query (according to mysql) is respectively 0.01s (on 5.0.37) and 0.13s (on 5.0.51a). The output from "show engine innodb status" shows that InnoDB during this time reads 1 row using 5.0.37 and about 100.000 rows if using 5.0.51a. This indicates that instead of selecting a single row it now reads the entire table. To further illustrate the difference better I did: * Increased the size of the table to 1 million rows. Running the same query now still takes 0.01s using 5.0.37 but has increased to 1.63s using 5.0.51. Output from innodb status shows that the number of rows read is about 1 million. * Increased the size of the table to 10 million rows. The query still takes 0.01s using 5.0.37. Using 5.0.51a the query now takes about 18 seconds.
[21 Sep 2009 14:19]
Olav Sandstå
I have verified that this performance change was introduced by the following change committed as a fix for Bug#29095 "Insert into a select from a join b using(c) order by b.auto_id desc limit 1 fail": --- 1.533/sql/sql_select.cc 2007-06-20 13:06:21 +05:00 +++ 1.534/sql/sql_select.cc 2007-06-24 01:19:55 +05:00 @@ -1347,8 +1347,7 @@ JOIN::optimize() there are aggregate functions, because in all these cases we need all result rows. */ - ha_rows tmp_rows_limit= ((order == 0 || skip_sort_order || - test(select_options & OPTION_BUFFER_RESULT)) && + ha_rows tmp_rows_limit= ((order == 0 || skip_sort_order) && !tmp_group && !thd->lex->current_select->with_sum_func) ? select_limit : HA_POS_ERROR; By reversing this change the performance is back to what it was earlier and the number of records read from InnoDB goes down from a "full table scan" to just a single record. Need to understand this change better in order to see if it can be done in a way that does not influence the performance of this type of queries.
[10 May 2010 18:37]
Omer Barnir
There is no bug here. The baseline value mentioned above (0.03) is not a baseline that always existed in the code but an improvement that was done with the fix to bug#9676. This improvement caused a regression in other areas and had to be removed removed (fix to bug#29095). As such this bug described existing performance and not a performance regression.