Bug #63905 | handler_read_next is suprisingly high and slows queries | ||
---|---|---|---|
Submitted: | 2 Jan 2012 23:47 | Modified: | 16 Jan 2012 18:34 |
Reporter: | Philip Gladstone | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Optimizer | Severity: | S5 (Performance) |
Version: | 5.1.60 | OS: | Windows |
Assigned to: | CPU Architecture: | Any |
[2 Jan 2012 23:47]
Philip Gladstone
[3 Jan 2012 0:07]
MySQL Verification Team
Please try version 5.1.60 and comment the results her. Thanks.
[3 Jan 2012 2:32]
Philip Gladstone
I just tried with 5.1.60 on Windows and I get the same results.
[3 Jan 2012 2:43]
Philip Gladstone
I just tried 5.5.19 on windows (using MyISAM storage engine) and it gives the same large numbers (though there are now 7 status variables that match 'handler_read%')
[3 Jan 2012 2:47]
Philip Gladstone
[You have to remove some of the sql_no_cache clauses to make the test case work in 5.5.19. It appears that SQL_NO_CACHE is not permitted in subqueries]
[3 Jan 2012 11:03]
MySQL Verification Team
Please print here your results. Thanks.
[3 Jan 2012 13:39]
Philip Gladstone
The results from running the command file on 5.1.60 on Windows. I have annotated the output. Query f2 = 1 3 Variable_name Value Handler_read_first 0 Handler_read_key 1 * A single key read is all that is needed Handler_read_next 0 Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Query f2 = 10 10 Variable_name Value Handler_read_first 0 Handler_read_key 1 * Same here Handler_read_next 0 Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Query f2 in (1,10) 10 Variable_name Value Handler_read_first 0 Handler_read_key 2 * Two key reads is all that is *needed* Handler_read_next 1536 * but it appears to be scanning some table Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Complex union 10 Variable_name Value Handler_read_first 0 Handler_read_key 2 * We can eliminate the scan with the nasty UNION Handler_read_next 0 Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 6 * I'm not sure about these, but the number is small. Query f3 = a (using subquery) 3 Variable_name Value Handler_read_first 0 Handler_read_key 2 * Should be two -- one for each table Handler_read_next 1 * Not sure about this, but number is small Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Query f3 = a (using join) 3 Variable_name Value Handler_read_first 0 Handler_read_key 2 * one for each table Handler_read_next 1025 * What is it scanning? Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Query f3 = c (using subquery) 10 Variable_name Value Handler_read_first 0 Handler_read_key 2 Handler_read_next 1 Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Query f3 = c (using join) 10 Variable_name Value Handler_read_first 0 Handler_read_key 2 Handler_read_next 513 Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Query f2 IN subquery 10 Variable_name Value Handler_read_first 1 Handler_read_key 3072 * Seems to be doing a lot of key operations Handler_read_next 3584 * and even more scanning! Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 0 Query using complex join 10 Variable_name Value Handler_read_first 0 Handler_read_key 4 * We can get the numbers small with the UNION approach Handler_read_next 2 Handler_read_prev 0 Handler_read_rnd 0 Handler_read_rnd_next 6
[3 Jan 2012 13:40]
Philip Gladstone
SQL commands to demonstrate problem. Run against empty database.
Attachment: ss.sql (application/octet-stream, text), 4.32 KiB.
[14 Jan 2012 9:58]
Sveta Smirnova
Thank you for the feedback. Hanler_read_next is increased when engine reads next row from index. So if index is used for f2 in (1,10) it has to read all index entries with f2 which satisfy condition, then find max value. In the test case provided there are 1536 such entries: select sql_no_cache count(f2) as 'Query f2' from t1 where f2 in (1,10); Query f2 1536 So having 1536 in Handler_read_next is expected. This is not whole index, neither data file. Regarding to cases when f2=CONSTANT optimizer uses special optimization: explain extended select sql_no_cache max(f1) as 'Query f2 = 10' from t1 where f2 = 10; id select_type table type possible_keys key key_len ref rows filtered Extra 1 SIMPLE NULL NULL NULL NULL NULL NULL NULL NULL Select tables optimized away This is why you don't see such increase in this case. So this is not a bug.
[15 Jan 2012 16:32]
Philip Gladstone
I'd like to reopon this as a performance enhancement request. It seems that the OR handling in the optimizer is rather weak. Is it always the case that SELECT x FROM y WHERE a OR b should be rewritten (for performance) as SELECT x FROM y WHERE a UNION SELECT x FROM y WHERE b where 'a' and 'b' are mutually exclusive?
[16 Jan 2012 18:34]
Sveta Smirnova
Thank you for the feedback. Verified as performance enhancement request. > Is it always the case that > > SELECT x FROM y WHERE a OR b > > should be rewritten (for performance) as > > SELECT x FROM y WHERE a UNION SELECT x FROM y WHERE b > > where 'a' and 'b' are mutually exclusive? Not, I don't think this would be a case, say, if a or b query unique key. Also I don't think this would be faster for more than 2 conditions even in setup similar to discussed in this report.