Bug #75428 | InnoDB persistent statistics not persistent enough | ||
---|---|---|---|
Submitted: | 7 Jan 2015 1:24 | Modified: | 14 Jan 2015 17:18 |
Reporter: | Eric Bergen (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S2 (Serious) |
Version: | 5.6 | OS: | Any |
Assigned to: | CPU Architecture: | Any |
[7 Jan 2015 1:24]
Eric Bergen
[7 Jan 2015 1:27]
Eric Bergen
The persistent statistics back things like data_length and index_length in information_schema.tables. When the stats don't update these values aren't updated which makes them nearly useless.
[7 Jan 2015 15:32]
Morgan Tocker
Is there value in having a maximum age attached to statistics, as well as the 10% threshold? This age threshold could be checked when opening the table. My thought process is that as well as rows being modified outdating the statistics, there might be changes to the statistics collection (via changes to stats_sample_pages or as bugs are fixed etc.) The historical bug #43660 comes to mind.
[8 Jan 2015 0:09]
Harrison Fisk
More details also at: https://www.facebook.com/MySQLatFacebook/posts/10152729777891696
[14 Jan 2015 9:51]
Vasil Dimov
"Also provide a method for triggering a statistics update that doesn't include doing a full analyze table" -- but the statistics update that is done by InnoDB internally, when 10% of the table is changed, is exactly the same as to what happens when you run ANALYZE TABLE. Notice that by default the stats gathering (done internally by InnoDB or by ANALYZE TABLE) will sample 20 leaf pages from each index. For big tables this may be insufficient to get a good representation. It can be overridden on a per table basis with: ALTER TABLE t STATS_SAMPLE_PAGES=...;
[14 Jan 2015 17:18]
Eric Bergen
The counter that tracks the number of rows updated is not persistent (joke in the bug title) so when mysql is restarted the counter gets reset. If mysql is restarted before 10% of the rows are changed the statistics don't get updated. When you upgrade mysql as often as we do some tables never get updated. Analyze table doesn't work if you have a mix of long and short running queries. For example if you're taking a backup and run analyze table the analyze will successfully update the table statistics but every query that attempts to access the table until your backup finishes will get blocked on "Waiting for flush". My idea with the session level stats was to be able to get access to the newest values for data_length and index_length but without the pain of analyze table.
[26 Feb 2015 22:00]
Ben Krug
Here is a test case. I'm sure it can be simplified, I have a few unnecessary steps for demonstration, and it also requires 4 restarts of the database. Not sure of a good way to script that into test cases. (Note it is not 100% deterministic, since InnoDB stats are not 100% deterministic. But it worked in my trials.) DROP TABLE IF EXISTS stats; CREATE TABLE stats (a int primary key auto_increment, b int, key idx_b (b)); INSERT INTO stats (b) values (1),(2),(3),(4); show indexes from stats; --- (cardinality of 4 for b) UPDATE stats SET b=1; --- (flush twice, sometimes estimates are bad) flush tables; show indexes from stats; flush tables; show indexes from stats; --- (should see lower cardinality for b) UPDATE stats SET b=a; flush tables; show indexes from stats; flush tables; show indexes from stats; -- (cardinality should be back to 4 for b) --shutdown UPDATE STATS set b=5 where a=1; --shutdown UPDATE STATS set b=5 where a=2; --shutdown UPDATE STATS set b=5 where a=3; --shutdown UPDATE STATS set b=5 where a=4; flush tables; show indexes from stats; flush tables; show indexes from stats; -- (stats still show cardinality 4 for b, although b=5 in all rows now)