Bug #25640 InnoDB: Analyze Table Should Allow User Selection of Index Dives
Submitted: 15 Jan 2007 22:07 Modified: 20 Jun 2010 17:23
Reporter: Bill Adams Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S4 (Feature request)
Version:All (5.0.27) OS:Linux (Linux, All)
Assigned to: Vasil Dimov CPU Architecture:Any
Tags: analyze table, cardinality, Contribution, feature request, innodb

[15 Jan 2007 22:07] Bill Adams
Description:
Running ANALYZE TABLE on InnoDB tables does 10 random dives into each index tree to determine cardinality. This seems to break on occasion on very long tables. (In my case a table with 8.5 billion records and the indexed column should average around 5-10k records per value.)

It would be nice to be able to run a query like: ANALYZE TABLE tbl DIVES=300; Far easier would be to implement a system variable that controls the number of dives for all tables.

Also, I think the 5.0 documentation has a bug as it says 10 dives but innobase/btr/btr0cur.c defines BTR_KEY_VAL_ESTIMATE_N_PAGES as 8.

See: http://dev.mysql.com/doc/refman/5.0/en/innodb-restrictions.html

Maybe I am misreading the code...

How to repeat:
Install Server.

Suggested fix:
Taking the easier route, adding a system variable to control the number of dives seems to be the right way to go. One could run "SET innodb_analyze_key_dives=20; ANALYZE TABLE tbl;" to achieve the same effect.
[15 Jan 2007 22:09] Bill Adams
Proposed Patch #1 For Dives Server Variable (Untested)

Attachment: key-dives.diff (text/x-patch), 5.19 KiB.

[30 Jan 2007 22:37] James Day
Bill, those dives are also done when the table is opened or when SHOW TABLE STATUS is done (and for other related operations).

What I'd like to see is the code in btr_estimate_number_of_different_key_vals starting at a smaller value of BTR_KEY_VAL_ESTIMATE_N_PAGES and looping the dives until it gets an estimated cardinality that doesn't change much (maybe BTR_KEY_VAL_ESTIMATE_MARGIN starting at 20%), plus at least some transitions or a limit on dives if there are still no transitions found. This way it would be fast (fewer dives) for indexes with high cardinality and only slow for those that are large (in rows or key size) with low cardinality.
[6 Feb 2007 18:00] Bill Adams
I will see if I can come up with a patch for that. Perhaps with a configurable parameter for the maximum number of dives allowed.
[2 Mar 2007 21:09] Bill Adams
Variable Key Dives Try #1 (Untested)

Attachment: variable-key-dives-1.patch (text/x-patch), 3.38 KiB.

[2 Mar 2007 21:34] Bill Adams
A couple of comments about the variable-key-dives-1.patch

The code compiles but I have not yet run it against a innodb table. Use at your own risk.

I am not sure that this patch will actually fix my problem. For the particular problem table I have, which is very big (13 billion rows), sometimes the cardinality gets set to eighteen (!). I can run analyze table again and it goes to 6.5M which is about right (on average about 8k rows in each index value). The problem is that somehow and only on occasion the random dives are hitting all the wrong index trees and giving a cardinality that is too low. That is why I want to be able to configure the minimum number of dives at run time.

So maybe a combination of the two patches?
[26 Jul 2007 19:08] Ian Kinnear
Hello, I'm wondering, were you able to test this patch? Is there any other solution?

I am having the same issue and it is a critical bug for me, because as soon as mysql starts to do a table scan after it decides not to use any index, all subsequent application select queries queue up behind the query (the query is a join, which unfortunately I can't change). Eventually we reach the maximum number of connections permitted which we have allowed and our app starts to return failure codes to the user. 
When analyze tables works and it uses the indexes, even if the query runs for a while other selects seem to be permitted and so the application continues to run.
[31 Jul 2007 18:06] Bill Adams
I have not heard anything back from the folks at MySQL. But I will try to revive my patch to make the number of key dives a configurable option because I still think that will best fix the problem.
[23 Jan 2008 14:09] Heikki Tuuri
Vasil should look at implementing this feature.
Regards, Heikki
[18 Mar 2009 0:23] Ken Jacobs
The new parameter innodb_stats_sample_pages was introduced in the InnoDB Plugin 1.0.2 to enable control of index cardinality estimates.
[18 Mar 2009 0:24] Ken Jacobs
The new parameter innodb_stats_sample_pages was introduced in the InnoDB Plugin 1.0.2 to enable control of index cardinality estimates.
[15 Jun 2010 8:18] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100615080459-smuswd9ooeywcxuc) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (merge vers: 5.1.47) (pib:16)
[15 Jun 2010 8:35] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615080558-cw01bzdqr1bdmmec) (version source revid:mmakela@bk-internal.mysql.com-20100415070122-1nxji8ym4mao13ao) (pib:16)