Bug #43664 SQLTables is very slow due to SHOW TABLE STATUS
Submitted: 16 Mar 2009 6:56 Modified: 24 May 2010 13:26
Reporter: Bogdan Degtyariov Email Updates:
Status: Closed Impact on me:
None 
Category:Connector / ODBC Severity:S4 (Feature request)
Version:5.1 OS:Any
Assigned to: Bogdan Degtyariov CPU Architecture:Any
Tags: information_schema, show table status, SQLTables

[16 Mar 2009 6:56] Bogdan Degtyariov
Description:
SQLTables() function requires ~2-4 minutes to return the list of 400 tables in a big database. The "SHOW TABLE STATUS" query used by SQLTables() is extremely slow for InnoDB tables with the big number of rows because the query is calculating the approximate number of rows in each table. In additional - the results cannot be cached due to non-deterministic nature of the result set (the row count is re-calculated every time), which makes the situation even worse.

How to repeat:
Create the database with 400 InnoDB tables. Each table has to contain at least 5 million rows. The more tables and records are in the database - the slower will be the result. This behavior is repeatable in mysql command line by running "SHOW TABLE STATUS".

Suggested fix:
Use Information_Schema
See the attached file with the patch
[16 Mar 2009 19:46] Jim Winstead
the I_S code should be conditional on the server version, unless we're choosing to abandon 4.1 support.

the query for when the catalog is or isn't specified needs to be slightly different -- if it is not specified, we need to use TABLE_SCHEMA =, and when it is specified, TABLE_SCHEMA LIKE. otherwise a _ or % in the current database name could cause screwy results.

using I_S does actually increase what SQLTables() can handle -- the old code can't handle a wildcard for the catalog name. a test or two should be added for this.
[17 Apr 2009 11:41] Bogdan Degtyariov
patch version 3

Attachment: I_S_SQLTables_v3.diff (application/octet-stream, text), 15.68 KiB.

[28 Apr 2009 0:54] Jim Winstead
there is one bit of redundancy in the show_views block of code, but it looks great aside from that.
[16 Jun 2009 12:29] Bogdan Degtyariov
patch version 6 (all tests pass). SQLForeignKeys() fixed

Attachment: I_S_SQLTables_v6.diff (application/octet-stream, text), 15.91 KiB.

[4 Dec 2009 18:09] Jim Winstead
I would change the option text to "Don't use INFORMATION_SCHEMA for metadata" instead of "Ignore Information_Schema database"
[4 Dec 2009 21:04] Lawrenty Novitsky
Considering
{
      if (is_minimum_version(stmt->dbc->mysql.server_version, "5.0", 3) && 
          !option_flag(stmt, FLAG_NO_INFORMATION_SCHEMA))
        return mysql_table_status_i_s(stmt, catalog, catalog_length,
                                             table, table_length, wildcard,
                                             show_tables, show_views);
      else
        return mysql_table_status_show(stmt, catalog, catalog_length,
                                             table, table_length, wildcard);
}

isn't checking of option FLAG_NO_INFORMATION_SCHEMA, and not for server version in the following chunk of code a error?

        int type_index= 2;
        int comment_index= 1;
        my_bool view;
        if(option_flag(stmt, FLAG_NO_INFORMATION_SCHEMA))
          type_index= comment_index= (stmt->result->field_count == 18) ? 17 : 15;

Didn't look very thoroughly - but looks suspicious from first glance.
Everything else looks good, but will finish review procedure on Monday.

And one more question. What do you think - should we assign a bitmask value for newly added options, or individual name is quite enough? I tend to latter.
[11 Apr 2010 15:46] Lawrenty Novitsky
the patch as it looks after merge and couple of changes

Attachment: I_S_SQLTables_v7.diff (text/x-diff), 18.92 KiB.

[11 Apr 2010 16:14] Lawrenty Novitsky
I've posted here the patch(it has been pushed to working_tree as rev#887) in its latest reincarnation after merge and with couple of things i thought had to be changed.
Most important change is following(giving here some abstracts - maybe somebode will have objections):
+  if (catalog && *catalog)
<--skipped-->
+  else
+  {
+    to= strmov(to, "TABLE_SCHEMA = DATABASE()");
+  }
before "else" code relied on stmt->dbc->database
Next change is:
+        if (stmt->dbc->ds->no_information_schema
+          || !server_has_i_s(stmt->dbc))
+          type_index= comment_index= (stmt->result->field_count == 18) ? 17 : 15;
i added server version condition here(and i that little function that checks if sever has I_S).
And there is also some number of really small changes those i didn't dignify by mentioning here.
[12 Apr 2010 15:49] Lawrenty Novitsky
Some additional stuff for the patch and fix of one merging error

Attachment: bug43664_add.diff (application/octet-stream, text), 3.13 KiB.

[12 Apr 2010 18:32] Lawrenty Novitsky
posted here addition to the previously pushed patch.
it contains fix of my merge mistake - merged odbcdialogparams.cpp manually, and name for the newly added option and related stuff. Without those 2 things windows setup dialog didn't work.
The disputable thing is the name I gave to the option - "NO_I_S". It's too general. It will be good name if we use this option in all catalog functions to decide which routine to use. i'm not sure it's a good idea. on othe hand I named it just like it's described in setup dialog and as it's named in data source structure :)
ah, also pushed changelog entry along with described above(working_tree rev#888)
[21 May 2010 7:58] Lawrenty Novitsky
pushed to the trunk
[24 May 2010 13:26] Tony Bedford
An entry has been added to the 5.1.7 changelog:

The SQLTables() function required approximately two to four minutes to return the list of 400 tables in a database. The SHOW TABLE STATUS query used by SQLTables() was extremely slow for InnoDB tables with a large number of rows because the query was calculating the approximate number of rows in each table. Further, the results could not be cached due to non-deterministic nature of the result set (the row count was re-calculated every time), impacting performance further.