Bug #25567 records() call performs scan in NDB, performance bug
Submitted: 12 Jan 2007 1:19 Modified: 25 Jan 2007 5:22
Reporter: Mikael Ronström Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Cluster: Cluster (NDB) storage engine Severity:S3 (Non-critical)
Version:5.1.15 OS:
Assigned to: Stewart Smith CPU Architecture:Any

[12 Jan 2007 1:19] Mikael Ronström
Description:
The function records() in the NDB handler is using the method
ndb_get_table_statistics(...) to retrieve the number of records.
This function is called once for each table in a SELECT statement.
This is a very high overhead on queries to perform a scan operation
per table per SELECT statement and should obviously be avoided.

How to repeat:
Run a normal select by primary key
Check statistics in cluster log and note that it adds
also to table scan counter

Suggested fix:
Read local info in handler object, this is initialised when handler object was opened.
Probably some more intelligent handling would be good.
[15 Jan 2007 2:24] Tomas Ulin
Unfortunately the MySQL server internally used the same table statistics to optimize the "select count(*)" command.

As many users choose to use "select count(*)" as a "test" for engine speed, we chose early on to update the internal "records" field with the accurate count of records, to avoid a full table scan on "select count(*)".

At the same time (because of performance issues, i.e. "records" is updated via an optimizer call before optimizing each query), we provided an option to remove this behavior --ndb-use-exact-count.

Note. the scan performed to retrieve the statistics, is not a "full table" scan, mearly retieving a count from each fragment of the table (which is done in parallell) from all fragments.  The relative cost for this is very high for a simple PK lookup, but for a more complex join, this relative cost is low.

Tomas
[15 Jan 2007 23:59] Stewart Smith
first run of patch

Attachment: ndb_records.patch (text/x-patch), 448 bytes.

[15 Jan 2007 23:59] Stewart Smith
i think this patch sholud be okay, going to test a bit more before saying "yes" but mysql-test-run is okay with it... just have to run with debug trace to make sure we're doing exactly the right thing...
[16 Jan 2007 0:38] Stewart Smith
the patch breaks things, don't use (namely count(*))

looks like i've found the real problem though.... problem in sql_select.cc
[16 Jan 2007 1:12] Stewart Smith
2nd patch, works.

Attachment: ndb_records.patch (text/x-patch), 461 bytes.

[23 Jan 2007 13:27] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/18624

ChangeSet@1.2398, 2007-01-24 00:27:19+11:00, stewart@willster.(none) +3 -0
  Bug#25567 records() call performs scan in NDB, performance bug
[23 Jan 2007 14:23] Stewart Smith
final patch approved by tomas, pushed to 5.1-ndb
[24 Jan 2007 2:08] Tomas Ulin
pushed to 5.1.15
[25 Jan 2007 5:22] Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release.

If necessary, you can access the source repository and build the latest available version, including the bug fix. More information about accessing the source trees is available at

    http://dev.mysql.com/doc/en/installing-source.html

Documented bugfix in 5.1.15 changelog.