Bug #48846 Too much time spent in ha_partition::records_in_range if not able to prune
Submitted: 17 Nov 2009 18:51 Modified: 19 Mar 2010 10:51
Reporter: Mattias Jonsson Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Partitions Severity:S3 (Non-critical)
Version:5.1+ OS:Any
Assigned to: Mattias Jonsson CPU Architecture:Any
Tags: pruning

[17 Nov 2009 18:51] Mattias Jonsson
Description:
When analysing bug#37252 and bug#47261 I found that too much time was spent in ha_partition::records_in_range since it calls records_in_range for all partitions (that are left after pruning).

The handler call records_in_range should return an estimation of records that a specific range for the given index would return. So it should be sufficient to do the estimation for the partitioning engine by only use a smaller subset of partitions for this instead of all (not pruned) partitions.

records_in_range are especially costly for MyISAM tables/partitions. My tests show that it is possible to get an 10X increase for small select queries on 1024 partitions. (For InnoDB the increase is about 2X for 1024 partitions, which is very good too).

How to repeat:
See bug#37252. Or use the included script (based on that bug). And of course compare to the attached patch.

Suggested fix:
See attached patch.
[17 Nov 2009 18:52] Mattias Jonsson
perl test script (uses mysql client).

Attachment: bug37252.no_dbi.pl (text/plain), 10.34 KiB.

[17 Nov 2009 19:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/90744

3199 Mattias Jonsson	2009-11-17
      Bug#48846: Too much time spent in ha_partition::records_in_range if not able to prune
      
      Problem was that ha_partition::records_in_range called
      records_in_range for all non pruned partitions, even if
      an estimate should be given.
      
      Solution is to only use 1/3 of the partitions (up to 10) for
      records_in_range and estimate the total from this subset.
      (And continue until a non zero return value from the called
      partitions records_in_range is given, since 0 means no rows
      will match.)
     @ sql/ha_partition.cc
        Bug#48846: Too much time spent in ha_partition::records_in_range if not able to prune
        
        estimate_rows_upper_bound and records_in_range are very similar
        (the only difference is the function and its parameters to use)
        so I created a common function for this.
        
        Since these calls from the optimizer are only estimates, it is
        not neccesary to call them for every partition, it can use
        a much smaller subset of the used partitions instead,
        which improves performance for selects.
     @ sql/ha_partition.h
        Bug#48846: Too much time spent in ha_partition::records_in_range if not able to prune
        
        Added two private functions to help some
        optimizer calls.
[17 Nov 2009 19:05] Mattias Jonsson
spreadsheet showing the performance figures

Attachment: bug48846_report.ods (application/vnd.oasis.opendocument.spreadsheet, text), 134.27 KiB.

[17 Nov 2009 19:09] Mattias Jonsson
To start with do 'ulimit -n 10000' (or at least > 2050) in the session where you start the server.
The following flags was added to partition_innodb-master.opt:
--innodb_flush_method=O_DIRECT --innodb_log_file_size=256M --innodb_flush_log_at_trx_commit=0 --innodb_buffer_pool_size=2G

and then I started the server by 'cd mysql-test ; ./mtr --mem --start partition_innodb'

(--innodb_file_per_table did take too much memory on my 4GB machine, but I could not see any big difference in performance of that flag, and the rest of the flags can probably be tuned further too...)
[23 Nov 2009 10:54] Mattias Jonsson
To the reviewers:
Note that the patch only uses the partition order to get the estimates, so the estimate will probably not use the last partition(s) which may be more relevant for range partitioning, and it will not search for the partition with the most rows and use. But since it is only an estimate I focused on speed and simplicity.
[15 Jan 2010 9:02] Bugs System
Pushed into 5.1.43 (revid:joro@sun.com-20100115085139-qkh0i0fpohd9u9p5) (version source revid:satya.bn@sun.com-20091223070903-6rn5kgl3chedqnxd) (merge vers: 5.1.42) (pib:16)
[18 Jan 2010 11:31] Jon Stephens
Documented fix in the 5.1.43 changelog as follows:

        When used on partitioned tables, the records_in_range handler 
        call checked all partitions, rather than the unpruned partitions 
        only.

Set status = NDI, waiting for merges to later trees.
[5 Feb 2010 11:50] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100204063540-9czpdmpixi3iw2yb) (version source revid:alik@sun.com-20100119163614-172adculixyu26j5) (pib:16)
[5 Feb 2010 11:56] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100205113942-oqovjy0eoqbarn7i) (version source revid:alik@sun.com-20100204064210-ljwanqvrjs83s1gq) (merge vers: 6.0.14-alpha) (pib:16)
[5 Feb 2010 12:01] Bugs System
Pushed into 5.5.2-m2 (revid:alik@sun.com-20100203172258-1n5dsotny40yufxw) (version source revid:alexey.kopytov@sun.com-20091225105650-qletdbs0wz9sx5nc) (merge vers: 5.5.1-m2) (pib:16)
[5 Feb 2010 14:18] Jon Stephens
Also documented in the 5.5.2, 5.6.0, and 6.0.14 changelogs.

Closed.
[7 Mar 2010 23:52] Paul DuBois
5.6.0 changelog entry unneeded.
[12 Mar 2010 14:13] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:29] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:44] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[15 Mar 2010 5:23] Jon Stephens
No additional changelog entries required. Returning to Closed state.
[18 Mar 2010 21:19] James Day
Docs, please change the release note text from:

When used on partitioned tables, the records_in_range handler call checked all partitions, rather than the unpruned partitions only.

To:

Improved partition range checking made some partition operations as much as two to ten times as fast in 1024 partition tests by reducing the number of unpruned partitions checked for statistics.

The purpose of the change is to give some idea of significance and to correct what looked like an error - seems to be reducing number of unpruned partitions checked, not simply changing from checking all to only checking unpruned.

Please also add the performance tag for this change, I've added that request to https://inside.mysql.com/wiki/Performance_improvements_docs also.
[19 Mar 2010 10:51] Jon Stephens
Updated changelog entry per James' comment and IRC discussion, now reads as follows:

        When used on partitioned tables, the records_in_range 
        handler call checked more partitions than necessary. The 
        fix for this issue reduces the number of unpruned partitions 
        checked for statistics in partition range checking, which has 
        resulted in some partition operations being performed up to 
        2-10 times faster than before this change was made, when 
        testing with tables having 1024 partitions.

Also added Performance tag as requested.

Docs commit is here: http://lists.mysql.com/commits/103805

Closed.