Bug #38185 ha_innobase::info can hold locks even when called with HA_STATUS_NO_LOCK
Submitted: 16 Jul 2008 22:03 Modified: 20 Jun 2010 17:25
Reporter: Neel Nadgir Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.1.24, 5.1 BZR OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[16 Jul 2008 22:03] Neel Nadgir
Description:
ha_innobase::info can hold locks even when called with HA_STATUS_NO_LOCK
causing scaling issues on multi-core systems.

From the source, the comment for HA_STATUS_NO_LOCK says

/*
  assuming the table keeps shared actual copy of the 'info' and
  local, possibly outdated copy, the following flag means that
  it should not try to get the actual data (locking the shared structure)
  slightly outdated version will suffice
*/
#define HA_STATUS_NO_LOCK        2

However, the fix for Bug#32440 causes ha_innobase::info to hold a lock 
even when it called with HA_STATUS_NO_LOCK

The function that holds the lock is fsp_get_available_space_in_free_extents()

make_join_statistics() calls storage_engine->info() with 
HA_STATUS_NO_LOCK. ::info() calls fsp_get_available_space_in_free_extents()
which locks a latch. This lock can get hot when multiple threads are 
executing queries on multi-core machines.

How to repeat:
A simple way to trigger the contention is to use sysbench 

sysbench  --max-requests=0 --test=oltp --oltp-table-size=10000000 --oltp-dist-type=special --oltp-read-only=on --oltp-point-selects=10 --oltp-simple-ranges=0 --oltp-sum-ranges=0 --oltp-order-ranges=0 --oltp-distinct-ranges=0 --num-threads=64 --max-time=30 run
[17 Jul 2008 5:13] Neel Nadgir
Proposed Fix
-------------
=== modified file 'storage/innobase/handler/ha_innodb.cc'
--- storage/innobase/handler/ha_innodb.cc       2008-06-12 00:08:07 +0000
+++ storage/innobase/handler/ha_innodb.cc       2008-07-16 22:13:03 +0000
@@ -5871,9 +5871,13 @@
                stats.index_file_length = ((ulonglong)
                                ib_table->stat_sum_of_other_index_sizes)
                                        * UNIV_PAGE_SIZE;
-               stats.delete_length =
-                       fsp_get_available_space_in_free_extents(
-                               ib_table->space);
+               if (!(flag & HA_STATUS_NO_LOCK)) {
+                       stats.delete_length =
+                               fsp_get_available_space_in_free_extents(
+                                       ib_table->space);
+               } else {
+                       stats.delete_length = 0;
+               }
                stats.check_time = 0;
===
[17 Jul 2008 12:44] Sveta Smirnova
Thank you for the report.

Verified as described using test, gdb and code analysis.

Backtrace for SELECT:

#0  fsp_get_available_space_in_free_extents (space=0) at fsp/fsp0fsp.c:2850
#1  0x00305696 in ha_innobase::info (this=0x5fcf028, flag=18) at handler/ha_innodb.cc:5795
#2  0x00146044 in make_join_statistics (join=0x5a3d888, tables=0x80bfc58, conds=0x5a3ea78, keyuse_array=0x5a3e994) at sql_select.cc:2486
#3  0x0014a30a in JOIN::optimize (this=0x5a3d888) at sql_select.cc:945
#4  0x0014db4b in mysql_select (thd=0x5a3b818, rref_pointer_array=0x80bf604, tables=0x80bfc58, wild_num=0, fields=@0x80bf5a0, conds=0x80c0330, og_num=0, order=0x0, group=0x0, having=0x0, proc_param=0x0, select_options=2417248768, result=0x80c0430, unit=0x80bf288, select_lex=0x80bf508) at sql_select.cc:2346
#5  0x00152bdb in handle_select (thd=0x5a3b818, lex=0x80bf228, result=0x80c0430, setup_tables_done_option=0) at sql_select.cc:269
#6  0x000ce237 in execute_sqlcom_select (thd=0x5a3b818, all_tables=0x80bfc58) at sql_parse.cc:4765
#7  0x000cfcd3 in mysql_execute_command (thd=0x5a3b818) at sql_parse.cc:2073
#8  0x00163021 in Prepared_statement::execute (this=0x801c818, expanded_query=0xb0aba838, open_cursor=false) at sql_prepare.cc:3568
#9  0x001632e0 in Prepared_statement::execute_loop (this=0x801c818, expanded_query=0xb0aba838, open_cursor=false, packet=0x54af022 "", packet_end=0x54af028 "") at sql_prepare.cc:3241
#10 0x001637e2 in mysql_stmt_execute (thd=0x5a3b818, packet_arg=0x54af019 "\001", packet_length=15) at sql_prepare.cc:2462
#11 0x000d9c28 in dispatch_command (command=COM_STMT_EXECUTE, thd=0x5a3b818, packet=0x54af019 "\001", packet_length=15) at sql_parse.cc:1092
#12 0x000dafce in do_command (thd=0x5a3b818) at sql_parse.cc:794
#13 0x000c717a in handle_one_connection (arg=0x5a3b818) at sql_connect.cc:1115
#14 0x90024227 in _pthread_body ()
[17 Jul 2008 23:51] Jeffrey Pugh
Neel also comments that "The regression was introduced in 5.1.24 as part of the fix for Bug#32440"
[18 Jul 2008 6:36] Vasil Dimov
Proposed fix for this bug (untested)

Attachment: bug38185.diff (application/octet-stream, text), 1.16 KiB.

[18 Jul 2008 6:46] Vasil Dimov
The proposed fix (by Neel) is safe, setting to R1.

I have attached a derived patch here:

http://bugs.mysql.com/file.php?id=9807

the difference from Neel's one is that delete_length is not set to 0 if HA_STATUS_NO_LOCK is specified and a comment is added.

Please do test this patch but do not commit it directly (we will send a snapshot via the standard procedure once you confirm it fixes the performance problem you have found).

If you are in doubt that this thing fixes the problem, you can apply the following patch instead:

--- cut ---
Index: handler/ha_innodb.cc
===================================================================
--- handler/ha_innodb.cc	(revision 2541)
+++ handler/ha_innodb.cc	(working copy)
@@ -5832,15 +5832,13 @@ ha_innobase::info(
 		stats.data_file_length = ((ulonglong)
 				ib_table->stat_clustered_index_size)
 					* UNIV_PAGE_SIZE;
 		stats.index_file_length = ((ulonglong)
 				ib_table->stat_sum_of_other_index_sizes)
 					* UNIV_PAGE_SIZE;
-		stats.delete_length =
-			fsp_get_available_space_in_free_extents(
-				ib_table->space) * 1024;
+		stats.delete_length = 0;
 		stats.check_time = 0;
 
 		if (stats.records == 0) {
 			stats.mean_rec_length = 0;
 		} else {
 			stats.mean_rec_length = (ulong) (stats.data_file_length / stats.records);
--- cut ---

it can help isolate the problem.
[18 Jul 2008 6:56] Vasil Dimov
This is a performance issue, setting to S5.

The code that executes is the same on all platforms but (probably?) has visible effects only on Solaris.
[22 Jul 2008 18:00] Vasil Dimov
Can someone confirm that this patch:

http://bugs.mysql.com/file.php?id=9807&text=1

fixes the problem?
[23 Jul 2008 6:50] Vasil Dimov
The proposed patch (by me or Neel) has the problem that it effectively sets back the behavior to what it was before Bug#32440 was fixed. I.e. with this patch Bug#32440 would resurface.

This is because ::info() is also called with HA_STATUS_NO_LOCK when a user issues SELECT * FROM information_schema.tables;

I created a new patch that changes this, so HA_STATUS_NO_LOCK is not used when user SELECTs from information_schema.tables. The change is in get_schema_tables_record() which seems to be used only to fill information_schema.tables so there should be no performance degradation. The new patch touches a MySQL file: sql/sql_show.cc.

Please test the new patch.

PS Thanks to Tim for writing a test for Bug#32440!
[23 Jul 2008 6:50] Vasil Dimov
New patch

Attachment: bug38185.diff (application/octet-stream, text), 1.41 KiB.

[25 Jul 2008 14:04] Vasil Dimov
The part of the fix that changes storage/innobase/handler/ha_innodb.cc is committed in the InnoDB 5.1 repository under r2545.

The rest of the fix need to be committed at the MySQL end:

--- cut ---
--- sql/sql_show.cc.orig	2008-07-23 09:32:14.000000000 +0300
+++ sql/sql_show.cc	2008-07-23 09:32:19.000000000 +0300
@@ -3549,8 +3549,7 @@ static int get_schema_tables_record(THD 
 
     if(file)
     {
-      file->info(HA_STATUS_VARIABLE | HA_STATUS_TIME | HA_STATUS_AUTO |
-                 HA_STATUS_NO_LOCK);
+      file->info(HA_STATUS_VARIABLE | HA_STATUS_TIME | HA_STATUS_AUTO);
       enum row_type row_type = file->get_row_type();
       switch (row_type) {
       case ROW_TYPE_NOT_USED:
--- cut ---
[20 Aug 2008 22:23] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/52096

2715 Timothy Smith	2008-08-20
      Cherry-pick some changes from innodb-5.1-ss2545 snapshot.  Includes fixes for
      Bug#37531, Bug#36941, Bug#36941, Bug#36942, Bug#38185.
      
      Also include test case from Bug 34300 which was left out from earlier snapshot
      (5.1-ss2387).
      
      Also include fix for Bug #29507, "TRUNCATE shows to many rows effected", since
      the fix for Bug 37531 depends on it.
[21 Aug 2008 17:59] Bugs System
Pushed into 5.1.28  (revid:timothy.smith@sun.com-20080820221833-brmru9b77ddt3tfx) (version source revid:azundris@mysql.com-20080821081500-f2d61fh4u61owz3p) (pib:3)
[21 Aug 2008 23:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/52267

2794 Timothy Smith	2008-08-21 [merge]
      Merge up from 5.1-bugteam.  Among other things, this applies some changes from
      the innodb-5.1-ss2545 snapshot into 6.0, fixing Bug#37531, Bug#36941,
      Bug#36942, and Bug#38185.
[21 Aug 2008 23:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/52268

2794 Timothy Smith	2008-08-21 [merge]
      Merge up from 5.1-bugteam.  Among other things, this applies some changes from
      the innodb-5.1-ss2545 snapshot into 6.0, fixing Bug#37531, Bug#36941,
      Bug#36942, and Bug#38185.
[27 Aug 2008 1:42] Paul DuBois
Noted in 5.1.28 changelog.

Over-aggressive lock acquisition by InnoDB could result in
performance degradation when multiple threads were executing
statements on multi-core machines.

Setting report to NDI pending push into 6.0.x.
[13 Sep 2008 23:42] Bugs System
Pushed into 6.0.7-alpha  (revid:timothy.smith@sun.com-20080820221833-brmru9b77ddt3tfx) (version source revid:john.embretsen@sun.com-20080808091208-ht48kyzsk7rim74g) (pib:3)
[16 Sep 2008 14:39] Paul DuBois
Noted in 6.0.7 changelog.

Revised changelog entry:

Over-aggressive lock acquisition by InnoDB when calculating free 
space for tablespaces could result in performance degradation when
multiple threads were executing statements on multi-core machines.
[5 May 2010 15:25] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 15:57] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[28 May 2010 5:58] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:27] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:55] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 23:58] Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[17 Jun 2010 12:01] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:42] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:28] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)