Bug #79005 cache read view for ro transaction with READ COMMITTED isolation
Submitted: 28 Oct 2015 17:00 Modified: 20 Jan 2017 4:38
Reporter: zhai weixiang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.7 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[28 Oct 2015 17:00] zhai weixiang
Description:
In MySQL 5.7, the read view object will be cached if the current transaction is a auto-commit select or a read only transaction with tx_isolation equals to REPEATABLE READ.

When it comes to READ COMMITTED isolation and non-autocommit read only transaction , read view was closed every time when the statement is finished. 

How to repeat:
Read the code

Suggested fix:
Cache read view object with RC isolation
[28 Oct 2015 17:04] zhai weixiang
If none rw transaction happened between two select inside same transaction, we can cache read-view object of the first query(just mark it as closed but not removed from view list), and reuse it while executing the second select, 

for example: begin; select; select;commit;
[24 Nov 2015 13:40] zhai weixiang
It's easy to verify the performance impact. let me show you an example.

Workload:
use sysbench ,  20 table with 1000000 records for each table, 128 concurrent threads

function event(thread_id)
   local table_name
   table_name = "sbtest".. sb_rand_uniform(1, oltp_tables_count)

   local id = sb_rand_uniform(1, oltp_table_size)

   rs = db_query("BEGIN")
   rs = db_query("SELECT * FROM ".. table_name .." WHERE id=" .. id)
   id = sb_rand_uniform(1, oltp_table_size)
   rs = db_query("SELECT * FROM ".. table_name .." WHERE id=" .. id)
   id = sb_rand_uniform(1, oltp_table_size)
   rs = db_query("SELECT * FROM ".. table_name .." WHERE id=" .. id)
   rs = db_query("COMMIT")
end

original  5.7.9

mysql> SELECT COUNT_STAR, SUM_TIMER_WAIT, AVG_TIMER_WAIT, EVENT_NAME FROM events_waits_summary_global_by_event_name where COUNT_STAR > 0 and EVENT_NAME like 'wait/synch/%' order by SUM_TIMER_WAIT desc limit 10;
+------------+-----------------+----------------+-------------------------------------------+
| COUNT_STAR | SUM_TIMER_WAIT | AVG_TIMER_WAIT | EVENT_NAME |
+------------+-----------------+----------------+-------------------------------------------+
| 16117725 | 101870546215590 | 6320115 | wait/synch/mutex/innodb/trx_sys_mutex |
| 16112771 | 13096148372655 | 812580 | wait/synch/mutex/sql/LOCK_table_cache |
| 45712614 | 8087412459300 | 176610 | wait/synch/mutex/sql/THD::LOCK_query_plan |
| 2686248 | 5321062761780 | 1980555 | wait/synch/mutex/innodb/lock_mutex |
| 72527768 | 4378556641620 | 60030 | wait/synch/mutex/innodb/trx_mutex |
| 31583663 | 3589077981300 | 113535 | wait/synch/mutex/sql/THD::LOCK_thd_data |
| 28216754 | 2898368312100 | 102660 | wait/synch/mutex/sql/THD::LOCK_thd_query |
| 8062458 | 2322703838055 | 287970 | wait/synch/rwlock/sql/LOCK_grant |
| 8116005 | 1783270226910 | 219675 | wait/synch/sxlock/innodb/btr_search_latch |
| 442525 | 87422938320 | 197490 | wait/synch/sxlock/innodb/hash_table_locks |
+------------+-----------------+----------------+-------------------------------------------+
10 rows in set (0.06 sec)

trx_sys_mutex is on the top of the list ,and the average wait time is 6320115

tps reported by sysbench:

[ 30s] threads: 128, tps: 37609.63, reads/s: 112840.08, writes/s: 0.00, response time: 4.90ms (95%)
[ 35s] threads: 128, tps: 37639.60, reads/s: 112912.79, writes/s: 0.00, response time: 4.90ms (95%)
[ 40s] threads: 128, tps: 37439.39, reads/s: 112324.76, writes/s: 0.00, response time: 4.93ms (95%)
[ 45s] threads: 128, tps: 37625.75, reads/s: 112869.64, writes/s: 0.00, response time: 4.97ms (95%)
[ 50s] threads: 128, tps: 37709.28, reads/s: 113131.03, writes/s: 0.00, response time: 4.90ms (95%)
[ 55s] threads: 128, tps: 37748.97, reads/s: 113243.51, writes/s: 0.00, response time: 4.83ms (95%)

With patch based on 5.7.9. This is a minor change and doesn't fully tested.

diff --git a/storage/innobase/handler/ha_innodb.cc b/storage/innobase/handler/ha_innodb.cc
index c6292a7..7798c73 100644
--- a/storage/innobase/handler/ha_innodb.cc
+++ b/storage/innobase/handler/ha_innodb.cc
@@ -15046,11 +15046,15 @@ ha_innobase::external_lock(
                } else if (trx->isolation_level <= TRX_ISO_READ_COMMITTED
                           && MVCC::is_view_active(trx->read_view)) {

-                       mutex_enter(&trx_sys->mutex);
+                       if (trx->read_only || trx->rsegs.m_redo.rseg == NULL) {
+                               trx_sys->mvcc->view_close(trx->read_view, false);
+                       } else {
+                               mutex_enter(&trx_sys->mutex);

-                       trx_sys->mvcc->view_close(trx->read_view, true);
+                               trx_sys->mvcc->view_close(trx->read_view, true);

-                       mutex_exit(&trx_sys->mutex);
+                               mutex_exit(&trx_sys->mutex);
+                       }
                }
        }

mysql> SELECT COUNT_STAR, SUM_TIMER_WAIT, AVG_TIMER_WAIT, EVENT_NAME FROM events_waits_summary_global_by_event_name where COUNT_STAR > 0 and EVENT_NAME like 'wait/synch/%' order by SUM_TIMER_WAIT desc limit 10;
+------------+----------------+----------------+-------------------------------------------+
| COUNT_STAR | SUM_TIMER_WAIT | AVG_TIMER_WAIT | EVENT_NAME |
+------------+----------------+----------------+-------------------------------------------+
| 4972632 | 17367020868240 | 3492180 | wait/synch/mutex/innodb/trx_sys_mutex |
| 9935938 | 7711063162170 | 776040 | wait/synch/mutex/sql/LOCK_table_cache |
| 26496483 | 4559905124460 | 171825 | wait/synch/mutex/sql/THD::LOCK_query_plan |
| 1657411 | 3968173670505 | 2393805 | wait/synch/mutex/innodb/lock_mutex |
| 44748953 | 2705635057980 | 60030 | wait/synch/mutex/innodb/trx_mutex |
| 18216115 | 1947553107300 | 106575 | wait/synch/mutex/sql/THD::LOCK_thd_data |
| 16559362 | 1536292656360 | 92655 | wait/synch/mutex/sql/THD::LOCK_thd_query |
| 4972279 | 1422802905495 | 285795 | wait/synch/rwlock/sql/LOCK_grant |
| 5008275 | 880175606565 | 175740 | wait/synch/sxlock/innodb/btr_search_latch |
| 128 | 172012478040 | 1343847240 | wait/synch/mutex/sql/LOCK_error_log |
+------------+----------------+----------------+-------------------------------------------+

[ 35s] threads: 128, tps: 38229.84, reads/s: 114691.52, writes/s: 0.00, response time: 4.68ms (95%)
[ 40s] threads: 128, tps: 38260.68, reads/s: 114781.43, writes/s: 0.00, response time: 4.81ms (95%)
[ 45s] threads: 128, tps: 38196.91, reads/s: 114591.14, writes/s: 0.00, response time: 4.81ms (95%)
[ 50s] threads: 128, tps: 38293.58, reads/s: 114877.74, writes/s: 0.00, response time: 4.81ms (95%)
[ 55s] threads: 128, tps: 38217.91, reads/s: 114656.74, writes/s: 0.00, response time: 4.80ms (95%)
[ 60s] threads: 128, tps: 38329.85, reads/s: 114988.54, writes/s: 0.00, response time: 4.78ms (95%)
[ 65s] threads: 128, tps: 38376.94, reads/s: 115131.83, writes/s: 0.00, response time: 4.75ms (95%)

Summary:
With the patch , decrease the avg wait time of trx_sys->mutex from 6320115 to 3492180, tps increased from about 37700 to 38300
[13 Sep 2016 1:40] zhai weixiang
Any comment about this bug ??? 
I can still see the perofmrance improvement in 5.7.15 with same workload.

  before
  
        transactions: 10015821 (36795.95 per sec.)
      	AVG_TIMER_WAIT of trx_sys->mutex: 4719750
      	
  after patching
      
      	transactions: 10014866 (37072.80 per sec.)
      	AVG_TIMER_WAIT of trx_sys->mutex: 2600430
[13 Sep 2016 1:41] zhai weixiang
Patch for testing purpose

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: view.diff (application/octet-stream, text), 1.28 KiB.

[20 Jan 2017 4:38] MySQL Verification Team
Hello Zhai,

Thank you for the report and contribution.

Thanks,
Umesh