Bug #84274 READ COMMITTED does not scale after 36 threads (in 5.6 after 16 threads)
Submitted: 20 Dec 2016 12:10 Modified: 23 Dec 2016 4:03
Reporter: Sveta Smirnova (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S5 (Performance)
Version:5.7.17, 5.6.35 OS:Any
Assigned to: CPU Architecture:Any

[20 Dec 2016 12:10] Sveta Smirnova
Description:
If I run two identical SysBench tests on 144 machine using REPEATABLE READ and READ COMMITTED RR scales up to 256 threads (expected) while RC only scales up to 36 threads.

Results. (rr = REPEATABLE READ, rc = READ COMMITTED)

5.6.35

	1 	2 		4 		8 		16 		32 		36 		64 		72 		128 	144 	256 	512 	1024
rr	792	1438	2871	5617	10945	19704	21485	23629	17679	18130	17757	14449	9047	5139
rc	725	1488	2922	5677	10827	13887	12361	7575	7087	4807	4472	2702	1465	782

5.7.17

	1 	2 		4 		8 		16 		32 		36 		64 		72 		128 	144 	256 	512 	1024
rr	719	1467	2882	5613	10848	19257	21530	35296	37797	49547	49500	49095	45772	43181
rc	710	1388	2864	5617	10766	18678	20341	17058	14928	14659	15220	16008	15346	14586

How to repeat:
Hardware configuration:

Processors: physical = 4, cores = 72, virtual = 144, hyperthreading = yes
Memory: 3.0T
Disk speed: 3K IOPS or ramdisk

1. Install MySQL Sandbox
2. Stop server
3. Change configuration:

[mysqld]
user               = smirnova
port               = 5717
socket             = /tmp/mysql_sandbox5717.sock
basedir            = /data/sveta/5.7.17
datadir            = /mnt/ramdisk/my-5.7.17/data
tmpdir             = /mnt/ramdisk/tmp
lower_case_table_names = 0
pid-file           = /data/sveta/sandbox/msb_5_7_17/data/mysql_sandbox5717.pid
bind-address       = 127.0.0.1
# slow-query-log = on
# slow-query-log-file=/data/sveta/sandbox/msb_5_7_17/data/msandbox-slow.log
# general-log = on
# general-log-file=/data/sveta/sandbox/msb_5_7_17/data/msandbox-general.log
#
# additional options passed through 'my_clause' 
#
log-error=msandbox.err

table_open_cache = 8000
table_open_cache_instances = 16
query_cache_type = 0
join_buffer_size=32k
sort_buffer_size=32k
max_connections=16000
back_log=5000
innodb_open_files=4000

#Monitoring
performance-schema=0
#InnoDB General                                                 
innodb_buffer_pool_load_at_startup=1
innodb_buffer_pool_dump_at_shutdown=1
#innodb_numa_interleave=1
innodb_file_per_table=1
innodb_file_format=barracuda
innodb_flush_method=O_DIRECT_NO_FSYNC
innodb_doublewrite=1
innodb_support_xa=1
innodb_checksums=1

#Concurrency
innodb_thread_concurrency=144
#innodb_page_cleaners=8
innodb_purge_threads=4
innodb_spin_wait_delay=12
innodb_log_file_size=8G
innodb_log_files_in_group=16
innodb_buffer_pool_size=128G
innodb_buffer_pool_instances=128
innodb_io_capacity=18000
innodb_io_capacity_max=36000
innodb_flush_log_at_timeout=0
innodb_flush_log_at_trx_commit=2
#innodb_flush_sync=1
innodb_adaptive_flushing=1
innodb_flush_neighbors = 0
innodb_max_dirty_pages_pct=90
innodb_max_dirty_pages_pct_lwm=10
innodb_lru_scan_depth=4000
innodb_adaptive_hash_index=0
innodb_change_buffering=none
optimizer_switch="index_condition_pushdown=off"

4. Start mysqld
5. Create database sbtest and run:

LD_PRELOAD=/data/sveta/5.7.14/lib/mysql/libjemalloc.so /data/sveta/sbkk/bin/sysbench --test=/data/sveta/sysbench/sysbench/tests/db/oltp_prepared.lua --db-driver=mysql --oltp-tables-count=8 --oltp-table-size=10000000 --mysql-table-engine=innodb --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox5717.sock --num-threads=144 --max-requests=0 --max-time=60 --percentile=0 run

Replace paths, use SysBench from this branch: https://github.com/akopytov/sysbench/tree/concurrency_kit

Test cases with prepared statement support are located at https://github.com/akopytov/sysbench/pull/94 and https://github.com/svetasmirnova/open-database-bench

6. Run tests:

for i in `echo 1 2 4 8 16 32 36 64 72 128 144 256 512 1024`; do LD_PRELOAD=/data/sveta/5.7.14/lib/mysql/libjemalloc.so /data/sveta/sbkk/bin/sysbench  --test=/data/sveta/sysbench/sysbench/tests/db/oltp_prepared.lua --db-driver=mysql --oltp-tables-count=8 --oltp-table-size=10000000 --mysql-table-engine=innodb --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox5717.sock --num-threads=$i --max-requests=0 --max-time=300 --percentile=0 run >> result_rr.log; done

7. Change global transaction isolation mode to READ COMMITTED, then repeat tests:

for i in `echo 1 2 4 8 16 32 36 64 72 128 144 256 512 1024`; do LD_PRELOAD=/data/sveta/5.7.14/lib/mysql/libjemalloc.so /data/sveta/sbkk/bin/sysbench  --test=/data/sveta/sysbench/sysbench/tests/db/oltp_prepared.lua --db-driver=mysql --oltp-tables-count=8 --oltp-table-size=10000000 --mysql-table-engine=innodb --mysql-user=msandbox --mysql-password=msandbox --mysql-socket=/tmp/mysql_sandbox5717.sock --num-threads=$i --max-requests=0 --max-time=300 --percentile=0 run >> result_rc.log; done

8. Compare results

Suggested fix:
Work on READ COMMITTED scalability.
[20 Dec 2016 16:34] Mark Callaghan
More results at http://smalldatum.blogspot.com/2016/11/sysbench-innodb-transaction-isolation.html
[22 Dec 2016 12:53] Sveta Smirnova
Note bug is not repeatable on smaller number of cores. For example, on 24-core machine I have expected results:

1    2        4        8        16        32        36        64        72        128        144        256        512        1024
659    1325    2437    4658    7466    9498    9505    9455    9452    9424    9464    9564    9309    6610
674    1356    2566    4732    7695    9600    9600    9526    9546    9517    9579    9566    9265    6422
[23 Dec 2016 4:03] MySQL Verification Team
Hello Sveta,

Thank you for the report, test case and feedback.
Imho this behavior is well known as our Dimitri's blog post already explained this at great length   http://dimitrik.free.fr/blog/archives/2015/02/mysql-performance-impact-of-innodb-transacti...

I agree that there is a "new" point in the bug report which was probably just missed in the past i.e. about "scalability limit" but as Dimitri observed in his post there is contention we have on trx_sys when RC is used, and thus hard to imagine it could scale.  

Also, Morgan has post about a poll inviting to vote for RR/RC default - http://www.tocker.ca/2015/01/14/proposal-to-change-replication-and-innodb-settings-in-mysq...

Regards,
Umesh
[23 Dec 2016 4:33] zhai weixiang
Will bug#79005 hlep improving the scalability ?