Bug #85132 Performance regression with FusionIO between 5.6.16 and 5.6.34
Submitted: 22 Feb 2017 17:59 Modified: 27 Aug 2018 12:00
Reporter: Geoff Montee Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.6.34 OS:Any
Assigned to: CPU Architecture:Any
Tags: fusionio, innodb, regression, sysbench

[22 Feb 2017 17:59] Geoff Montee
Description:
A user is seeing a performance regression between 5.6.16 and 5.6.34 on a server with the following specifications:

Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz
256GB RAM
FusionIO drive (partition mounted as type xfs (rw,nodev,noatime,nodiratime,inode64,no barrier))
CentOS 6.6/6.7

The performance regression between the two versions seems to be larger when using a smaller value for innodb_buffer_pool_size--possibly related to more disk access.

When tested with sysbench 0.5, the user saw the following performance differences:

8 threads = 1.12% reduction in TPS
16 threads = 1.71% reduction in TPS
32 threads = 2.19% reduction in TPS
64 threads = 14.38% reduction in TPS
128 threads = 34.59% reduction in TPS

This regression does not seem to affect servers not using FusionIO.

The server is using the following configuration file:

[mysqld]

key_buffer_size                 = 128M
max_allowed_packet              = 32M
tmp_table_size                  = 64M
max_heap_table_size             = 64M
myisam_sort_buffer_size         = 256k

open-files-limit                = 8192
net_write_timeout               = 900
net_read_timeout                = 900

sync_binlog                     = 0
innodb_support_xa               = 1

log-bin                         = /datadir/log/binlog/mysql-binlog
relay-log                       = /datadir/log/binlog/mysql-relay
relay-log-index                 = /datadir/log/binlog/mysql-relay.index
relay-log-info-file             = /datadir/log/binlog/mysql-relay.info

expire_logs_days                = 4
log-slave-updates               = 1
read-only                       = 0
binlog_cache_size               = 1M
slave_compressed_protocol       = 1
log-bin-trust-function-creators = 1

default-storage-engine          = innodb
character-set-server            = utf8
innodb_file_format              = Barracuda

query_cache_size                = 0
query_cache_type                = 0

eq_range_index_dive_limit       = 0

innodb_use_native_aio           = 1
innodb_stats_persistent         = 1
innodb_adaptive_flushing        = 1
innodb_checksum_algorithm       = crc32
innodb_log_buffer_size          = 16M
innodb_flush_log_at_trx_commit  = 0
innodb_stats_on_metadata        = 0
innodb_flush_method             = O_DIRECT
innodb_lock_wait_timeout        = 50
innodb_thread_concurrency       = 0
innodb_file_per_table
innodb_print_all_deadlocks      = on

# Performance schema: We want this to be ON by default.
performance_schema              = on

innodb_data_file_path           = ibdata1:10M:autoextend
innodb_log_group_home_dir       = /datadir/log

# Security:
old_passwords                   = 0
secure-auth                     = on
local-infile                    = off

bind-address                    = 0.0.0.0

ssl-ca                          = /datadir/cacert/cacert.pem
ssl-cert                        = /datadir/key/mysql_servers.crt
ssl-key                         = /datadir/key/mysql_servers.key

# Disabling symbolic-links is recommended to prevent assorted security risks;
# to do so, uncomment this line:
symbolic-links                  = 0

# Disable unnecessary Engines:
blackhole                       = off
archive                         = off
federated                       = off

# InnoDB monitoring: Too impactful. Can enable dynamically for performance diagnostics.
innodb_monitor_disable          = ALL

long_query_time                 = 1
log_queries_not_using_indexes   = 1

performance_schema              = OFF
innodb_buffer_pool_size         = 3G
innodb_log_file_size            = 32M
innodb_io_capacity              = 8192
innodb_buffer_pool_instances    = 8
innodb_additional_mem_pool_size = 2M
innodb_open_files               = 300
table_open_cache                = 400
table_definition_cache          = 400
sort_buffer_size                = 256k
max_connections                 = 480
key_buffer_size                 = 80M
tmp_table_size                  = 32M
max_heap_table_size             = 32M

How to repeat:
Run a test with sysbench 0.5 on a similar system that has FusionIO. e.g. to set up sysbench:

git clone https://github.com/akopytov/sysbench.git
cd sysbench
git checkout 0.5
./autogen.sh
./configure --prefix=/home/ec2-user/sysbench-0.5 --with-mysql --with-lua
make
make install

Then initialize the sysbench schema:

mysql -u root --execute="CREATE DATABASE sysbench;"
/home/ec2-user/sysbench-0.5/bin/sysbench \
--test=/home/ec2-user/sysbench-0.5/share/sysbench/parallel_prepare.lua \
--oltp-tables-count=40 \
--mysql-host=127.0.0.1 \
--mysql-port=3306 \
--mysql-user=root \
--mysql-db=sysbench \
--mysql-table-engine=innodb \
--num-threads=20 \
--oltp-table-size=1000000 \
run

And then run the test with $NUM_THREADS:

/home/ec2-user/sysbench-0.5/bin/sysbench \
--test=/home/ec2-user/sysbench-0.5/share/sysbench/oltp.lua \
--oltp-tables-count=40 \
--mysql-host=127.0.0.1 \
--mysql-port=3306 \
--mysql-user=root \
--mysql-db=sysbench \
--mysql-table-engine=innodb \
--num-threads=$NUM_THREADS \
--max-requests=0 \
--max-time=600 \
--oltp-table-size=1000000 \
run

Suggested fix:
If there is a performance regression, determine the cause.
[25 Jul 2018 13:23] MySQL Verification Team
Hi,

Thank you for your report.

We do need additional info from you. Is FusionIO containing a SSD ??? If yes, then we can not accept performance regression reports when MySQL server is not configured optimally for SSD devices. You have other misconfigurations, like the sizes of some logs that can also cause performance regressions.

Our Reference Manual is full of the information on how to optimise our server for specific devices .....
[25 Jul 2018 15:06] MySQL Verification Team
One more small addition.

We are welcoming performance regression reports between two consecutive releases, like 5.6.34 and 5.6.35.

With twenty interim releases, it is impossible to find a culprit, since there are several thousands of code patches, which is impossible to analyse.
[26 Aug 2018 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".