Description:
If all of the following conditions are met, SQL thread in 5.6 is significantly slower than 5.1/5.6.
* Almost all data fit in memory
* Very heavily inserted/updated and committed (7000~15000 commits/s)
* 5.6 Multi-threaded slave can not be used (due to application side restrictions etc)
Performance drop becomes much worse if:
* Crash safe slave is enabled (relay_log_info_repository=table and sync_relay_log_info=1)
* log-slave-update is set on the slave (for GTID etc)
Here are my micro-benchmark numbers (See "how to repeat" for benchmark conditions).
version: commits/s
5.1.69: 14520/s (no crash safe slave)
5.5.33: 12062/s (no crash safe slave)
fb5.1: 12505/s (crash safe slave)
5.6.15 TABLE: 6103/s (crash safe slave)
5.6.15p FILE: 10264/s (no crash safe slave)
5.6.15p TABLE: 7528/s (crash safe slave)
* TABLE means relay_log_info_repository=table
* 5.6.15p has a patch for bug#70342, a patch for skipping load_db_opt_by_name(), and a patch for not opening/fetching/closing slave_relay_log_info table every transaction
If log-slave-updates is not set:
5.1.69: 18692/s (no crash safe slave)
5.5.33: 15593/s (no crash safe slave)
fb5.1: 17056/s (crash safe slave)
5.6.15 TABLE: 7890/s (crash safe slave)
5.6.15p FILE: 15244/s (no crash safe slave)
5.6.15p TABLE: 9993/s (crash safe slave)
There are a few observations.
- 17% drop from 5.1.69 to 5.5.33. I have not profiled much in 5.5, but I suspect it's mostly caused by Metadata Locking.
- From 5.5.33 to 5.6.15p(FILE), 15% drop if log-slave-updates was enabled, 2.5% drop if log-slave-updates is disabled. Writing to binlog (from single threaded SQL thread) in 5.6 takes more time than 5.1/5.5.
- relay_log_info_repository=table adds significant overhead. Compared to fb5.1(crash safe slave), performance drop was 51.2% (14520/s->6103/s). slave_relay_log_info table is opened/fetched/updated/closed per every transaction commit. By keeping the table opened and skipping fetching row (updating only), performance improved to 7528/s (with log-slave-updates) and 9993/s (without log-slave-updates), but these are still much lower than fb5.1 (12505/s and 17056/s).
How to repeat:
my.cnf
loose_performance-schema=0
loose_binlog_checksum=0
log-bin
innodb-flush-log-at-trx-commit=0
log-slave-updates
sync-binlog=0
loose_relay_log_info_repository=table
loose_relay_log_recovery=1
loose_sync_relay_log_info=1
loose_sync_master_info=0
innodb_file_per_table
default-storage-engine=innodb
query_cache_type=0
query_cache_size=0
Setup master and slave. It is fine to run both of them on the same machine.
# Let the slave to receive all relay logs, so that SQL thread won't conflict with I/O thread
mysql -h $SLAVE -e "stop slave sql_thread"
mysqlslap -h $MASTER --auto-generate-sql --number-of-queries=500000 --concurrency=50 --auto-generate-sql-load-type=write
# Start SQL thread
mysql -h $SLAVE -e "start slave sql_thread"
# Then check how many transactions are processed on the slave
mysqladmin extended-status -i 10 -r -h $SLAVE | grep Com_insert
Suggested fix:
- Make relay_log_info_repository=table faster
- Writing to binlog in 5.6 has to be as fast as 5.1/5.6
- SQL thread took some time on load_db_opt_by_name() (~5% on FILE based benchmark). This function can be skipped many times. Consider adding a parameter to skip it.
- I think ~17% performance drops from 5.1 to 5.5 mostly came from Metadata Locking. Verify that and improve if it's possible