Description:
master_info_repository=FILE (5.6 default) is very expensive. It calls my_b_printf(memcpy) per each master_info field, while in 5.1 it called just once per binlog event.
This may cause performance problem when using semi-sync replication. When using master_info_repository=TABLE, concurrent insert (100 clients via mysqlslap) throughput on semisync slave was 22434/s on my test server. When using FILE, concurrent insert throughput was only 11981/s. And top perf stat was memcpy as below.
5.39% 476 mysqld libc-2.14.1.so [.] __memcpy_ssse3_back
|
--- __memcpy_ssse3_back
|
|--65.83%-- my_b_printf
| |
| |--40.80%-- Rpl_info_file::do_set_info(int, char const*)
| | |
| | |--58.66%-- bool Rpl_info_handler::set_info<char*>(char*)
| | | Master_info::write_info(Rpl_info_handler*)
| | | Master_info::flush_info(bool)
| | | flush_master_info(Master_info*, bool)
| | | handle_slave_io
| | | start_thread
As documented, "master_info_repository=TABLE" is faster because it doesn't update anything per 10000 binary log events by default. By using TABLE, slave_master_info.master_log_{file,pos} is not trustful, but this is actually not a problem because slave_relay_log_info is always updated per commit and slave_relay_log_info is the source of truth for crash recovery.
FILE currently does:
1. Writing to filesystem cache (master.info) per each binlog event (same as 5.1)
2. Calling memcpy per each master.info attribute (in 5.6)
How to repeat:
Read source code:
/* 51 code */
rpl_mi.cc: flush_master_info()
my_b_seek(file, 0L);
my_b_printf(file,
"%u\n%s\n%s\n%s\n%s\n%s\n%d\n%d\n%d\n%s\n%s\n%s\n%s\n%s\n%d\n",
LINES_IN_MASTER_INFO,
mi->master_log_name, llstr(mi->master_log_pos, lbuf),
mi->host, mi->user,
mi->password, mi->port, mi->connect_retry,
(int)(mi->ssl), mi->ssl_ca, mi->ssl_capath, mi->ssl_cert,
mi->ssl_cipher, mi->ssl_key, mi->ssl_verify_server_cert);
DBUG_RETURN(-flush_io_cache(file));
/* 56 code */
bool Master_info::write_info()
{
if (to->prepare_info_for_write() ||
to->set_info((int) LINES_IN_MASTER_INFO) ||
to->set_info(master_log_name) ||
to->set_info((ulong) master_log_pos) ||
to->set_info(host) ||
to->set_info(user) ||
to->set_info(password) ||
to->set_info((int) port) ||
to->set_info((int) connect_retry) ||
to->set_info((int) ssl) ||
to->set_info(ssl_ca) ||
to->set_info(ssl_capath) ||
to->set_info(ssl_cert) ||
to->set_info(ssl_cipher) ||
to->set_info(ssl_key) ||
to->set_info((int) ssl_verify_server_cert) ||
to->set_info(heartbeat_period) ||
to->set_info(bind_addr) ||
to->set_info(ignore_server_ids) ||
to->set_info(master_uuid) ||
to->set_info(retry_count) ||
to->set_info(ssl_crl) ||
to->set_info(ssl_crlpath) ||
to->set_info((int) auto_position))
DBUG_RETURN(TRUE);
}
/* Each set_info calls my_b_printf in do_set_info */
bool Rpl_info_file::do_set_info(const int pos, const char *value)
{
return (my_b_printf(&info_file, "%s\n", value) > (size_t) 0 ?
FALSE : TRUE);
}
Suggested fix:
FILE performance can be improved by:
1. Writing to filesystem cache per sync_master_info events (same as TABLE)
2. Calling memcpy in batch (not per attribute)