Bug #70342 master_info_repository=FILE calls my_b_printf too often
Submitted: 13 Sep 2013 22:18 Modified: 21 Sep 2013 15:43
Reporter: Yoshinori Matsunobu (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S5 (Performance)
Version:5.6.13 OS:Any
Assigned to: CPU Architecture:Any

[13 Sep 2013 22:18] Yoshinori Matsunobu
Description:
master_info_repository=FILE (5.6 default) is very expensive. It calls my_b_printf(memcpy) per each master_info field, while in 5.1 it called just once per binlog event.

This may cause performance problem when using semi-sync replication. When using master_info_repository=TABLE, concurrent insert (100 clients via mysqlslap) throughput on semisync slave was 22434/s on my test server. When using FILE, concurrent insert throughput was only 11981/s. And top perf stat was memcpy as below.

     5.39%        476   mysqld  libc-2.14.1.so        [.] __memcpy_ssse3_back

             |
             --- __memcpy_ssse3_back
                |
                |--65.83%-- my_b_printf
                |          |
                |          |--40.80%-- Rpl_info_file::do_set_info(int, char const*)
                |          |          |
                |          |          |--58.66%-- bool Rpl_info_handler::set_info<char*>(char*)
                |          |          |          Master_info::write_info(Rpl_info_handler*)
                |          |          |          Master_info::flush_info(bool)
                |          |          |          flush_master_info(Master_info*, bool)
                |          |          |          handle_slave_io
                |          |          |          start_thread

As documented, "master_info_repository=TABLE" is faster because it doesn't update anything per 10000 binary log events by default. By using TABLE, slave_master_info.master_log_{file,pos} is not trustful, but this is actually not a problem because slave_relay_log_info is always updated per commit and slave_relay_log_info is the source of truth for crash recovery.

FILE currently does:
1. Writing to filesystem cache (master.info) per each binlog event (same as 5.1)
2. Calling memcpy per each master.info attribute (in 5.6)

How to repeat:
Read source code:

/* 51 code */
rpl_mi.cc: flush_master_info()
 my_b_seek(file, 0L);
 my_b_printf(file,
 "%u\n%s\n%s\n%s\n%s\n%s\n%d\n%d\n%d\n%s\n%s\n%s\n%s\n%s\n%d\n",
 LINES_IN_MASTER_INFO,
 mi->master_log_name, llstr(mi->master_log_pos, lbuf),
 mi->host, mi->user,
 mi->password, mi->port, mi->connect_retry,
 (int)(mi->ssl), mi->ssl_ca, mi->ssl_capath, mi->ssl_cert,
 mi->ssl_cipher, mi->ssl_key, mi->ssl_verify_server_cert);
 DBUG_RETURN(-flush_io_cache(file));

/* 56 code */

bool Master_info::write_info()
{
 if (to->prepare_info_for_write() ||
 to->set_info((int) LINES_IN_MASTER_INFO) ||
 to->set_info(master_log_name) ||
 to->set_info((ulong) master_log_pos) ||
 to->set_info(host) ||
 to->set_info(user) ||
 to->set_info(password) ||
 to->set_info((int) port) ||
 to->set_info((int) connect_retry) ||
 to->set_info((int) ssl) ||
 to->set_info(ssl_ca) ||
 to->set_info(ssl_capath) ||
 to->set_info(ssl_cert) ||
 to->set_info(ssl_cipher) ||
 to->set_info(ssl_key) ||
 to->set_info((int) ssl_verify_server_cert) ||
 to->set_info(heartbeat_period) ||
 to->set_info(bind_addr) ||
 to->set_info(ignore_server_ids) ||
 to->set_info(master_uuid) ||
 to->set_info(retry_count) ||
 to->set_info(ssl_crl) ||
 to->set_info(ssl_crlpath) ||
 to->set_info((int) auto_position))
 DBUG_RETURN(TRUE);
}

/* Each set_info calls my_b_printf in do_set_info */

bool Rpl_info_file::do_set_info(const int pos, const char *value)
{
 return (my_b_printf(&info_file, "%s\n", value) > (size_t) 0 ?
 FALSE : TRUE);
}

Suggested fix:
FILE performance can be improved by:
1. Writing to filesystem cache per sync_master_info events (same as TABLE)
2. Calling memcpy in batch (not per attribute)
[21 Sep 2013 15:43] MySQL Verification Team
Hello Yoshinori,

Thank you for the bug report.

Thanks,
Umesh
[16 Dec 2013 18:38] Yoshinori Matsunobu
The inefficiency applies to relay log info file, too.