Bug #67805 | Very slow replication with MySQL-5.5.28 compare to MySQL-5.5.24 | ||
---|---|---|---|
Submitted: | 4 Dec 2012 18:57 | Modified: | 23 May 2013 18:19 |
Reporter: | Jean-Christophe Petit | Email Updates: | |
Status: | Can't repeat | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S3 (Non-critical) |
Version: | MySQL-5.5.28 | OS: | Linux (Redhat-6.1) |
Assigned to: | CPU Architecture: | Any | |
Tags: | MySQL-5.5.28, replication, slow performance |
[4 Dec 2012 18:57]
Jean-Christophe Petit
[4 Dec 2012 19:47]
Sveta Smirnova
Thank you for the report. Which binary log format do you use? Which storage engine do you use?
[4 Dec 2012 20:16]
Jean-Christophe Petit
bin log format is STATEMENT Storage engine is innodb replication is asynchronous File System is XFS Have a good day
[4 Dec 2012 20:42]
Valeriy Kravchuk
Please, send my.cnf file content and the output of: show global status; from your problematic 5.5.28 slave.
[22 Dec 2012 16:01]
MySQL Verification Team
but is it replication specific or the server in general. I mean, if you make the 5.5.28 a master, does the 5.5.28 master become slower than the 5.5.24 masters ?
[2 Jan 2013 9:26]
Jean-Christophe Petit
It is specific to the Slave with version 5.5.28 Other slaves with 5.5.24 fine. Master is 5.5.24 and slave 5.5.28 Thank you
[2 Jan 2013 10:13]
MySQL Verification Team
Haven't had any luck repeating this yet with certainty. I'm using blackhole tables and benchmarking a 200M statement binlog. 5.5.24 and 5.5.28 slaves play it within 3 seconds of each other in a > 1 minute test run, so they are comparable. On a side note, for windows slaves I have managed to identify two things to make replication *much* faster (on any mysql version): 1. put "symbolic-links=0" in my.cnf. And do not use .sym files anywhere. 2. make sure there is a "db.opt" file in all databases, else a failed open takes a lot of time for each statement.
[7 Jan 2013 11:44]
Jean-Christophe Petit
It happened with an other customer. This time, Mysql-server was 5.1.67 and slave was 5.5.28 The replication was extremely slow the same as our first report. Switching the slave to 5.5.24 resolved the issue. This customer was attempting a migration from 5.1.X to 5.5.X Thank you
[23 May 2013 17:40]
MySQL Verification Team
I see no performance difference between my 5.5.24/5.5.28 slaves. Except 5.5.28 tends to crash when I add user variables into my test. We need to know what kind of queries are run to hit the problem. Could it be some optimizer change, therefore deletes take longer if they have complex where clauses? Is autoincrement variables and user variables used? Do you have some benchmark/testcase, or profiling data that shows the problem?
[23 May 2013 17:54]
Jean-Christophe Petit
Hopefully, I get a response after 4 months.. The problem seems to be fixed with 5.5.31
[23 May 2013 18:19]
MySQL Verification Team
That's good news indeed.
[25 May 2013 4:19]
Jervin R
I can still repeat this on 5.5.31, on my tests I have a 5.1.67 master, 5.1.69 and 5.5.31 slaves. The same lagging behavior can be observed on 5.5.24 and 5.5.30. Default configurations on all except server-id, log-bin on master. 1. STOP SLAVE on all slaves 2. Run mysqlslap on the master mysqlslap --auto-generate-sql --auto-generate-sql-load-type=write --commit=1 --concurrency=8 --create-schema=slap_test --detach=100 --iterations=5 --number-char-cols=5 --number-int-cols=5 --number-of-queries=50000 -umsandbox -pmsandbox 3. START SLAVE on the slaves, watch the 5.5.31 slave lag behind while 5.1.69 complete in a few seconds. My test below is both slaves are on the same SSD drive, even if I set innodb_flush_log_at_trx_commit = 2 it will only cut the lag down some 50%. With innodb_flush_log_at_trx_commit = 1, the lag went up to 170 seconds and then a steep drop after about 3mins. [revin@forge msb]$ for t in {1..360}; do echo $(date +%Y-%m-%d_%H_%M_%S) && ( for s in 5169 5531; do echo "$s $(sb $s -e 'show slave status \G'|grep -Ei 'read_master_log_pos|exec_master_log_pos|seconds'|xargs)"; done ) && sleep 2; done 2013-05-25_00_12_18 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 238544568 Seconds_Behind_Master: 18 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 200156247 Seconds_Behind_Master: 19 2013-05-25_00_12_20 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 299611065 Seconds_Behind_Master: 18 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 203850830 Seconds_Behind_Master: 21 2013-05-25_00_12_22 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 360706440 Seconds_Behind_Master: 18 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 207445179 Seconds_Behind_Master: 23 2013-05-25_00_12_24 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 395422698 Seconds_Behind_Master: 0 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 210512293 Seconds_Behind_Master: 24 ... 2013-05-25_00_14_51 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 395422698 Seconds_Behind_Master: 0 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 387776125 Seconds_Behind_Master: 166 2013-05-25_00_14_53 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 395422698 Seconds_Behind_Master: 0 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 390456496 Seconds_Behind_Master: 168 2013-05-25_00_14_55 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 395422698 Seconds_Behind_Master: 0 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 393220532 Seconds_Behind_Master: 170 2013-05-25_00_14_57 5169 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 395422698 Seconds_Behind_Master: 0 5531 Read_Master_Log_Pos: 395422698 Exec_Master_Log_Pos: 395422698 Seconds_Behind_Master: 0 pt-pmp shows most of the time on AIO: [revin@forge ~]$ pt-pmp -i 10 -s 1 -p 5604 Sat May 25 00:17:48 EDT 2013 100 libaio::??(libaio.so.1),os_aio_linux_collect(os0file.c:4539),os_aio_linux_handle(os0file.c:4539),fil_aio_wait(fil0fil.c:4620),io_handler_thread(srv0start.c:482),start_thread(libpthread.so.0),clone(libc.so.6) 10 sigwait(libpthread.so.0),signal_hand(mysqld.cc:2700),start_thread(libpthread.so.0),clone(libc.so.6) 10 select(libc.so.6),os_thread_sleep(os0thread.c:259),srv_master_thread(srv0srv.c:2871),start_thread(libpthread.so.0),clone(libc.so.6) 10 read(libpthread.so.0),vio_read_buff(viosocket.c:95),my_real_read(net_serv.cc:831),my_net_read(net_serv.cc:1028),cli_safe_read(client.c:735),read_event(slave.cc:2226),handle_slave_io(slave.cc:2226),start_thread(libpthread.so.0),clone(libc.so.6) 10 pthread_cond_timedwait,os_cond_wait_timed(os0sync.c:171),os_event_wait_time_low(os0sync.c:171),srv_monitor_thread(srv0srv.c:2183),start_thread(libpthread.so.0),clone(libc.so.6) 10 pthread_cond_timedwait,os_cond_wait_timed(os0sync.c:171),os_event_wait_time_low(os0sync.c:171),srv_lock_timeout_thread(srv0srv.c:2333),start_thread(libpthread.so.0),clone(libc.so.6) 10 pthread_cond_timedwait,os_cond_wait_timed(os0sync.c:171),os_event_wait_time_low(os0sync.c:171),srv_error_monitor_thread(srv0srv.c:2516),start_thread(libpthread.so.0),clone(libc.so.6) 10 poll(libc.so.6),handle_connections_sockets(mysqld.cc:5179),mysqld_main(mysqld.cc:4625),__libc_start_main(libc.so.6),_start 7 fsync(libpthread.so.0),os_file_fsync(os0file.c:2073),os_file_flush_func(os0file.c:2073),pfs_os_file_flush_func(os0file.ic:381),fil_flush(os0file.ic:381),log_write_up_to(log0log.c:1514),trx_commit_complete_for_mysql(trx0trx.c:1724),innobase_commit(ha_innodb.cc:2818),ha_commit_one_phase(handler.cc:1393),ha_commit_trans(handler.cc:1327),trans_commit_stmt(transaction.cc:299),mysql_execute_command(sql_parse.cc:4485),mysql_parse(sql_parse.cc:5627),Query_log_event::do_apply_event(log_event.cc:3472),apply_event(log_event.h:1165),apply_event_and_update_pos(log_event.h:1165),exec_relay_log_event(slave.cc:2552),handle_slave_sql(slave.cc:3377),start_thread(libpthread.so.0),clone(libc.so.6) 1 memcpy(libc.so.6),Log_event::read_log_event(log_event.cc:1188),next_event(slave.cc:4614),exec_relay_log_event(slave.cc:4614),handle_slave_sql(slave.cc:3377),start_thread(libpthread.so.0),clone(libc.so.6) 1 Log_event::do_shall_skip(libc.so.6),shall_skip(log_event.h:1190),apply_event_and_update_pos(log_event.h:1190),exec_relay_log_event(slave.cc:2552),handle_slave_sql(slave.cc:3377),start_thread(libpthread.so.0),clone(libc.so.6) 1 lex_one_token(libc.so.6),MYSQLlex(sql_lex.cc:889),MYSQLparse(sql_yacc.cc:16806),parse_sql(sql_parse.cc:7365),mysql_parse(sql_parse.cc:5582),Query_log_event::do_apply_event(log_event.cc:3472),apply_event(log_event.h:1165),apply_event_and_update_pos(log_event.h:1165),exec_relay_log_event(slave.cc:2552),handle_slave_sql(slave.cc:3377),start_thread(libpthread.so.0),clone(libc.so.6)
[25 May 2013 4:45]
Jervin R
Repeatable on 5.5.13, 5.5.24, 5.6.11
[25 May 2013 4:52]
Jervin R
I also tested with both slaves 5.1.68 and 5.1.69 just to make sure its not IO, and both are able to catchup without problems: [revin@forge msb]$ for t in {1..360}; do echo $(date +%Y-%m-%d_%H_%M_%S) && ( for s in 5169 5168; do echo "$s $(sb $s -e 'show slave status \G'|grep -Ei 'read_master_log_pos|exec_master_log_pos|seconds'|xargs)"; done ) && sleep 2; done 2013-05-25_00_51_20 5169 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 35565992 Seconds_Behind_Master: 11 5168 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 30511490 Seconds_Behind_Master: 11 2013-05-25_00_51_22 5169 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 97201230 Seconds_Behind_Master: 11 5168 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 84266626 Seconds_Behind_Master: 12 2013-05-25_00_51_24 5169 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 158789902 Seconds_Behind_Master: 11 5168 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 138285706 Seconds_Behind_Master: 12 2013-05-25_00_51_26 5169 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 197711402 Seconds_Behind_Master: 0 5168 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 192343463 Seconds_Behind_Master: 13 2013-05-25_00_51_29 5169 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 197711402 Seconds_Behind_Master: 0 5168 Read_Master_Log_Pos: 197711402 Exec_Master_Log_Pos: 197711402 Seconds_Behind_Master: 0 ^C [revin@forge msb]$
[1 Jun 2013 3:36]
Jervin R
Ignore my last 2 comments, apparently, mysqlslap does not have its table definitions explicitly set, so it was MyISAM on 5.1 and InnoDB on 5.5 causing major replication differences. Running the same tests with default-storage-engine=InnoDB yields the same 5.5 numbers to 5.1 slaves.