Bug #44360 | Replication failed | ||
---|---|---|---|
Submitted: | 20 Apr 2009 7:34 | Modified: | 18 Jul 2011 8:47 |
Reporter: | Nick Chang | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S1 (Critical) |
Version: | 5.1.33, 5.1.37, 5.1.39, 5.1.49 | OS: | Any (CentOS 4.6, windows) |
Assigned to: | Sven Sandberg | CPU Architecture: | Any |
Tags: | valgrind |
[20 Apr 2009 7:34]
Nick Chang
[20 Apr 2009 9:48]
Nick Chang
Master's my.cnf [client] port = 3306 socket = /var/lib/mysql/mysql.sock default-character-set=utf8 [mysqld] default-character-set=utf8 default-collation=utf8_general_ci init_connect='SET NAMES utf8' port = 3306 socket = /var/lib/mysql/mysql.sock skip-locking key_buffer = 256M max_allowed_packet = 501M table_cache = 256 sort_buffer_size = 1M read_buffer_size = 1M read_rnd_buffer_size = 4M myisam_sort_buffer_size = 64M thread_cache_size = 8 query_cache_size= 16M thread_concurrency = 8 log-bin=mysql-bin server-id = 1 old_passwords=1 max_connections=1000 long_query_time = 30 slow_query_log = 1 slow_query_log_file = slow_queries.log log-bin-trust-function-creators = 1 skip-name-resolve relay-log=db1-relay-bin sync_binlog=1 binlog_format=mixed [mysql] no-auto-rehash [isamchk] key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M [myisamchk] key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M [mysqld_safe] err-log=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid slave's my.cnf [client] port = 3306 socket = /var/lib/mysql/mysql.sock default-character-set=utf8 [mysqld] default-character-set=utf8 default-collation=utf8_general_ci init_connect='SET NAMES utf8' port = 3306 socket = /var/lib/mysql/mysql.sock skip-locking key_buffer_size = 256M max_allowed_packet = 601M table_open_cache = 256 sort_buffer_size = 1M read_buffer_size = 1M read_rnd_buffer_size = 4M myisam_sort_buffer_size = 64M thread_cache_size = 8 query_cache_size= 32k query_cache_limit = 4M thread_concurrency = 8 log-bin=mysql-bin server-id = 2 old_passwords=1 max_connections=500 long_query_time = 30 slow_query_log = 1 slow_query_log_file = slow_queries.log log-bin-trust-function-creators = 1 skip-name-resolve relay-log=db2-relay-bin slave_skip_errors=all sync_binlog=1 binlog_format=mixed wait_timeout=2880000 interactive_timeout = 2880000 innodb_flush_log_at_trx_commit=2 [mysql] no-auto-rehash [isamchk] key_buffer_size = 128M sort_buffer_size = 2M read_buffer_size = 2M [myisamchk] key_buffer_size = 128M sort_buffer_size = 128M read_buffer_size = 2M [myisamchk] key_buffer_size = 128M sort_buffer_size = 128M read_buffer_size = 2M [mysqld_safe] err-log=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid
[20 Apr 2009 9:57]
Nick Chang
I updating mysql data to ftp. Please help me solve this problem. Thanks
[21 Apr 2009 11:16]
Valeriy Kravchuk
Had you uploaded mysql-bin.000030 (compressed) to our FTP server already? Please, add bug number, 44360, to the file name.
[22 Apr 2009 1:47]
Nick Chang
I updated mysql-bin.00030 to your ftp and rename file to bug-data-4436.tar.gz Thanks for your support. Please help me solve this problem. Thanks BR. Nick
[12 Jun 2009 8:42]
Susanne Ebrecht
Could it be that you replication user don't have the rights for delete?
[12 Jun 2009 8:56]
Nick Chang
Hi Your mean, i need give delete privs to replication user?? Thanks BR Nick
[12 Jun 2009 9:05]
Susanne Ebrecht
Yes of course. When user don't have delete privs then it is not allowed to delete. On replication the user used on slave is the replication user.
[17 Jun 2009 9:13]
Nick Chang
Hello I add delete privs to repl then I start slave. But it show error message then mysql crash. 090612 17:16:03 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000042' at position 97652470, relay log './db2-relay-bin.000001' position: 4 090612 17:16:03 [Note] Slave I/O thread: connected to master 'repl@10.10.20.146:3306',replication started in log 'mysql-bin.000042' at position 97652470 090612 18:26:00 - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=268435456 read_buffer_size=1048576 max_used_connections=1 max_threads=500 threads_connected=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1289319 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. thd: 0xa67b0b8 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0xa2b6d32c thread_stack 0x30000 /usr/sbin/mysqld(my_print_stacktrace+0x21)[0x845a840] /usr/sbin/mysqld(handle_segfault+0x358)[0x81d2a54] /lib/tls/libpthread.so.0[0x5c4a98] /usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEPK14Relay_log_info+0x2c0)[0x828a254] /usr/sbin/mysqld(_ZN21Update_rows_log_event11do_exec_rowEPK14Relay_log_info+0x1b)[0x828aa13] /usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x200)[0x82863cc] /usr/sbin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_infob+0xc9)[0x82e9d6b] /usr/sbin/mysqld(handle_slave_sql+0x6ee)[0x82eacf0] /lib/tls/libpthread.so.0[0x5be3cc] /lib/tls/libc.so.6(__clone+0x5e)[0x4c21ae] Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at (nil) is an invalid pointer thd->thread_id=9 thd->killed=NOT_KILLED The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash.
[1 Jul 2009 7:56]
Susanne Ebrecht
How did you add the delete priviledges?
[2 Jul 2009 3:32]
Nick Chang
Yes, I add the delete priviledges.
[8 Jul 2009 11:50]
Susanne Ebrecht
Sorry, seems you misunderstood me. Please explain step by step what you did to add the delete privileges.
[10 Jul 2009 9:46]
Nick Chang
Hello It's my sql command. UPDATE `mysql`.`user` SET `Delete_priv` = 'Y' WHERE CONVERT( `user`.`Host` USING utf8 ) = '%' AND CONVERT( `user`.`User` USING utf8 ) = 'repl' LIMIT 1 ; Thanks
[13 Jul 2009 4:26]
Susanne Ebrecht
You need to give delete grant on replication user including correct host. Consider that '%' is not the correct host for the user. When it is on same machine then the host is 'localhost' otherwise it is the exact host. SELECT user, host FROM mysql.user; will show you right host and user name for the replication user.
[13 Jul 2009 6:48]
Nick Chang
Hello. mysql> select user,host from mysql.user; +----------+------------------+ | user | host | +----------+------------------+ | admin | % | | fetsport | % | | readonly | % | | repl | % | | root | 10.10.20.147 | | root | 127.0.0.1 | | root | db1.fetsport.com | | admin | localhost | | root | localhost | +----------+------------------+ 9 rows in set (0.00 sec)
[16 Jul 2009 9:23]
Nick Chang
I upgrade mysql to 5.1.33. But it don't solve this problem.
[13 Aug 2009 9:31]
Susanne Ebrecht
Please use GRANT for adding DELETE priviledges: http://dev.mysql.com/doc/refman/5.1/en/grant.html
[18 Aug 2009 9:27]
Nick Chang
I input command 'grant delete on *.* to repl@'%';' But I still see the same error. Please help me to slove problem. Thanks Nick
[19 Aug 2009 22:37]
Sveta Smirnova
Thank you for the feedback. Please also provide full slave and master error log files. Thanks in advance.
[24 Aug 2009 1:38]
Nick Chang
error log
Attachment: bug-data-44360.zip (application/x-zip-compressed, text), 51.31 KiB.
[24 Aug 2009 1:39]
Nick Chang
I upload error log. Please help me to slave this problem. Thanks Nick
[24 Aug 2009 7:53]
Sveta Smirnova
Thank you for the feedback. It looks like you still get crashes only Saturday mornings. In this case please upgrade to current version 5.1.37, try with it and inform us if it solves the problem. Please also upload mysql-bin.000045 and db2-relay-bin.000002 files, because I can not see errors for previous one mysql-bin.000030.
[24 Aug 2009 10:19]
Nick Chang
I already upload to MySQL's FTP. Thanks Nick
[24 Aug 2009 20:18]
Sveta Smirnova
Thank you for the feedback. I get following error when try to unzip the file: $unzip bug-data-44360.zip Archive: bug-data-44360.zip End-of-central-directory signature not found. Either this file is not a zipfile, or it constitutes one disk of a multi-part archive. In the latter case the central directory and zipfile comment will be found on the last disk(s) of this archive. $ls -la bug-data-44360.zip -rw-r--r-- 1 ssmirnova ssmirnova 42795916 Aug 24 22:15 bug-data-44360.zip Please check if archive is valid.
[25 Aug 2009 4:04]
Nick Chang
Hello Can you rename bug-data-44360.zip to bug-data-44360.tar.gz Then you can use tar -zxvf bug-data-44360.tar.gz Nick
[25 Aug 2009 9:04]
Sveta Smirnova
Thank you for the feedback. Verified as described. Backtrace: (gdb) bt #0 0x0000003429e0b002 in pthread_kill () from /lib64/libpthread.so.0 #1 0x0000000000afa6e5 in my_write_core (sig=11) at stacktrace.c:310 #2 0x000000000068eed6 in handle_segfault (sig=11) at mysqld.cc:2552 #3 <signal handler called> #4 0x00000000005c665a in Field::set_notnull (this=0xb0dc918, row_offset=0) at ../../sql/field.h:262 #5 0x00000000007a474d in unpack_row (rli=0xb0beda0, table=0xb0de8d8, colcnt=26, row_data=0xafecdb8 "\024", cols=0xae7fa18, row_end=0xae7fab0, master_reclength=0xae7fa60) at rpl_record.cc:231 #6 0x00000000007a387a in Rows_log_event::unpack_current_row (this=0xae7f9b8, rli=0xb0beda0) at log_event.h:3549 #7 0x00000000007a0ac5 in Rows_log_event::find_row (this=0xae7f9b8, rli=0xb0beda0) at log_event.cc:8746 #8 0x00000000007a1c1d in Update_rows_log_event::do_exec_row (this=0xae7f9b8, rli=0xb0beda0) at log_event.cc:9182 #9 0x000000000079c53e in Rows_log_event::do_apply_event (this=0xae7f9b8, rli=0xb0beda0) at log_event.cc:7380 #10 0x00000000008453d8 in Log_event::apply_event (this=0xae7f9b8, rli=0xb0beda0) at log_event.h:1059 #11 0x000000000083f85e in apply_event_and_update_pos (ev=0xae7f9b8, thd=0xae04098, rli=0xb0beda0, skip=false) at slave.cc:2164 #12 0x000000000089b3cf in mysql_client_binlog_statement (thd=0xae04098) at sql_binlog.cc:212 #13 0x00000000006a8a99 in mysql_execute_command (thd=0xae04098) at sql_parse.cc:4845 #14 0x00000000006ab691 in mysql_parse (thd=0xae04098, inBuf=0xb038688 "BINLOG '\nQA2GShMBAAAAcAAAAAJ0", 'A' <repeats 16 times>, "CGZldHNwb3J0AA9JX01MQl9OQkFfRVZFTlQAFQgP\nAw8PDwMPDw8PDw8PDw8MDw8PByAwAB4AAwAwAB4AaQAPAAYABgAsAQ8ALAFIAIcAPADcBf7/Dw==\nQA2GShMBAAAAXQAAAF90", 'A' <repeats 16 times>, "C"..., length=1366, found_semicolon=0x452786c0) at sql_parse.cc:5931 #15 0x000000000069dfa5 in dispatch_command (command=COM_QUERY, thd=0xae04098, packet=0xafd30e9 "BINLOG '\nQA2GShMBAAAAcAAAAAJ0", 'A' <repeats 16 times>, "CGZldHNwb3J0AA9JX01MQl9OQkFfRVZFTlQAFQgP\nAw8PDwMPDw8PDw8PDw8MDw8PByAwAB4AAwAwAB4AaQAPAAYABgAsAQ8ALAFIAIcAPADcBf7/Dw==\nQA2GShMBAAAAXQAAAF90", 'A' <repeats 16 times>, "C"..., packet_length=1366) at sql_parse.cc:1213 #16 0x000000000069d02b in do_command (thd=0xae04098) at sql_parse.cc:854 #17 0x000000000069b4bd in handle_one_connection (arg=0xae04098) at sql_connect.cc:1127 #18 0x0000003429e061b5 in #16 0x000000000069d02b in do_command (thd=0xae04098) at sql_parse.cc:854 #17 0x000000000069b4bd in handle_one_connection (arg=0xae04098) at sql_connect.cc:1127 #18 0x0000003429e061b5 in start_thread () from /lib64/libpthread.so.0 #19 0x00000034292cd39d in clone () from /lib64/libc.so.6 #20 0x0000000000000000 in ?? () How to repeat is in private comment.
[26 Aug 2009 7:40]
MySQL Verification Team
stack trace with my reduced testcase, on 5.1.39-debug: mysqld.exe!LeadDownVec()[memcpy.asm:485] mysqld.exe!Field_varstring::unpack()[field.cc:7347] mysqld.exe!unpack_row()[rpl_record.cc:242] mysqld.exe!Rows_log_event::unpack_current_row()[log_event.h:3549] mysqld.exe!Rows_log_event::find_row()[log_event.cc:8746] mysqld.exe!Update_rows_log_event::do_exec_row()[log_event.cc:9182] mysqld.exe!Rows_log_event::do_apply_event()[log_event.cc:7380] mysqld.exe!Log_event::apply_event()[log_event.h:1059] mysqld.exe!apply_event_and_update_pos()[slave.cc:2164] mysqld.exe!mysql_client_binlog_statement()[sql_binlog.cc:212] mysqld.exe!mysql_execute_command()[sql_parse.cc:4845] mysqld.exe!mysql_parse()[sql_parse.cc:5931] mysqld.exe!dispatch_command()[sql_parse.cc:1213] mysqld.exe!do_command()[sql_parse.cc:854] mysqld.exe!handle_one_connection()[sql_connect.cc:1127] mysqld.exe!pthread_start()[my_winthread.c:87] mysqld.exe!_callthreadstart()[thread.c:293] mysqld.exe!_threadstart()[thread.c:277] kernel32.dll!FlsSetValue()
[26 Aug 2009 8:04]
MySQL Verification Team
on linux my reduced testcase caused 5.1.39-debug valgrind errors on both binlog statements.
Attachment: bug44360_reduced_testcase_valgrind_errors.sql (text/x-sql), 31.22 KiB.
[26 Aug 2009 8:14]
Sveta Smirnova
Reduced test case crashes Linux server for me as well, but on Mac it fails with strange error: ERROR 1146 (42S02) at line 113: Table 'fetsport.S_GAME' doesn't exist
[18 Jun 2010 21:35]
Roel Van de Paar
See also bug #46597
[18 Jun 2010 21:45]
Roel Van de Paar
Related to bug #39648?
[18 Jun 2010 21:49]
Roel Van de Paar
Related to bug #47322?
[2 Aug 2010 14:48]
MySQL Verification Team
this is caused by a corrupted binlog. how the binlog got into this state is another story, but many table id's are all zero, e.g: Table_map: `test`.`tt1` mapped to number 0
[2 Sep 2010 13:12]
MySQL Verification Team
i think some DDL is done on the master and this trashes the table_map id's in the binlogs. can anybody confirm if this is the case ?
[31 May 2011 12:48]
MySQL Verification Team
still crashes 5.1.57 and 5.5.12 using the testcase: http://bugs.mysql.com/file.php?id=12774
[18 Jul 2011 8:47]
Jon Stephens
Documented fix in the 5.1.59, 5.5.16, and 5.6.3 changelogs, as follows: Processing of corrupted table map events could cause the server to crash. This was especially likely if the events mapped different tables to the same identifier, such as could happen due to Bug#56226. Now, before applying a table map event, the server checks whether the table has already been mapped with different settings, and if so, an error is raised and the slave SQL thread stops. (If it has been mapped with the same settings, or if the table is set to be ignored by filtering rules, there is no change in behavior: the event is skipped and identifiers are not checked. Closed.