Bug #44360 Replication failed
Submitted: 20 Apr 2009 7:34 Modified: 18 Jul 2011 8:47
Reporter: Nick Chang Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.1.33, 5.1.37, 5.1.39, 5.1.49 OS:Any (CentOS 4.6, windows)
Assigned to: Sven Sandberg
Tags: valgrind
Triage: Triaged: D1 (Critical)

[20 Apr 2009 7:34] Nick Chang
Description:
Every SAT morning, replication always replication failed.

Why i can't execute Delete_rows event??

mysql> show slave status \G;
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 10.10.20.146
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000030
          Read_Master_Log_Pos: 57229867
               Relay_Log_File: db2-relay-bin.000009
                Relay_Log_Pos: 9835212
        Relay_Master_Log_File: mysql-bin.000030
             Slave_IO_Running: Yes
            Slave_SQL_Running: No
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 1610
                   Last_Error: Could not execute Delete_rows event on table fetsport.I_MLB_BOXSCORE_FINISH; Corrupted replication event was detected, Error_code: 1610; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000030, end_log_pos 14355998
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 14355418
              Relay_Log_Space: 52712006
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 1610
               Last_SQL_Error: Could not execute Delete_rows event on table fetsport.I_MLB_BOXSCORE_FINISH; Corrupted replication event was detected, Error_code: 1610; handler error HA_ERR_END_OF_FILE; the event's master log mysql-bin.000030, end_log_pos 14355998
1 row in set (0.00 sec)

ERROR:
No query specified

mysql> show variables like '%binlog%';
+--------------------------------+------------+
| Variable_name                  | Value      |
+--------------------------------+------------+
| binlog_cache_size              | 32768      |
| binlog_format                  | MIXED      |
| innodb_locks_unsafe_for_binlog | OFF        |
| max_binlog_cache_size          | 4294967295 |
| max_binlog_size                | 1073741824 |
| sync_binlog                    | 1          |
+--------------------------------+------------+

How to repeat:
Setup New replication also display this problem.
[20 Apr 2009 9:48] Nick Chang
Master's my.cnf
[client]
port            = 3306
socket          = /var/lib/mysql/mysql.sock
default-character-set=utf8

[mysqld]
default-character-set=utf8
default-collation=utf8_general_ci
init_connect='SET NAMES utf8'

port            = 3306
socket          = /var/lib/mysql/mysql.sock
skip-locking
key_buffer = 256M
max_allowed_packet = 501M
table_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size= 16M
thread_concurrency = 8
log-bin=mysql-bin
server-id       = 1
old_passwords=1
max_connections=1000
long_query_time = 30
slow_query_log = 1
slow_query_log_file = slow_queries.log
log-bin-trust-function-creators = 1
skip-name-resolve
relay-log=db1-relay-bin
sync_binlog=1
binlog_format=mixed
[mysql]
no-auto-rehash

[isamchk]
key_buffer = 128M
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M

[myisamchk]
key_buffer = 128M
sort_buffer_size = 128M
read_buffer = 2M
write_buffer = 2M

[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid

slave's my.cnf
[client]
port            = 3306
socket          = /var/lib/mysql/mysql.sock
default-character-set=utf8

[mysqld]
default-character-set=utf8
default-collation=utf8_general_ci
init_connect='SET NAMES utf8'

port            = 3306
socket          = /var/lib/mysql/mysql.sock
skip-locking
key_buffer_size = 256M
max_allowed_packet = 601M
table_open_cache = 256
sort_buffer_size = 1M
read_buffer_size = 1M
read_rnd_buffer_size = 4M
myisam_sort_buffer_size = 64M
thread_cache_size = 8
query_cache_size= 32k
query_cache_limit = 4M
thread_concurrency = 8
log-bin=mysql-bin
server-id       = 2
old_passwords=1
max_connections=500
long_query_time = 30
slow_query_log = 1
slow_query_log_file = slow_queries.log
log-bin-trust-function-creators = 1
skip-name-resolve
relay-log=db2-relay-bin
slave_skip_errors=all
sync_binlog=1
binlog_format=mixed
wait_timeout=2880000
interactive_timeout = 2880000
innodb_flush_log_at_trx_commit=2

[mysql]
no-auto-rehash

[isamchk]
key_buffer_size = 128M
sort_buffer_size = 2M
read_buffer_size = 2M

[myisamchk]
key_buffer_size = 128M
sort_buffer_size = 128M
read_buffer_size = 2M
[myisamchk]
key_buffer_size = 128M
sort_buffer_size = 128M
read_buffer_size = 2M

[mysqld_safe]
err-log=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pid
[20 Apr 2009 9:57] Nick Chang
I updating mysql data to ftp.

Please help me solve this problem.

Thanks
[21 Apr 2009 11:16] Valerii Kravchuk
Had you uploaded mysql-bin.000030 (compressed) to our FTP server already? Please, add bug number, 44360, to the file name.
[22 Apr 2009 1:47] Nick Chang
I updated mysql-bin.00030 to your ftp and rename file to bug-data-4436.tar.gz

Thanks for your support. Please help me solve this problem.

Thanks BR.
Nick
[12 Jun 2009 8:42] Susanne Ebrecht
Could it be that you replication user don't have the rights for delete?
[12 Jun 2009 8:56] Nick Chang
Hi

Your mean, i need give delete privs to replication user??

Thanks
BR 

Nick
[12 Jun 2009 9:05] Susanne Ebrecht
Yes of course. When user don't have delete privs then it is not allowed to delete.

On replication the user used on slave is the replication user.
[17 Jun 2009 9:13] Nick Chang
Hello

I add delete privs to repl then I start slave.
But it show error message then mysql crash.

090612 17:16:03 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000042' at position 97652470, relay log './db2-relay-bin.000001' position: 4
090612 17:16:03 [Note] Slave I/O thread: connected to master 'repl@10.10.20.146:3306',replication started in log 'mysql-bin.000042' at position 97652470
090612 18:26:00 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=268435456
read_buffer_size=1048576
max_used_connections=1
max_threads=500
threads_connected=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1289319 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0xa67b0b8
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xa2b6d32c thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x21)[0x845a840]
/usr/sbin/mysqld(handle_segfault+0x358)[0x81d2a54]
/lib/tls/libpthread.so.0[0x5c4a98]
/usr/sbin/mysqld(_ZN14Rows_log_event8find_rowEPK14Relay_log_info+0x2c0)[0x828a254]
/usr/sbin/mysqld(_ZN21Update_rows_log_event11do_exec_rowEPK14Relay_log_info+0x1b)[0x828aa13]
/usr/sbin/mysqld(_ZN14Rows_log_event14do_apply_eventEPK14Relay_log_info+0x200)[0x82863cc]
/usr/sbin/mysqld(_Z26apply_event_and_update_posP9Log_eventP3THDP14Relay_log_infob+0xc9)[0x82e9d6b]
/usr/sbin/mysqld(handle_slave_sql+0x6ee)[0x82eacf0]
/lib/tls/libpthread.so.0[0x5be3cc]
/lib/tls/libc.so.6(__clone+0x5e)[0x4c21ae]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at (nil) is an invalid pointer
thd->thread_id=9
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[1 Jul 2009 7:56] Susanne Ebrecht
How did you add the delete priviledges?
[2 Jul 2009 3:32] Nick Chang
Yes, I add the delete priviledges.
[8 Jul 2009 11:50] Susanne Ebrecht
Sorry, seems you misunderstood me.

Please explain step by step what you did to add the delete privileges.
[10 Jul 2009 9:46] Nick Chang
Hello

It's my sql command.

UPDATE `mysql`.`user` SET `Delete_priv` = 'Y' WHERE CONVERT( `user`.`Host` USING utf8 ) = '%' AND CONVERT( `user`.`User` USING utf8 ) = 'repl' LIMIT 1 ;

Thanks
[13 Jul 2009 4:26] Susanne Ebrecht
You need to give delete grant on replication user including correct host. Consider that '%' is not the correct host for the user. When it is on same machine then the host is 'localhost' otherwise it is the exact host.

SELECT user, host FROM mysql.user; will show you right host and user name for the replication user.
[13 Jul 2009 6:48] Nick Chang
Hello. 

mysql> select user,host from mysql.user;
+----------+------------------+
| user     | host             |
+----------+------------------+
| admin    | %                |
| fetsport | %                |
| readonly | %                |
| repl     | %                |
| root     | 10.10.20.147     |
| root     | 127.0.0.1        |
| root     | db1.fetsport.com |
| admin    | localhost        |
| root     | localhost        |
+----------+------------------+
9 rows in set (0.00 sec)
[16 Jul 2009 9:23] Nick Chang
I upgrade mysql to 5.1.33. But it don't solve this problem.
[13 Aug 2009 9:31] Susanne Ebrecht
Please use GRANT for adding DELETE priviledges:

http://dev.mysql.com/doc/refman/5.1/en/grant.html
[18 Aug 2009 9:27] Nick Chang
I input command 'grant delete on *.* to repl@'%';'

But I still see the same error.
Please help me to slove problem.
Thanks
Nick
[19 Aug 2009 22:37] Sveta Smirnova
Thank you for the feedback.

Please also provide full slave and master error log files. Thanks in advance.
[24 Aug 2009 1:38] Nick Chang
error log

Attachment: bug-data-44360.zip (application/x-zip-compressed, text), 51.31 KiB.

[24 Aug 2009 1:39] Nick Chang
I upload error log.
Please help me to slave this problem.
Thanks
Nick
[24 Aug 2009 7:53] Sveta Smirnova
Thank you for the feedback.

It looks like you still get crashes only Saturday mornings. In this case please upgrade to current version 5.1.37, try with it and inform us if it solves the problem.

Please also upload mysql-bin.000045 and db2-relay-bin.000002 files, because I can not see errors for previous one mysql-bin.000030.
[24 Aug 2009 10:19] Nick Chang
I already upload to MySQL's FTP.

Thanks

Nick
[24 Aug 2009 20:18] Sveta Smirnova
Thank you for the feedback.

I get following error when try to unzip the file:

$unzip bug-data-44360.zip 
Archive:  bug-data-44360.zip
  End-of-central-directory signature not found.  Either this file is not
  a zipfile, or it constitutes one disk of a multi-part archive.  In the
  latter case the central directory and zipfile comment will be found on
  the last disk(s) of this archive.

$ls -la bug-data-44360.zip 
-rw-r--r-- 1 ssmirnova ssmirnova 42795916 Aug 24 22:15 bug-data-44360.zip

Please check if archive is valid.
[25 Aug 2009 4:04] Nick Chang
Hello

Can you rename bug-data-44360.zip to bug-data-44360.tar.gz
Then you can use tar -zxvf bug-data-44360.tar.gz

Nick
[25 Aug 2009 9:04] Sveta Smirnova
Thank you for the feedback.

Verified as described.

Backtrace:

(gdb) bt
#0  0x0000003429e0b002 in pthread_kill () from /lib64/libpthread.so.0
#1  0x0000000000afa6e5 in my_write_core (sig=11) at stacktrace.c:310
#2  0x000000000068eed6 in handle_segfault (sig=11) at mysqld.cc:2552
#3  <signal handler called>
#4  0x00000000005c665a in Field::set_notnull (this=0xb0dc918, row_offset=0) at ../../sql/field.h:262
#5  0x00000000007a474d in unpack_row (rli=0xb0beda0, table=0xb0de8d8, colcnt=26, row_data=0xafecdb8 "\024", cols=0xae7fa18, row_end=0xae7fab0, master_reclength=0xae7fa60) at rpl_record.cc:231
#6  0x00000000007a387a in Rows_log_event::unpack_current_row (this=0xae7f9b8, rli=0xb0beda0) at log_event.h:3549
#7  0x00000000007a0ac5 in Rows_log_event::find_row (this=0xae7f9b8, rli=0xb0beda0) at log_event.cc:8746
#8  0x00000000007a1c1d in Update_rows_log_event::do_exec_row (this=0xae7f9b8, rli=0xb0beda0) at log_event.cc:9182
#9  0x000000000079c53e in Rows_log_event::do_apply_event (this=0xae7f9b8, rli=0xb0beda0) at log_event.cc:7380
#10 0x00000000008453d8 in Log_event::apply_event (this=0xae7f9b8, rli=0xb0beda0) at log_event.h:1059
#11 0x000000000083f85e in apply_event_and_update_pos (ev=0xae7f9b8, thd=0xae04098, rli=0xb0beda0, skip=false) at slave.cc:2164
#12 0x000000000089b3cf in mysql_client_binlog_statement (thd=0xae04098) at sql_binlog.cc:212
#13 0x00000000006a8a99 in mysql_execute_command (thd=0xae04098) at sql_parse.cc:4845
#14 0x00000000006ab691 in mysql_parse (thd=0xae04098, 
    inBuf=0xb038688 "BINLOG '\nQA2GShMBAAAAcAAAAAJ0", 'A' <repeats 16 times>, "CGZldHNwb3J0AA9JX01MQl9OQkFfRVZFTlQAFQgP\nAw8PDwMPDw8PDw8PDw8MDw8PByAwAB4AAwAwAB4AaQAPAAYABgAsAQ8ALAFIAIcAPADcBf7/Dw==\nQA2GShMBAAAAXQAAAF90", 'A' <repeats 16 times>, "C"..., length=1366, found_semicolon=0x452786c0) at sql_parse.cc:5931
#15 0x000000000069dfa5 in dispatch_command (command=COM_QUERY, thd=0xae04098, 
    packet=0xafd30e9 "BINLOG '\nQA2GShMBAAAAcAAAAAJ0", 'A' <repeats 16 times>, "CGZldHNwb3J0AA9JX01MQl9OQkFfRVZFTlQAFQgP\nAw8PDwMPDw8PDw8PDw8MDw8PByAwAB4AAwAwAB4AaQAPAAYABgAsAQ8ALAFIAIcAPADcBf7/Dw==\nQA2GShMBAAAAXQAAAF90", 'A' <repeats 16 times>, "C"..., packet_length=1366) at sql_parse.cc:1213
#16 0x000000000069d02b in do_command (thd=0xae04098) at sql_parse.cc:854
#17 0x000000000069b4bd in handle_one_connection (arg=0xae04098) at sql_connect.cc:1127
#18 0x0000003429e061b5 in #16 0x000000000069d02b in do_command (thd=0xae04098) at sql_parse.cc:854
#17 0x000000000069b4bd in handle_one_connection (arg=0xae04098) at sql_connect.cc:1127
#18 0x0000003429e061b5 in start_thread () from /lib64/libpthread.so.0
#19 0x00000034292cd39d in clone () from /lib64/libc.so.6
#20 0x0000000000000000 in ?? ()

How to repeat is in private comment.
[26 Aug 2009 7:40] Shane Bester
stack trace with my reduced testcase, on 5.1.39-debug:

mysqld.exe!LeadDownVec()[memcpy.asm:485]
mysqld.exe!Field_varstring::unpack()[field.cc:7347]
mysqld.exe!unpack_row()[rpl_record.cc:242]
mysqld.exe!Rows_log_event::unpack_current_row()[log_event.h:3549]
mysqld.exe!Rows_log_event::find_row()[log_event.cc:8746]
mysqld.exe!Update_rows_log_event::do_exec_row()[log_event.cc:9182]
mysqld.exe!Rows_log_event::do_apply_event()[log_event.cc:7380]
mysqld.exe!Log_event::apply_event()[log_event.h:1059]
mysqld.exe!apply_event_and_update_pos()[slave.cc:2164]
mysqld.exe!mysql_client_binlog_statement()[sql_binlog.cc:212]
mysqld.exe!mysql_execute_command()[sql_parse.cc:4845]
mysqld.exe!mysql_parse()[sql_parse.cc:5931]
mysqld.exe!dispatch_command()[sql_parse.cc:1213]
mysqld.exe!do_command()[sql_parse.cc:854]
mysqld.exe!handle_one_connection()[sql_connect.cc:1127]
mysqld.exe!pthread_start()[my_winthread.c:87]
mysqld.exe!_callthreadstart()[thread.c:293]
mysqld.exe!_threadstart()[thread.c:277]
kernel32.dll!FlsSetValue()
[26 Aug 2009 8:04] Shane Bester
on linux my reduced testcase caused 5.1.39-debug valgrind errors on both binlog statements.

Attachment: bug44360_reduced_testcase_valgrind_errors.sql (text/x-sql), 31.22 KiB.

[26 Aug 2009 8:14] Sveta Smirnova
Reduced test case crashes Linux server for me as well, but on Mac it fails with strange error:

ERROR 1146 (42S02) at line 113: Table 'fetsport.S_GAME' doesn't exist
[18 Jun 2010 21:35] Roel Van de Paar
See also bug #46597
[18 Jun 2010 21:45] Roel Van de Paar
Related to bug #39648?
[18 Jun 2010 21:49] Roel Van de Paar
Related to bug #47322?
[2 Aug 2010 14:48] Shane Bester
this is caused by a corrupted binlog.  how the binlog got into this state is
another story, but many table id's are all zero, e.g:
Table_map: `test`.`tt1` mapped to number 0
[2 Sep 2010 13:12] Shane Bester
i think some DDL is done on the master and this trashes the table_map id's in the binlogs. can anybody confirm if this is the case ?
[31 May 2011 12:48] Shane Bester
still crashes 5.1.57 and 5.5.12 using the testcase: http://bugs.mysql.com/file.php?id=12774
[18 Jul 2011 8:47] Jon Stephens
Documented fix in the 5.1.59, 5.5.16, and 5.6.3 changelogs, as follows:

        Processing of corrupted table map events could cause the server
        to crash. This was especially likely if the events mapped
        different tables to the same identifier, such as could happen
        due to Bug#56226.

        Now, before applying a table map event, the server checks
        whether the table has already been mapped with different
        settings, and if so, an error is raised and the slave SQL thread
        stops. (If it has been mapped with the same settings, or if the
        table is set to be ignored by filtering rules, there is no
        change in behavior: the event is skipped and identifiers are
        not checked.

Closed.