Bug #48091 valgrind errors when slave has double not null and master has double null
Submitted: 15 Oct 2009 17:37 Modified: 15 Mar 2010 4:52
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S1 (Critical)
Version:5.1.39 OS:Any
Assigned to: Alfranio Correia CPU Architecture:Any
Tags: regression, valgrind
Triage: Triaged: D2 (Serious)

[15 Oct 2009 17:37] Shane Bester
Description:
when a table on master contains nullable double field while the slave contains
not null able double field, the slave reads invalid memory according to valgrind.

One such sample (will attach entire error log as file)

Invalid read of size 1
at memcpy (mc_replace_strmem.c:482)
by Field_real::unpack (field.cc:1498)
by unpack_row (rpl_record.cc:242)
by Rows_log_event::write_row (log_event.h:3549)
by Write_rows_log_event::do_exec_row (log_event.cc:8623)
by Rows_log_event::do_apply_event (log_event.cc:7407)
by apply_event_and_update_pos (log_event.h:1059)
by handle_slave_sql (slave.cc:2281)
by start_thread (in /lib/libpthread-2.5.so)
by clone (in /lib/libc-2.5.so)

How to repeat:
setup a working row-based replication while running the slave under valgrind.
then run the following on the respective connections:

#on the master:
drop table if exists `t1`;
create table `t1` (`col000` double default null) engine=myisam;

#on the slave:
drop table if exists `t1`;
create table `t1` (`col000` double not null) engine=myisam;

#on the master:
insert into t1 values (null);

Suggested fix:
don't read invalid memory
[15 Oct 2009 17:38] Shane Bester
complete valgrind output from 5.1.39

Attachment: bug48091_valgrind_5.1.39_output.txt (text/plain), 23.09 KiB.

[19 Oct 2009 5:51] Roel Van de Paar
Showing the error twice on 5.1.39 + Valgrind information after shutdown

Attachment: valgrind_reads.txt (text/plain), 11.69 KiB.

[19 Oct 2009 5:51] Roel Van de Paar
Verified as described.
[19 Oct 2009 6:32] Roel Van de Paar
5.1.36 does not show the issue, but fails correctly on the slave:

             Slave_IO_Running: Yes
            Slave_SQL_Running: No
                   Last_Errno: 1048
                   Last_Error: Error 'Column 'col000' cannot be null' on query. Default database: 'roelt'. Query: 'insert into t4 values (null)'
[19 Oct 2009 6:39] Roel Van de Paar
Notice that 5.1.39, even after the error, keeps running:

             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
                   Last_Errno: 0
                   Last_Error:
[20 Oct 2009 16:53] Luis Soares
May be related: BUG#43789.

Also, while testing this I hit an assertion (in MySQL Version 5.1.41):

TEST CASE
=========

-- source include/master-slave.inc
-- source include/have_binlog_format_row.inc

-- connection slave

SET SQL_LOG_BIN=0;
create table `t1` (c1 double not null) engine=myisam;
SET SQL_LOG_BIN=1;

-- connection master

SET SQL_LOG_BIN=0;
create table `t1` (c1 double default null) engine=myisam;
SET SQL_LOG_BIN=1;

insert into t1 values (null); 
DROP TABLE t1; 
-- sync_slave_with_master

ASSERTION
=========

Thread 1 (process 18289):
#0  0x00007f97578601f6 in pthread_kill () from /lib/libpthread.so.0
#1  0x0000000000b30e47 in my_write_core (sig=6) at stacktrace.c:310
#2  0x00000000006bd11e in handle_segfault (sig=6) at mysqld.cc:2570
#3  <signal handler called>
#4  0x00007f9756a21fb5 in raise () from /lib/libc.so.6
#5  0x00007f9756a23bc3 in abort () from /lib/libc.so.6
#6  0x00007f9756a1af09 in __assert_fail () from /lib/libc.so.6
#7  0x00000000007d5c60 in Rows_log_event::unpack_current_row (this=0x1aebda0, rli=0x1aace30) at log_event.h:3552
#8  0x00000000007bfab0 in Rows_log_event::write_row (this=0x1aebda0, rli=0x1aace30, overwrite=false) at log_event.cc:8459
#9  0x00000000007c06c3 in Write_rows_log_event::do_exec_row (this=0x1aebda0, rli=0x1aace30) at log_event.cc:8645
#10 0x00000000007cafe2 in Rows_log_event::do_apply_event (this=0x1aebda0, rli=0x1aace30) at log_event.cc:7423
#11 0x0000000000882670 in Log_event::apply_event (this=0x1aebda0, rli=0x1aace30) at log_event.h:1059
#12 0x0000000000878665 in apply_event_and_update_pos (ev=0x1aebda0, thd=0x1ae8e90, rli=0x1aace30) at slave.cc:2136
#13 0x000000000087b467 in exec_relay_log_event (thd=0x1ae8e90, rli=0x1aace30) at slave.cc:2277
#14 0x000000000087c2fe in handle_slave_sql (arg=0x1aaba40) at slave.cc:3023
#15 0x00007f975785b3ba in start_thread () from /lib/libpthread.so.0
#16 0x00007f9756ad4fcd in clone () from /lib/libc.so.6
#17 0x0000000000000000 in ?? ()
[22 Oct 2009 0:10] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87702

3188 Alfranio Correia	2009-10-22
      BUG#48091 valgrind errors when slave has double not null and master has double null
      
      Backporting BUG#38173 Field doesn't have a default value with row-based replication
[22 Oct 2009 0:11] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87703

3188 Alfranio Correia	2009-10-22
      BUG#48091 valgrind errors when slave has double not null and master has double null
      
      Backporting BUG#38173 to mysql-5.1-bugteam
      
      The reason of  the bug was incompatibile with the master side behaviour.
      INSERT query on the master is allowed to insert into a table without specifying
      values of DEFAULT-less fields if sql_mode is not strict.
                        
      Fixed with checking sql_mode by the sql thread to decide how to react.
      Non-strict sql_mode should allow Write_rows event to complete.
                        
      todo: warnings can be shown via show slave status, still this is a 
      separate rather general issue how to show warnings for the slave threads.
[22 Oct 2009 0:16] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87704

3189 Alfranio Correia	2009-10-22
      BUG#48091 valgrind errors when slave has double not null and master has double null
            
      Backporting BUG#43789 to mysql-5.1-bugteam
                                    
      The replication was generating corrupted data, warning messages on Valgrind
      and aborting on debug mode while replicating a "null" to "not null" field.
      Specifically the unpack_row routine, was considering the slave's table
      definition and trying to retrieve a field value, where there was nothing to be
      retrieved, ignoring the fact that the value was defined as "null" by the master.
                                    
      To fix the problem, we proceed as follows:
                                    
      1 - If it is not STRICT sql_mode, implicit default values are used, regardless
      if it is multi-row or single-row statement.
                                    
      2 - However, if it is STRICT mode, then a we do what follows:
                                    
      2.1 If it is a transactional engine, we do a rollback on the first NULL that is
      to be set into a NOT NULL column and return an error.
                                    
      2.2 If it is a non-transactional engine and it is the first row to be inserted
      with multi-row, we also return the error. Otherwise, we proceed with the
      execution, use implicit default values and print out warning messages.
                              
      Unfortunately, the current patch cannot mimic the behavior showed by the master
      for updates on multi-tables and multi-row inserts. This happens because such
      statements are unfolded in different row events. For instance, considering the
      following updates and strict mode:
                              
      (master)
      create table t1 (a int);
      create table t2 (a int not null);
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                              
      t1 would have (10) and t2 would have (0) as this would be handled as a
      multi-row update. On the other hand, if we had the following updates:
                              
      (master)
      create table t1 (a int);
      create table t2 (a int);
                              
      (slave)
      create table t1 (a int);
      create table t2 (a int not null);
                              
      (master)
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                              
      On the master t1 would have (10) and t2 would have (NULL). On
      the slave, t1 would have (10) but the update on t1 would fail.
[22 Oct 2009 0:20] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87705

3190 Alfranio Correia	2009-10-22
      BUG#48091 valgrind errors when slave has double not null and master has double null
            
      Backporting BUG#43789 to mysql-5.1-bugteam
      
      Post-fix for BUG#43789.
[22 Oct 2009 0:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/87706

3191 Alfranio Correia	2009-10-22
      BUG#48091 valgrind errors when slave has double not null and master has double null
            
      Backporting BUG#47741 to mysql-5.1-bugteam
[2 Dec 2009 8:07] Bugs System
Pushed into 5.1.42 (revid:joro@sun.com-20091202080033-mndu4sxwx19lz2zs) (version source revid:davi.arnaut@sun.com-20091125130912-d7hrln14ef7y5d7i) (merge vers: 5.1.42) (pib:13)
[3 Dec 2009 14:18] Jon Stephens
Noted in 5.1.42 changelog (see BUG#43789). Closed.
[16 Dec 2009 8:40] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091216083311-xorsasf5kopjxshf) (version source revid:alik@sun.com-20091214191830-wznm8245ku8xo702) (merge vers: 6.0.14-alpha) (pib:14)
[16 Dec 2009 8:47] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091216082430-s0gtzibcgkv4pqul) (version source revid:alexey.kopytov@sun.com-20091124083136-iqm136jm31sfdwg3) (merge vers: 5.5.0-beta) (pib:14)
[16 Dec 2009 8:54] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091216083231-rp8ecpnvkkbhtb27) (version source revid:alik@sun.com-20091212203859-fx4rx5uab47wwuzd) (merge vers: 5.6.0-beta) (pib:14)
[16 Dec 2009 15:46] Jon Stephens
No new changelog entries required (see BUG#43789). Closed.
[12 Mar 2010 14:16] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:32] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:48] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[15 Mar 2010 4:52] Jon Stephens
5.1-main -> telco merge; no new changelog entries required. (Also see my previous comments.) Closed.