Bug #43789 different master/slave table defs cause crash: text/varchar null vs not null
Submitted: 22 Mar 2009 11:55 Modified: 15 Mar 2010 4:41
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S1 (Critical)
Version:5.1.32 OS:Any
Assigned to: Alfranio Correia CPU Architecture:Any
Tags: valgrind
Triage: Triaged: D1 (Critical) / R2 (Low) / E3 (Medium)

[22 Mar 2009 11:55] Shane Bester
Description:
similar to bug #43785 and bug #38850 and bug #43783 just different column type.

text or varchar columns declared as null on master but not null on slave cause slave to crash:

Invalid read of size 1
at 0x85078F5: my_utf8_uni (ctype-utf8.c:1954)
by 0x84FF8C8: my_well_formed_len_mb (ctype-mb.c:297)
by 0x81F0464: well_formed_copy_nchars (sql_string.cc:990)
by 0x81D733B: Field_blob::store (field.cc:7754)
by 0x81CD187: Field_blob::unpack (field.cc:8132)
by 0x82CB38E: unpack_row (rpl_record.cc:242)
by 0x82C838B: Rows_log_event::write_row (log_event.h:3548)
by 0x82C8D21: Write_rows_log_event::do_exec_row (log_event.cc:8513)
by 0x82C79CD: Rows_log_event::do_apply_event (log_event.cc:7281)
by 0x8348BC1: apply_event_and_update_pos (log_event.h:1056)
by 0x834BBD6: exec_relay_log_event (slave.cc:2130)
by 0x834C292: handle_slave_sql (slave.cc:2801)
Address 0x64A4E7E is 1 bytes after a block of size 5 alloc'd
at 0x4005400: malloc (vg_replace_malloc.c:149)
by 0x84DBE59: my_malloc (my_malloc.c:34)
by 0x82BBD97: Rows_log_event::Rows_log_event (log_event.cc:6961)
by 0x82BBF22: Write_rows_log_event::Write_rows_log_event (log_event.cc:8131)
by 0x82C45BB: Log_event::read_log_event (log_event.cc:1183)
by 0x82C4AE1: Log_event::read_log_event (log_event.cc:1032)
by 0x834A24C: next_event (slave.cc:3834)
by 0x834B411: exec_relay_log_event (slave.cc:2095)
by 0x834C292: handle_slave_sql (slave.cc:2801)
by 0x4893DA: start_thread 
by 0x3D606D: clone 

I won't report any more 'null vs not null' type bugs. Therefore marking this one as "QA test needed" because each column type must be tested.

How to repeat:
table def on master:

create table `t1` ( `a` int(11) not null auto_increment,
  `col000` varchar(255) default null , primary key (`a`)
) engine=myisam default charset=latin1;

table def on slave:

create table `t1` ( `a` int(11) not null auto_increment,
  `col000` varchar(255)  not null, primary key (`a`)
) engine=myisam default charset=latin1;

insert into a row into the master:

insert into t1 values (1,null);
[30 Mar 2009 4:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/70780

2837 Alfranio Correia	2009-03-30
      BUG#43789 different master/slave table defs cause crash: text/varchar null vs not null
      
      The replication was generating corrupted data, warning messages on
      Valgrind and abort on debug mode while replicating a "null" to
      "not null" field. Specifically the unpack_row routine, was considering
      the slave's table definition and trying to retrieve a field value, where
      there was nothing to be retrieved, ignoring the fact that the value was
      defined as "null" by the master.
      
      To fix the problem, we ignore the slave's table definition thus enabling
      the following behavior. If there is a default value, it is assigned to
      the field.
[30 Mar 2009 4:48] Alfranio Correia
The different types have no influence on the failure.
BUG#43785, BUG#38850 and BUG#43783 should be set as duplicates.
[1 Apr 2009 2:50] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/70990

2837 Alfranio Correia	2009-04-01
      BUG#43789 different master/slave table defs cause crash: text/varchar null
      vs not null
            
      The replication was generating corrupted data, warning messages on
      Valgrind and aborting on debug mode while replicating a "null" to
      "not null" field. Specifically the unpack_row routine, was considering
      the slave's table definition and trying to retrieve a field value, where
      there was nothing to be retrieved, ignoring the fact that the value was
      defined as "null" by the master.
            
      To fix the problem, we throw an error if the slave's table definition
      does not accept nulls and proceed with the execution otherwise.
      
      added:
        mysql-test/suite/rpl/r/rpl_not_null.result
        mysql-test/suite/rpl/t/rpl_not_null.test
      modified:
        sql/rpl_record.cc
        sql/log_event.cc
        sql/log_event_old.cc
[5 Apr 2009 12:13] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71399

2839 Alfranio Correia	2009-04-05
      BUG#43789 different master/slave table defs cause crash: text/varchar null
      vs not null
                  
      The replication was generating corrupted data, warning messages on
      Valgrind and aborting on debug mode while replicating a "null" to
      "not null" field. Specifically the unpack_row routine, was considering
      the slave's table definition and trying to retrieve a field value, where
      there was nothing to be retrieved, ignoring the fact that the value was
      defined as "null" by the master.
                  
      To fix the problem, we throw an error if the slave's table definition
      does not accept nulls and proceed with the execution otherwise.
            
      added:
        mysql-test/suite/rpl/r/rpl_not_null.result
        mysql-test/suite/rpl/t/rpl_not_null.test
      modified:
        sql/rpl_record.cc
        sql/log_event.cc
        sql/log_event_old.cc
[10 Apr 2009 13:38] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/71862

2838 Alfranio Correia	2009-04-10
      BUG#43789 different master/slave table defs cause crash: text/varchar null
                vs not null
      
      The replication was generating corrupted data, warning messages on Valgrind
      and aborting on debug mode while replicating a "null" to "not null" field.
      Specifically the unpack_row routine, was considering the slave's table
      definition and trying to retrieve a field value, where there was nothing to be
      retrieved, ignoring the fact that the value was defined as "null" by the master.
      
      To fix the problem, we proceed as follows:
      
      1 - If it is not STRICT sql_mode, implicit default values are used, regardless
      if it is multi-row or single-row statement.
      
      2 - However, if it is STRICT mode, then a we do what follows:
      
      2.1 If it is a transactional engine, we do a rollback on the first NULL that is
      to be set into a NOT NULL column and return an error.
      
      2.2 If it is a non-transaction engine and it is the first row to be inserted
      with multi-row, we also return the error. Otherwise, we proceed with the
      execution, use implicit default values and print out warning messages.
     @ sql/log_event.cc
        Changed as follows:
        
        1 - Added the parameter row (i.e. the row number in a multi-row statement) to
        the do_exec_row method and, in particular to write_row which is responsible
        for handling inserts.
        2 - When appropriated the we call the prepare_record and unpack_current_row with
        the row number information and the flag abort_on_warnings which states if we are
        running on an STRICT mode or not.
        3 - When appropriated errors are caught and the execution stopped.
     @ sql/log_event.h
        Changed as follows:
        
        1 - Changed the signature of the methods do_exec_row method and write_row to
        accept the parameter row (i.e. the row number in a multi-row statement).
        2 - Changed the signature of the unpack_current_row() to accept the parameters
        row and abort_on_warning wich are passed to the function unpack_record().
     @ sql/rpl_record.cc
        Changed to implement the defined rules.
     @ sql/rpl_record.h
        Changed to implement the defined rules.
[18 Apr 2009 19:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/72462

2844 Alfranio Correia	2009-04-18
      BUG#43789 different master/slave table defs cause crash: text/varchar null
                vs not null
            
      The replication was generating corrupted data, warning messages on Valgrind
      and aborting on debug mode while replicating a "null" to "not null" field.
      Specifically the unpack_row routine, was considering the slave's table
      definition and trying to retrieve a field value, where there was nothing to be
      retrieved, ignoring the fact that the value was defined as "null" by the master.
            
      To fix the problem, we proceed as follows:
            
      1 - If it is not STRICT sql_mode, implicit default values are used, regardless
      if it is multi-row or single-row statement.
            
      2 - However, if it is STRICT mode, then a we do what follows:
            
      2.1 If it is a transactional engine, we do a rollback on the first NULL that is
      to be set into a NOT NULL column and return an error.
            
      2.2 If it is a non-transactional engine and it is the first row to be inserted
      with multi-row, we also return the error. Otherwise, we proceed with the
      execution, use implicit default values and print out warning messages.
      
      Unfortunately, the current patch cannot mimic the behavior showed by the master
      for updates on multi-tables and multi-row inserts. This happens because such
      statements are unfolded in different row events. For instance, considering the
      following updates and strict mode:
      
      (master)
      create table t1 (a int);
      create table t2 (a int not null);
      insert into t1 values (1);
      insert into t1 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
      
      t1 would have (10) and t2 would have (0) as this would be handled as a
      multi-row update. On the other hand, if we had the following updates:
      
      (master)
      create table t1 (a int);
      create table t2 (a int);
      
      (slave)
      create table t1 (a int);
      create table t2 (a int not null);
      
      (master)
      insert into t1 values (1);
      insert into t1 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
      
      On the master t1 would have (10) and t2 would have (NULL). On
      the slave, t1 would have (10) but the update on t1 would fail.
     @ sql/log_event.cc
        Changed as follows:
         
        1 - When appropriated we call the prepare_record and unpack_current_row with
          both the flag abort_on_warnings which states if we are running on an STRICT
          mode or not and the flag first_row which indicates if the first row is being
          processed.
                
        2 - When appropriated errors are caught and the execution stopped.
     @ sql/log_event.h
        Changed the signature of the unpack_current_row() to accept both the
        flag abort_on_warnings and the flag first_row.
     @ sql/rpl_record.cc
        Changed to implement the defined rules.
     @ sql/rpl_record.h
        Changed to implement the defined rules.
[20 Apr 2009 14:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/72507

2844 Alfranio Correia	2009-04-20
      BUG#43789 different master/slave table defs cause crash: text/varchar null
                vs not null
                  
      The replication was generating corrupted data, warning messages on Valgrind
      and aborting on debug mode while replicating a "null" to "not null" field.
      Specifically the unpack_row routine, was considering the slave's table
      definition and trying to retrieve a field value, where there was nothing to be
      retrieved, ignoring the fact that the value was defined as "null" by the master.
                  
      To fix the problem, we proceed as follows:
                  
      1 - If it is not STRICT sql_mode, implicit default values are used, regardless
      if it is multi-row or single-row statement.
                  
      2 - However, if it is STRICT mode, then a we do what follows:
                  
      2.1 If it is a transactional engine, we do a rollback on the first NULL that is
      to be set into a NOT NULL column and return an error.
                  
      2.2 If it is a non-transactional engine and it is the first row to be inserted
      with multi-row, we also return the error. Otherwise, we proceed with the
      execution, use implicit default values and print out warning messages.
            
      Unfortunately, the current patch cannot mimic the behavior showed by the master
      for updates on multi-tables and multi-row inserts. This happens because such
      statements are unfolded in different row events. For instance, considering the
      following updates and strict mode:
            
      (master)
      create table t1 (a int);
      create table t2 (a int not null);
      insert into t1 values (1);
      insert into t1 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
            
      t1 would have (10) and t2 would have (0) as this would be handled as a
      multi-row update. On the other hand, if we had the following updates:
            
      (master)
      create table t1 (a int);
      create table t2 (a int);
            
      (slave)
      create table t1 (a int);
      create table t2 (a int not null);
            
      (master)
      insert into t1 values (1);
      insert into t1 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
            
      On the master t1 would have (10) and t2 would have (NULL). On
      the slave, t1 would have (10) but the update on t1 would fail.
     @ sql/log_event.cc
        Changed as follows:
                 
        1 - When appropriated we call the prepare_record and unpack_current_row with
        both the flag abort_on_warnings which states if we are running on an STRICT
        mode or not and the flag first_row which indicates if the first row is being
        processed.
                        
        2 - When appropriated errors are caught and the execution stopped.
     @ sql/log_event.h
        Changed the signature of the unpack_current_row() to accept both the
        flag abort_on_warnings and the flag first_row.
     @ sql/rpl_record.cc
        Changed to implement the defined rules.
     @ sql/rpl_record.h
        Changed to implement the defined rules.
[23 Apr 2009 23:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/72744

2844 Alfranio Correia	2009-04-24
      BUG#43789 different master/slave table defs cause crash: text/varchar null
                vs not null
                        
      The replication was generating corrupted data, warning messages on Valgrind
      and aborting on debug mode while replicating a "null" to "not null" field.
      Specifically the unpack_row routine, was considering the slave's table
      definition and trying to retrieve a field value, where there was nothing to be
      retrieved, ignoring the fact that the value was defined as "null" by the master.
                        
      To fix the problem, we proceed as follows:
                        
      1 - If it is not STRICT sql_mode, implicit default values are used, regardless
      if it is multi-row or single-row statement.
                        
      2 - However, if it is STRICT mode, then a we do what follows:
                        
      2.1 If it is a transactional engine, we do a rollback on the first NULL that is
      to be set into a NOT NULL column and return an error.
                        
      2.2 If it is a non-transactional engine and it is the first row to be inserted
      with multi-row, we also return the error. Otherwise, we proceed with the
      execution, use implicit default values and print out warning messages.
                  
      Unfortunately, the current patch cannot mimic the behavior showed by the master
      for updates on multi-tables and multi-row inserts. This happens because such
      statements are unfolded in different row events. For instance, considering the
      following updates and strict mode:
                  
      (master)
      create table t1 (a int);
      create table t2 (a int not null);
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                  
      t1 would have (10) and t2 would have (0) as this would be handled as a
      multi-row update. On the other hand, if we had the following updates:
                  
      (master)
      create table t1 (a int);
      create table t2 (a int);
                  
      (slave)
      create table t1 (a int);
      create table t2 (a int not null);
                  
      (master)
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                  
      On the master t1 would have (10) and t2 would have (NULL). On
      the slave, t1 would have (10) but the update on t1 would fail.
     @ sql/log_event.cc
        Changed as follows:
                         
        1 - When appropriated we call the prepare_record and unpack_current_row with
        both the flag abort_on_warnings which states if we are running on an STRICT
        mode or not and the flag first_row which indicates if the first row is being
        processed.
        
        2 - When appropriated errors are caught and the execution stopped.
     @ sql/log_event.h
        Changed the signature of the unpack_current_row() to accept both the
        flag abort_on_warnings and the flag first_row.
     @ sql/rpl_record.cc
        Changed to implement the defined rules.
     @ sql/rpl_record.h
        Changed to implement the defined rules.
[24 Apr 2009 0:05] Alfranio Correia
Pushed to 6.0-rpl.
[13 May 2009 3:31] Bugs System
Pushed into 6.0.12-alpha (revid:alik@sun.com-20090513032549-rxa73jbxd1qv09xc) (version source revid:aelkin@mysql.com-20090429125820-vu261kl1z4z5f0iv) (merge vers: 6.0.12-alpha) (pib:6)
[13 May 2009 13:40] Jon Stephens
Documented bugfix in the 6.0.12 changelog as follows:

        Replicating TEXT or VARCHAR columns declared as NULL on the master 
        but NOT NULL on the slave caused the slave to crash.
[29 Sep 2009 14:19] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85035

3117 Alfranio Correia	2009-09-29
      BUG#43789 different master/slave table defs cause crash: text/varchar null
                vs not null
      
      NOTE: Backporting the patch to next-mr.
                              
      The replication was generating corrupted data, warning messages on Valgrind
      and aborting on debug mode while replicating a "null" to "not null" field.
      Specifically the unpack_row routine, was considering the slave's table
      definition and trying to retrieve a field value, where there was nothing to be
      retrieved, ignoring the fact that the value was defined as "null" by the master.
                              
      To fix the problem, we proceed as follows:
                              
      1 - If it is not STRICT sql_mode, implicit default values are used, regardless
      if it is multi-row or single-row statement.
                              
      2 - However, if it is STRICT mode, then a we do what follows:
                              
      2.1 If it is a transactional engine, we do a rollback on the first NULL that is
      to be set into a NOT NULL column and return an error.
                              
      2.2 If it is a non-transactional engine and it is the first row to be inserted
      with multi-row, we also return the error. Otherwise, we proceed with the
      execution, use implicit default values and print out warning messages.
                        
      Unfortunately, the current patch cannot mimic the behavior showed by the master
      for updates on multi-tables and multi-row inserts. This happens because such
      statements are unfolded in different row events. For instance, considering the
      following updates and strict mode:
                        
      (master)
      create table t1 (a int);
      create table t2 (a int not null);
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                        
      t1 would have (10) and t2 would have (0) as this would be handled as a
      multi-row update. On the other hand, if we had the following updates:
                        
      (master)
      create table t1 (a int);
      create table t2 (a int);
                        
      (slave)
      create table t1 (a int);
      create table t2 (a int not null);
                        
      (master)
      insert into t1 values (1);
      insert into t2 values (2);
      update t1, t2 SET t1.a=10, t2.a=NULL;
                        
      On the master t1 would have (10) and t2 would have (NULL). On
      the slave, t1 would have (10) but the update on t1 would fail.
[30 Sep 2009 12:59] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85207

3120 Alfranio Correia	2009-09-30
      Post-fix for BUG#43789
      
      NOTE: Backporting the patch to next-mr.
[30 Sep 2009 14:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/85219

3121 Alfranio Correia	2009-09-30
      Post-fix for BUG#43789
      
      NOTE: Backporting the patch to next-mr.
[27 Oct 2009 9:50] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091027094604-9p7kplu1vd2cvcju) (version source revid:zhenxing.he@sun.com-20091026140226-uhnqejkyqx1aeilc) (merge vers: 6.0.14-alpha) (pib:13)
[28 Oct 2009 6:28] Jon Stephens
Already documented in 6.0.12 changelog. Closed.
[12 Nov 2009 8:20] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091110093229-0bh5hix780cyeicl) (version source revid:alik@sun.com-20091027095744-rf45u3x3q5d1f5y0) (merge vers: 5.5.0-beta) (pib:13)
[12 Nov 2009 12:13] Jon Stephens
Also documented in 5.5.0 changelog; closed.
[2 Dec 2009 8:08] Bugs System
Pushed into 5.1.42 (revid:joro@sun.com-20091202080033-mndu4sxwx19lz2zs) (version source revid:davi.arnaut@sun.com-20091125130912-d7hrln14ef7y5d7i) (merge vers: 5.1.42) (pib:13)
[3 Dec 2009 11:29] Jon Stephens
Also documented in 5.1.42 changelog; closed.
[16 Dec 2009 8:41] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091216083311-xorsasf5kopjxshf) (version source revid:alik@sun.com-20091214191830-wznm8245ku8xo702) (merge vers: 6.0.14-alpha) (pib:14)
[16 Dec 2009 8:48] Bugs System
Pushed into 5.5.0-beta (revid:alik@sun.com-20091216082430-s0gtzibcgkv4pqul) (version source revid:alexey.kopytov@sun.com-20091124083136-iqm136jm31sfdwg3) (merge vers: 5.5.0-beta) (pib:14)
[16 Dec 2009 8:55] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20091216083231-rp8ecpnvkkbhtb27) (version source revid:alik@sun.com-20091212203859-fx4rx5uab47wwuzd) (merge vers: 5.6.0-beta) (pib:14)
[16 Dec 2009 15:40] Jon Stephens
Also documented in the 5.5.1, 5.6.0, and 6.0.14 changelogs. Closed.
[16 Dec 2009 15:41] Jon Stephens
Disregard previous comment; fix already documented in 5.5.0 and 6.0.12 changelogs.

Also doumented in 5.6.0 changelog. Closed.
[8 Mar 2010 0:12] Paul Dubois
5.6.0 changelog entry unneeded.
[12 Mar 2010 14:09] Bugs System
Pushed into 5.1.44-ndb-7.0.14 (revid:jonas@mysql.com-20100312135944-t0z8s1da2orvl66x) (version source revid:jonas@mysql.com-20100312115609-woou0te4a6s4ae9y) (merge vers: 5.1.44-ndb-7.0.14) (pib:16)
[12 Mar 2010 14:25] Bugs System
Pushed into 5.1.44-ndb-6.2.19 (revid:jonas@mysql.com-20100312134846-tuqhd9w3tv4xgl3d) (version source revid:jonas@mysql.com-20100312060623-mx6407w2vx76h3by) (merge vers: 5.1.44-ndb-6.2.19) (pib:16)
[12 Mar 2010 14:39] Bugs System
Pushed into 5.1.44-ndb-6.3.33 (revid:jonas@mysql.com-20100312135724-xcw8vw2lu3mijrhn) (version source revid:jonas@mysql.com-20100312103652-snkltsd197l7q2yg) (merge vers: 5.1.44-ndb-6.3.33) (pib:16)
[15 Mar 2010 4:41] Jon Stephens
No new changelog entries required. Closed.