Bug #52202 mysqlbinlog_row* fail in daily-trunk on Sol10 x86_64 debug_max
Submitted: 19 Mar 2010 8:12 Modified: 15 Nov 2010 12:56
Reporter: Alexander Nozdrin Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.5.3-M3 (Celosia) OS:Solaris (sol10 x86_64 debug_max, sol11 (snv111* debug))
Assigned to: Luis Soares CPU Architecture:Any
Tags: experimental, pb2, test failure

[19 Mar 2010 8:12] Alexander Nozdrin
Description:
main.mysqlbinlog* fail in daily-trunk on sol10 x86_64 debug_max only.

Symptoms:

  - main.mysqlbinlog_row:
-###   @1='2001:02:03' /* DATE meta=0 nullable=1 is_null=0 */
+###   @1='2001:02:03' /* DATE meta=0 nullable=1 is_null=0 */

  - main.mysqlbinlog_row_myisam
-###   @28='1000:01:01' /* DATE meta=0 nullable=1 is_null=0 */
+###   @28='1000:01:01' /* DATE meta=0 nullable=1 is_null=0 */

  - main.mysqlbinlog_row_innodb
-###   @28='1000:01:01' /* DATE meta=0 nullable=1 is_null=0 */
+###   @28='1000:01:01' /* DATE meta=0 nullable=1 is_null=0 */

How to repeat:
Checkk out daily-trunk in PB.
[19 Mar 2010 8:58] Alexander Nozdrin
They also fail on all OpenSolaris (snv111) boxes:
  - http://tinyurl.com/yznjfo7
[22 Mar 2010 11:53] Sveta Smirnova
Thank you for the report.

Verified as described using link provided. Not repeatable in my environment.
[22 Mar 2010 12:56] Alexander Nozdrin
Making the test cases experimental on Solaris starting from trunk.
[25 Mar 2010 0:21] Luis Soares
I have made analysis on the failures! Here are some details.

SETTINGS
========

I have analyzed this failure with the following settings:

  - host: loki11 (SunOS 5.10 ix86 S10/64x). 
  - tree: mysql-trunk-bugfixing
  - revid: alik@sun.com-20100322130038-5kkj9jxgx7jrwf27
  - build: env similar with PB2. 

    CFLAGS=-g
    CXXFLAGS=-g0
    ASFLAGS=-g
    LDFLAGS=-g
    CFLAGS='-g -Xa -xstrconst -mt -D_FORTEC_'
    CXXFLAGS='-g0 -noex -mt -D_FORTEC_'
    CFLAGS='-g -Xa -xstrconst -mt -D_FORTEC_ -m64'
    CXXFLAGS='-g0 -noex -mt -D_FORTEC_ -m64'
    ASFLAGS='-g -m64'
    LDFLAGS='-g -m64'

    export CC=cc-5.0 CXX=CC
    export CFLAGS
    export CXXFLAGS
    export CPPFLAGS
    export ASFLAGS
    export LDFLAGS

   $ cmake . -DENABLE_THREAD_SAFE_CLIENT=1 -DENABLE_LOCAL_INFILE=1 -DWITH_PIC=1 -DWITH_CLIENT_LDFLAGS=_STATIC=1 -DWITH_MYSQLD_LDFLAGS=_STATIC=1 -DWITH_ZLIB=bundled -DWITHOUT_NDB_DEBUG=1 -DWITH_BIG_TABLES=1 -DWITH_SSL=bundled -DWITH_READLINE=1 -DWITH_EMBEDDED_SERVER=1 -DWITH_ARCHIVE_STORAGE_ENGINE=1 -DWITH_BLACKHOLE_STORAGE_ENGINE=1 -DWITH_CSV_STORAGE_ENGINE=1 -DWITH_EXAMPLE_STORAGE_ENGINE=1 -DWITH_FEDERATED_STORAGE_ENGINE=1 -DWITH_PARTITION=1 -DWITH_EXTRA_CHARSETS=all -DWITH_INNODB=1 -DWITH_NDBCLUSTER=1 -DWITH_DEBUG=1 -DWITH_LIBEVENT=1

   $ make

OBSERVATIONS
============

 O1. Running mysqlbinlog_row test case after building
     mysql-trunk-bugfixing I hit the failure exhibited by PB2.

 O2. I narrowed down the test case and came up with the following
     smaller excerpt that reproduces it:

     -- source include/have_log_bin.inc
     -- source include/have_binlog_format_row.inc

     CREATE TABLE t1 (c28 DATE);
     INSERT INTO t1 VALUES ('2001-02-03');
     DELETE FROM t1 WHERE c28='2001-02-03';
     DROP TABLE t1;
     -- exit

     Running this snippet and analyzing the output from
     mysqlbinlog:

     $ perl mysql-test-run.pl --mysqld=--binlog-format=row main.b52202
     $ mysqlbinlog -v -v var/mysqld.1/data/master-bin.000001 | less

     We find that there is an extra char in the output (mind the
     @1='....^B'):

     (...)
     ### INSERT INTO test.t1
     ### SET
     ###   @1='2001:02:03^B' /* DATE meta=0 nullable=1 is_null=0 */
     (...)

 O3. Using solaris debugger dbx, I traced down the execution of
     mysqlbinlog to log_event.cc:log_event_print_value, in
     particular to:

(...)
  case MYSQL_TYPE_NEWDATE:
    {
      uint32 tmp= uint3korr(ptr);
      int part;
      char buf[10];
      char *pos= &buf[10];

      /* Copied from field.cc */
      *pos--=0;                                 // End NULL
      part=(int) (tmp & 31);
      *pos--= (char) ('0'+part%10);
      *pos--= (char) ('0'+part/10);
      *pos--= ':';
      part=(int) (tmp >> 5 & 15);
      *pos--= (char) ('0'+part%10);
      *pos--= (char) ('0'+part/10);
      *pos--= ':';
      part=(int) (tmp >> 9);
      *pos--= (char) ('0'+part%10); part/=10;
      *pos--= (char) ('0'+part%10); part/=10;
      *pos--= (char) ('0'+part%10); part/=10;
      *pos=   (char) ('0'+part);
      my_b_printf(file , "'%s'", buf);
(...)
 
     In this snippet, we find that we are actually writing 11
     chars to buf (mind the 0 in buf[10]), which is sized at
     10. Stepping through dbx, we confirm that this is an overrun:

     (dbx) stop at log_event.cc:1717
     (dbx) run -v -v var/mysqld.1/data/master-bin.000001
     (...)
     (dbx) list
     1724         *pos--=0;                                 // End NULL
     (dbx) print &part
     &part = 0xfffffd7fffdfcc4c    
     (dbx) print pos
     pos = 0xfffffd7fffdfcc4c ""
     (dbx) print &buf[10]
     dbx: warning: subscript out of range
     &buf[10] = 0xfffffd7fffdfcc4c ""

     So we start by writing 0 to 0xfffffd7fffdfcc4c which is
     shared between buf[10] (out of range) and part. 
     
     - buf[10] is written into, right in the beginning of the
       assignments and is set to 0.

     - part is written into in several places through the list of
       assignments, overwriting the previous 0.

     As such, instead of a null terminated buffer, we will end up
     with part value in the given byte. Since the last value that
     part gets is 2 (see date output above and note that buf[0]
     gets ('0' + 2)), it is consistent with the char ^B in the
     faulty output of mysqlbinlog.

 O4. This seems to be a regression of WL#5151. This code was
     introduced/last changed by:

      revision-id: mats@sun.com-20091216122847-x86am4tb4fivqrpu
      message:
        WL#5151: Conversion between different types when
                 replicating
  
        Fixes to make mysqlbinlog work correctly and updates
        to rpl_ndb result files.

CONCLUSION
==========

 C1. The simple fix for this is to increase buf size (as we are 
     indeed writing 11 chars: 10 for date and 1 for NULL).
[25 Mar 2010 0:31] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/104274

2998 Luis Soares	2010-03-25
      BUG#52202: mysqlbinlog_row* fail in daily-trunk on Sol10 x86_64
                 debug_max
      
      There was a buffer overrun when unpacking the date
      field. Incidentaly, this seems to affect only solaris x86_64
      debug builds, but others platforms may be vulnerable as well.
      
      In particular, the buffer size used was not taking into
      consideration that the '\0' character would be written into
      it.
      
      We fix this by increasing the size of the buffer used to
      accommodate one extra byte (the one for the '\0').
[16 Apr 2010 0:47] Luis Soares
Queued in mysql-trunk-bugfixing, mysql-next-mr-bugfixing and mysql-6.0-codebase-bugfixing:
 - http://pb2.norway.sun.com/web.py?template=push_details&push=1172817
 - http://pb2.norway.sun.com/web.py?template=push_details&push=1172819
 - http://pb2.norway.sun.com/web.py?template=push_details&push=1172818
[27 Apr 2010 9:45] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100427094135-5s49ecp3ckson6e2) (version source revid:alik@sun.com-20100427093843-uekr85qkd7orx12t) (merge vers: 6.0.14-alpha) (pib:16)
[27 Apr 2010 9:48] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100427093804-a2k3rrjpwu5jegu8) (version source revid:alik@sun.com-20100427093804-a2k3rrjpwu5jegu8) (merge vers: 5.5.5-m3) (pib:16)
[27 Apr 2010 9:51] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100427094036-38frbg3famdlvjup) (version source revid:alik@sun.com-20100427093825-92wc8b22d4yg34ju) (pib:16)
[28 Apr 2010 12:47] Jon Stephens
Appears to be test failure only, closing without further action.
[28 Apr 2010 13:53] Jon Stephens
Discussed with Luis on IRC; documented fix as follows in the 5.5.5 and 6.0.14 changelogs:

      A buffer overrun in the handling of DATE column values could cause
      mysqlbinlog to fail when reading back logs containing certain 
      combinations of DML on a table having a DATE column followed by 
      dropping the table.

Keeping Closed status for now, but asking that this be backported to Cluster due to its origins in WL#5151.
[28 Apr 2010 13:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/106832
[28 Apr 2010 14:16] Magnus BlÄudd
Pushed to 6.3.34, 7.0.15 and 7.1.4
[28 Apr 2010 14:45] Jon Stephens
Also documented in the NDB-6.3.34, 7.0.15, and 7.1.4 changelogs.

See previous comment for changelog entry.

Closed.
[5 Oct 2010 17:38] Anitha Gopi
Test still experimental on next-mr. If the bug is fixed remove test from experimental group. http://trollheim.norway.sun.com/archive/2329363.disabled_tests.html
[5 Oct 2010 22:15] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/120038

3096 Luis Soares	2010-10-05
      BUG#52202: mysqlbinlog_row* fail in daily-trunk on Sol10 
                 x86_64 debug_max
      
      Removed test cases affected by this bug from experimental 
      list.
[5 Oct 2010 22:22] Luis Soares
Also listed experimental in mysql-5.5-bugteam and mysql-trunk-merge. 
Will push to these trees.
[6 Oct 2010 10:49] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/120092

3096 Luis Soares	2010-10-06
      BUG#52202: mysqlbinlog_row* fail in daily-trunk on Sol10 
                 x86_64 debug_max
            
      Removed test cases affected by this bug from experimental 
      list.
[6 Oct 2010 10:58] Luis Soares
Pushed updated experimental list to mysql-5.5-bugteam and mysql-trunk-merge:
- http://pb2.norway.sun.com/web.py?template=push_details&push=1627179
- http://pb2.norway.sun.com/web.py?template=push_details&push=1627160
[6 Oct 2010 12:23] Jon Stephens
No new user-facing changes; fix was already documented as previously noted, so closing without further action.
[13 Oct 2010 6:49] Jon Stephens
Setting back to Closed state -- see my comment of 06 October.
[29 Oct 2010 14:18] Jon Stephens
Already documented in 5.5.5 changelog; set back to Closed.
[9 Nov 2010 19:42] Bugs System
Pushed into mysql-5.5 5.5.7-rc (revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (version source revid:sunanda.menon@sun.com-20101109182959-otkxq8vo2dcd13la) (merge vers: 5.5.7-rc) (pib:21)
[13 Nov 2010 16:22] Bugs System
Pushed into mysql-trunk 5.6.99-m5 (revid:alexander.nozdrin@oracle.com-20101113155825-czmva9kg4n31anmu) (version source revid:alexander.nozdrin@oracle.com-20101113152450-2zzcm50e7i4j35v7) (merge vers: 5.6.1-m4) (pib:21)
[13 Nov 2010 16:29] Bugs System
Pushed into mysql-next-mr (revid:alexander.nozdrin@oracle.com-20101113160336-atmtmfb3mzm4pz4i) (version source revid:vasil.dimov@oracle.com-20100629074804-359l9m9gniauxr94) (pib:21)
[15 Nov 2010 12:56] Jon Stephens
Set back to Closed. See previous comments.