Bug #42749 infinite loop writing to row based binlog - processlist shows "freeing items"
Submitted: 11 Feb 2009 6:59 Modified: 25 Jun 2009 9:06
Reporter: Shane Bester (Platinum Quality Contributor) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S1 (Critical)
Version:5.1.30, 5.1.32, 6.0.10 OS:Any (Windows, Linux)
Assigned to: Luis Soares CPU Architecture:Any

[11 Feb 2009 6:59] Shane Bester
Description:
when --log-bin --binlog_format=row is enabled and importing a mysqldump, it can hang like this:

mysql> show processlist\G
*************************** 1. row *****************
     Id: 1
   User: root
   Host:
     db: test
Command: Query
   Time: 214
  State: freeing items
   Info: INSERT INTO `xxx` VALUES (1234,'1234 ....

Call stack for thread:

mysqld.exe!MYSQL_BIN_LOG::write_cache  Line 4287
mysqld.exe!MYSQL_BIN_LOG::write Line 4397
mysqld.exe!binlog_end_trans  Line 1401
mysqld.exe!MYSQL_BIN_LOG::log_xid  Line 5440
mysqld.exe!ha_commit_trans Line 1163
mysqld.exe!ha_autocommit_or_rollback  Line 1333
mysqld.exe!dispatch_command
mysqld.exe!do_command
mysqld.exe!handle_one_connection
mysqld.exe!pthread_start
mysqld.exe!_callthreadstart
mysqld.exe!_threadstart

no other connections will be able to write to the binlog during this hang.

How to repeat:
start server with --log-bin --binlog-format=row and import the privately attached mysqldump
[11 Feb 2009 8:11] MySQL Verification Team
in 6.0.10 it is looping in function MYSQL_BIN_LOG::write_cache:

while (hdr_offs < length)
{
  if (hdr_offs + LOG_EVENT_HEADER_LEN > length)
  {
   <cut>
  }
  else
  {
    uchar *log_pos= (uchar *)cache->read_pos + hdr_offs + LOG_POS_OFFSET;

    /* fix end_log_pos */
    val= uint4korr(log_pos) + group;
    int4store(log_pos, val);

    /* next event header at ... */
    log_pos= (uchar *)cache->read_pos + hdr_offs + EVENT_LEN_OFFSET;
    hdr_offs += uint4korr(log_pos);
  }
}

when the looping happens:

hdr_offs 12271	unsigned int
length	13381	unsigned int

hdr_offs never increases because "uint4korr(log_pos);" is always
returning 0.
[11 Feb 2009 14:08] Mats Kindahl
Is the server a 32-bit build or a 64-bit build?
[11 Feb 2009 14:30] MySQL Verification Team
I used 64-bit 5.1.30 and 32-bit 6.0.10.  Customer used 5.1.30 64-bit on linux.
[9 Mar 2009 8:28] MySQL Verification Team
actually, this use case is pretty common. it can affect every replication setup or every setup using binlogging with row based format..  there will likely be many 5.1 users needing a fix.
[30 Apr 2009 7:24] Sveta Smirnova
Bug #44548 looks like same case. Reporter uses MIXED binary log format.
[30 Apr 2009 16:26] Sveta Smirnova
Bug #44548 was confirmed to be duplicate of this one.
[12 May 2009 7:11] Mats Kindahl
See BUG#44752, which is a duplicate
[12 May 2009 7:17] Mats Kindahl
Analyzing corrupt binlog sent to me by Luís.

Have a look at this:

$ client/mysqlbinlog master-bin.000001
...
# at 8727
...
# at 8795
ERROR: Error in Log_event::read_log_event(): 'Sanity check failed', data_len: 176128, event_type: 74
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;

We start off with a good event (02 == QUERY_LOG_EVENT)

$ od -t x1 -j8727 -N19 master-bin.000001
0021027 6a 73 08 4a 02 01 00 00 00 44 00 00 00 5b 22 00
        ^^^^^^^^^^^ ^^             ^^^^^^^^^^^
          timestamp  type            length
0021047 00 00 00
0021052

Length is 0x00000044 == 68

$ expr 8727 + 68
8795
$ od -t x1 -j8795 -N19 master-bin.000001
0021133 6a 73 08 4a 13 01 00 00 00 f7 01 00 00 52 24 00
        ^^^^^^^^^^^                ^^^^^^^^^^^
0021153 00 00 00
0021156

Length is 0x000001f7 == 503

$ expr 8795 + 503
9298
$ od -t x1 -j9298 -N19 master-bin.000001
0022122 03 6a 73 08 4a 17 01 00 00 00 b0 02 00 5b ca 04
        ^^^^^^^^^^^ ^^             ^^^^^^^^^^^
0022142 00 00 00
0022145

The first bytes are the timestamp, which should not vary very much in a run.
Also, the 5:th byte is the type, which is 0x4a == 74 and length becomes
0x0002b000 == 176128 which is what mysqlbinlog reported as type and length
respectively.

The length of the table map event is not correct.
[12 May 2009 11:32] Luis Soares
It seems that the calculation of the event size in
Table_map_log_event, in its constructor, is wrong by one byte, when
m_field_metadata_size exceeds 255. In log_event.cc:

(...)
7863   /*
7864     Create an array for the field metadata and store it.
7865   */
7866   m_field_metadata_size= save_field_metadata();
7867   DBUG_ASSERT(m_field_metadata_size <= (m_colcnt * 2));
7868 
7869   /*
7870     Now set the size of the data to the size of the field metadata array
7871     plus one or two bytes for number of elements in the field metadata array.
7872   */
7873   if (m_field_metadata_size > 255)
7874     m_data_size+= m_field_metadata_size + 2;
7875   else
7876     m_data_size+= m_field_metadata_size + 1;
(...)

The m_data_size above should be +3 and not +2 as later, when writing the
body (Table_map_log_event::write_data_body) there is the following call,
in log_event.cc:

(...)
8134   /*
8135     Store the size of the field metadata.
8136   */
8137   uchar mbuf[sizeof(m_field_metadata_size)];
8138   uchar *const mbuf_end= net_store_length(mbuf,m_field_metadata_size);
(...)

Inside net_store_length, the return value is actually 3 bytes shifted and
not 2 bytes as expected in the constructor (note the 1 increment in the
packet and the +2 in the return). In pack.c:

(...)
 97 uchar *net_store_length(uchar *packet, ulonglong length)
 98 {
 99   if (length < (ulonglong) LL(251))
100   {
101     *packet=(uchar) length;
102     return packet+1;
103   }
104   /* 251 is reserved for NULL */
105   if (length < (ulonglong) LL(65536))
106   {
107     *packet++=252;
108     int2store(packet,(uint) length);
109     return packet+2;
110   }
(...)
[12 May 2009 11:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/73811

2887 Luis Soares	2009-05-12
      BUG#42749: infinite loop writing to row based binlog - processlist shows
      "freeing items"
      
      The calculation of the table map log event in the event constructor
      was one byte shorter than what would be actually written. This would
      lead to a mismatch between the number of bytes written and the event
      end_log_pos, causing bad event alignment in the binlog (corrupted
      binlog) or in the transaction cache while fixing positions
      (MYSQL_BIN_LOG::write_cache). This could lead to impossible to read
      binlog or even infinite loops in MYSQL_BIN_LOG::write_cache.
      
      This patch addresses this issue by correcting the expected event
      length in the Table_map_log_event constructor, when the field metadata
      size exceeds 255.
     @ sql/log_event.cc
        Added the extra byte as net_store_length imposes.
[14 May 2009 7:31] Zhenxing He
BUG#44627 is a dup of this.
[14 May 2009 11:59] Luis Soares
Pushed to 5.1-bugteam and 6.0-bugteam.
[28 May 2009 12:39] Jon Stephens
Documented bugfix in the 5.1.36 changelog as follows:

        When using row-based logging, the length of an event for which
        the field metadata exceeded 255 bytes in size was incorrectly
        calculated. This could lead to corruption of the binary log, or
        cause the server to hang.
[28 May 2009 13:07] Jon Stephens
Set to NDI pending push to 5.1+.
[17 Jun 2009 19:23] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version source revid:luis.soares@sun.com-20090514115255-seesp0i0tn8tuhus) (merge vers: 6.0.12-alpha) (pib:11)
[25 Jun 2009 9:06] Jon Stephens
Also documented in the 5.4.4 changelog. Closed.
[12 Aug 2009 22:06] Paul DuBois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[14 Aug 2009 22:57] Paul DuBois
Ignore previous comment about 5.4.2.
[26 Aug 2009 13:45] Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46] Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48] Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:32] Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[7 Oct 2009 1:51] Paul DuBois
The 5.4 fix has been pushed into 5.4.2.