Bug #42749 | infinite loop writing to row based binlog - processlist shows "freeing items" | ||
---|---|---|---|
Submitted: | 11 Feb 2009 6:59 | Modified: | 25 Jun 2009 9:06 |
Reporter: | Shane Bester (Platinum Quality Contributor) | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S1 (Critical) |
Version: | 5.1.30, 5.1.32, 6.0.10 | OS: | Any (Windows, Linux) |
Assigned to: | Luis Soares | CPU Architecture: | Any |
[11 Feb 2009 6:59]
Shane Bester
[11 Feb 2009 8:11]
MySQL Verification Team
in 6.0.10 it is looping in function MYSQL_BIN_LOG::write_cache: while (hdr_offs < length) { if (hdr_offs + LOG_EVENT_HEADER_LEN > length) { <cut> } else { uchar *log_pos= (uchar *)cache->read_pos + hdr_offs + LOG_POS_OFFSET; /* fix end_log_pos */ val= uint4korr(log_pos) + group; int4store(log_pos, val); /* next event header at ... */ log_pos= (uchar *)cache->read_pos + hdr_offs + EVENT_LEN_OFFSET; hdr_offs += uint4korr(log_pos); } } when the looping happens: hdr_offs 12271 unsigned int length 13381 unsigned int hdr_offs never increases because "uint4korr(log_pos);" is always returning 0.
[11 Feb 2009 14:08]
Mats Kindahl
Is the server a 32-bit build or a 64-bit build?
[11 Feb 2009 14:30]
MySQL Verification Team
I used 64-bit 5.1.30 and 32-bit 6.0.10. Customer used 5.1.30 64-bit on linux.
[9 Mar 2009 8:28]
MySQL Verification Team
actually, this use case is pretty common. it can affect every replication setup or every setup using binlogging with row based format.. there will likely be many 5.1 users needing a fix.
[30 Apr 2009 7:24]
Sveta Smirnova
Bug #44548 looks like same case. Reporter uses MIXED binary log format.
[30 Apr 2009 16:26]
Sveta Smirnova
Bug #44548 was confirmed to be duplicate of this one.
[12 May 2009 7:11]
Mats Kindahl
See BUG#44752, which is a duplicate
[12 May 2009 7:17]
Mats Kindahl
Analyzing corrupt binlog sent to me by Luís. Have a look at this: $ client/mysqlbinlog master-bin.000001 ... # at 8727 ... # at 8795 ERROR: Error in Log_event::read_log_event(): 'Sanity check failed', data_len: 176128, event_type: 74 DELIMITER ; # End of log file ROLLBACK /* added by mysqlbinlog */; /*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/; We start off with a good event (02 == QUERY_LOG_EVENT) $ od -t x1 -j8727 -N19 master-bin.000001 0021027 6a 73 08 4a 02 01 00 00 00 44 00 00 00 5b 22 00 ^^^^^^^^^^^ ^^ ^^^^^^^^^^^ timestamp type length 0021047 00 00 00 0021052 Length is 0x00000044 == 68 $ expr 8727 + 68 8795 $ od -t x1 -j8795 -N19 master-bin.000001 0021133 6a 73 08 4a 13 01 00 00 00 f7 01 00 00 52 24 00 ^^^^^^^^^^^ ^^^^^^^^^^^ 0021153 00 00 00 0021156 Length is 0x000001f7 == 503 $ expr 8795 + 503 9298 $ od -t x1 -j9298 -N19 master-bin.000001 0022122 03 6a 73 08 4a 17 01 00 00 00 b0 02 00 5b ca 04 ^^^^^^^^^^^ ^^ ^^^^^^^^^^^ 0022142 00 00 00 0022145 The first bytes are the timestamp, which should not vary very much in a run. Also, the 5:th byte is the type, which is 0x4a == 74 and length becomes 0x0002b000 == 176128 which is what mysqlbinlog reported as type and length respectively. The length of the table map event is not correct.
[12 May 2009 11:32]
Luis Soares
It seems that the calculation of the event size in Table_map_log_event, in its constructor, is wrong by one byte, when m_field_metadata_size exceeds 255. In log_event.cc: (...) 7863 /* 7864 Create an array for the field metadata and store it. 7865 */ 7866 m_field_metadata_size= save_field_metadata(); 7867 DBUG_ASSERT(m_field_metadata_size <= (m_colcnt * 2)); 7868 7869 /* 7870 Now set the size of the data to the size of the field metadata array 7871 plus one or two bytes for number of elements in the field metadata array. 7872 */ 7873 if (m_field_metadata_size > 255) 7874 m_data_size+= m_field_metadata_size + 2; 7875 else 7876 m_data_size+= m_field_metadata_size + 1; (...) The m_data_size above should be +3 and not +2 as later, when writing the body (Table_map_log_event::write_data_body) there is the following call, in log_event.cc: (...) 8134 /* 8135 Store the size of the field metadata. 8136 */ 8137 uchar mbuf[sizeof(m_field_metadata_size)]; 8138 uchar *const mbuf_end= net_store_length(mbuf,m_field_metadata_size); (...) Inside net_store_length, the return value is actually 3 bytes shifted and not 2 bytes as expected in the constructor (note the 1 increment in the packet and the +2 in the return). In pack.c: (...) 97 uchar *net_store_length(uchar *packet, ulonglong length) 98 { 99 if (length < (ulonglong) LL(251)) 100 { 101 *packet=(uchar) length; 102 return packet+1; 103 } 104 /* 251 is reserved for NULL */ 105 if (length < (ulonglong) LL(65536)) 106 { 107 *packet++=252; 108 int2store(packet,(uint) length); 109 return packet+2; 110 } (...)
[12 May 2009 11:54]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73811 2887 Luis Soares 2009-05-12 BUG#42749: infinite loop writing to row based binlog - processlist shows "freeing items" The calculation of the table map log event in the event constructor was one byte shorter than what would be actually written. This would lead to a mismatch between the number of bytes written and the event end_log_pos, causing bad event alignment in the binlog (corrupted binlog) or in the transaction cache while fixing positions (MYSQL_BIN_LOG::write_cache). This could lead to impossible to read binlog or even infinite loops in MYSQL_BIN_LOG::write_cache. This patch addresses this issue by correcting the expected event length in the Table_map_log_event constructor, when the field metadata size exceeds 255. @ sql/log_event.cc Added the extra byte as net_store_length imposes.
[14 May 2009 7:31]
Zhenxing He
BUG#44627 is a dup of this.
[14 May 2009 11:59]
Luis Soares
Pushed to 5.1-bugteam and 6.0-bugteam.
[28 May 2009 12:39]
Jon Stephens
Documented bugfix in the 5.1.36 changelog as follows: When using row-based logging, the length of an event for which the field metadata exceeded 255 bytes in size was incorrectly calculated. This could lead to corruption of the binary log, or cause the server to hang.
[28 May 2009 13:07]
Jon Stephens
Set to NDI pending push to 5.1+.
[17 Jun 2009 19:23]
Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version source revid:luis.soares@sun.com-20090514115255-seesp0i0tn8tuhus) (merge vers: 6.0.12-alpha) (pib:11)
[25 Jun 2009 9:06]
Jon Stephens
Also documented in the 5.4.4 changelog. Closed.
[12 Aug 2009 22:06]
Paul DuBois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[14 Aug 2009 22:57]
Paul DuBois
Ignore previous comment about 5.4.2.
[26 Aug 2009 13:45]
Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46]
Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48]
Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:32]
Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[7 Oct 2009 1:51]
Paul DuBois
The 5.4 fix has been pushed into 5.4.2.