| Bug #42749 | infinite loop writing to row based binlog - processlist shows "freeing items" | ||
|---|---|---|---|
| Submitted: | 11 Feb 2009 6:59 | Modified: | 25 Jun 2009 9:06 |
| Reporter: | Shane Bester (Platinum Quality Contributor) | Email Updates: | |
| Status: | Closed | Impact on me: | |
| Category: | MySQL Server: Row Based Replication ( RBR ) | Severity: | S1 (Critical) |
| Version: | 5.1.30, 5.1.32, 6.0.10 | OS: | Any (Windows, Linux) |
| Assigned to: | Luis Soares | CPU Architecture: | Any |
[11 Feb 2009 8:11]
MySQL Verification Team
in 6.0.10 it is looping in function MYSQL_BIN_LOG::write_cache:
while (hdr_offs < length)
{
if (hdr_offs + LOG_EVENT_HEADER_LEN > length)
{
<cut>
}
else
{
uchar *log_pos= (uchar *)cache->read_pos + hdr_offs + LOG_POS_OFFSET;
/* fix end_log_pos */
val= uint4korr(log_pos) + group;
int4store(log_pos, val);
/* next event header at ... */
log_pos= (uchar *)cache->read_pos + hdr_offs + EVENT_LEN_OFFSET;
hdr_offs += uint4korr(log_pos);
}
}
when the looping happens:
hdr_offs 12271 unsigned int
length 13381 unsigned int
hdr_offs never increases because "uint4korr(log_pos);" is always
returning 0.
[11 Feb 2009 14:08]
Mats Kindahl
Is the server a 32-bit build or a 64-bit build?
[11 Feb 2009 14:30]
MySQL Verification Team
I used 64-bit 5.1.30 and 32-bit 6.0.10. Customer used 5.1.30 64-bit on linux.
[9 Mar 2009 8:28]
MySQL Verification Team
actually, this use case is pretty common. it can affect every replication setup or every setup using binlogging with row based format.. there will likely be many 5.1 users needing a fix.
[30 Apr 2009 7:24]
Sveta Smirnova
Bug #44548 looks like same case. Reporter uses MIXED binary log format.
[30 Apr 2009 16:26]
Sveta Smirnova
Bug #44548 was confirmed to be duplicate of this one.
[12 May 2009 7:11]
Mats Kindahl
See BUG#44752, which is a duplicate
[12 May 2009 7:17]
Mats Kindahl
Analyzing corrupt binlog sent to me by Luís.
Have a look at this:
$ client/mysqlbinlog master-bin.000001
...
# at 8727
...
# at 8795
ERROR: Error in Log_event::read_log_event(): 'Sanity check failed', data_len: 176128, event_type: 74
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
We start off with a good event (02 == QUERY_LOG_EVENT)
$ od -t x1 -j8727 -N19 master-bin.000001
0021027 6a 73 08 4a 02 01 00 00 00 44 00 00 00 5b 22 00
^^^^^^^^^^^ ^^ ^^^^^^^^^^^
timestamp type length
0021047 00 00 00
0021052
Length is 0x00000044 == 68
$ expr 8727 + 68
8795
$ od -t x1 -j8795 -N19 master-bin.000001
0021133 6a 73 08 4a 13 01 00 00 00 f7 01 00 00 52 24 00
^^^^^^^^^^^ ^^^^^^^^^^^
0021153 00 00 00
0021156
Length is 0x000001f7 == 503
$ expr 8795 + 503
9298
$ od -t x1 -j9298 -N19 master-bin.000001
0022122 03 6a 73 08 4a 17 01 00 00 00 b0 02 00 5b ca 04
^^^^^^^^^^^ ^^ ^^^^^^^^^^^
0022142 00 00 00
0022145
The first bytes are the timestamp, which should not vary very much in a run.
Also, the 5:th byte is the type, which is 0x4a == 74 and length becomes
0x0002b000 == 176128 which is what mysqlbinlog reported as type and length
respectively.
The length of the table map event is not correct.
[12 May 2009 11:32]
Luis Soares
It seems that the calculation of the event size in
Table_map_log_event, in its constructor, is wrong by one byte, when
m_field_metadata_size exceeds 255. In log_event.cc:
(...)
7863 /*
7864 Create an array for the field metadata and store it.
7865 */
7866 m_field_metadata_size= save_field_metadata();
7867 DBUG_ASSERT(m_field_metadata_size <= (m_colcnt * 2));
7868
7869 /*
7870 Now set the size of the data to the size of the field metadata array
7871 plus one or two bytes for number of elements in the field metadata array.
7872 */
7873 if (m_field_metadata_size > 255)
7874 m_data_size+= m_field_metadata_size + 2;
7875 else
7876 m_data_size+= m_field_metadata_size + 1;
(...)
The m_data_size above should be +3 and not +2 as later, when writing the
body (Table_map_log_event::write_data_body) there is the following call,
in log_event.cc:
(...)
8134 /*
8135 Store the size of the field metadata.
8136 */
8137 uchar mbuf[sizeof(m_field_metadata_size)];
8138 uchar *const mbuf_end= net_store_length(mbuf,m_field_metadata_size);
(...)
Inside net_store_length, the return value is actually 3 bytes shifted and
not 2 bytes as expected in the constructor (note the 1 increment in the
packet and the +2 in the return). In pack.c:
(...)
97 uchar *net_store_length(uchar *packet, ulonglong length)
98 {
99 if (length < (ulonglong) LL(251))
100 {
101 *packet=(uchar) length;
102 return packet+1;
103 }
104 /* 251 is reserved for NULL */
105 if (length < (ulonglong) LL(65536))
106 {
107 *packet++=252;
108 int2store(packet,(uint) length);
109 return packet+2;
110 }
(...)
[12 May 2009 11:54]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/73811 2887 Luis Soares 2009-05-12 BUG#42749: infinite loop writing to row based binlog - processlist shows "freeing items" The calculation of the table map log event in the event constructor was one byte shorter than what would be actually written. This would lead to a mismatch between the number of bytes written and the event end_log_pos, causing bad event alignment in the binlog (corrupted binlog) or in the transaction cache while fixing positions (MYSQL_BIN_LOG::write_cache). This could lead to impossible to read binlog or even infinite loops in MYSQL_BIN_LOG::write_cache. This patch addresses this issue by correcting the expected event length in the Table_map_log_event constructor, when the field metadata size exceeds 255. @ sql/log_event.cc Added the extra byte as net_store_length imposes.
[14 May 2009 7:31]
Zhenxing He
BUG#44627 is a dup of this.
[14 May 2009 11:59]
Luis Soares
Pushed to 5.1-bugteam and 6.0-bugteam.
[28 May 2009 12:39]
Jon Stephens
Documented bugfix in the 5.1.36 changelog as follows:
When using row-based logging, the length of an event for which
the field metadata exceeded 255 bytes in size was incorrectly
calculated. This could lead to corruption of the binary log, or
cause the server to hang.
[28 May 2009 13:07]
Jon Stephens
Set to NDI pending push to 5.1+.
[17 Jun 2009 19:23]
Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090616183122-chjzbaa30qopdra9) (version source revid:luis.soares@sun.com-20090514115255-seesp0i0tn8tuhus) (merge vers: 6.0.12-alpha) (pib:11)
[25 Jun 2009 9:06]
Jon Stephens
Also documented in the 5.4.4 changelog. Closed.
[12 Aug 2009 22:06]
Paul DuBois
Noted in 5.4.2 changelog because next 5.4 version will be 5.4.2 and not 5.4.4.
[14 Aug 2009 22:57]
Paul DuBois
Ignore previous comment about 5.4.2.
[26 Aug 2009 13:45]
Bugs System
Pushed into 5.1.37-ndb-7.0.8 (revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[26 Aug 2009 13:46]
Bugs System
Pushed into 5.1.37-ndb-6.3.27 (revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (version source revid:jonas@mysql.com-20090826105955-bkj027t47gfbamnc) (merge vers: 5.1.37-ndb-6.3.27) (pib:11)
[26 Aug 2009 13:48]
Bugs System
Pushed into 5.1.37-ndb-6.2.19 (revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (version source revid:jonas@mysql.com-20090825194404-37rtosk049t9koc4) (merge vers: 5.1.37-ndb-6.2.19) (pib:11)
[27 Aug 2009 16:32]
Bugs System
Pushed into 5.1.35-ndb-7.1.0 (revid:magnus.blaudd@sun.com-20090827163030-6o3kk6r2oua159hr) (version source revid:jonas@mysql.com-20090826132541-yablppc59e3yb54l) (merge vers: 5.1.37-ndb-7.0.8) (pib:11)
[7 Oct 2009 1:51]
Paul DuBois
The 5.4 fix has been pushed into 5.4.2.

Description: when --log-bin --binlog_format=row is enabled and importing a mysqldump, it can hang like this: mysql> show processlist\G *************************** 1. row ***************** Id: 1 User: root Host: db: test Command: Query Time: 214 State: freeing items Info: INSERT INTO `xxx` VALUES (1234,'1234 .... Call stack for thread: mysqld.exe!MYSQL_BIN_LOG::write_cache Line 4287 mysqld.exe!MYSQL_BIN_LOG::write Line 4397 mysqld.exe!binlog_end_trans Line 1401 mysqld.exe!MYSQL_BIN_LOG::log_xid Line 5440 mysqld.exe!ha_commit_trans Line 1163 mysqld.exe!ha_autocommit_or_rollback Line 1333 mysqld.exe!dispatch_command mysqld.exe!do_command mysqld.exe!handle_one_connection mysqld.exe!pthread_start mysqld.exe!_callthreadstart mysqld.exe!_threadstart no other connections will be able to write to the binlog during this hang. How to repeat: start server with --log-bin --binlog-format=row and import the privately attached mysqldump