MySQL Bugs: #36468: Falcon replication failure, engine reports HA_ERR_END_OF

Bug #36468	Falcon replication failure, engine reports HA_ERR_END_OF_FILE on slave
Submitted:	2 May 2008 11:24	Modified:	30 Jul 2008 13:55
Reporter:	Philip Stoev	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Falcon storage engine	Severity:	S1 (Critical)
Version:	6.0.5	OS:	Linux (64bit)
Assigned to:	Christopher Powers	CPU Architecture:	Any
Tags:	replication

Description:
When executing a replication workload, the slave SQL thread stops, with Last_Error:

Could not execute Update_rows event on table tpcb.account; Can't find record in 'account', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 12552

How to repeat:
Simpler testcase will follow shortly.

Init file for bug 36468

Attachment: bug36468_init.test (application/octet-stream, text), 14.35 KiB.

Test case for bug 36468

Attachment: bug36468_run.test (application/octet-stream, text), 897 bytes.

To reproduce the bug, please run this on a 64 bit linux machine:

# Start the master and slave

$ perl mysql-test-run.pl --start-and-exit --skip-ndb rpl_alter

# Create tables and such

$ perl mysql-test-run.pl --extern --socket=var/tmp/master.sock --user=root bug36468_init

# Make sure master and slave are still up, and run

$ perl mysql-test-run.pl --extern --socket=var/tmp/master.sock --user=root bug36468_run

# Cat slave.err and see the message

$ cat var/log/slave.err

Another way to get the same error, even on 32bit:

$ perl mysql-test-run.pl --mysqld=--skip-innodb --mysqld=--default-storage-engine=falcon rpl.rpl_row_basic_8partition

TEST                           RESULT         TIME (ms)
-------------------------------------------------------

rpl.rpl_row_basic_8partition 'row' [ fail ]

=== SHOW MASTER STATUS ===
---- 1. ----
File    slave-bin.000001
Position        5982
Binlog_Do_DB
Binlog_Ignore_DB
==========================

=== SHOW SLAVE STATUS ===
---- 1. ----
Slave_IO_State  Waiting for master to send event
Master_Host     127.0.0.1
Master_User     root
Master_Port     9306
Connect_Retry   1
Master_Log_File master-bin.000001
Read_Master_Log_Pos     6199
Relay_Log_File  slave-relay-bin.000002
Relay_Log_Pos   6019
Relay_Master_Log_File   master-bin.000001
Slave_IO_Running        Yes
Slave_SQL_Running       No
Replicate_Do_DB
Replicate_Ignore_DB
Replicate_Do_Table
Replicate_Ignore_Table
Replicate_Wild_Do_Table
Replicate_Wild_Ignore_Table
Last_Errno      0
Last_Error      Could not execute Delete_rows event on table test.t1; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 6130
Skip_Counter    0
Exec_Master_Log_Pos     5872
Relay_Log_Space 6502
Until_Condition None
Until_Log_File
Until_Log_Pos   0
Master_SSL_Allowed      No
Master_SSL_CA_File
Master_SSL_CA_Path
Master_SSL_Cert
Master_SSL_Cipher
Master_SSL_Key
Seconds_Behind_Master
Master_SSL_Verify_Server_Cert   No
Last_IO_Errno   0
Last_IO_Error
Last_SQL_Errno  0
Last_SQL_Error  Could not execute Delete_rows event on table test.t1; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 6130
=========================

mysqltest: In included file "./include/rpl_multi_engine3.inc": At line 56: could not sync with master ('select master_pos_wait('master-bin.000001', 6199)' returned NULL)

The result from queries just before the failure was:
< snip >
4       1       Testing MySQL databases is a cool       Must make it bug free for the customer  654321.4321 15.21   0       1965    1985-11-14
42      1       Testing MySQL databases is a cool       Must make it bug free for the customer  654321.4321 15.21   0       1965    1905-11-14
142     1       Testing MySQL databases is a cool       Must make it bug free for the customer  654321.4321 15.21   0       1965    1995-11-14
412     1       Testing MySQL databases is a cool       Must make it bug free for the customer  654321.4321 15.21   0       1965    2005-11-14
--- Update t1 on master --
UPDATE t1 SET b1 = 0, t="2006-02-22" WHERE id = 412;
--- Check the update on master ---
SELECT id,hex(b1),vc,bc,d,f,total,y,t FROM t1 WHERE id = 412;
id      hex(b1) vc      bc      d       f       total   y       t
412     0       Testing MySQL databases is a cool       Must make it bug free for the customer  654321.4321 15.21   0       1965    2006-02-22
--- Check Update on slave ---
SELECT id,hex(b1),vc,bc,d,f,total,y,t FROM t1 WHERE id = 412;
id      hex(b1) vc      bc      d       f       total   y       t
412     0       Testing MySQL databases is a cool       Must make it bug free for the customer  654321.4321 15.21   0       1965    2006-02-22
--- Remove a record from t1 on master ---
DELETE FROM t1 WHERE id = 42;
--- Show current count on master for t1 ---
SELECT COUNT(*) FROM t1;
COUNT(*)
4

There is no indication that this is problem in Falcon. Lars, can your team take a quick look at this?

6.0.4 was not affected by this bug. Innodb and MyISAM do not fail with the provided test.

The problem is that a Delete_rows_log_event containing garbage is written to the binlog. This causes the slave to stop replicating when the event is executed. This happens with Falcon tables that contain at least one column of type 'text', whenever a row is deleted from the table.

Here's a minimal test case:

==== begin rpl_bug36468.test ====
source include/master-slave.inc;
source include/have_binlog_format_row.inc;

# create a falcon table with a single 'text' column and a single row
CREATE TABLE t1(a text) ENGINE=falcon;
INSERT INTO t1 VALUES('Hello world! This test case reproduces BUG#36468.');
SELECT * FROM t1;
sync_slave_with_master;
SELECT * FROM t1;
connection master;

# deleting from this table produces a Delete_rows_log_event containing
# garbage in the binlog, which causes the slave to stop with an error.
DELETE FROM t1;
sync_slave_with_master;

connection master;
drop table t1;
sync_slave_with_master;
==== end rpl_bug36468.test ====

If you run the test case and then run mysqlbinlog --hexdump var/log/master-bin.000001, you see this Delete_rows_log_event:

#080515 18:36:37 server id 1  end_log_pos 658 
# Position  Timestamp   Type   Master ID        Size      Master Pos    Flags
#      241 95 66 2c 48   19   01 00 00 00   51 00 00 00   92 02 00 00   10 00
#      254 10 00 00 00 00 00 01 00  01 01 fe 31 00 ee ee ee |...........1....|
#      264 ee ee ee ee ee ee ee ee  ee ee ee ee ee ee ee ee |................|
#      274 ee ee ee ee ee ee ee ee  ee ee ee ee ee ee ee ee |................|
#      284 ee ee ee ee ee ee ee ee  ee ee ee ee ee ee       |..............|
#       Delete_rows: table id 16 flags: STMT_END_F

Instead of the string in the 'text' column, you get a string of 0xee bytes, of the same length.

Sven, I was wondering, can you imagine a situation where the garbage written to the binlog accidentially happens to constitute a valid Delete event, that will delete a record different from the one that is supposed to be deleted.

Philip, yes, for each string of length X that master deletes, slave will try to delete a string consisting of only the character 0xee repeated X times. So if the slave has such strings in the table, then is is possible that the slave successfully removes them instead of stopping with an error. Here's an example (modified version of the test above):

==== begin rpl_bug36468_a.test ====
source include/master-slave.inc;
source include/have_binlog_format_row.inc;

# create a falcon table with a single 'text' column and a single row
CREATE TABLE t1(a text) ENGINE=falcon;
INSERT INTO t1 VALUES(CHAR(0xee, 0xee, 0xee, 0xee));
INSERT INTO t1 VALUES('test');

SELECT * FROM t1;
sync_slave_with_master;
SELECT * FROM t1;
connection master;

# deleting a string of length four on this table produces a
# Delete_rows_log_event which tries to delete
# CHAR(0xee, 0xee, 0xee, 0xee). Since this string happens to
# exist in the table, the operation is successful and the wrong
# row is removed from the table.
DELETE FROM t1 WHERE a='test' ;
SELECT * FROM t1;
sync_slave_with_master;
SELECT * FROM t1;

connection master;
drop table t1;
sync_slave_with_master;
==== end rpl_bug36468_a.test ====

Sven, it appears that more than just DELETEs on a TEXT column are affected. My initial example causes the error by calling a stored procedure that does not use any DELETE , just INSERTs and UPDATEs. Also, the server reports the error in the "account" table, which does not contain TEXT, but contains INT, DECIMAL and CHAR. Mysqlbinlog reports this at the location of the failure:

#080517 17:26:31 server id 1  end_log_pos 12552
# Position  Timestamp   Type   Master ID        Size      Master Pos    Flags
#     30cf 17 eb 2e 48   18   01 00 00 00   39 00 00 00   08 31 00 00   10 00
#     30e2 14 00 00 00 00 00 01 00  04 01 0c fe 17 00 00 00 |................|
#     30f2 fc 7f ff ff fe ff 0f 61  63 63 6f 75 6e 74 20 75 |.......account.u|
#     3102 70 64 61 74 65 64  |pdated|
#       Update_rows: table id 20 flags: STMT_END_F

Unfortunately, I can not narrow down my initial example further because I keep hitting bug #32709 in a debug binary. Please let me know if I should keep trying or the bug is now sufficiently well undersood.

Philip, I think we probably have two bugs here. The bug described by Hakan and which I have investigated shows corruption in the binlog in a deterministic manner. It can be consistently reproduced the same way each time the test is run. The symptoms you describe are different: high workload is needed, not easily reproducible, and the binlog is not corrupted in the same way.

So we should probably open a new bug. If you can, please continue trying to simplify the test case.

Sven, both test cases on this bug require no concurrency. They are both easy to reproduce and require no high loads.

Chris Powers says:

This bug involves the failure of deleted records having blob-type fields (e.g. TEXT) to replicate.

Replication fails because the DELETE entry in the binlog contains garbage for the blob field. The blob field contains garbage because StorageInterface::delete_row() frees the blob memory before the DELETE operation is added to the binlog (see ha_falcon.cpp and handler.cc.)

This is only a problem for deleted records. Inserts and and updates work fine.

I don't know why we haven't seen this before, and I have not found any code changes that account for this apparent change in behavior.

The simplest solution is to remove freeActiveBlobs() from delete_row(). This resolves the replication problem, but does it introduce potential memory problems?

Here is the issue in more detail:

Each instance of StorageInterface maintains a list of StorageBlob objects that are allocated when a blob field in the current record is decoded, for example during table or index scans. StorageBlobs contains a blob id and a pointer to blob data. The blob data is always freed before the next record is decoded, however, the StorageBlob object itself is retained for reuse.

The problem is that StorageInterface::delete_row() calls freeActiveBlobs() in order to free the blob memory allocated from the preceding positioning operation, e.g rnd_next(). This is a problem for replication because handler::ha_delete_row() binlogs the contents of the deleted record immediately after calling delete_row().

I think we can safely remove the call to freeActiveBlobs() in delete_row() because the blob memory will either be freed by a subsequent record positioning (the final rnd_next, rnd_pos, index_next), or when the StorageInterface object destructs.

The risk is that somehow delete_row() can be called repeatedly before freeActiveBlobs() is called, resulting a rapid increase in StorageBlob memory allocation. This might be mitigated by adding freeActiveBlobs() elsewhere, perhaps rnd_init() or index_init().

I invite those more familiar with the storage interface to give this some thought. In the meantime, I can give Philip a patch to test and push the changes if there are no objections.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47159

ChangeSet@1.2685, 2008-05-28 10:48:58-05:00, cpowers@xeno.mysql.com +1 -0
  Bug#36468, Falcon replication failure, engine reports HA_ERR_END_OF_FILE on slave
  
  Replication requires that blob data remain intact until the DELETE operation is 
  binlogged. Removed freeActiveBlobs() from StorageInterface::delete_rows().

Fix approved by Jim Starkey.

rpl.rpl_row_basic_8partition.test (modified for Falcon) passes.

A bit more elaboration on the previous comments.

Removing freeActiveBlobs() from delete_row() does not introduce the potential for a memory leak because, as noted above, subsequent positioning commands (rnd_next, rnd_pos, index_next) will release blob memory.

Although this bug initially appeared to be a regression, the delete_row() method has always freed blob data, so the replication of blob data types would have never worked. I suspect that Falcon blob replication has not been sufficiently tested until now.

The rpl_row_basic_8partition test case now passes fine in the latest 6.0-falcon-team.

However, the bug36468_init.test/bug36468_run.test continues to fail on a non-debug build of 6.0-falcon-test.

New test case for bug 36468

Attachment: bug36468.test (application/octet-stream, text), 834 bytes.

A new test case has been uploaded that fails against the current 6.0-falcon-team, non-debug comple. To run, please do:

$ perl mysql-test-run.pl --start-and-exit --skip-ndb rpl_alter
$ perl mysql-test-run.pl --extern --socket=var/tmp/master.sock --user=root bug36468
$ cat var/log/slave.err

Please note that for the bug to show up, the last UPDATE must be issued from a new thread. If the test case is modified not to reconnect before the UPDATE, the bug will not show up.

Chris Powers said in an email:

"I reproduced the second bug yesterday (Sunday) on both Windows and Linux, debug and non-debug. I debugged in depth today after coding a workaround for unrelated replication Bug#32709. 

The testcase creates two clients on the master server. Client A inserts several rows into a table from within a stored procedure. Although no error is reported, the inserts do not replicate. Since the table is incomplete on the slave, updates replicated from Client B fail with the error that Philip reported.

Tomorrow morning I will determine why the inserts from Client A fail to replicate. It is not clear whether the problem is with Falcon or Replication."

Here is an even simpler test case for the remaining bug:
--source include/master-slave.inc

--connection slave
set global binlog_format = 'row';
--connection master
set global binlog_format = 'row';

CREATE TABLE IF NOT EXISTS account (id INT) ENGINE = Falcon;
SET AUTOCOMMIT=0;
INSERT INTO account VALUES (1);
SET AUTOCOMMIT=1;

The INSERT will not make it to the slave. The AUTOCOMMIT=0 is required to cause the problem. Innodb is not affected. Maybe the assert from bug 32709 signifies that the binary log file has been insufficiently flushed by the time the connection is closed, meaning that log records were lost?

One more observation:

SET AUTOCOMMIT=1 is supposed to cause implicit COMMIT. However this only causes the insert to appear on the master, and not on the slave.

On the other hand, if you use explicit COMMIT, the row appears on both.

In both cases, Innodb is not affected.

Mats,

The implicit commit generated by SET AUTOCOMMIT=1 appears to prevent pending events from being flushed to the binlog. Subsequent START TRANSACTION/COMMIT or changes to AUTOCOMMIT have no effect. Exiting the client triggers the assert detailed in Bug#32709.

Note that sequence SET AUTOCOMMIT=0, <do stuff>, COMMIT works ok.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47694

ChangeSet@1.2626, 2008-06-10 10:54:51-05:00, cpowers@xeno.mysql.com +1 -0
  Bug#36468, "Falcon replication failure, engine reports HA_ERR_END_OF_FILE on slave"
  - Removed freeActiveBlobs() from StorageInterface::delete_rows(). Replication requires that blob
  data remain intact until DELETE operations are binlogged.

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/47704

2692 Christopher Powers	2008-06-10
      Bug#36468, "Falcon replication failure"
      
      - Removed freeActiveBlobs() from StorageInterface::deleteRow(). Replication requires blob memory to remain intact until the DELETE event is binlogged.

OK to push to mysql-6.0-falcon in Bazzar.
This change was pushed to the 6.0.5 clone-off.
Chris, please confirm.

This change is in the 6.0.5 release.

Pushed into 6.0.7-alpha  (revid:serg@mysql.com-20080722121106-wy84j0yvceyu72zr) (pib:2)

A note has been added to the 6.0.5 changelog: 

Replicating a Falcon table that contained a TEXT or BLOG column would fail during a DELETE operation with the error HA_ERR_END_OF_FILE.

Pushed into 6.0.7-alpha  (revid:sp1r-cpowers@xeno.mysql.com-20080610155451-14755) (version source revid:vvaintroub@mysql.com-20080804094710-jb2qpqxpf2ir2gf3) (pib:3)