Bug #38360 BLACKHOLE replication with RBR is broken
Submitted: 25 Jul 2008 3:55 Modified: 17 Oct 2008 18:20
Reporter: Jon Stephens Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:5.1, 6.0 BZR OS:Any
Assigned to: Mats Kindahl
Triage: Triaged: D2 (Serious) / R3 (Medium) / E3 (Medium)

[25 Jul 2008 3:55] Jon Stephens
Description:
Todd Farmer wrote:
> The documentation for BLACKHOLE (at 
> http://dev.mysql.com/doc/refman/5.1/en/blackhole-storage-engine.html) 
> makes the following claim:
> 
> "Inserts into a BLACKHOLE table do not store any data, but if the binary 
> log is enabled, the SQL statements are logged (and replicated to slave 
> servers). This can be useful as a repeater or filter mechanism."
> 
> This is true, in the most literal sense (INSERT operations being the 
> key).  However, it is worth pointing out that BLACKHOLE does not play 
> nicely with row-based binary logging and UPDATE and/or DELETE statements:
> 
> C:\WINDOWS\system32>mysql -uroot -P3310 test
> Welcome to the MySQL monitor.  Commands end with ; or \g.
> Your MySQL connection id is 2
> Server version: 5.1.26-rc-community-log MySQL Community Server (GPL)
> 
> Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
> 
> mysql> SET SESSION binlog_format = 'STATEMENT';
> Query OK, 0 rows affected (0.02 sec)
> 
> mysql> drop table if exists a;
> Query OK, 0 rows affected, 1 warning (0.03 sec)
> 
> mysql> create table a (a INT PRIMARY KEY) ENGINE = BLACKHOLE;
> Query OK, 0 rows affected (0.08 sec)
> 
> mysql> SET SQL_LOG_BIN = OFF;
> Query OK, 0 rows affected (0.03 sec)
> 
> mysql> ALTER TABLE a ENGINE = MyISAM;
> Query OK, 0 rows affected (0.06 sec)
> Records: 0  Duplicates: 0  Warnings: 0
> 
> mysql> SET SQL_LOG_BIN = ON;
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> SET SESSION binlog_format = 'ROW';
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> INSERT INTO a VALUES (1);
> Query OK, 1 row affected (0.00 sec)
> 
> mysql> UPDATE a SET a = 2;
> Query OK, 1 row affected (0.00 sec)
> Rows matched: 1  Changed: 1  Warnings: 0
> 
> mysql> DELETE FROM a;
> Query OK, 1 row affected (0.00 sec)
> 
> mysql> exit
> Bye
> 
> C:\WINDOWS\system32>mysql -uroot -P3311
> Welcome to the MySQL monitor.  Commands end with ; or \g.
> Your MySQL connection id is 4
> Server version: 5.1.26-rc-community-log MySQL Community Server (GPL)
> 
> Type 'help;' or '\h' for help. Type '\c' to clear the buffer.
> 
> mysql> show slave status\G
> *************************** 1. row ***************************
>                Slave_IO_State: Waiting for master to send event
>                   Master_Host: localhost
>                   Master_User: root
>                   Master_Port: 3310
>                 Connect_Retry: 60
>               Master_Log_File: knopfler-bin.000001
>           Read_Master_Log_Pos: 946
>                Relay_Log_File: knopfler-relay-bin.000002
>                 Relay_Log_Pos: 666
>         Relay_Master_Log_File: knopfler-bin.000001
>              Slave_IO_Running: Yes
>             Slave_SQL_Running: No
>               Replicate_Do_DB:
>           Replicate_Ignore_DB:
>            Replicate_Do_Table:
>        Replicate_Ignore_Table:
>       Replicate_Wild_Do_Table:
>   Replicate_Wild_Ignore_Table:
>                    Last_Errno: 1032
>                    Last_Error: Could not execute Update_rows event on 
> table test.a; Can't find record in 'a', Error_code
> : 1032; handler error HA_ERR_END_OF_FILE; the event's master log 
> knopfler-bin.000001, end_log_pos 666
>                  Skip_Counter: 0
>           Exec_Master_Log_Pos: 518
>               Relay_Log_Space: 1252
>               Until_Condition: None
>                Until_Log_File:
>                 Until_Log_Pos: 0
>            Master_SSL_Allowed: No
>            Master_SSL_CA_File:
>            Master_SSL_CA_Path:
>               Master_SSL_Cert:
>             Master_SSL_Cipher:
>                Master_SSL_Key:
>         Seconds_Behind_Master: NULL
> Master_SSL_Verify_Server_Cert: No
>                 Last_IO_Errno: 0
>                 Last_IO_Error:
>                Last_SQL_Errno: 1032
>                Last_SQL_Error: Could not execute Update_rows event on 
> table test.a; Can't find record in 'a', Error_code
> : 1032; handler error HA_ERR_END_OF_FILE; the event's master log 
> knopfler-bin.000001, end_log_pos 666
> 1 row in set (0.00 sec)
> 
> A customer ran into this, expecting BLACKHOLE to work with RBR in the 
> same way it works with SBR.
> 
> Thanks!

How to repeat:
See Description.

Suggested fix:
Update docs with info from Todd.
[25 Jul 2008 3:56] Jon Stephens
Todd Farmer wrote:
> Todd Farmer wrote, On 7/24/2008 5:09 PM:
> 
>> This is true, in the most literal sense (INSERT operations being the 
>> key).  However, it is worth pointing out that BLACKHOLE does not play 
>> nicely with row-based binary logging and UPDATE and/or DELETE statements:
> 
> Perhaps also worth adding, and a little more comprehensive:
> mysql> ALTER TABLE a ENGINE = BLACKHOLE;
> Query OK, 0 rows affected (0.06 sec)
> Records: 0  Duplicates: 0  Warnings: 0
> 
> mysql> SET SQL_LOG_BIN = ON;
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> SET SESSION binlog_format = 'ROW';
> Query OK, 0 rows affected (0.00 sec)
> 
> mysql> INSERT INTO a VALUES (1);
> ERROR 1598 (HY000): Binary logging not possible. Message: Row-based 
> format required for this statement, but not allowed
> by this combination of engines
[25 Jul 2008 7:24] Jon Stephens
See Bug#38355
[25 Jul 2008 10:16] Sveta Smirnova
I believe this is not documentation issue. Test below will show this:

--source include/master-slave.inc

create table t (id int) engine=myisam;

connection slave;
alter table t engine=blackhole;

connection master;
set session binlog_format='ROW';
insert into t values (1), (2);

sleep 1;

connection slave;
select count(*) from t;

connection master;
set session binlog_format='MIXED';
delete from t where id=1;

sleep 1;

connection slave;
show slave status;

connection master;
set session binlog_format='ROW';
delete from t where id=1;

sleep 1;

connection slave;
show slave status;

This doesn't fail.

But if remove

connection master;
set session binlog_format='MIXED';
delete from t where id=1;

test will fail. So I think this is most likely server and not documentation bug.
[25 Jul 2008 10:31] Susanne Ebrecht
bug #38355 is a duplicate of this bug here
[25 Jul 2008 10:32] Sveta Smirnova
Category changed to replication, because reason provided in my above comment.
[29 Jul 2008 5:40] Todd Farmer
I'm still not seeing this as a server bug (unless the lack of support for BLACKHOLE in replicating using row-based binary logging for UPDATE/DELETE statements is incorrect).

When the first test run is executed, the initial DELETE statement deletes the row, and is written to the binary log as a statement.  When the second DELETE statement is executed, no rows are affected, so even though the binary log format is set to "ROW", nothing is written to the binary logs and sent to the slave.  Hence, no error.

Once the initial (binlog_format = 'MIXED') DELETE statement is removed, the DELETE statement executed when binlog_format = 'ROW' actually affects a row, is written to the binary log, replicated to the slave, where it fails.

Changing the test so that the subsequent DELETE statement affects a different row each time causes consistent failure:

--source include/master-slave.inc

create table t (id int) engine=myisam;

--save_master_pos

connection slave;

--sync_with_master
sleep 2 ;
alter table t engine=blackhole;

connection master;
set session binlog_format='ROW';
insert into t values (1), (2);
--save_master_pos
sleep 2 ;

connection slave;
--sync_with_master
select count(*) from t;

connection master;
set session binlog_format='MIXED';
delete from t where id=1;
--save_master_pos

connection slave;
--sync_with_master
--vertical_results
show slave status;

connection master;
set session binlog_format='ROW';
delete from t where id=2;
--save_master_pos

connection slave;
--sync_with_master
--vertical_results
show slave status;
[29 Jul 2008 20:39] Giuseppe Maxia
IMO, this is a server bug, as the following test shows.

Use the replication sandbox (http://launchpad.net/mysql-sandbox) to create a group of 3 servers:

export NODE_OPTIONS="-c log-slave-updates -c replicate-same-server-id=0"
./make_multiple_sandbox /path/to/tarball/mysql-5.1-26-rc-yourOS.tar.gz 

use the following script to set a multi-layer replication
(n1 = master ; n2 = slave of n1 and master of n3;  n3 = slave of n2)

#!/bin/bash
set -x
./use_all "set global binlog_format='statement'"
# ./use_all "set global binlog_format='row'"
./use_all "stop slave; reset master; reset slave"

X=msandbox
H=127.0.0.1
O1=MASTER_HOST
O2=MASTER_PORT
O3=MASTER_USER
O4=MASTER_PASSWORD

./n2 -e "stop slave"
./n2 -e "change master to $O1='$H', $O2=14727, $O3='$X', $O4='$X'"
./n2 -e "start slave"
sleep 1

./n3 -e "stop slave"
./n3 -e "change master to $O1='$H', $O2=14728, $O3='$X', $O4='$X'"
./n3 -e "start slave"
sleep 1

./use_all "show slave status\G" |grep Running

##
## test starts here
##

./n1 -e "drop table if exists test.t1"
sleep 1
./n1 -e "create table test.t1 (id int, c char(10))"

./n2 -e "set sql_log_bin=0; alter table test.t1 engine=blackhole; set sql_log_bin=1"

./n1 -e "insert into test.t1 values (1, 'aaa')"
./n1 -e "insert into test.t1 values (2, 'bbb')"

sleep 1
./use_all "select * from test.t1"
./n1 -e "update test.t1 set c = 'zzz' where id = 1"

sleep 1
./use_all "select * from test.t1"

./n1 -e "delete from test.t1  where id = 1"

sleep 1
./use_all "select * from test.t1"

### --- END TEST --- CUT HERE

With Statement based replication, the blackhole table passes the statements just right, and the output is the following:

+ ./n1 -e 'drop table if exists test.t1'
+ sleep 1
+ ./n1 -e 'create table test.t1 (id int, c char(10))'
+ ./n2 -e 'set sql_log_bin=0; alter table test.t1 engine=blackhole; set sql_log_bin=1'
+ ./n1 -e 'insert into test.t1 values (1, '\''aaa'\'')'
+ ./n1 -e 'insert into test.t1 values (2, '\''bbb'\'')'
+ sleep 1
+ ./use_all 'select * from test.t1'
# server: 1: 
id	c
1	aaa
2	bbb
# server: 2: 
(empty)
# server: 3: 
id	c
1	aaa
2	bbb
+ ./n1 -e 'update test.t1 set c = '\''zzz'\'' where id = 1'
+ sleep 1
+ ./use_all 'select * from test.t1'
# server: 1: 
id	c
1	zzz
2	bbb
# server: 2:
(empty) 
# server: 3: 
id	c
1	zzz
2	bbb
+ ./n1 -e 'delete from test.t1  where id = 1'
+ sleep 1
+ ./use_all 'select * from test.t1'
# server: 1: 
id	c
2	bbb
# server: 2:
(empty) 
# server: 3: 
id	c
2	bbb

(The UPDATE and DELETE statements are relayed as expected) 

To test for row based replication, just change the first lines of the file:
# ./use_all "set global binlog_format='statement'"
./use_all "set global binlog_format='row'"

Repeat the test, and see what happens. The blackhole table does not replicate at all.

+ ./n1 -e 'drop table if exists test.t1'
+ sleep 1
+ ./n1 -e 'create table test.t1 (id int, c char(10))'
+ ./n2 -e 'set sql_log_bin=0; alter table test.t1 engine=blackhole; set sql_log_bin=1'
+ ./n1 -e 'insert into test.t1 values (1, '\''aaa'\'')'
+ ./n1 -e 'insert into test.t1 values (2, '\''bbb'\'')'
+ sleep 1
+ ./use_all 'select * from test.t1'
# server: 1: 
id	c
1	aaa
2	bbb
# server: 2: 
(empty)
# server: 3: 
(empty)
+ ./n1 -e 'update test.t1 set c = '\''zzz'\'' where id = 1'
+ sleep 1
+ ./use_all 'select * from test.t1'
# server: 1: 
id	c
1	zzz
2	bbb
# server: 2: 
(empty)
# server: 3: 
(empty)
+ ./n1 -e 'delete from test.t1  where id = 1'
+ sleep 1
+ ./use_all 'select * from test.t1'
# server: 1: 
id	c
2	bbb
# server: 2: 
(empty)
# server: 3: 
(empty)
[29 Jul 2008 20:58] Giuseppe Maxia
In addition to the above test, the error log of n2 reports the following:

080729 22:50:15 [ERROR] Slave SQL: Error 'Binary logging not possible. Message: Row-based format required for this statement, but not allowed by this combination of engines' on opening tables, Error_code: 1598
080729 22:50:15 [Warning] Slave: Binary logging not possible. Message: Row-based format required for this statement, but not allowed by this combination of engines Error_code: 1598

I am not sure how to interpret the above messages. It seems to imply that Blackhole is not supposed to replicate in RBR.
[30 Jul 2008 6:20] Giuseppe Maxia
Changed description to reflect the current status.
The bug is verified. 
RBR breaks replication when using BLACKHOLE tables.
Replication works fine with BLACKHOLE and SBR.
[30 Jul 2008 8:51] Lars Thalmann
Most likely the solution is to let Blackhole be only statement-capable 
for logging purposes.  Brian supports this:

On Wed, Jul 30, 2008 at 12:22:15AM -0700, Brian Aker wrote:
> RBR was never meant to work with Blackhole. Blackhole was to be tagged
> statement only.
[30 Jul 2008 10:41] Lars Thalmann
I've re-read the bug report more carefully and realize the problem is
different from what I originally thought.

There is a bunch of flags that can be set differently for different
engines (e.g. NDB has different settings).

I think we can and should set these so that a BLACKHOLE slave can
accept row-based events.  The bug fixer will check this out.
[30 Jul 2008 16:13] Todd Farmer
Does "accepting row-based events" also mean the following?

* BLACKHOLE will accept UPDATE/DELETE events, even though it does not have a matching row to affect, without generating an error

* BLACKHOLE will log the incoming RBR events to the slave binary log (when --log-slave-updates is set), even though the DML executed on the BLACKHOLE slave table affected no data.

If those things don't happen, changing BLACKHOLE to "accept row-based events" won't be of any real value, IMO.

I'm sure this is understood, but just want to make sure we're all on the same page here.
[31 Jul 2008 10:40] Lars Thalmann
Thanks Todd, for clarifying.  I agree with these items.
[31 Jul 2008 16:28] Todd Farmer
One more note:

I propose that the solution has to be independent of the setting for slave_exec_mode.  The documentation at http://dev.mysql.com/doc/refman/5.1/en/server-system-variables.html#option_mysqld_slave_ex... is unclear regarding *which* errors are skipped, or what "other special replication scenarios" might require usage of IDEMPOTENT, but it seems unsafe to require the slave to be set to ignore data consistency errors for ALL tables, simply to support BLACKHOLE engine usage.

Ideally, such an option would be a table-level option, rather than affecting error checking for all tables on a slave.
[22 Sep 2008 19:00] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/54459

2683 Mats Kindahl	2008-09-22
      Bug #38360: BLACKHOLE replication with RBR is broken
      
      The Blackhole engine did not support row-based replication
      since the delete_row() and update_row() functions were not
      implemented.
      
      This patch adds row-based replication support for the
      Blackhole engine by implementing the two functions mentioned
      above, and making the engine pretend that it has found the
      correct row to delete or update when executed from the slave
      SQL thread.
      
      It is necessary to only pretend this for the SQL thread, since
      a SELECT executed on the Blackhole engine will otherwise never
      return EOF, causing a livelock.
[2 Oct 2008 8:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55018

2683 Mats Kindahl	2008-10-02
      Bug #38360: BLACKHOLE replication with RBR is broken
      
      The Blackhole engine did not support row-based replication
      since the delete_row(), update_row(), and the index and range
      searching functions were not implemented.
      
      This patch adds row-based replication support for the
      Blackhole engine by implementing the two functions mentioned
      above, and making the engine pretend that it has found the
      correct row to delete or update when executed from the slave
      SQL thread by implementing index and range searching functions.
      
      It is necessary to only pretend this for the SQL thread, since
      a SELECT executed on the Blackhole engine will otherwise never
      return EOF, causing a livelock.
[2 Oct 2008 9:03] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55024

2683 Mats Kindahl	2008-10-02
      Bug #38360: BLACKHOLE replication with RBR is broken
      
      The Blackhole engine did not support row-based replication
      since the delete_row(), update_row(), and the index and range
      searching functions were not implemented.
      
      This patch adds row-based replication support for the
      Blackhole engine by implementing the two functions mentioned
      above, and making the engine pretend that it has found the
      correct row to delete or update when executed from the slave
      SQL thread by implementing index and range searching functions.
      
      It is necessary to only pretend this for the SQL thread, since
      a SELECT executed on the Blackhole engine will otherwise never
      return EOF, causing a livelock.
[3 Oct 2008 9:21] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55195

2684 Mats Kindahl	2008-10-02
      Bug #38360: BLACKHOLE replication with RBR is broken
      
      Incremental patch to add comments to test cases.
[6 Oct 2008 14:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55463

2669 Alfranio Correia	2008-10-06
      Created a new test case that aims at verifying that 
      replication works with an intermediate slave using BLACKHOLE 
      (BUG#38360) storage ngine.
[7 Oct 2008 13:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/55592

2669 Alfranio Correia	2008-10-07
      BUG#38360 - Created a new test case that aims at verifying that replication works with 
      an intermediate slave using BLACKHOLE storage ngine.
[7 Oct 2008 20:02] Paul Dubois
Noted in 5.1.29 changelog.

Replication of BLACKHOLE tables did not work with row-based binary 
logging. 

Leaving report status unchanged; this is early documentation of an upcoming push into 5.1.29.
[9 Oct 2008 17:54] Bugs System
Pushed into 5.1.30  (revid:mats@sun.com-20081002191315-3hq5hhz88qaz4hcp) (version source revid:kgeorge@mysql.com-20081007082452-gk4l86zq8k53wwyo) (pib:4)
[9 Oct 2008 18:44] Paul Dubois
Setting report to NDI pending push into 6.0.x.
[17 Oct 2008 16:45] Bugs System
Pushed into 6.0.8-alpha  (revid:mats@sun.com-20081002191315-3hq5hhz88qaz4hcp) (version source revid:kgeorge@mysql.com-20081007153644-uypi14yjgque9obc) (pib:5)
[17 Oct 2008 18:20] Paul Dubois
Noted in 6.0.8 changelog.
[28 Oct 2008 21:05] Bugs System
Pushed into 5.1.29-ndb-6.2.17  (revid:mats@sun.com-20081002191315-3hq5hhz88qaz4hcp) (version source revid:tomas.ulin@sun.com-20081028140209-u4emkk1xphi5tkfb) (pib:5)
[28 Oct 2008 22:24] Bugs System
Pushed into 5.1.29-ndb-6.3.19  (revid:mats@sun.com-20081002191315-3hq5hhz88qaz4hcp) (version source revid:tomas.ulin@sun.com-20081028194045-0353yg8cvd2c7dd1) (pib:5)
[1 Nov 2008 9:49] Bugs System
Pushed into 5.1.29-ndb-6.4.0  (revid:mats@sun.com-20081002191315-3hq5hhz88qaz4hcp) (version source revid:jonas@mysql.com-20081101082305-qx5a1bj0z7i8ueys) (pib:5)