Bug #32605 Binary log and replication filtering options differ between RBR and SBR
Submitted: 21 Nov 2007 22:56 Modified: 10 Jan 2008 15:37
Reporter: Todd Farmer (OCA) Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S2 (Serious)
Version:5.1.22 OS:Any
Assigned to: Jon Stephens CPU Architecture:Any

[21 Nov 2007 22:56] Todd Farmer
Description:
MySQL has long provided replication and binary log filtering options, and one of the quirks of many of these options is that they rely on the default database rather than the actual database of the affected table.  These behaviors are well documented for statement-based replication, and no updates to this documentation have been made (see http://dev.mysql.com/doc/refman/5.1/en/binary-log.html , http://dev.mysql.com/doc/refman/5.1/en/replication-options.html and http://dev.mysql.com/doc/refman/5.1/en/replication-rules.html in particular), despite the fact that RBR behaves entirely differently (and perhaps more logically):

mysql> use test;
Database changed
mysql> set binlog_format = 'STATEMENT';
Query OK, 0 rows affected (0.00 sec)

mysql> create table filter.t1 (i int);
Query OK, 0 rows affected (0.11 sec)

mysql> create table filter.t2 (i int);
Query OK, 0 rows affected (0.09 sec)

mysql> set binlog_format = 'ROW';
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO filter.t1 VALUES (1);
Query OK, 1 row affected (0.00 sec)

mysql> set binlog_format = 'STATEMENT';
Query OK, 0 rows affected (0.00 sec)

mysql> INSERT INTO filter.t1 VALUES (2);
Query OK, 1 row affected (0.00 sec)

# result on slave with --replicate-ignore-db=filter:

mysql> select * from t1;
+------+
| i    |
+------+
|    2 |
+------+
1 row in set (0.00 sec)

mysql> show slave status\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: localhost
                  Master_User: rep
                  Master_Port: 3307
                Connect_Retry: 60
              Master_Log_File: knopfler-bin.000006
          Read_Master_Log_Pos: 464
               Relay_Log_File: knopfler-relay-bin.000028
                Relay_Log_Pos: 612
        Relay_Master_Log_File: knopfler-bin.000006
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB: filter
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 464
              Relay_Log_Space: 816
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
1 row in set (0.00 sec)

mysql>

This will cause immense headaches for those who have become dependent on the existing SBR filtering mechanisms (particularly if running in MIXED mode).

How to repeat:
Configure master-slave replication with --replicate-ignore-db=filter on slave, then:

# on master:

CREATE DATABASE filter;

# on slave:

CREATE DATABASE filter;

# on master:

SET binlog_format = 'STATEMENT';
USE test

CREATE TABLE filter.t1 (i INT);
INSERT INTO filter.t1 VALUES (1);

SET binlog_format = 'ROW';
INSERT INTO filter.t1 VALUES (2);

# on slave:

SELECT * FROM filter.t1; -- note that only one row replicated

Suggested fix:
Either make make RBR implement the filtering logic used in SBR, or update documentation accordingly.  There should be significant warnings about running in MIXED mode with filtering of any kind, as well as documentation of the differences in behavior between STATEMENT and ROW.
[21 Nov 2007 23:13] Todd Farmer
Another example, this dealing with binlog-ignore-db:

mysql> use test;
Database changed
mysql> set binlog_format = 'ROW';
Query OK, 0 rows affected (0.00 sec)

mysql> insert into filter.t2 values (1);
Query OK, 1 row affected (0.28 sec)

mysql> show binlog events in 'knopfler-bin.000004';
+---------------------+-----+-------------+-----------+-------------+----------------------------------------------------+
| Log_name            | Pos | Event_type  | Server_id | End_log_pos | Info                                               |
+---------------------+-----+-------------+-----------+-------------+----------------------------------------------------+
| knopfler-bin.000004 |   4 | Format_desc |         6 |         106 | Server ver: 5.1.22-rc-community-log, Binlog ver: 4 |
| knopfler-bin.000004 | 106 | Query       |         6 |         199 | use `test`; create table filter.t2 (i int)         |
| knopfler-bin.000004 | 199 | Query       |         6 |         292 | use `test`; create table filter.t3 (i int)         |
+---------------------+-----+-------------+-----------+-------------+----------------------------------------------------+
3 rows in set (0.00 sec)

mysql> show master status;
+---------------------+----------+--------------+------------------+
| File                | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+---------------------+----------+--------------+------------------+
| knopfler-bin.000004 |      292 |              | filter           |
+---------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

mysql> create table ft (i int);
Query OK, 0 rows affected (0.11 sec)

mysql> use filter
Database changed
mysql> insert into test.ft values (1);
Query OK, 1 row affected (0.05 sec)

mysql> show binlog events in 'knopfler-bin.000004';
+---------------------+-----+-------------+-----------+-------------+----------------------------------------------------+
| Log_name            | Pos | Event_type  | Server_id | End_log_pos | Info                                               |
+---------------------+-----+-------------+-----------+-------------+----------------------------------------------------+
| knopfler-bin.000004 |   4 | Format_desc |         6 |         106 | Server ver: 5.1.22-rc-community-log, Binlog ver: 4 |
| knopfler-bin.000004 | 106 | Query       |         6 |         199 | use `test`; create table filter.t2 (i int)         |
| knopfler-bin.000004 | 199 | Query       |         6 |         292 | use `test`; create table filter.t3 (i int)         |
| knopfler-bin.000004 | 292 | Query       |         6 |         378 | use `test`; create table ft (i int)                |
| knopfler-bin.000004 | 378 | Table_map   |         6 |         419 | table_id: 16 (test.ft)                             |
| knopfler-bin.000004 | 419 | Write_rows  |         6 |         453 | table_id: 16 flags: STMT_END_F                     |
+---------------------+-----+-------------+-----------+-------------+----------------------------------------------------+
6 rows in set (0.02 sec)
[22 Nov 2007 9:27] Mats Kindahl
Yes, row-based replication is filtering the rows on the actual table (including the database) and not on the currently selected database. This is a deliberate decision made to resolve some of the problems with statement-based replication (e.g., multi-table updates). It is unfortunate that it has not been documented properly.

In general when using statement-based replication, one should never qualify the tables with the database, and instead set the current database to the correct one. If that rule is followed, mixed mode replication will not have any problems either.

For some more information, please have a look at my post regarding this, which is available under:

    http://mysqlmusings.blogspot.com/2006/09/replication-and-disappearing.html
[26 Nov 2007 15:02] Jon Stephens
Changed Category to Docs so this will show up in my queue.
[10 Jan 2008 15:37] Jon Stephens
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.