MySQL Bugs: #17620: Replicate (Row Based) Fails when Query Cache enabled on slave.

Bug #17620	Replicate (Row Based) Fails when Query Cache enabled on slave.
Submitted:	21 Feb 2006 15:50	Modified:	25 Sep 2006 17:23
Reporter:	Jeff C	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: Row Based Replication ( RBR )	Severity:	S2 (Serious)
Version:	5.1.9/5.1BK	OS:	FreeBSD (Freebsd 6.0)
Assigned to:	Mats Kindahl	CPU Architecture:	Any

Description:
Row Based Replication fails to update the slave tables when query cache is enabled in /etc/my.cnf

Note that the binary logs update correctly on both the master and slave. The table does not get
updated.

Problem:

On fresh installs, with minimal /etc/my.cnf settings, RBR worked flawlessly... rows added on master, showed on slave
instantly.

However, when you add these statements to /etc/my.cnf, RBR statements fail to update the tables...
"insert into..." fails, while "truncate table" will work (master is setup for binlog-format=row)..

query_cache_size=16M
query_cache_limit=2M
query_cache_type=1

How I discovered this: My slave wasn't updating so I tried to isolate the my.cnf settings causing the problems...

So I toggled each setting, restarted mysql, inserted on master, checked slave. On each iteration, it worked until
Query Cache is enabled... Then the RBR statements fail to execute, however the "truncate table..." statements execute immediately.

I have tested this on freebsd 4.11, and freebsd 6.0 ...

$$$$ note $$$$ : When you're on the slave, and the table is not updating... running: set global query_cache_size=0; select * from table; reveals the updates.

MASTER:

[mysqld]
log-bin=new-bin
server-id = 1
binlog-format=row
binlog-do-db=go

SLAVE:

[mysqld]
query_cache_size=16M
query_cache_limit=2M
query_cache_type=1
server-id = 2

However, when you drop the "query*" settings on the slave, the tables will update properly.

How to repeat:
Add query* cache settings above to /etc/my.cnf

To see that it is in fact a problem with query cache, on slave run:
set global query_cache_size=0; select * from table; and you'll see the RBR statements have updated the table.

Suggested fix:
See why the query cache blocks the inserts.

Jeff,

Did you do a select before the updates on the slave, then a select after the updates which didn't see the changed data until you cleared the query cache? If yes, that would suggest that the query cache isn't being reset properly by row-based replication.

52 days in "analyzing"... perhaps they need to reassign bugs to people that are not overworked or something.  I saw another bug go from analyzing to verified within 1 day.  This is a S2 bug and it's at 52 days.  At least say "We did not experience that" or "this bug is verified" so that it can get the ball rolling on being fixed.

I spent a few hours figuring out why this was happening, and I've seen others with the comments in the Replication forums experiencing the samething.  

Thanks,
Jeff

Jeff,

Did you do a select before the updates on the slave, then a select after the
updates which didn't see the changed data until you cleared the query cache?

You can safely assume that everyone working on bugs is heavily overworked. That means the greatest possible help from bug reporters will really speed things up. Bugs can go from reported to patch available in less than 24 hours sometimes. It depends on the details of the bug report and the severity of the problem. There's enough detail not specified in your report to slow things down - it's why I asked the narrowing down question I asked. At present this is assigned a priority which will result in it _never_ being looked at by any programmer. If you answer my question it would probably be either noted as fixed already or be fixed in the next release.

I don't normally work on bugs - this needs to move so I'm asking you the question it'll take to get it moving.

James,
   I respect your comments.

   Are you saying, 
     1) bug report too confusing
     2) you tried it, did not get the same results
     3) never tried it, trying to get further information to help the bug along.

   If you're at #1, I'll find the time to further report it...  lay it out step by step.. ( I thought I did).

   If you're at 2, I'll lay it out step by step.

   If you're at 3, well.. dunno :)

At this point, I will just continually keep checking new releases until the bug is fixed and then report back... I find it amusing to count the days in analyzing (btw, who reset it?)

Ok, seeing how this bug is 5 months old... I'll see If I can't get it addressed.

Bug: once a query gets executed on the slave table, all updates coming in from master block.  

1) Setup master/slave for replication.

Master: 
     [mysqld]
      log-bin=new-bin
      server-id   = 1
      binlog-format=row

Slave: 
     [mysqld]
      server-id       = 100
      replicate-wild-do-table=test.%

2) Get slave talking to master;
   
   change master to
      master_host='123.123.123.123',
      master_user='repl',
      master_password='HopeThisGetsFixed';

   start slave;

3) Disable query cache on slave;
     
     *SLAVE*

    set global query_cache_size=0;

4) Create new table, and insert sample data on MASTER

    *MASTER*
 
    use test;
    create table testing (id int unsigned primary key);
    insert into testing (id) values (1),(2),(3);

5) Toggle Query Caching on Slave

    *SLAVE*

    set global query_cache_size=16*1024*1024;

6) Update more data on master

    *MASTER*

    insert into testing (id) values (4),(5),(6);

7) Look at slave data

   *SLAVE*

    select * from testing;

8)  Insert more data on master (HERE IS WHERE BUG IS)

   *MASTER*
  
    insert into testing (id) values (7),(8),(9);

9) BUG:  check slave data

    select * from testing;

10) Note:  all updates will be blocked on slave until you run:

    *SLAVE*

    set global query_cache_size=0;
    select * from testing;

 * now you see the data * 

    use test;
    select * from testing;

Thank you for the bug report. Verified as described on Suse Linux 10:

miguel@hegel:~/dbs/5.1> bin/mysql --defaults-file=/home/miguel/dbs/5.1/my.cnf -uroot --prompt='master>'
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 5.1.11-beta-debug-log

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

master>grant all on *.* to 'slaveuser'@'localhost';
Query OK, 0 rows affected (0.09 sec)

master>show master status\G
*************************** 1. row ***************************
            File: new-bin.000001
        Position: 348
    Binlog_Do_DB: 
Binlog_Ignore_DB: 
1 row in set (0.02 sec)

master>

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 5.1.11-beta-debug

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

slave>CHANGE MASTER TO
    -> MASTER_HOST='localhost',
    -> MASTER_USER='slaveuser',
    -> MASTER_PASSWORD='',
    -> MASTER_LOG_FILE='new-bin.000001',
    -> MASTER_LOG_POS=348;
Query OK, 0 rows affected (0.03 sec)

slave>START SLAVE;
Query OK, 0 rows affected (0.00 sec)

slave>SHOW SLAVE STATUS\G
*************************** 1. row ***************************
             Slave_IO_State: Waiting for master to send event
                Master_Host: localhost
                Master_User: slaveuser
                Master_Port: 3306
              Connect_Retry: 60
            Master_Log_File: new-bin.000001
        Read_Master_Log_Pos: 348
             Relay_Log_File: hegel-relay-bin.000002
              Relay_Log_Pos: 241
      Relay_Master_Log_File: new-bin.000001
           Slave_IO_Running: Yes
          Slave_SQL_Running: Yes
            Replicate_Do_DB: 
        Replicate_Ignore_DB: 
         Replicate_Do_Table: 
     Replicate_Ignore_Table: 
    Replicate_Wild_Do_Table: test.%
Replicate_Wild_Ignore_Table: 
                 Last_Errno: 0
                 Last_Error: 
               Skip_Counter: 0
        Exec_Master_Log_Pos: 348
            Relay_Log_Space: 241
            Until_Condition: None
             Until_Log_File: 
              Until_Log_Pos: 0
         Master_SSL_Allowed: No
         Master_SSL_CA_File: 
         Master_SSL_CA_Path: 
            Master_SSL_Cert: 
          Master_SSL_Cipher: 
             Master_SSL_Key: 
      Seconds_Behind_Master: 0
1 row in set (0.01 sec)

slave>set global query_cache_size=0;
Query OK, 0 rows affected (0.02 sec)

master>use test;
Database changed
master>    create table testing (id int unsigned primary key);
Query OK, 0 rows affected (0.01 sec)

master>    insert into testing (id) values (1),(2),(3);
Query OK, 3 rows affected (0.05 sec)
Records: 3  Duplicates: 0  Warnings: 0

slave>set global query_cache_size=16*1024*1024;
Query OK, 0 rows affected (0.11 sec)

master>insert into testing (id) values (4),(5),(6);
Query OK, 3 rows affected (0.02 sec)
Records: 3  Duplicates: 0  Warnings: 0

slave>use test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
slave>select * from testing;
+----+
| id |
+----+
|  1 | 
|  2 | 
|  3 | 
|  4 | 
|  5 | 
|  6 | 
+----+
6 rows in set (0.00 sec)

master>insert into testing (id) values (7),(8),(9);
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

slave>select * from testing;
+----+
| id |
+----+
|  1 | 
|  2 | 
|  3 | 
|  4 | 
|  5 | 
|  6 | 
+----+
6 rows in set (0.00 sec)

slave>set global query_cache_size=0;
Query OK, 0 rows affected (0.10 sec)

slave>    select * from testing;
+----+
| id |
+----+
|  1 | 
|  2 | 
|  3 | 
|  4 | 
|  5 | 
|  6 | 
|  7 | 
|  8 | 
|  9 | 
+----+
9 rows in set (0.00 sec)

A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/11480

ChangeSet@1.2257, 2006-09-06 17:45:27+02:00, mats@romeo.(none) +3 -0
  BUG#17620: Row-based replication fails when query cache enabled on slave
  Invalidating query cache when processing rows for a statement on the slave.

Patch approved by Lars. Setting status on his behalf.

Pushed into 5.1.12

Noted in 5.1.12 changelog.