Bug #58489 FLUSH TABLES WITH READ LOCK - issue with concurrent load
Submitted: 25 Nov 2010 12:42 Modified: 5 Mar 2013 14:18
Reporter: Victor Kirkebo Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server: Locking Severity:S3 (Non-critical)
Version:5.5 OS:Any
Assigned to: CPU Architecture:Any

[25 Nov 2010 12:42] Victor Kirkebo
Description:
While testing MySQL Enterprise Backup with 5.5.7-rc I have seen an issue when the mysqlbackup application attempts to get a global read lock by issuing the command FLUSH TABLES WITH READ LOCK.

At some point mysqlbackup performs these steps:
--------------
create table binlogmarker(i int) engine=innodb;
set autocommit=0;
insert into binlogmarker values(1);
flush tables with read lock;
commit;
show master status;
unlock tables;
--------------

mysqlbackup expects the SHOW MASTER STATUS to return the binlog position of the commit following the FLUSH TABLES WITH READ LOCK. However, in 5.5.7-rc the binlog position returned sometimes is larger than it should be. The cause seems to be that some other concurrent transactions are able to commit before SHOW MASTER STATUS. This seems to be wrong - it has always been the case when running backup with 5.1 that any concurrent transactions could not commit until UNLOCK TABLES had been performed.

How to repeat:
I will upload a test soon
[25 Nov 2010 21:57] Davi Arnaut
Hi Victor,

Could you test with 5.5.8? There were quite a few recent changes related to FLUSH TABLES WITH READ LOCK. See Bug#57006 and Bug#54673.
[25 Nov 2010 22:02] Davi Arnaut
More especially, from the patch that re-implement FTWRL, there is this following excerpt:

"Similar to the old implementation acquisition of GRL in new implementation is two-step. During the first step we block all concurrent DML and DDL statements by acquiring global S metadata lock (each DML and DDL statement acquires global IX lock for its duration). During the second step we block commits by acquiring global S lock in COMMIT namespace (commit code acquires global IX lock in this namespace)."

It would be nice to know whether this test case applies to the old or the new implementation of FTWRL.
[26 Nov 2010 1:17] Davi Arnaut
Victor,

I'm not sure this test does what it claims to do. The FTWRL will implicitly commit the transaction. The following COMMIT will do nothing, otherwise it would just fail.
[26 Nov 2010 1:23] Davi Arnaut
http://dev.mysql.com/doc/refman/5.5/en/implicit-commit.html

This page explain things a bit. It is equivalent as if a COMMIT had been executed before the FTWRL, so there is a window for other transactions to commit before FTWRL starts/finishes. I guess we should make this remark clear in the documentation of FTWRL.
[26 Nov 2010 1:30] Davi Arnaut
Hum, if Enterprise Backup perform those steps, it might be interesting to know why..
[26 Nov 2010 7:26] Dmitry Lenev
Hi Victor!

Please also note that even in 5.1 FTWRL doesn't block all statements which
concurrently try to be written to the binary log. Particularly DROP TEMPORARY TABLES statement is not blocked by FTWRL but is written to the binary log (there are other cases as well).

In 5.5.8 we no longer try to block any statements affecting only temporary tables. As result you can't really rely that binlog position after FTWRL is stable. Still binlog won't contain any statements affecting objects which
can become part of backup (temporary tables do not fall into this category).
[26 Nov 2010 10:46] Victor Kirkebo
I just made an attempt to test if FTWRL really performs an implicit commit.
It seems it does indeed happen in 5.5 but not in 5.1:

5.1:
----
mysql> select @@version;
+-------------------------------------------+
| @@version                                 |
+-------------------------------------------+
| 5.1.53-enterprise-commercial-advanced-log |
+-------------------------------------------+
1 row in set (0.00 sec)

mysql> create database tst;
Query OK, 1 row affected (0.01 sec)

mysql> use tst;
Database changed

mysql> create table t1(i int) engine=innodb;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW MASTER STATUS;
+-------------------+----------+--------------+------------------+
| File              | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+-------------------+----------+--------------+------------------+
| master-bin.000002 | 17066763 |              |                  |
+-------------------+----------+--------------+------------------+
1 row in set (0.00 sec)

mysql> set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 values(1);
Query OK, 1 row affected (0.00 sec)

mysql> FLUSH TABLES WITH READ LOCK;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW BINLOG EVENTS IN 'master-bin.000002' FROM 17066763;
Empty set (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (0.01 sec)

mysql> SHOW BINLOG EVENTS IN 'master-bin.000002' FROM 17066763;
+-------------------+----------+------------+-----------+-------------+-------------------------------------+
| Log_name          | Pos      | Event_type | Server_id | End_log_pos | Info                                |
+-------------------+----------+------------+-----------+-------------+-------------------------------------+
| master-bin.000002 | 17066763 | Query      |         1 |    17066830 | BEGIN                               |
| master-bin.000002 | 17066830 | Query      |         1 |    17066916 | use `tst`; insert into t1 values(1) |
| master-bin.000002 | 17066916 | Xid        |         1 |    17066943 | COMMIT /* xid=588959 */             |
+-------------------+----------+------------+-----------+-------------+-------------------------------------+
3 rows in set (0.00 sec)
[26 Nov 2010 10:54] Davi Arnaut
Indeed, the implicit commit was added in 6.0 (and later ported to 5.5) as a preparation for retaining metadata locks in the scope of a transaction. It was needed because at the time we didn't have a deadlock detector and a session could deadlock itself if already holding metadata locks at the time of the flush. The implicit commit was added to quite a few statements and the documentation was also updated to reflect this.
[26 Nov 2010 12:01] Victor Kirkebo
Hi Davi,
If I understand correctly we now have a functioning deadlock detector. If so, does this imply that it is possible to remove the newly introduced FTRWL implicit commit in 5.5 if we wanted to?
[26 Nov 2010 12:02] Davi Arnaut
Hi Victor,

Yes, but we would like to avoid it. We are already talking with the Backup guys about what would be the best approach going forward. Thanks for you help!
[1 Dec 2010 10:55] Susanne Ebrecht
Setting to verified according to the comments above.
[5 Oct 2012 19:05] MySQL Verification Team
This problem also affects the ability of using the backup to initialize a slave and is not limited to just point-in-time recoveries. 

If the wrong position is recorded in the backup log, then starting replication at that position can fail due to the 'extra' commits that happen beyond that point that are in the image.  This is exacerbated if --binlog-format=ROW as there can be specific mismatches between what is on the table and what the binary log is telling the slave what to do.
[5 Mar 2013 13:54] Sanjay Manwani
The synchronization between server and MEB was taken care of by using the 
ibbackup_binlog_marker table.
The details of the ibbackup_binlog_marker table are added to the docs.
[5 Mar 2013 14:18] Ingo Strüwing
MEB has a workaround for this since 3.5.3.
The binlog position is still not accurate in all cases, but the deviation should consist of modifications to temporary tables only. So it should not affect point-in-time-recovery or setting up a slave.