Bug #53706 transaction isolation does not work on subqueries
Submitted: 17 May 2010 14:27 Modified: 1 Jun 2010 9:35
Reporter: Sjoerd van Hagen Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S2 (Serious)
Version:5.1.42, 5.1.47, 5.1.48-bzr, 5.6.99 OS:Linux
Assigned to: Assigned Account CPU Architecture:Any
Tags: ISOLATION, regression, subqueries, transaction, wrong results

[17 May 2010 14:27] Sjoerd van Hagen
Description:
It is possible to insert values into a table while there is a transaction running. A repeated select inside the transaction will correctly show the table as it was before the insert. However, when running a select with a subquery the subquery seems to be using the version of the table after the insert. This leads to wrong results. Repetition of the problem is quite easy, although sometimes it does not work, and will make clear what goes wrong.

I think this is a serious bug, as is any bug returning the wrong results.

How to repeat:
I will be using two mysql client connections to do this, mysql1 and mysql2.

mysql1> create table list ( id integer ) ;
mysql1> insert into table list (id) values (1) ;

mysql2> begin ;
mysql2> select * from list ;
+------+
| id   |
+------+
|    1 | 
+------+

mysql1> insert into table list (id) values (2) ;

mysql2> select * from list;
+------+
| id   |
+------+
|    1 | 
+------+

Until here everything is as expected. Sometimes the second insert blocked, which I thought was incorrect behavior, which did prevent getting to the bug. I had to try it again just once to get it to work, maybe a little faster.
The problem is in running the next query:

mysql2> select * from list where exists (select * from list where id=2);
+------+
| id   |
+------+
|    1 | 
+------+

In case you think this has something to do with the exists keyword or you do not like quantors , I have two other examples:

mysql2> select * from list where 3 in (select id from list);
+------+
| id   |
+------+
|    1 | 
+------+

or

select count(*), (select count(*) from list ) as count  from list ;
+----------+-------+
| count(*) | count |
+----------+-------+
|        1 |     2 | 
+----------+-------+

It is not to hard to see that the result is wrong. I would like to make my suspicion that it has to do with the isolation in the subquery a bit more convincing by showing that after deleting the row with id 2 in the mysql1 connection but that one seems to keep blocking.

Just to show that the transaction isolation setting is correct:

mysql> show variables like '%tx_%';
+---------------+-----------------+
| Variable_name | Value           |
+---------------+-----------------+
| tx_isolation  | REPEATABLE-READ | 
+---------------+-----------------+

Suggested fix:
I am not a database programmer but it seems to me that the transaction context in which the subquery is run should be identical to the transaction context in which the main one is run. It seems that this is not the case. Making sure that it is should solve this problem I think.
[25 May 2010 9:12] Sveta Smirnova
Thank you for the report.

I can not repeat described behavior. Additionally version 5.1.42 is outdated. Please try with current version 5.1.47 and inform us if problem still exists.
[25 May 2010 9:20] Sjoerd van Hagen
As the database runs on amazon RDS, I am not at liberty to update it. What I can do is ask them when they will update. 
Also reproducing the bug did not work for me at all times. Sometimes the update queries blocked. It seems one will have to move quickly to have a higher probability of reproducing the bug.
Is there any chance you could try it on 5.1.42 to see whether that works?
[25 May 2010 23:20] Eric Ferreira
I've tested and reproduced this on 5.1.47

Server version: 5.1.47-log MySQL Community Server (GPL)

Session 1:

mysql> create database test;
Query OK, 1 row affected (0.00 sec)

mysql> use test
Database changed

mysql> create table list ( id integer ) ;
Query OK, 0 rows affected (0.04 sec)

mysql> insert into list (id) values (1) ;
Query OK, 1 row affected (0.02 sec)

###T1####

mysql> insert into list (id) values (2) ;
Query OK, 1 row affected (0.02 sec)

Session 2:

mysql> select * from list ;
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

###T1####

mysql>  select * from list;
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

mysql> select * from list where exists (select * from list where id=2);
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

mysql> select * from list where 3 in (select id from list);
Empty set (0.00 sec)

mysql> select * from list where 2 in (select id from list);
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

mysql> select count(*), (select count(*) from list ) as count  from list ;
+----------+-------+
| count(*) | count |
+----------+-------+
|        1 |     2 |
+----------+-------+
1 row in set (0.00 sec)

mysql> show variables like '%tx_%';
+---------------+-----------------+
| Variable_name | Value           |
+---------------+-----------------+
| tx_isolation  | REPEATABLE-READ |
+---------------+-----------------+
1 row in set (0.00 sec)
[26 May 2010 5:33] Sveta Smirnova
Thank you for the feedback.

I still can not repeat described behavior. Please inform us about package you are using (filename you downloaded) and send us configuration file.
[26 May 2010 7:45] Sjoerd van Hagen
Thank you Eric. I do not have full access to the database, or time to download and install one myself. Any other information I may be able to provide.
[26 May 2010 8:10] Sveta Smirnova
Sjoerd,

thank you fot the feedback. Probably Eric would be able to provide necessary information.
[26 May 2010 21:08] Eric Ferreira
Here is the link

http://dev.mysql.com/get/Downloads/MySQL-5.1/mysql-5.1.47-linux-x86_64-glibc23.tar.gz/from...

I would be curios to see the copy/paste of the steps Sveta took and not reproduced it.
[27 May 2010 8:42] Sveta Smirnova
Thank you for the feedback.

I tested with same binaries. Here is the copy-paste:

mysql1>  create table list ( id integer ) engine=innodb;
Query OK, 0 rows affected (0.01 sec)

mysql1> insert into list (id) values (1) ;
Query OK, 1 row affected (0.00 sec)

mysql2> begin;
Query OK, 0 rows affected (0.00 sec)

mysql2> select * from list;
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

mysql1> insert into list (id) values (2) ;
Query OK, 1 row affected (0.00 sec)

mysql2> select * from list;
+------+
| id   |
+------+
|    1 |
+------+
1 row in set (0.00 sec)

mysql2> select * from list where exists (select * from list where id=2);
Empty set (0.00 sec)

mysql2> select * from list where 3 in (select id from list);
Empty set (0.00 sec)

mysql2> select count(*), (select count(*) from list ) as count  from list ;
+----------+-------+
| count(*) | count |
+----------+-------+
|        1 |     1 |
+----------+-------+
1 row in set (0.00 sec)

What I don't see in your output is table engine and BEGIN satement. Do you use auto_commit? Do you use default-storage-enine option? What is the output of SHOW CREATE TABLE list in your environment?
[27 May 2010 15:32] Eric Ferreira
I missed the begin; from the copy and past.
I do use innoDB as default

Here is the table

mysql> show create table list
    -> ;
+-------+------------------------------------------------------------------------------------------+
| Table | Create Table                                                                             |
+-------+------------------------------------------------------------------------------------------+
| list  | CREATE TABLE `list` (
  `id` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1 |
+-------+------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

and autocommit is on too

mysql> show global variables like '%commit%';
+--------------------------------+-------+
| Variable_name                  | Value |
+--------------------------------+-------+
| autocommit                     | ON    |
| innodb_commit_concurrency      | 0     |
| innodb_flush_log_at_trx_commit | 1     |
+--------------------------------+-------+

below is my my.cnf file

[client]
port            = 8443
socket          = /tmp/mysql.sock

[mysqldump]
quick
max_allowed_packet = 16M

[mysql]
no-auto-rehash

[isamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M

[myisamchk]
key_buffer = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M

[mysqlhotcopy]
interactive-timeout

[mysqld]
basedir = '/systemdbbin/mysql'
binlog_cache_size = '32768'
binlog_format = 'MIXED'
datadir = '/systemdbdata/db/'
default_storage_engine = 'InnoDB'
innodb-additional-mem-pool-size = '2097152'
innodb-buffer-pool-size = '6039797760'
innodb-data-home-dir = '/systemdbdata/db/innodb'
innodb-file-per-table = '1'
innodb-flush-method = 'O_DIRECT'
innodb-log-buffer-size = '8388608'
innodb-log-file-size = '134217728'
innodb-log-group-home-dir = '/systemdbdata/log/innodb'
key_buffer_size = '16777216'
local-infile = '1'
log-error = '/systemdbdata/log/error/mysql-error.log'
log-output = 'TABLE'
max_binlog_size = '134217728'
max_connections = '640'
pid-file = '/systemdbdata/log/mysql-8443.pid'
plugin_dir = '/systemdbbin/mysql/lib/mysql/plugin'
port = '8443'
read_buffer_size = '262144'
read_rnd_buffer_size = '524288'
secure-file-priv = '/tmp'
socket = '/tmp/mysql.sock'
thread_stack = '196608'
tmpdir = '/systemdbdata/tmp'
log-bin = off
[27 May 2010 15:45] Eric Ferreira
Sveta, 
are you using linux ?

I can still reproduce on my linux box (with that my.cnf I sent you), but when I donload the same version for the mac ( http://www.mysql.com/get/Downloads/MySQL-5.1/mysql-5.1.47-osx10.6-x86_64.dmg/from/http://mysql.llarian.net/ ) with default everthing it does not reproduce.

Eric
[27 May 2010 16:17] Sveta Smirnova
Thank you for the feedback.

Verified as described. Option binlog matters.

Test case for MTR:

--source include/have_innodb.inc
--source include/have_log_bin.inc

create table list1 ( id integer ) engine=innodb;
insert into list1 (id) values (1) ;

connect (addconroot, localhost, root,,);
connection addconroot;

begin;
select * from list1;

connection default;
insert into list1 (id) values (2) ;

connection addconroot;
select * from list1;
select * from list1 where exists (select * from list1 where id=2);
select * from list1 where 3 in (select id from list1);
select count(*), (select count(*) from list1 ) as count  from list1 ;
[27 May 2010 16:22] Sjoerd van Hagen
Removed the running on RDS bit as the bug is reproducible on other machines as well.
[28 May 2010 17:08] Mikhail Izioumtchenko
reproduced with the current plugin code.
Here we performed a locking read in the subquery, so we return the actual
count of the records in the table locking them in the process.
Yes, we end up with all records in the table locked after this SELECT,
I think there's a separate bug for this.

MySQL decides to lock link1 using TL_READ_NO_INSERT

#0  ha_innodb::store_lock (this=0x129baad0, thd=0x129533f0, to=0x1295f140,
    lock_type=TL_READ_NO_INSERT) at handler/ha_innodb.cc:9121
#1  0x00000000005ee8f3 in get_lock_data (thd=0x129533f0, table_ptr=0x1295b5f8,
    count=2, flags=2, write_lock_used=0x4a222e68) at lock.cc:875
#2  0x00000000005ef9b7 in mysql_lock_tables (thd=0x129533f0,
    tables=0x1295b5f8, count=2, flags=4, need_reopen=0x4a222fa3) at lock.cc:213
#3  0x000000000065a2e6 in lock_tables (thd=0x129533f0, tables=0x1295b230,
    count=2, need_reopen=0x4a222fa3) at sql_base.cc:5341
#4  0x00000000006636a7 in open_and_lock_tables_derived (thd=0x129533f0,
    tables=0x1295b230, derived=true) at sql_base.cc:5026
#5  0x0000000000616eda in open_and_lock_tables (thd=0x129533f0,
    tables=0x1295b230) at mysql_priv.h:1564
#6  0x00000000006095c7 in execute_sqlcom_select (thd=0x129533f0,
    all_tables=0x1295b230) at sql_parse.cc:5032
#7  0x000000000060a492 in mysql_execute_command (thd=0x129533f0)
    at sql_parse.cc:2263
#8  0x0000000000612f75 in mysql_parse (thd=0x129533f0,
    inBuf=0x1295a1e0 "select count(*),(select count(*) from list1) as c2 from list1", length=61, found_semicolon=0x4a224ee0) at sql_parse.cc:5986

why? I don't know, it looks MyISAMish. I think we should allow this and
do the right thing in ::store_lock() where we have a lot of code 
to deal with it.

for example we correctly downgrade the table lock:

We especially allow concurrent inserts if MySQL is at the
                start of a stored procedure call (SQLCOM_CALL)
                (MySQL does have thd_in_lock_tables() TRUE there). */

                if elock_type == TL_READ_NO_INSERT
                    && sql_command != SQLCOM_LOCK_TABLES) {

                        lock_type = TL_READ;
                }

however in this case we set prebuilt->select_lock_type to 2 (LOCK_S)
which then enforces the locking read.

Our only chance to set prebuilt->select_lock_type to 5 (LOCK_NONE) and
use consistent read is here:

 if ((srv_locks_unsafe_for_binlog
                     || isolation_level <= TRX_ISO_READ_COMMITTED)
                    && isolation_level != TRX_ISO_SERIALIZABLE
                    && (lock_type == TL_READ || lock_type == TL_READ_NO_INSERT)
                    && (sql_command == SQLCOM_INSERT_SELECT
                        || sql_command == SQLCOM_REPLACE_SELECT
                        || sql_command == SQLCOM_UPDATE
                        || sql_command == SQLCOM_CREATE_TABLE)) {

                        /* If we either have innobase_locks_unsafe_for_binlog
                        option set or this session is using READ COMMITTED
                        isolation level and isolation level of the transaction
                        is not set to serializable and MySQL is doing
                        INSERT INTO...SELECT or REPLACE INTO...SELECT
                        or UPDATE ... = (SELECT ...) or CREATE  ...
                        SELECT... without FOR UPDATE or IN SHARE
                        MODE in select, then we use consistent read
                        for select. */

                        prebuilt->select_lock_type = LOCK_NONE;
                        prebuilt->stored_select_lock_type = LOCK_NONE;

In our case, txn isolation is REPEATABLE READ and sql_command is 0 which is
SQL_SELECT. So, changing the above condition to read:

if (isolation_level != TRX_ISO_SERIALIZABLE
                    && (lock_type == TL_READ || lock_type == TL_READ_NO_INSERT)
                    && (sql_command == SQLCOM_INSERT_SELECT
                        || sql_command == SQLCOM_REPLACE_SELECT
                        || sql_command == SQLCOM_UPDATE
                        || sql_command == SQLCOM_SELECT
                        || sql_command == SQLCOM_CREATE_TABLE)) {

fixes the test case, not sure about the correctness of the fix.
Why did we exclude REPEATABLE READ in the first place?
Why SQLCOM_UPDATE is there but not SQLCOM_DELETE or SQLCOM_INSERT
[28 May 2010 17:26] Mikhail Izioumtchenko
can be related to bug#46947
[28 May 2010 17:35] Mikhail Izioumtchenko
wouldn't reproduce in 5.5 (mysql-trunk-innodb), maybe due to
reportedly more thorough fix to  bug#46947. Another related bug is bug#39843.
We should double check if MySQL locking mode is too strong here
and it's an Optimizer bug after all, not InnoDB.
[31 May 2010 7:43] Jimmy Yang
This looks like a dup of bug #46947, and fix described in   http://lists.mysql.com/commits/109039 would fix the problem. And also explain the cause of the problem:

 3394 Dmitry Lenev	2010-05-24
      A 5.1-only version of  fix for bug #46947 "Embedded SELECT
      without FOR UPDATE is causing a lock".
      
      SELECT statements with subqueries referencing InnoDB tables
      were acquiring shared locks on rows in these tables when they
      were executed in REPEATABLE-READ mode and with statement or
      mixed mode binary logging turned on.
      
      This was a regression which were introduced when fixing
      bug 39843.
      
      The problem was that for tables belonging to subqueries
      parser set TL_READ_DEFAULT as a lock type. In cases when
      statement/mixed binary logging at open_tables() time this
      type of lock was converted to TL_READ_NO_INSERT lock at
      open_tables() time and caused InnoDB engine to acquire
      shared locks on reads from these tables. Although in some
      cases such behavior was correct (e.g. for subqueries in
      DELETE) in case of SELECT it has caused unnecessary locking.
[31 May 2010 9:10] Sjoerd van Hagen
This bug is clearly not a duplicate of #46947. Yes, it does happen that the table is sometimes locked when it should not be, but sometimes it does not lock and the query produces wrong results. In the one case it is too aggressive in locking, in this case it does not lock (it should not) but it does produce wrong results.
[31 May 2010 14:52] Mikhail Izioumtchenko
I always saw locking read in my experiments, it can easily be seen
by using BEGIN - INSERT instead of the autocomitted INSERT in the testcase.
I verified http://lists.mysql.com/commits/109039 does fix this problem.
[31 May 2010 15:05] Sjoerd van Hagen
It is quite odd that some of use can reproduce the problem without difficulty while others observe locking (which also should not take place). I observed locking sometimes but if I would just start again and moved quickly (still manually using the client) it mostly just accepted my inserts, like it should. The problem seems to be that the query is running on a snapshot of the table while the subquery runs on the updated version of the table, hence the wrong results in the query.

I hope that it is clear that this problem should not be solved by just locking up the table and that this bug is not just a duplicate of #46947. Why do you think it is a duplicate Jimmy Yang? I do not see the similarities in unnecessary locking and failing the repeatable-read invariant.
[1 Jun 2010 9:22] Jimmy Yang
Sjoerd, this bug can be reproduced easily with log-bin = off, that determines if we use binary logging.

This problem is not to "be solved by just locking up the table", rather, it should not acquire shared lock for the subquery as described in #46947.

The problem is just like what described by Dmitry Lenev, the subquery should not be called with TL_READ_NO_INSERT, rather it should be called by TL_READ. And this problem exhibits exactly the scenario described in Dimitry's fix in 46947, its subquery uses TL_READ_DEFAULT, then convert to TL_READ_NO_INSERT at  open_tables(), and then cause InnoDB to acquire unnecessary share locks and result addition row that should not return in the subquery.

And again, the description by Dmitry for his fix described exactly our problem here:

      The problem was that for tables belonging to subqueries
      parser set TL_READ_DEFAULT as a lock type. In cases when
      statement/mixed binary logging at open_tables() time this
      type of lock was converted to TL_READ_NO_INSERT lock at
      open_tables() time and caused InnoDB engine to acquire
      shared locks on reads from these tables. Although in some
      cases such behavior was correct (e.g. for subqueries in
      DELETE) in case of SELECT it has caused unnecessary locking.

Please check the simple fix in http://lists.mysql.com/commits/109039, it would fix this problem correctly.
[1 Jun 2010 9:35] Sjoerd van Hagen
So both the excessive logging observed in #46947 and the incorrect results I saw stem from the same underlying bug.

I will not be able to apply the batch because I use amazon RDS which means amazon maintains the database.
[2 Jun 2010 15:51] Eric Ferreira
What 5.1.XX release will this patch be part of ?