Bug #19887 concurrent updates restricted by queries not using the binlog
Submitted: 17 May 2006 15:17 Modified: 21 May 2006 15:19
Reporter: Martin Friebe (Gold Quality Contributor) (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Locking Severity:S4 (Feature request)
Version:4.1 / 5.0 OS:Any (*)
Assigned to: CPU Architecture:Any

[17 May 2006 15:17] Martin Friebe
Description:
According to http://dev.mysql.com/doc/refman/4.1/en/concurrent-inserts.html and http://dev.mysql.com/doc/refman/5.0/en/concurrent-inserts.html the select in an "insert .. select... " can allow concurrent inserts, unless the binlog is used.

>If you are using the binary log, concurrent inserts are converted to normal inserts for 
>CREATE ... SELECT or INSERT ... SELECT  statements. This is done to ensure that you 
>can re-create an exact copy of your tables by applying the log during a backup operation.

It appears that mysql does only check the global "log-bin" option.

the statement:
INSERT INTO  tbl1  SELECT * FROM  tbl2;

mysql does allow concurrent inserts on tbl2, if log-bin=0
mysql does not allow concurrent inserts on tbl2, if log-bin=1 (correct)

mysql also does not allow concurrent updates on tbl2, if the bin-log does not apply to the statement, due to either "binlog-ignore-db=", or "SET SQL_LOG_BIN=0;"

In both cases the binlog is not active for the statement. The restriction is not necessary, because the statement can not get logged in the wrong order.

How to repeat:
# prepare

create table if not exists dummy (stuff varchar(255)) ENGINE=MyISAM;
create table if not exists dummy2 (stuff varchar(255));
insert into dummy (stuff) values('1'), ('2'), ('3'), ('4');

# repeat 20 to 30 times,
# so that: "insert into dummy2 select * from dummy" takes about 10 second
insert into dummy select * from dummy;

# How to repeat
# run the below, from 2 seperate mysql session, start the 2nd insert immediatly after the first

#mysql1>
insert into dummy2 select  * from dummy;
#mysql2>
insert into dummy (stuff) values('1');

###
If runing the above without binlog, the 2nd query will return without waiting for the first, as documented.

if running with binlog (and binlog active for this session/query), it will wait as expected.

try running the test after executing, in the first session (mysql1):
 SET SQL_LOG_BIN=0;
The "insert select" will not be in the binlog, but the 2nd query will wait.

The same applies if setting "binlog-ignore-db=" for the database used to execute the query (and again mysql will honor the setting, by not bin-logging the sql)

# below are the relevant lines from my my.cnf file, which have been toggled as described
[mysqld]
server-id=1
skip-name-resolve
skip-show-database
skip-external-locking
log-bin=/usr/local/mysql/db-bin
log-bin-index=/usr/local/mysql/db-bin
binlog-ignore-db=test

Suggested fix:
Instead of just checking the global option for the bin-log, verify that if it will apply to this query.
This check is done anyway, as its result is used for determining, if the query goes to the binlog. It should also determine, if restrictions like this are applied.
[19 May 2006 10:43] Martin Friebe
tested on 5.0.17 too
[21 May 2006 13:41] Valeriy Kravchuk
Thank you for a problem report. If you want to check on latest versions, use 4.1.19 and 5.0.21, please. But it looks like this behaviour is intended and described at that manual pages you quoted:

"If you are using the binary log, concurrent inserts are converted to normal inserts for CREATE ... SELECT or INSERT ... SELECT  statements."

So, it's clear: "If you are using binary log", not if your particular statement "should be written to the binary log" etc.

I think this is a valid and reasonable feature request (maybe, even easy to implement one), but not a bug, formally. Do you agree?
[21 May 2006 13:53] Martin Friebe
agreed to feature request.

I believe closed bug #7879, describes the locations, where this would need to be implemented or checked. I believe 7879 was where those restrictions were implemented?

on a personal note: I believe this feature could benefit many. As it would allow multi-step statistics/reports/analytics on large data, with far less impact on running applications

Thanks
[4 Oct 2008 20:43] Konstantin Osipov
As part of fix for Bug#34306, if you use row-based replication concurrent insert is employed automatically.
We don't take into account replicate-do* patterns though.