Bug #61209 auto_increment_offset != 1 + innodb_autoinc_lock_mode=1 => bulk inserts fail
Submitted: 18 May 2011 1:39 Modified: 17 May 2012 20:14
Reporter: Bruce Weirdan Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.1.57, 5.5.12, 5.6.2-m5 OS:Linux (Linux weirdan 2.6.38-2-amd64 #1 SMP Sun May 8 13:51:57 UTC 2011 x86_64 GNU/Linux )
Assigned to: Sunny Bains CPU Architecture:Any

[18 May 2011 1:39] Bruce Weirdan
Description:
Running concurrent bulk inserts on a server with auto_increment_offset!=1, auto_increment_increment>1 and innodb_autoinc_lock_mode=1 results in intermittent errors like this: Duplicate entry '87674' for key 'PRIMARY'
even though primary key is set to auto_increment and is not specified in the insert statement. 

There's no errors if auto_increment_offset=1 or innodb_autoinc_lock_mode=0 ('traditional')

It's reproducible on both current GA releases (5.1.57 and 5.5.12).

How to repeat:
Start mysqld with innodb_autoinc_lock_mode=1

Run attached script (note: it will delete all *log files in the current directory)
prerequisites: PHP 5.3 with mysql extension

php run.php -h 127.0.0.1 -p 3306 -uroot -P -dtest

It generates a number of log files (t*.log) in current directory (one log file per insert process).

Expected results: log files should not contain any 'Duplicate entry '...' for key 'PRIMARY'' messages, regardless of auto_increment_offset setting.

Actual results: when auto_increment_offset is set to something != 1 in thr.php some inserts fail.
[18 May 2011 1:40] Bruce Weirdan
Reproduce script

Attachment: duplicate-primary.tgz (application/octet-stream, text), 1.05 KiB.

[18 May 2011 14:18] Bruce Weirdan
5.6.2-m5 is also affected
[18 Sep 2011 0:12] Bruce Weirdan
Any progress on this?
[13 Oct 2011 21:39] Zach Bailey
We are also running into this issue, both on 5.1.57 and 5.5.12 - it's blocking our upgrade to 5.5.

Have tried exactly what Bruce mentioned (changing the auto inc lock mode) and that did not seem to help either
[2 Dec 2011 0:29] Zach Bailey
Bruce,

Are you using multi-master replication as well?
[2 Dec 2011 0:36] Zach Bailey
Could we get an official comment on this? It's a huge blocker for us on upgrading to either 5.1 or 5.5.

We are using multi-master replication with 1 < auto_inc_offset < auto_inc_inc

I have to think those things are all tied to this bug somehow, so when someone has the opportunity to look into this on the MySQL side, you need to use those settings to reproduce!
[2 Dec 2011 0:53] Bruce Weirdan
Zach,

We intended to use MM replication (thus autoincrements set so that primary keys would interleave) but never actually used MM, only straight MS replication.

In my tests the problem was reproducible with a single MySQL instance, no slaves connected. The exact my.cnf used is included in the attached .tgz - all other settings were at their default values.
[2 Dec 2011 1:05] Zach Bailey
Thanks Bruce, good to know that makes reproducing it even simpler. Have you ever tried your hand at debugging MySQL? Seems like submitting a patch is the only way to get things like this fixed - either that or paying $TEXAS for an Oracle support contract :(
[2 Dec 2011 14:13] Bruce Weirdan
Unfortunately debugging multithreaded C++ is beyond my skills, and this issue is not so pressing for us as to hire external developer to solve it.
[30 Dec 2011 10:34] Patryk Pomykalski
patch for 5.1 branch

Attachment: 61209_fix_51.diff (application/octet-stream, text), 461 bytes.

[30 Dec 2011 10:37] Patryk Pomykalski
oops, there is typo in the previous one, this should work

Attachment: 61209_fix_51_working.diff (application/octet-stream, text), 466 bytes.

[3 Jan 2012 15:52] Zach Bailey
Is anyone from MySQL/Oracle able to comment on the validity of Patryk's patch and if it's something that is possible to merge in and/or "bless" to fix this bug?

Alternatively, Patryk, how hard is it to apply this sort of patch to the MySQL code base and how reliable is the included test suite with MySQL to ensure that this patch does not introduce other bugs or problems?
[7 Jan 2012 14:32] Patryk Pomykalski
My fix brakes some edge cases. I'm still working on it...
[26 Jan 2012 23:18] Patryk Pomykalski
5.5 branch fix with a test to reproduce this bug

Attachment: mysql55_bug_61209.patch (application/octet-stream, text), 13.59 KiB.

[9 Feb 2012 1:39] Zach Bailey
Why doesn't anyone at Oracle care about fixing this BLOCKER bug for people trying to upgrade to 5.1 or 5.5 from 5.0?
[22 Feb 2012 2:37] Zach Bailey
9 minor releases for 5.5 since this bug was reported and still no one has even looked at it?
[22 Feb 2012 3:34] Calvin Sun
A patch is in review.
[20 Mar 2012 1:59] Zach Bailey
Any updates on the patch in review? Again, this is a blocker issue that's keeping us from taking advantage of all the awesome improvements 5.5 has to offer (we're stuck on 5.0).
[20 Mar 2012 10:14] Calvin Sun
The fix has been pushed into 5.6; will backport the fix to 5.5 soon.
[20 Mar 2012 14:34] Zach Bailey
Fantastic, thanks Calvin.
[31 Mar 2012 17:20] Zach Bailey
Is anyone from Percona able to comment on which Percona 5.5 release will incorporate this bugfix?

Thanks!
[9 Apr 2012 22:18] Sergei Golubchik
your fix may be cool and all, but it does not fix the bug.
See the attached patch - it contains a test case. This test case still fails with your bug fix. So, bulk insert *still* fails.
[9 Apr 2012 22:20] Sergei Golubchik
I mean, Oracle's bug fix doesn't fix the bug
[9 Apr 2012 22:33] Sunny Bains
My tests show that it works.

user@tyr03 autoinc> php run.php -h 127.0.0.1 -p 5400 -uroot -P -dtest
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t10.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t1.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t2.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t3.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t4.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t5.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t6.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t7.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t8.log

mysql> show variables like '%auto%';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| auto_increment_increment    | 3     |
| auto_increment_offset       | 3     |
| autocommit                  | ON    |
| automatic_sp_privileges     | ON    |
| innodb_autoextend_increment | 8     |
| innodb_autoinc_lock_mode    | 1     |
| sql_auto_is_null            | OFF   |
+-----------------------------+-------+
7 rows in set (0.00 sec)

-rw-r--r-- 1 subains wheel 0 Apr 10 00:31 t9.log
[9 Apr 2012 22:53] Sergei Golubchik
not the PHP one, I didn't try it. But from the patch later. Here's the test case:
# This is the test case for bug #61209 - duplicate key errors
# when using auto_increment_increment > 1 and auto_increment_offset > 1
#
--source include/have_innodb.inc

#
# Precautionary clean up.
#
--disable_warnings
DROP TABLE IF EXISTS bug_61209;
--enable_warnings

#
# Create test data.
#
CREATE TABLE bug_61209 (a INT auto_increment, PRIMARY KEY(a)) ENGINE=InnoDB;

INSERT INTO bug_61209 VALUES (), ();

--echo
--echo # Connect con1
--connect (con1,localhost,root,,)
--echo
--echo # Connect con2
--connect (con2,localhost,root,,)

--echo
--echo # Connection con1
--connection con1
SET SESSION auto_increment_increment=3;
SET SESSION auto_increment_offset=2;
SELECT GET_LOCK('a', 9);

--echo
--echo # Connection con2
--connection con2
SET SESSION auto_increment_increment=3;
SET SESSION auto_increment_offset=2;
send INSERT INTO bug_61209 (a) VALUES (NULL), (NULL), (NULL + GET_LOCK('a', 10));

--echo
--echo # Connection con1 reap
--connection con1
INSERT INTO bug_61209 (a) VALUES (NULL), (NULL), (NULL);
SELECT RELEASE_LOCK('a');

--echo
--echo # Connection con2 reap
--connection con2
reap;

#
# Clean up
#
DROP TABLE bug_61209;
[10 Apr 2012 0:04] Sergei Golubchik
if you comment the second SET auto_increment_increment it will not fail. May be it's a different bug?
[10 Apr 2012 5:38] Sergei Golubchik
A different bug, indeed.
See this line in the test:
  INSERT INTO bug_61209 VALUES (), ();
if you change it to insert (),(),(),() - that is, four rows, there will be no duplicate key error.
And the bug is here:

   next = (current - offset) / step;
   ...
   next_value = next * step;

this rounds current-offset *down*, while you should round it *up*. A possible fix could be

-   next = (current - offset) / step;
+   next = (current - offset - 1) / step + 1;
[10 Apr 2012 11:55] Laurynas Biveinis
Internal Oracle bug id #13817703, Oracle fix in 5.6 revisions 3737,3739,3774, 5.5 revision 3774.
[11 Apr 2012 3:31] Lixun Peng
rev 3774 can't backport to 5.1 .
[11 Apr 2012 13:13] Laurynas Biveinis
Sergei -

With your suggested fix applied there are several duplicate key regressions, including the innodb-autoinc test. I do not agree with your analysis for the second bug.  If I understand the innobase_next_autoinc() code correctly (spent only little time...), it is meaning truncating down by design, not rounding up, and it returns the end of interval for multi-row inserts.

For your testcase this method with the Oracle fix generates the following ids (numbers in parentheses are numbers to be used in the reserved intervals for the multi-row inserts):
1
3 (2)
11 (8) (5)
20 (17) (14)

No duplicate values here. With your rounding-up change there appear holes in the sequence, probably masking the issue.

What makes it more interesting, at least for us, is that the testcase randomly passes on Percona Server, perhaps 5% of time.
[11 Apr 2012 15:57] Sergei Golubchik
Oh, sure.
My last comment is totally wrong, I just didn't want to spam the bug report with numerous theories about what the problem is. I'll post the final bug fix after the UC, unless Sunny fixes it first.
[11 Apr 2012 16:14] Patryk Pomykalski
I haven't tested the Oracle fix, but the code looks needlessly complicated. The fix used in MariaDB looks a lot simpler and works (simplified my above patch).
[17 May 2012 20:14] John Russell
Added to changelog for 5.5.23, 5.5.24, 5.6.6: 

Running concurrent bulk inserts on a server with
auto_increment_offset=1, auto_increment_increment greater than 1, and
innodb_autoinc_lock_mode=1 could result in intermittent errors like
the following, even with the primary key set to auto_increment and
omitted from the INSERT statement:

Duplicate entry 'value' for key 'PRIMARY'

The workaround was to set auto_increment_offset=1 or
innodb_autoinc_lock_mode=0 ("traditional").