MySQL Bugs: #61209: auto_increment_offset != 1 + innodb_autoinc_lock

Bug #61209	auto_increment_offset != 1 + innodb_autoinc_lock_mode=1 => bulk inserts fail
Submitted:	18 May 2011 1:39	Modified:	17 May 2012 20:14
Reporter:	Bruce Weirdan	Email Updates:
Status:	Closed	Impact on me:	None
Category:	MySQL Server: InnoDB storage engine	Severity:	S3 (Non-critical)
Version:	5.1.57, 5.5.12, 5.6.2-m5	OS:	Linux (Linux weirdan 2.6.38-2-amd64 #1 SMP Sun May 8 13:51:57 UTC 2011 x86_64 GNU/Linux )
Assigned to:	Sunny Bains	CPU Architecture:	Any

Description:
Running concurrent bulk inserts on a server with auto_increment_offset!=1, auto_increment_increment>1 and innodb_autoinc_lock_mode=1 results in intermittent errors like this: Duplicate entry '87674' for key 'PRIMARY'
even though primary key is set to auto_increment and is not specified in the insert statement. 

There's no errors if auto_increment_offset=1 or innodb_autoinc_lock_mode=0 ('traditional')

It's reproducible on both current GA releases (5.1.57 and 5.5.12).

How to repeat:
Start mysqld with innodb_autoinc_lock_mode=1

Run attached script (note: it will delete all *log files in the current directory)
prerequisites: PHP 5.3 with mysql extension

php run.php -h 127.0.0.1 -p 3306 -uroot -P -dtest

It generates a number of log files (t*.log) in current directory (one log file per insert process).

Expected results: log files should not contain any 'Duplicate entry '...' for key 'PRIMARY'' messages, regardless of auto_increment_offset setting.

Actual results: when auto_increment_offset is set to something != 1 in thr.php some inserts fail.

Reproduce script

Attachment: duplicate-primary.tgz (application/octet-stream, text), 1.05 KiB.

5.6.2-m5 is also affected

Any progress on this?

We are also running into this issue, both on 5.1.57 and 5.5.12 - it's blocking our upgrade to 5.5.

Have tried exactly what Bruce mentioned (changing the auto inc lock mode) and that did not seem to help either

Bruce,

Are you using multi-master replication as well?

Could we get an official comment on this? It's a huge blocker for us on upgrading to either 5.1 or 5.5.

We are using multi-master replication with 1 < auto_inc_offset < auto_inc_inc

I have to think those things are all tied to this bug somehow, so when someone has the opportunity to look into this on the MySQL side, you need to use those settings to reproduce!

Zach,

We intended to use MM replication (thus autoincrements set so that primary keys would interleave) but never actually used MM, only straight MS replication.

In my tests the problem was reproducible with a single MySQL instance, no slaves connected. The exact my.cnf used is included in the attached .tgz - all other settings were at their default values.

Thanks Bruce, good to know that makes reproducing it even simpler. Have you ever tried your hand at debugging MySQL? Seems like submitting a patch is the only way to get things like this fixed - either that or paying $TEXAS for an Oracle support contract :(

Unfortunately debugging multithreaded C++ is beyond my skills, and this issue is not so pressing for us as to hire external developer to solve it.

patch for 5.1 branch

Attachment: 61209_fix_51.diff (application/octet-stream, text), 461 bytes.

oops, there is typo in the previous one, this should work

Attachment: 61209_fix_51_working.diff (application/octet-stream, text), 466 bytes.

Is anyone from MySQL/Oracle able to comment on the validity of Patryk's patch and if it's something that is possible to merge in and/or "bless" to fix this bug?

Alternatively, Patryk, how hard is it to apply this sort of patch to the MySQL code base and how reliable is the included test suite with MySQL to ensure that this patch does not introduce other bugs or problems?

My fix brakes some edge cases. I'm still working on it...

5.5 branch fix with a test to reproduce this bug

Attachment: mysql55_bug_61209.patch (application/octet-stream, text), 13.59 KiB.

Why doesn't anyone at Oracle care about fixing this BLOCKER bug for people trying to upgrade to 5.1 or 5.5 from 5.0?

9 minor releases for 5.5 since this bug was reported and still no one has even looked at it?

A patch is in review.

Any updates on the patch in review? Again, this is a blocker issue that's keeping us from taking advantage of all the awesome improvements 5.5 has to offer (we're stuck on 5.0).

The fix has been pushed into 5.6; will backport the fix to 5.5 soon.

Fantastic, thanks Calvin.

Is anyone from Percona able to comment on which Percona 5.5 release will incorporate this bugfix?

Thanks!

your fix may be cool and all, but it does not fix the bug.
See the attached patch - it contains a test case. This test case still fails with your bug fix. So, bulk insert *still* fails.

I mean, Oracle's bug fix doesn't fix the bug

My tests show that it works.

user@tyr03 autoinc> php run.php -h 127.0.0.1 -p 5400 -uroot -P -dtest
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t10.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t1.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t2.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t3.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t4.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t5.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t6.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t7.log
-rw-r--r-- 1 user wheel 0 Apr 10 00:31 t8.log

mysql> show variables like '%auto%';
+-----------------------------+-------+
| Variable_name               | Value |
+-----------------------------+-------+
| auto_increment_increment    | 3     |
| auto_increment_offset       | 3     |
| autocommit                  | ON    |
| automatic_sp_privileges     | ON    |
| innodb_autoextend_increment | 8     |
| innodb_autoinc_lock_mode    | 1     |
| sql_auto_is_null            | OFF   |
+-----------------------------+-------+
7 rows in set (0.00 sec)

-rw-r--r-- 1 subains wheel 0 Apr 10 00:31 t9.log

not the PHP one, I didn't try it. But from the patch later. Here's the test case:
# This is the test case for bug #61209 - duplicate key errors
# when using auto_increment_increment > 1 and auto_increment_offset > 1
#
--source include/have_innodb.inc

#
# Precautionary clean up.
#
--disable_warnings
DROP TABLE IF EXISTS bug_61209;
--enable_warnings

#
# Create test data.
#
CREATE TABLE bug_61209 (a INT auto_increment, PRIMARY KEY(a)) ENGINE=InnoDB;

INSERT INTO bug_61209 VALUES (), ();

--echo
--echo # Connect con1
--connect (con1,localhost,root,,)
--echo
--echo # Connect con2
--connect (con2,localhost,root,,)

--echo
--echo # Connection con1
--connection con1
SET SESSION auto_increment_increment=3;
SET SESSION auto_increment_offset=2;
SELECT GET_LOCK('a', 9);

--echo
--echo # Connection con2
--connection con2
SET SESSION auto_increment_increment=3;
SET SESSION auto_increment_offset=2;
send INSERT INTO bug_61209 (a) VALUES (NULL), (NULL), (NULL + GET_LOCK('a', 10));

--echo
--echo # Connection con1 reap
--connection con1
INSERT INTO bug_61209 (a) VALUES (NULL), (NULL), (NULL);
SELECT RELEASE_LOCK('a');

--echo
--echo # Connection con2 reap
--connection con2
reap;

#
# Clean up
#
DROP TABLE bug_61209;

if you comment the second SET auto_increment_increment it will not fail. May be it's a different bug?

A different bug, indeed.
See this line in the test:
  INSERT INTO bug_61209 VALUES (), ();
if you change it to insert (),(),(),() - that is, four rows, there will be no duplicate key error.
And the bug is here:

   next = (current - offset) / step;
   ...
   next_value = next * step;

this rounds current-offset *down*, while you should round it *up*. A possible fix could be

-   next = (current - offset) / step;
+   next = (current - offset - 1) / step + 1;

Internal Oracle bug id #13817703, Oracle fix in 5.6 revisions 3737,3739,3774, 5.5 revision 3774.

rev 3774 can't backport to 5.1 .

Sergei -

With your suggested fix applied there are several duplicate key regressions, including the innodb-autoinc test. I do not agree with your analysis for the second bug.  If I understand the innobase_next_autoinc() code correctly (spent only little time...), it is meaning truncating down by design, not rounding up, and it returns the end of interval for multi-row inserts.

For your testcase this method with the Oracle fix generates the following ids (numbers in parentheses are numbers to be used in the reserved intervals for the multi-row inserts):
1
3 (2)
11 (8) (5)
20 (17) (14)

No duplicate values here. With your rounding-up change there appear holes in the sequence, probably masking the issue.

What makes it more interesting, at least for us, is that the testcase randomly passes on Percona Server, perhaps 5% of time.

Oh, sure.
My last comment is totally wrong, I just didn't want to spam the bug report with numerous theories about what the problem is. I'll post the final bug fix after the UC, unless Sunny fixes it first.

I haven't tested the Oracle fix, but the code looks needlessly complicated. The fix used in MariaDB looks a lot simpler and works (simplified my above patch).

Added to changelog for 5.5.23, 5.5.24, 5.6.6: 

Running concurrent bulk inserts on a server with
auto_increment_offset=1, auto_increment_increment greater than 1, and
innodb_autoinc_lock_mode=1 could result in intermittent errors like
the following, even with the primary key set to auto_increment and
omitted from the INSERT statement:

Duplicate entry 'value' for key 'PRIMARY'

The workaround was to set auto_increment_offset=1 or
innodb_autoinc_lock_mode=0 ("traditional").