MySQL Bugs: #28705: INSERT... ON DUPLICATE KEY UPDATE skips auto

Bug #28705	INSERT... ON DUPLICATE KEY UPDATE skips auto_increment values in ndb engine
Submitted:	27 May 2007 15:25	Modified:	21 Jun 2007 14:01
Reporter:	johnny slakva	Email Updates:
Status:	Not a Bug	Impact on me:	None
Category:	MySQL Cluster: Cluster (NDB) storage engine	Severity:	S3 (Non-critical)
Version:	5.0.41	OS:	Linux
Assigned to:		CPU Architecture:	Any
Tags:	auto_increment, ON DUPLICATE KEY UPDATE

Description:
bug described as #24432 is still actual for ndbcluster engine.

How to repeat:
mysql> CREATE TABLE `test` (
`id` INT NOT NULL AUTO_INCREMENT PRIMARY KEY ,
`t` VARCHAR( 10 ) NOT NULL ,
`c` INT NOT NULL ,
UNIQUE (`t`)
) ENGINE = ndbcluster;
Query OK, 0 rows affected (1.31 sec)
mysql> insert into test (t, c) values ('a', 1) on duplicate key update c=c+1;
Query OK, 1 row affected (0.00 sec)
mysql> insert into test (t, c) values ('a', 1) on duplicate key update c=c+1;
Query OK, 2 rows affected (0.01 sec)
mysql> insert into test (t, c) values ('b', 1) on duplicate key update c=c+1;
Query OK, 1 row affected (0.01 sec)

mysql> select * from test;
+----+---+---+
| id | t | c |
+----+---+---+
|  3 | b | 1 |    // expected id=2!
|  1 | a | 2 |
+----+---+---+
2 rows in set (0.04 sec)

also, when insert fails and respective update occurs, affected rows number is wrong (see above); can i open another bugreport for that?

Suggested fix:
same as for #24432

Perhaps I am missing something, but, your insert statement;
insert into test (t, c) values ('b', 1) on duplicate key update c=c+1;
The value 'b' for column 't' is not a duplicate, therefore will insert the original value for c. If you wanted c to = 2 for b, you need to run your insert once more (for t=b).

Your data for NDB in 5.0;
+----+---+---+
| id | t | c |
+----+---+---+
|  1 | a | 2 | 
|  3 | b | 1 | 
+----+---+---+

Your data for innodb in 5.0;
mysql> select * from test1 ORDER BY id;
+----+---+---+
| id | t | c |
+----+---+---+
|  1 | a | 2 | 
|  3 | b | 1 | 
+----+---+---+

Please let me know if I have missed something. As this appears to be working.

sorry, i had to describe the bug little more.

i didnt want c to = 2 for b, i wanted id to = 2 for it, and not 3 as it appears to be.

problem is that autoincrement value increases regardless whether insert occured or update, that is what i wanted to point.

to be clearer, samples follow.

my data in myisam 5.0.41:
select * from test;
+----+---+---+
| id | t | c |
+----+---+---+
|  1 | a | 2 |
|  2 | b | 1 |
+----+---+---+

my data in ndb 5.0.41:
select * from test1 order by id;
+----+---+---+
| id | t | c |
+----+---+---+
|  1 | a | 2 | 
|  3 | b | 1 |    // id is the problem 
+----+---+---+

when huge number of updates occur, this wastes autoincrement values, and also, as pointed in #24432, this will cause problems with replication, especially if i wanted to replicate ndb tables to non-ndb.

This behavior replicates fine, eg the missing auto_increment value is also missing in myisam. I have tested this locally (cluster > myisam) row based replication.

the auto_increment behavior is not a bug, the only guarantee we give on auto_increment values is that:

a) it will not create duplicate values
b) each new id will be larger than the previous one

there is no guarantee that there will be no gaps,
and there are situations were gaps will indeed occure,
e.g on all transactional storage engines when a transaction
in which an auto_increment INSERT occured are rolled back,
in this case the auto_increment value(s) returned to this
transaction are lost

with cluster the situation is even more complicated, here
auto_increment values have to be fetched in advance, so
even if an INSERT ON DUPLICATE KEY UPDATE fails in the
INSERT part due to a duplicate key the prefetched id
value is 'lost' for further operations 

another case in which cluster may produce gaps is when
you do a INSERT INTO ... SELECT into a table with an
auto_increment field. Here auto_increment values are
fetched in batches of (by default) 32 and not one by
one for performance reasons. Unused values from the
last batch fetched are lost here, too, creating gaps
of up to 31 values when using default settings.

The affected rows output indeed looks like an error 
though, but creating a new bug report for this one
indeed seems to make sense by now ...

The affected rows count actually works as documented
(and is not cluster specific), see

  http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html

  The rows-affected value is 1 if the row is inserted
  as a new record and 2 if an existing record is updated.