Bug #42087 Statement.executeUpdate() gives strange update count with INSERT ...ON DUPLICATE
Submitted: 13 Jan 2009 15:52 Modified: 18 Mar 2009 11:38
Reporter: Laurent Goujon Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: General Severity:S2 (Serious)
Version:5.1.30 OS:Any
Assigned to: CPU Architecture:Any

[13 Jan 2009 15:52] Laurent Goujon
Description:
Hi,

I'm unsure if it is a real bug or if it's just a documentation problem but I think it's worth to write it down to a bug.

We have a current system inserting/updating lots of data into a table using INSERT ... ON DUPLICATE KEY statements as it greatly improved general speed of the application. It relies on the Statement.executeUpdate return value as a way to control/check how many rows have been inserted/updated.

The current version is running on MySQL Server 4.1 and it's just working fine but for various reasons, transitioning to MySQL Server 5.1 is planned in some future.

We were aware that the way MySQL computed the affected row value changed between 4.1 and 5.0 (see bug #19978) but theorically there should be no impact since our application ensures that all updates are really modifying the database. However with MySQL Server 5.1 we got updateCount values which we don't fully understand and which break the compatibility with our application.

For example: 
- with mysql 4.1, updating 3 rows would have return 6 but will return 9 with mysql 5.1
- with mysql 4.1, inserting 1 row, modifying 1 row, and updating 1 row (unmodified) will return 5 and in mysql 5.1 it will return 6

Using the new useAffectedRows parameter is a possible workaround for us and seems to really match the description found in the INSERT ... ON DUPLICATE KEY documentation page.

So, real bug or simple documentation problem?

How to repeat:
See the attached test-case
[13 Jan 2009 15:54] Laurent Goujon
Simple test case to exhibit the problem

Attachment: MySQLTest.java (text/plain), 2.11 KiB.

[13 Jan 2009 15:55] Tonci Grgin
Hi Laurent and thanks for your report. I think this is a duplicate, let me see.
[13 Jan 2009 15:56] Laurent Goujon
The attached test case can be used to exhibit the problem. You need to modify host/user/password information and set up access (with basic privileges + create/drop privileges) to the test db.

You can also modify the useAffectedRows parameter but you need to use the latest J/Connecter and it won't have any effect on a MySQL 4.1 server.
[13 Jan 2009 16:02] Tonci Grgin
Laurent, as I suspected, this *is* a duplicate of Bug#39352. Please see discussion there.
[13 Jan 2009 16:16] Laurent Goujon
I already saw this bug and that's why I used the 5.1.7 version so I could play with useAffectedRows.

But still, there might some improvements to be done in the documentation. How can the result of 9 to be explained for 3 modified rows ? 3 found rows and 3 updates ?

and why the returned value is different if you're using a MySQL 4.1 server.

I'm not sure that's the same problem (but definitively in the same area)
[14 Jan 2009 12:29] Laurent Goujon
I used JDBC trace options to sniff network packets and the numbers provided by the driver are those provided by the server

So I looked at the server code and found something strange into sql/sql_insert.cc : the variable info.touched is incremented twice !

In fact in the function write_record, the same snippet of code is duplicated:

1444         if (table->next_number_field)
1445           table->file->adjust_next_insert_id_after_explicit_value(
1446             table->next_number_field->val_int());
1447         info->touched++;

and 

1482         if (table->next_number_field)
1483           table->file->adjust_next_insert_id_after_explicit_value(
1484             table->next_number_field->val_int());
1485         info->touched++;

I would suggest removing one of those.

This also explainsw why using useAffectedRows is okay: info.updated is only incremented once.
[14 Jan 2009 20:19] Laurent Goujon
It seems this has been introduced between 5.1.20 and 5.1.21. Maybe a bad merge ?

5.0.67 looks okay on the other side
[15 Jan 2009 9:38] Valeriy Kravchuk
This is what I have with your test case in command line:

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot -P3308 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.0.74-enterprise-gpl-nt-log MySQL Enterprise Server - Pro Editi
on (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> CREATE TABLE test_onduplicate (id integer not null auto_increment primary
 key,
    -> foo varchar(255)) ENGINE=INNODB;
Query OK, 0 rows affected (0.80 sec)

mysql> INSERT INTO test_onduplicate(id, foo) VALUES (2,"a"), (4,"b"), (6,"c")
    -> ON DUPLICATE KEY UPDATE foo=values(foo);
Query OK, 3 rows affected (0.13 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> INSERT INTO test_onduplicate(id, foo) VALUES (2,"a"), (4,"d"), (6,"e")
    -> ON DUPLICATE KEY UPDATE foo=values(foo);
Query OK, 4 rows affected (0.03 sec)
Records: 3  Duplicates: 2  Warnings: 0

mysql> INSERT INTO test_onduplicate(id, foo) VALUES (2,"g"), (4,"h"), (6,"i")
    -> ON DUPLICATE KEY UPDATE foo=values(foo);
Query OK, 6 rows affected (0.01 sec)
Records: 3  Duplicates: 3  Warnings: 0

mysql> exit
Bye

C:\Program Files\MySQL\MySQL Server 5.0\bin>mysql -uroot -proot -P3310 test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1
Server version: 5.1.30-community-log MySQL Community Server (GPL)

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> CREATE TABLE test_onduplicate (id integer not null auto_increment primary
 key,
    -> foo varchar(255)) ENGINE=INNODB;
Query OK, 0 rows affected (0.78 sec)

mysql> INSERT INTO test_onduplicate(id, foo) VALUES (2,"a"), (4,"b"), (6,"c")
    -> ON DUPLICATE KEY UPDATE foo=values(foo);
Query OK, 3 rows affected (0.16 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> INSERT INTO test_onduplicate(id, foo) VALUES (2,"a"), (4,"d"), (6,"e")
    -> ON DUPLICATE KEY UPDATE foo=values(foo);
Query OK, 4 rows affected (0.06 sec)
Records: 3  Duplicates: 2  Warnings: 0

mysql> INSERT INTO test_onduplicate(id, foo) VALUES (2,"g"), (4,"h"), (6,"i")
    -> ON DUPLICATE KEY UPDATE foo=values(foo);
Query OK, 6 rows affected (0.06 sec)
Records: 3  Duplicates: 3  Warnings: 0

Do you see any difference 5.0 vs. 5.1 or contradiction to the documentation? I do not. So, if a bug is somewhere, it is NOT in the server, IMHO. Even if code looks really different.
[15 Jan 2009 9:56] Laurent Goujon
This is quite normal since your client has capability CLIENT_FOUND_ROWS so what you get is the "new" count behaviour: 1 per insert and 2 per effectively modified row. In the sql_insert_code, this is using info->copied and info->updated informations to compute this number. If you set the JDBC Driver option useAffectedRows to true (available starting 5.1.7 version of the connector), the driver will tell the server that it also has the capability CLIENT_FOUND_ROWS, and you'll get exactly the same result.

However the JDBC driver doesn't set this by default as it is non compliant with JDBC specifications so what you'll get as result is the previous count: 1 per insert and 2 per "touched" row, so it means using info->touched instead of info->updated

See this snippet of code to choose how the computation is done into sql/sql_insert.cc
 949   if (values_list.elements == 1 && (!(thd->options & OPTION_WARNINGS) ||
 950                     !thd->cuted_fields))
 951   {
 952     thd->row_count_func= info.copied + info.deleted +
 953                          ((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
 954                           info.touched : info.updated);
 955     my_ok(thd, (ulong) thd->row_count_func, id);
 956   }
 957   else
 958   {
 959     char buff[160];
 960     ha_rows updated=((thd->client_capabilities & CLIENT_FOUND_ROWS) ?
 961                      info.touched : info.updated);
 962     if (ignore)
 963       sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
 964           (lock_type == TL_WRITE_DELAYED) ? (ulong) 0 :
 965           (ulong) (info.records - info.copied), (ulong) thd->cuted_fields);
 966     else
 967       sprintf(buff, ER(ER_INSERT_INFO), (ulong) info.records,
 968           (ulong) (info.deleted + updated), (ulong) thd->cuted_fields);
 969     thd->row_count_func= info.copied + info.deleted + updated;
 970     ::my_ok(thd, (ulong) thd->row_count_func, id, buff);
 971   }

So to my humble opinion:
- The INSERT ... ON DUPLICATE KEY is okay but should emphasize that old clients or JDBC driver might request the previous way of computing row count and so get different results depending on it
- A regression has been introduced between 5.1.20 and 5.1.21 when using the old way of computing row count, by incrementing touched twice. 
- 4.1 and 5.0 servers are not impacted and provide the expected results
[15 Jan 2009 10:26] Laurent Goujon
Sorry, this is the other way around.

The count is incorrect when capability CLIENT_FOUND_ROW is set (and the latest version of the JDBC driver now enables to disable this flag which was set by default during the 5.0.x series to match JDBC specs).

I dunno if there's a way to set this capability from the commandline client, just to check...
[23 Mar 2009 16:09] Susanne Ebrecht
Bug #43311 is set as duplicate of this bug here
[3 Apr 2009 7:40] Susanne Ebrecht
For documentation please also read bug #43311.

Consider here that lots of documentation (jdbc, c-api, ...) needs overworked when this is fixed.