Bug #46675 ON DUPLICATE KEY UPDATE and updateCount() possibly wrong
Submitted: 12 Aug 2009 13:35 Modified: 19 Aug 2011 15:15
Reporter: Ed Slattery Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:server 5.1.37 OS:Any (MS Windows, Mac, Linux)
Assigned to: CPU Architecture:Any
Tags: Connector/J

[12 Aug 2009 13:35] Ed Slattery
Description:
With all the history of this (#39352 etc) I am wary of reporting this - but I can't see why I am getting the values I get.

I am using server 5.1.37 and connector/j 5.1.8

I insert twice to a row with a primary key, on the second insert using "ON DUPLICATE KEY UPDATE" syntax. 

Using mysql server version 5.0.45, I get an update count of 2 for the secomd insert.
Switching to server 5.1.37, I get an update count of 3.

I was expected either 1 or 2 depending on the useAffectedRows flag - I cannot see how 3 is produced.

How to repeat:
I connect to the server, setting the flag "useAffectedRows" to either true or false - doesnt change the result.

I create a table with a primary key:

CREATE TABLE test(id INT, label CHAR(1), PRIMARY KEY(id))

Insert:

INSERT INTO test(id, label) VALUES (1, "a")

Insert again:

INSERT INTO test(id, label) VALUES (1, "a") ON DUPLICATE KEY UPDATE id = 4

and either look at the value returned by executeUpdate, or use 

mystatement.getUpdateCount().

The value returned is 3.
The number of rows in the database is 1, and it contains (4,'a')

I could see how the value could be 1 or 2 depending on the flag - but how does it get to be 3?
[12 Aug 2009 13:44] MySQL Verification Team
Thank you for the bug report. Could you please provide the test case (code sample). Thanks in advance.
[12 Aug 2009 14:23] Ed Slattery
This snippet shows the returned value of 3:

import javax.sql.DataSource;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.util.Map;
import java.util.HashMap;

public class mysql_test {
  public static void main(String[] args) {

    Map<String, Object> keys = new HashMap<String, Object>();
    keys.put("serverName", "localhost");
    keys.put("port", 3306);
    keys.put("user", "root");
    keys.put("useAffectedRows", false);
    keys.put("password", "");
    
    String clazz = "com.mysql.jdbc.jdbc2.optional.MysqlDataSource";

    ClassLoader contextClassLoader = Thread.currentThread().getContextClassLoader();
    try {
        DataSource ds = (DataSource) contextClassLoader.loadClass(clazz).newInstance();
	Connection cs = ds.getConnection();

        Statement s = cs.createStatement();

        s.execute("drop database if exists test");
        s.execute("create database test");
        s.execute("use test");
        s.execute("CREATE TABLE test(id INT, label CHAR(1), PRIMARY KEY(id))");
        s.executeUpdate("INSERT INTO test(id, label) VALUES (1, 'a')");
        s.executeUpdate("INSERT INTO test(id, label) VALUES (1, 'a') ON DUPLICATE KEY UPDATE id = 4");
          
        System.out.println("Rows Affected:"+s.getUpdateCount());

        cs.close();
 
   } catch (Exception e) {
       e.printStackTrace();
   }
}
}
[18 Aug 2009 9:07] Susanne Ebrecht
Which storage engine are you using?
[19 Aug 2009 7:47] Ed Slattery
innoDB for both 5.1.37 and 5.0.45
[19 Aug 2009 12:33] Tonci Grgin
Hi Ed. Where should I start...

Please check Bug#39352, Bug#43311, Bug#37458 and Bug#41532 for starters and tell me if you're still puzzled about this.
[19 Aug 2009 13:19] Ed Slattery
Yes, I had read those and similar defects, but I didnt find anything which would explain a single update returning a value of 3.  My application was using this value to get the number of rows affected by the change - but as I understand the responses to the other defects, that was a bad idea. How does one normally get the number of affected rows after an indert/update?
[19 Aug 2009 13:20] Ed Slattery
I meant 'insert' of course.
[19 Aug 2009 13:31] Tonci Grgin
Well, I guess I'll have to put up wireshark and check what's on the wire... Guess is c/J reports whatever server sends.
[24 Aug 2009 9:16] Tonci Grgin
Hi Ed.

Although 3 is indeed returned by server:
Mon Aug 24 10:34:36 CEST 2009 TRACE: ReadAheadInputStream.readIfNecessary([B@19209ea,0,7) not all data available in buffer, must read from stream
Mon Aug 24 10:34:36 CEST 2009 TRACE:   ReadAheadInputStream.fill(7), buffer_size=16384, current_position=0, need to read 7 bytes to fill request, underlying InputStream reports 7 total bytes available, attempting to read 7 bytes.
Mon Aug 24 10:34:36 CEST 2009 TRACE: reuseAndReadPacket() payload:
00 03 00 02 00 00 00        . . . . . . . 

I am still setting the report to "Verified" on the basis of what Susanne said in her private comment and due to the fact that neither server docs describe situation where 3 could be returned nor c/J docs describe "useAffectedRows" connection string property.

Mark?
[24 Aug 2009 13:22] Mark Matthews
The "useAffectedRows" property has the following text (that should end up automagically in the documentation):

"Don't set the CLIENT_FOUND_ROWS flag when connecting to the server (not JDBC-compliant, will break most applications that rely on "found" rows vs. "affected rows" for DML statements), but does cause "correct" update counts from "INSERT ... ON DUPLICATE KEY UPDATE" statements to be returned by the server."

Since it's not ending up in the documentation as part of the documentation publishing process, I'm turning this into a documentation bug.
[25 Aug 2009 15:57] Paul DuBois
Aside from the issue noted by Mark that part of the Connector/J doc info isn't getting into the manual, I think this is actually a server bug. Left unexplained by any of the preceding discussion is why there should be a server version difference (5.0 returns 2, 5.1 returns 3). I created a similar test program using Perl DBI, which has a mysql_client_found_rows flag that can be enabled or disabled at connect time, and here is what I find when executing the INSERT ... ON DUPLICATE KEY UPDATE statement and checking the rows return count.

mysql_client_found_rows = 0: The second INSERT returns a row count of 2 in all MySQL versions.

mysql_client_found_rows = 1: The second INSERT returns this row count:

Before MySQL 5.1.20: 2
MySQL 5.1.20: undef on Mac OS X, 139775481 on Linux (initialized value? garbage?)
MySQL 5.1.21 and up: 3

Looking in the 5.1.20 changelog, I see Bug#28505 which concerns mysql_affected_rows() and CLIENT_FOUND_ROWS. However, this change was supposed to have been made in both 5.0.44 and 5.1.20, and the change in row count to return 3 occurs only in 5.1. (I checked 5.0.43, 5.0.44, 5.0.45 and all of them return 2 rows, expected.)

It looks to me like something went wrong with the 5.1 fix. I don't know why there was a change from returning undef/139775481 to returning 3 between 5.1.20 and 5.1.21. I don't see anything that looks like it's relevant in the 5.1.21 changelog.
[25 Aug 2009 16:04] Paul DuBois
re: "MySQL 5.1.20: undef on Mac OS X, 139775481 on Linux (initialized value? garbage?)"

That should say "uninitialized value?"
[26 Aug 2009 6:01] Tonci Grgin
Paul, thank you very much.
[13 Nov 2009 14:00] Tony Bedford
It appears there must have been a temporary glitch in the documentation publishing process. This appears to be resolved now. The "missing" documentation was found to be present and correct on the following pages:

http://dev.mysql.com/doc/refman/5.4/en/connector-j-reference-configuration-properties.html
http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-configuration-properties.html
http://dev.mysql.com/doc/refman/5.0/en/connector-j-reference-configuration-properties.html
http://dev.mysql.com/doc/refman/4.1/en/connector-j-reference-configuration-properties.html

However, I am not closing this bug just yet as Paul pointed out there could be a server bug involved. I will let Paul decide whether to use this bug or raise another bug accordingly.
[16 Nov 2009 10:31] Tony Bedford
After email with Omer I am setting status to Open, lead to Joro, and category to Server. This is so the server guys can have a look and see if there is indeed a server bug (as Paul suggested) lurking in this report.
[16 Nov 2009 10:42] Tonci Grgin
Cleaning up statuses and assigning to Sveta for further checking.
[16 Nov 2009 13:08] Sveta Smirnova
Verified as server bug using following code:

#include <stdio.h>
#include <mysql.h>
#include <assert.h>
#include <string.h>

int main(int argc, char **argv)
{
	MYSQL conn;
	int OK;
	
	const char* query1= "drop table if exists test;";
	const char* query2= "CREATE TABLE test(id INT, label CHAR(1), PRIMARY KEY(id));";
	const char* query3= "INSERT INTO test(id, label) VALUES (1, 'a');";
	const char* query4= "INSERT INTO test(id, label) VALUES (1, 'a') ON DUPLICATE KEY UPDATE id = 4;";
	
	mysql_init(&conn);
	
	if (!mysql_real_connect(&conn, "127.0.0.1", "root", "", "test", 4040, NULL,CLIENT_FOUND_ROWS))
         printf("Error: %s\n", mysql_error(&conn));
	 
	 OK = mysql_real_query (&conn, query1, strlen(query1));
     assert(0 == OK);
	 OK = mysql_real_query (&conn, query2, strlen(query2));
     assert(0 == OK);
	 OK = mysql_real_query (&conn, query3, strlen(query3));
     assert(0 == OK);
	 OK = mysql_real_query (&conn, query4, strlen(query4));
     assert(0 == OK);
	 
	 printf("%ld rows affected", (long) mysql_affected_rows(&conn));
	 mysql_close(&conn);
	 
	 return 0;
}

Result:

$./bug46675 
3 rows affected

Documentation at http://dev.mysql.com/doc/refman/5.1/en/mysql-affected-rows.html says:

For UPDATE statements, if you specify the CLIENT_FOUND_ROWS flag when connecting to mysqld, mysql_affected_rows() returns the number of rows matched by the WHERE  clause. Otherwise, the default behavior is to return the number of rows actually changed. 

So I expect "1 rows affected" as result and not 3.

MySQL Proxy confirms this is server who sends wrong result:

we got a normal query: INSERT INTO test(id, label) VALUES (1, 'a') ON DUPLICATE KEY UPDATE id = 4;
query-time: 0.211ms
response-time: 0.214ms
affected rows: 3
[18 May 2011 17:12] Mark Matthews
Note that this is related to Bug#61213. The only real fix I can see is to extend the protocol to return the actual *list* of generated/retained keys, not a row count and the initial starting point.
[19 Aug 2011 15:15] Paul DuBois
Noted in 5.5.16, 5.6.3 changelogs.

The mysql_affected_rows() C API function returned 3 (instead of 2)
for INSERT ... ON DUPLICATE KEY UPDATE statements where there was a
duplicated key value.
[5 Jul 2013 19:12] Paul DuBois
Addition to changelog entry:

Now the affected-rows value per row is 1 if the row is inserted as a
new row, 2 if an existing row is updated, and 0 if an existing row is
set to its current values. If you specify the CLIENT_FOUND_ROWS flag
to mysql_real_connect() when connecting to mysqld, the affected-rows
value is 1 (not 0) if an existing row is set to its current values.