Bug #56565 Multiple update syntax doesn't update rows multiple times
Submitted: 5 Sep 2010 10:08 Modified: 6 Sep 2010 20:50
Reporter: Winfried Trümper Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: DML Severity:S3 (Non-critical)
Version:5.1.36 OS:Any
Assigned to: CPU Architecture:Any

[5 Sep 2010 10:08] Winfried Trümper
Description:
My interpretation of the the documentation for the multiple update syntax is, that a row could be updated multiple times with multiple update syntax. If you can take table_references and where_condition from a SELECT into an UPDATE, shouldn't the affected rows be the same?

Below is an example which shows the difference: "5 rows in set" for the SELECT , but only "Rows matched: 2" in the equivalent UPDATE. Meaning that the where_condition and table_references are identical, but results are not.

I refer to the following promising part of the documentation: "For the multiple-table syntax, UPDATE updates rows in each table named in table_references that satisfy the conditions. [..] where_condition is an expression that evaluates to true for each row to be updated.
table_references and where_condition are is specified as described in Section 12.2.8, “SELECT Syntax”."

How to repeat:
CREATE TABLE some_properties (
	has_property enum('No', 'Yes') NOT NULL default 'No',
        p_flat varchar(255) NOT NULL default '',
	p_counter int unsigned NOT NULL default 0
);
INSERT INTO some_properties VALUES
        ('No', 'A-1', 0),
        ('Yes', 'A', 0),
        ('No', 'A-2', 0),
        ('No', 'B-8', 0),
        ('No', 'B-9', 0),
        ('Yes', 'B', 0);

SELECT	p1.p_flat, count(p2.p_flat)
FROM some_properties AS p1 INNER JOIN some_properties AS p2
 ON (p2.p_flat LIKE CONCAT(p1.p_flat, '-%'))
WHERE (p1.has_property = 'Yes')
GROUP BY p1.p_flat;

UPDATE some_properties AS p1 INNER JOIN some_properties AS p2
 ON (p2.p_flat LIKE CONCAT(p1.p_flat, '-%'))
SET	p1.p_counter = p1.p_counter +1
WHERE (p1.has_property = 'Yes');

SELECT * FROM some_properties;

DROP TABLE IF EXISTS some_properties;

Suggested fix:
If you consider it a feature, then add the following to the documentation: "[..] but each row is update once only, even if the join of the equivalent select would list the row multiple times".
[6 Sep 2010 8:39] Christian Winkgen
I stumbled upon a similar thing:

> select * from a;
+----+------+
| id | val  |
+----+------+
|  1 |   10 |
+----+------+
1 row in set (0.00 sec)

> select * from b;
+----+------+
| id | val  |
+----+------+
|  1 |  100 |
|  1 |  200 |
+----+------+
2 rows in set (0.00 sec)

> update a join b using( id ) set a.val = a.val + b.val;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

> select * from a;
+----+------+
| id | val  |
+----+------+
|  1 |  110 |
+----+------+
1 row in set (0.00 sec)

Reproduced with 5.1.41 and 5.0.45.

This is not what I would expect from reading the UPDATE statement syntax.
Changing the UPDATE to SELECT would at least display both values in table b.

Workaround for this is using a sub-select with group by:

update a join ( 
  select id, sum(val) val from b group by id 
) b using( id ) set a.val = a.val + b.val;
[6 Sep 2010 16:45] Sveta Smirnova
Thank you for the report.

I don't see mismatched information in the test result:

SELECT  p1.p_flat, count(p2.p_flat)
FROM some_properties AS p1 INNER JOIN some_properties AS p2
ON (p2.p_flat LIKE CONCAT(p1.p_flat, '-%'))
WHERE (p1.has_property = 'Yes')
GROUP BY p1.p_flat;
p_flat  count(p2.p_flat)
A       2
B       2
affected rows: 2
UPDATE some_properties AS p1 INNER JOIN some_properties AS p2
ON (p2.p_flat LIKE CONCAT(p1.p_flat, '-%'))
SET     p1.p_counter = p1.p_counter +1
WHERE (p1.has_property = 'Yes');
affected rows: 2
info: Rows matched: 2  Changed: 2  Warnings: 0

What is the issue here?
[6 Sep 2010 17:32] Winfried Trümper
Christian, your example misses the select for comparison. Remember my argument is the difference between select and update. Unless you explicitly challenge the line of argumentation, we should stick to it. Does the following code reflect your example properly? Because it shows no difference in number of selected/updated rows, thus no problem. Please, if you contribute, give the statements, not the output. Thanks. -Winfried

CREATE TABLE a (id int, val int);
CREATE TABLE b (id int, val int);
INSERT INTO a VALUES (1, 10);
INSERT INTO b VALUES
       (1, 100),
       (2, 200);
SELECT a.val, (a.val + b.val)
FROM a JOIN b USING( id );
UPDATE a JOIN b USING( id )
SET a.val = a.val + b.val;
DROP TABLE b;
DROP TABLE a;
[6 Sep 2010 17:43] Winfried Trümper
Sorry, while cutting down my real world code to an essential example, I left the wrong select. This is an attempt for repair with minimum changes. Carries the right select. It is four selected rows vs. two updated rows. This example is overly complicated, but that is not the point now. -Winfried

CREATE TABLE some_properties (
	has_property enum('No', 'Yes') NOT NULL default 'No',
        p_flat varchar(255) NOT NULL default '',
	p_counter int unsigned NOT NULL default 0
);
INSERT INTO some_properties VALUES
        ('No', 'A-1', 0),
        ('Yes', 'A', 0),
        ('No', 'A-2', 0),
        ('No', 'B-8', 0),
        ('No', 'B-9', 0),
        ('Yes', 'B', 0);
SELECT p1.p_flat, (p1.p_counter +1)
FROM some_properties AS p1 INNER JOIN some_properties AS p2
 ON (p2.p_flat LIKE CONCAT(p1.p_flat, '-%'))
WHERE (p1.has_property = 'Yes');
UPDATE some_properties AS p1 INNER JOIN some_properties AS p2
 ON (p2.p_flat LIKE CONCAT(p1.p_flat, '-%'))
SET	p1.p_counter = p1.p_counter +1
WHERE (p1.has_property = 'Yes');
DROP TABLE IF EXISTS some_properties;
[6 Sep 2010 18:05] Sveta Smirnova
Thank you for the feedback.

MySQL is smart enough and does not update row if no change needed. Therefore number of affected rows is lower than in SELECT case. This partially described at http://dev.mysql.com/doc/refman/5.1/en/information-functions.html: "For UPDATE statements, the affected-rows value by default is the number of rows actually changed." Therefore this is not a bug. Closing it as such.
[6 Sep 2010 18:44] Winfried Trümper
Sveta, it is clear that UPDATE distinguishes between matching and changed rows. The mysql text client displays this information for every UPDATE, so it's hard to miss. Your answer would apply if the UPDATE would give 4/2, but it gives 2/2 (matched/changed).
[6 Sep 2010 19:01] Sveta Smirnova
Winfried,

SELECT shows duplicate rows, therefore matched == changed.
[6 Sep 2010 19:43] Winfried Trümper
Great, looks like we're making progress. May I reformulate your answer in the following way:

"When it comes to duplicate rows there is no equivalent to the ALL and DISTINCT options of SELECT for the multi-table UPDATE syntax. Instead it is always DISTINCT for UPDATE with multi-table syntax, while the default option for SELECT is ALL."

Does this description reflect the actual internal implementation properly?

Would you agree that this sentences would be helpful pieces of information for readers of chapter 12.2.11? In that case you might want to re-open the issue, so that the great MySQL manual can become even more informative.

See, I never claimed it's a real bug. I just wrote in the suggested fix if it's a feature, document it in chapter 12.2.11 (multiple-table update syntax). Because it reads too much like "UPDATE is like SELECT in all aspects".

Thanks

-Winfried
[6 Sep 2010 20:50] Winfried Trümper
Change to my original suggestion: "implicit DISTINCT+GROUP BY" makes it more clear.

"When it comes to duplicate rows there is no equivalent to the ALL and DISTINCT options of SELECT for the multi-table UPDATE syntax. Instead it is always an implicit DISTINCT+GROUP BY for UPDATE with multi-table syntax, while the default option for SELECT is ALL."