Bug #1873 Insert on duplicate key update strict mode
Submitted: 18 Nov 2003 7:58 Modified: 14 Feb 2008 10:46
Reporter: Olaf van der Spek (Basic Quality Contributor) Email Updates:
Status: Won't fix Impact on me:
None 
Category:MySQL Server Severity:S4 (Feature request)
Version:6.0 OS:Any
Assigned to: Sergei Golubchik CPU Architecture:Any
Tags: qc

[18 Nov 2003 7:58] Olaf van der Spek
Description:
I'd like to be able to update multi-rows (independent) in a single table with a single query. The table has a primary key. See "how to repeat" and "suggested fix".

How to repeat:
+------------+---------------+------+-----+---------+----------------+
| Field      | Type          | Null | Key | Default | Extra          |
+------------+---------------+------+-----+---------+----------------+
| pid        | int(11)       |      | PRI | NULL    | auto_increment |
| rank       | int(11)       |      |     | 0       |                |
+------------+---------------+------+-----+---------+----------------+

update set rank = 1 where pid = 1;
update set rank = 2 where pid = 4;
update set rank = 3 where pid = 9;

Suggested fix:
Modify UPDATE to allow REPLACE-like behaviour:
UPDATE2 t (rank, pid) VALUES (1, 1), (2, 4), (3, 9)
[18 Nov 2003 10:17] Sergei Golubchik
Thank you for your bug report. This issue has already been fixed
in the latest released version of that product, which you can download at 
http://www.mysql.com/downloads/

use

 INSERT t (rank, pid) VALUES (1, 1), (2, 4), (3, 9)
  ON DUPLICATE KEY UPDATE pid=VALUES(pid)
[18 Nov 2003 10:18] Sergei Golubchik
sorry, forgot to say.

it's in 4.1.0
[18 Nov 2003 12:17] Olaf van der Spek
I assume you mean rank = VALUES(rank)?
Because pid is the primary key.

BTW, it's 4.1.1, not 4.1.0: "Since MySQL 4.1.1 one can use function VALUES(col_name) to refer to the column value in the INSERT part of the INSERT ... UPDATE command - that is the value that would be inserted if there would be no duplicate key conflict."

When will 4.1 be "Production release" instead of "Alpha release"?
[17 Apr 2004 12:15] Olaf van der Spek
I also wanted to ask, what if you only want to update. So if a row doesn't exist, to ignore the update for that row and continue with the other rows?
[5 Sep 2004 0:06] Olaf van der Spek
Has this feature (update, but don't insert if non-existing) already been added?
[5 Sep 2004 10:45] Sergei Golubchik
no not yet.
We don't currently have plans to support this (non-standard) syntax
[23 May 2007 18:13] Olaf van der Spek
I'd like to reopen this feature request.

When strict mode is enable, the 'ON DUPLICATE KEY UPDATE' trick doesn't work anymore, because although the insert case should never be executed, there may be fields that don't have a default value there.
[15 Dec 2007 21:42] Olaf van der Spek
Tags += qc
[15 Dec 2007 23:05] Peter Laursen
I am not sure that I agree with Olaf.  Not sure I fully understand either!

But I DO agree that the docs at 
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
should clarify the impact of sql_modes on 'ON DUPLICATE KEY UPDATE' clause.

I also think that I found at least 2 inconsistencies:

using a variation of the example that Olaf posted at 
http://bugs.mysql.com/bug.php?id=33261

set sql_mode = '';

drop table if exists t;
create table t 
(
    pid int not null, 
    rank int not null, 
    points int not null,  
    primary key (pid)
);

INSERT t (rank, pid) VALUES (1, 1), (2, 4), (3, 9);
-- (3 row(s)affected)
INSERT t (rank, pid) VALUES (4, 1), (5, 4), (6, 9)
    ON DUPLICATE KEY UPDATE points = 1;
-- (6 row(s)affected) 
--  1st inconsistency: why 6 rows ????

set sql_mode = 'strict_all_tables';

drop table if exists t;
create table t 
(
    pid int not null, 
    rank int not null, 
    points int not null,  
    primary key (pid)
);

INSERT t (rank, pid, points) VALUES (1, 1, 0), (2, 4, 0), (3, 9, 0);
-- (3 row(s)affected)
INSERT t (rank, pid) VALUES (4, 1), (5, 4), (6, 9)
    ON DUPLICATE KEY UPDATE points = 1;
-- Error Code : 1364
-- Field 'points' doesn't have a default value
-- ?????
-- 2nd inconsistency: it should update 'points' column with '1' so it won't use that default anyway - why then demand it in the statement ??!!
[15 Dec 2007 23:10] Olaf van der Spek
> --  1st inconsistency: why 6 rows ????

What version? Can't duplicate on 5.1.22.

> why then demand it in the statement??!!

It checks before it's trying to insert the rows.
[15 Dec 2007 23:19] Peter Laursen
1) version is 5.0.45

2) I understand, but there is no need that it should.  It should simply write '1' and shut up!
[15 Dec 2007 23:22] Olaf van der Spek
I'm not sure. In that case you have a query that'll sometimes cause query errors (depending on whether certain rows exist or not).
[15 Dec 2007 23:28] Peter Laursen
It will of course have to check for the EXISTENCY of the column, but as the column should be updated as the statement tells, there is no need to check for a DEFAULT.

Now, I will also *shut up* here and see what the MySQL people have to say!
[15 Dec 2007 23:30] Olaf van der Spek
> It will of course have to check for the EXISTENCY of the column, but as the column should be updated as the statement tells, there is no need to check for a DEFAULT.

*Only* if the key exists already. Otherwise, it'll have to do an insert. And that's data-dependent.
[17 Dec 2007 14:32] Susanne Ebrecht
Olaf, Peter,

I really can't follow your discussion. 
Please, could you summarize what is the problem here?
Do you want a new feature or just a documentation change?
Also, please let us know the version.
[17 Dec 2007 14:40] Olaf van der Spek
It's a request for a new feature. I don't think a version number makes sense for such a request.

The summary from the initial request is still valid.
[17 Dec 2007 14:58] Susanne Ebrecht
Ahhh, now I understand.

You want the following implementation for update:

 UPDATE tabelle
    SET { col = expression |
          ( col [, ...] ) = ( expression [, ...] ) } [, ...]
    [ FROM from-list ]
    [ WHERE condition ]
[17 Dec 2007 15:13] Olaf van der Spek
Eh, no. Or maybe. 

What I'd like is the functionality of the following statement without the insert part. So, if a key does not exist, that value pair should be ignored.

INSERT t (rank, pid) VALUES (1, 1), (2, 4), (3, 9) ON DUPLICATE KEY UPDATE pid=VALUES(pid)

Supporting a query instead of "(rank, pid) VALUES (1, 1), (2, 4), (3, 9)" would be a nice addition.
[17 Dec 2007 15:25] Susanne Ebrecht
Let me try again:

The SQL standard says, this could be possible:

UPDATE tab set (col1, col2, ...) = (val1, val2, ....), (col3, col4,...) = (val3, val4, ...), ...

You want:
UPDATE tab set (col1, co2, ...) = (val1, val2, ....), (val3, val4, ....), ...

or similar for a single col:
UPDATE tab set col1 = val1, val2, val3,...

That would be an add-on to the standard SQL syntax.

Did I understand it right, now?
[17 Dec 2007 15:33] Peter Laursen
Now I think I understand

Won't "UPDATE ... IGNORE ..." do the trick?

http://dev.mysql.com/doc/refman/5.1/en/update.html

"If you use the IGNORE keyword, the update statement does not abort even if errors occur during the update. Rows for which duplicate-key conflicts occur are not updated. Rows for which columns are updated to values that would cause data conversion errors are updated to the closest valid values instead"
[17 Dec 2007 15:36] Peter Laursen
@ Susanne

Please tell if I should create seperate reports for the 2 small issues I reported [16 Dec 0:05]
[17 Dec 2007 16:22] Olaf van der Spek
I'm not familiar with the SQL standard.

> UPDATE tab set (col1, col2, ...) = (val1, val2, ....), (col3, col4,...)
= (val3, val4, ...), ...

What would be the semantics of this statement?
Would it be equivalent to this one?
UPDATE tab set (col1, col2, ...) = (val1, val2, ....), ...
UPDATE tab set (col3, col4,...) = (val3, val4, ...), ...

> You want:

Yes, although the syntax from on duplicate key update looks more flexible.

> or similar for a single col:

No, that won't work, as you wouldn't have a key to identify the row.

> That would be an add-on to the standard SQL syntax.

Could be, as I said, I'm not familiar with the standard.

> Did I understand it right, now?

I hope so.
[17 Dec 2007 16:54] Peter Laursen
If I understand, what you do is to UPDATE existing rows and NOT(= never?) to INSERT new rows.

Then it would IMHO be logical to use the UPDATE statement and not the INSERT statement! IGNORE keyword will prevent an error to occur if the WHERE clause does not 'hit' any existing columns.
[17 Dec 2007 17:02] Olaf van der Spek
> Then it would IMHO be logical to use the UPDATE statement and not the INSERT statement!

The update statement doesn't support independent updates. You can only update multiple rows in the same way.
[17 Dec 2007 17:20] Susanne Ebrecht
Ok, I think, it is clear now:

Your wish:
UDATE tab set (col1, col2, col3, ...) = (val1, val2, val3, ...), (val4, val5, ....), (val10, val11, val12, ...), ...

If this is implemented once, it would also work:
INSERT ... ON DUPLICATE KEY UPDATE (col1, col2, col3, ...) = (val1, val2, val3, ...), (val4, val5, val6, ...), ....
[17 Dec 2007 17:45] Olaf van der Spek
Yes, I think that's it.
[18 Dec 2007 12:46] Susanne Ebrecht
I think, we both make thinking errors here.

UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...)
This could be possible.

UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (col3, col4, ...) = (val4, val5, ...)
This could be possible too.

But:
UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (val4, val5, val6, ...)
This can't be possible.

Just an example:
Update tab set (a,b)=(1,2),(3,4),(5,6) where c=5;

What do you expect the update will do?
Which values have "a" and "b" after the update?

when you have 3 rows where c=5 is true what should happen?

first step all "a" of that rows = 1 and all "b" of that rows = 2
second step all "a" of that rows = 3 and all "b" of that rows = 4
third step all "a" of that rows = 5 and all "b" of that rows = 6

That would mean, that at the end all "a" of the rows are 5 and all "b" are 6.
That could be easier solved with: Update tab set (a,b)=(5,6) where c=5;
In my eyes, this would be senseless.

The other possibility:
first of the three rows where c=5: a will get 1 and b will get 2
second of the three rows where c=5: a will get 3 and b will get 4
third of the three rows where c=5: a will get 5 and b will get 6

This is very difficult to implement and will miss the sense of update in my eyes.

The only think, we could implement is:
UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (col3, col4, ...) = (val4, val5, ...)

as example:
Update tab set (a,b)=(1,2), (c,d)=(3,4), (e,f)=(5,6) where c=5;

This would mean that all columns a of the three rows will get 1, all b will get 2, all c will get 3, and so on.

Of course, for this syntax also should be possible something like this:

Update tab set (a,b)=(select x,y from tab2 where ..), (c,d)=(select u,w ...), (e,f)=((select ...),6) where c=5;

The advantage is, that you have the columns as list first and then the values as list, like at the insert statement. Look:

INSERT into tab (col1, col2, ...) values (val1, val2, ...)
UPDATE tab set (col1, col2, ...)    =     (val1, val2, ...)
[18 Dec 2007 13:07] Olaf van der Spek
> UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (val4, val5, val6, ...)
> This can't be possible.

> Just an example:
> Update tab set (a,b)=(1,2),(3,4),(5,6) where c=5;

My request is for functionality comparable to insert ... on duplicate key update ...
So, that'd be:
insert into t (a,b) values (1,2),(3,4),(5,6) on duplicate key update b = values(b);
Assume a is a unique key. Then it'd be equivalent to:
update t set b = 2 where a = 1;
update t set b = 4 where a = 3;
update t set b = 6 where a = 5;

If you want to have a where clause, you could append it, but that's not necessary in my case.

> This is very difficult to implement and will miss the sense of update in my eyes.

Yes, and it's not what I requested. ;)

> The only think, we could implement is:
> UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (col3, col4, ...) = (val4,
> val5, ...)

> as example:
> Update tab set (a,b)=(1,2), (c,d)=(3,4), (e,f)=(5,6) where c=5;

> This would mean that all columns a of the three rows will get 1, all b will get 2, all c
> will get 3, and so on.

> Of course, for this syntax also should be possible something like this:

> Update tab set (a,b)=(select x,y from tab2 where ..), (c,d)=(select u,w ...),
> (e,f)=((select ...),6) where c=5;

Is that part of the SQL standard?
It's a very complex operation, a bit like:
update t set (a,b,c,d) = (select x,y from t2 where ... append_columns select u, w from t3 where ...)

But not what I asked for. Could be useful though.
[19 Dec 2007 15:27] Susanne Ebrecht
Hi Olaf,

now, I have it ...

This already works in MySQL 6.0 and also in MySQL 5.0.51.

mysql> create table t1(id integer, num integer, primary key(id));
Query OK, 0 rows affected (0.03 sec)

mysql> set sql_mode='STRICT_TRANS_TABLES';
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 values(1,1),(2,2),(3,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> insert into t1 values(1,2),(2,3),(3,4) on duplicate key update num=values(num);
Query OK, 6 rows affected (0.03 sec)
Records: 3  Duplicates: 3  Warnings: 0

mysql> select * from t1\G
*************************** 1. row ***************************
 id: 1
num: 2
*************************** 2. row ***************************
 id: 2
num: 3
*************************** 3. row ***************************
 id: 3
num: 4
3 rows in set (0.00 sec)

I will close this bug. Please, feel free to open it again, if it's still not, what you wanted.
[19 Dec 2007 15:30] Susanne Ebrecht
Sorry, I forget. Besides version 5.0.51, I tested with:

mysql> select version()\G
*************************** 1. row ***************************
version(): 6.0.3-alpha-debug
[19 Dec 2007 15:44] Olaf van der Spek
Of course *that* works.

Please read this again:
> When strict mode is enable, the 'ON DUPLICATE KEY UPDATE' trick doesn't work 
> anymore, because although the insert case should never be executed, there may be > fields that don't have a default value there.

Try this instead:
mysql> set sql_mode='strict_trans_tables';

mysql> drop table if exists t1;
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists t1;
create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 (id, num, a) values (1, 1, 1),(3, 3, 3);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t1 (id, num) values (1, 2), (2, 3), (3, 4) on duplicate key update num = values(num);
ERROR 1364 (HY000): Field 'a' doesn't have a default value

mysql> set sql_mode='';
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists t1;

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 (id, num, a) values (1, 1, 1),(3, 3, 3);
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t1 (id, num) values (1, 2), (2, 3), (3, 4) on duplicate key update num = values(num);
Query OK, 5 rows affected, 1 warning (0.00 sec)
Records: 3  Duplicates: 2  Warnings: 0

mysql> select * from t1;
+----+------+---+
| id | num  | a |
+----+------+---+
|  1 |    2 | 1 | 
|  3 |    4 | 3 | 
|  2 |    3 | 0 | 
+----+------+---+
3 rows in set (0.00 sec)

mysql>
[19 Dec 2007 15:59] Peter Laursen
mysql> insert into t1 values(1,2),(2,3),(3,4) on duplicate key update
num=values(num);
Query OK, 6 rows affected (0.03 sec)

Why 6 rows??
[19 Dec 2007 16:14] Olaf van der Spek
> Why 6 rows??

That's a bug, please submit a new report for it.
[19 Dec 2007 16:29] Peter Laursen
done http://bugs.mysql.com/bug.php?id=33370
[19 Dec 2007 19:06] Susanne Ebrecht
6 is a little bit logical.
http://dev.mysql.com/doc/refman/5.1/en/constraint-invalid-data.html

In my eyes:

Query OK, 6 rows affected (0.03 sec)
Records: 3  Duplicates: 3  Warnings: 0

3 records + 3 duplicates = 6

In my eyes: first it tries to insert the statements. Therefor 3 rows are affected. Because all are duplicate it tries to update them, therefor 3 rows are affected. The sum is 6.
[19 Dec 2007 19:13] Susanne Ebrecht
This is also documented at the MySQL 5.0 Certification Study Guide:

Chapter 11.2.3 Using INSERT ... ON DUPLICATE KEY UPDATE

"Notice the difference in the 'row affected' value returned by the server for each INSERT statement: If a new record is inserted, the value is 1; if an already existing record is updated, the value is 2"
[21 Dec 2007 15:34] Olaf van der Spek
This one isn't supposed to be closed.
[3 Jan 2008 13:01] Susanne Ebrecht
Please read bug #33370 and bug #33371 for further information.
[3 Jan 2008 13:11] Olaf van der Spek
> Please read bug #33370 and bug #33371 for further information.

Sorry, but I don't see how that invalidates this feature request.
See one of my last posts: [19 Dec 2007 16:44] Olaf van der Spek
[3 Jan 2008 14:10] Susanne Ebrecht
Sorry, but I can't see a feature request here.
What exactly should the feature request be?
[3 Jan 2008 14:19] Olaf van der Spek
The following code should work by introducing a new update statement with semantics comparable to insert ... on duplicate key update ... without the insert part (and thus avoiding the issues with strict mode).

mysql> set sql_mode='strict_trans_tables';

mysql> drop table if exists t1;
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 (id, num, a) values (1, 1, 1),(3, 3, 3);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t1 (id, num) values (1, 2), (2, 3), (3, 4) on duplicate key update num
= values(num);
ERROR 1364 (HY000): Field 'a' doesn't have a default value
[4 Jan 2008 17:11] Susanne Ebrecht
Please read bug #33371.

Using strict mode also means, that you are more strict to the SQL Standard and the expected behaviour of the state-of-the-art.

What should happen, when you made a thinking error, and there is no column, that could be updated? Then the insert will fail because of missing values. The system can't guess if you know, that this statement is always an UPDATE and not an INSERT.

Usually, INSERT ... ON DUPLICATE KEY UPDATE is used, when you are not sure if the row exist or not. And strict mode is used to avoid thinking errors or having inconsistent data.
[4 Jan 2008 17:34] Olaf van der Spek
> Please read bug #33371.

I have read #33371.

> Using strict mode also means, that you are more strict to the SQL Standard and > the expected behaviour of the state-of-the-art.

I know what strict mode means.

> What should happen, when you made a thinking error, 

I don't make thinking errors. ;)

> and there is no column, that could be updated? Then the insert will fail because of missing values. The system can't guess if

I know. That's why this is a *feature* request for a statement that only does updates, no inserts.
[7 Jan 2008 14:01] Susanne Ebrecht
But that's no feature request. Such a feature is implemented since years and is called: "UPDATE". :)

When you know, that it always just is an "UPDATE" and no "INSERT" is needed, why using "INSERT ... ON DUPLICATE KEY UPDATE .."?
Why not just using "UPDATE ...."?
[7 Jan 2008 14:05] Olaf van der Spek
Because, like the initial description says: "I'd like to be able to update multiple rows (independent) in a single table with a single
query."
UPDATE only allows dependent updates, while I need to do independent updates.

Again, please read the initial description.
[7 Jan 2008 15:19] Susanne Ebrecht
when you want to update multiple rows with different values, the system must know, which rows you want to update with which values.
This means, you need a condition for every row you want to update.

something like: UPDATE tab set a=1 where id=50, set b=5 where id=75, set a=7 where id=27, ...

This is not possible to implement.

The solution for your problem is writing a procedure/function, where you can handle all these conditions and update the rows.

After this you only will have one statement: "CALL PROCEDURE(...)" or "SELECT FUNCTION(...)"

Or you use "INSERT ... ON DUPLICATE KEY UPDATE" without using strict mode, when the strict mode is the problem here.
[7 Jan 2008 15:22] Olaf van der Spek
> This is not possible to implement.

Why not?
I don't see how removing the insert part from insert ... on duplicate key update ... is impossible.
[17 Jan 2008 15:11] Olaf van der Spek
Susanne?
[14 Feb 2008 10:37] Susanne Ebrecht
Sorry for delay.

Because it's against SQL standard rules.
[14 Feb 2008 10:46] Olaf van der Spek
> Because it's against SQL standard rules.

What rule says you're not allowed to implement this extension?