MySQL Bugs: #1873: Insert on duplicate key update strict mode

Bug #1873	Insert on duplicate key update strict mode
Submitted:	18 Nov 2003 7:58	Modified:	14 Feb 2008 10:46
Reporter:	Olaf van der Spek (Basic Quality Contributor)	Email Updates:
Status:	Won't fix	Impact on me:	None
Category:	MySQL Server	Severity:	S4 (Feature request)
Version:	6.0	OS:	Any
Assigned to:	Sergei Golubchik	CPU Architecture:	Any
Tags:	qc

Description:
I'd like to be able to update multi-rows (independent) in a single table with a single query. The table has a primary key. See "how to repeat" and "suggested fix".

How to repeat:
+------------+---------------+------+-----+---------+----------------+
| Field      | Type          | Null | Key | Default | Extra          |
+------------+---------------+------+-----+---------+----------------+
| pid        | int(11)       |      | PRI | NULL    | auto_increment |
| rank       | int(11)       |      |     | 0       |                |
+------------+---------------+------+-----+---------+----------------+

update set rank = 1 where pid = 1;
update set rank = 2 where pid = 4;
update set rank = 3 where pid = 9;

Suggested fix:
Modify UPDATE to allow REPLACE-like behaviour:
UPDATE2 t (rank, pid) VALUES (1, 1), (2, 4), (3, 9)

Thank you for your bug report. This issue has already been fixed
in the latest released version of that product, which you can download at 
http://www.mysql.com/downloads/

use

 INSERT t (rank, pid) VALUES (1, 1), (2, 4), (3, 9)
  ON DUPLICATE KEY UPDATE pid=VALUES(pid)

sorry, forgot to say.

it's in 4.1.0

I assume you mean rank = VALUES(rank)?
Because pid is the primary key.

BTW, it's 4.1.1, not 4.1.0: "Since MySQL 4.1.1 one can use function VALUES(col_name) to refer to the column value in the INSERT part of the INSERT ... UPDATE command - that is the value that would be inserted if there would be no duplicate key conflict."

When will 4.1 be "Production release" instead of "Alpha release"?

I also wanted to ask, what if you only want to update. So if a row doesn't exist, to ignore the update for that row and continue with the other rows?

Has this feature (update, but don't insert if non-existing) already been added?

no not yet.
We don't currently have plans to support this (non-standard) syntax

I'd like to reopen this feature request.

When strict mode is enable, the 'ON DUPLICATE KEY UPDATE' trick doesn't work anymore, because although the insert case should never be executed, there may be fields that don't have a default value there.

Tags += qc

I am not sure that I agree with Olaf.  Not sure I fully understand either!

But I DO agree that the docs at 
http://dev.mysql.com/doc/refman/5.0/en/insert-on-duplicate.html
should clarify the impact of sql_modes on 'ON DUPLICATE KEY UPDATE' clause.

I also think that I found at least 2 inconsistencies:

using a variation of the example that Olaf posted at 
http://bugs.mysql.com/bug.php?id=33261

set sql_mode = '';

drop table if exists t;
create table t 
(
    pid int not null, 
    rank int not null, 
    points int not null,  
    primary key (pid)
);

INSERT t (rank, pid) VALUES (1, 1), (2, 4), (3, 9);
-- (3 row(s)affected)
INSERT t (rank, pid) VALUES (4, 1), (5, 4), (6, 9)
    ON DUPLICATE KEY UPDATE points = 1;
-- (6 row(s)affected) 
--  1st inconsistency: why 6 rows ????

set sql_mode = 'strict_all_tables';

drop table if exists t;
create table t 
(
    pid int not null, 
    rank int not null, 
    points int not null,  
    primary key (pid)
);

INSERT t (rank, pid, points) VALUES (1, 1, 0), (2, 4, 0), (3, 9, 0);
-- (3 row(s)affected)
INSERT t (rank, pid) VALUES (4, 1), (5, 4), (6, 9)
    ON DUPLICATE KEY UPDATE points = 1;
-- Error Code : 1364
-- Field 'points' doesn't have a default value
-- ?????
-- 2nd inconsistency: it should update 'points' column with '1' so it won't use that default anyway - why then demand it in the statement ??!!

> --  1st inconsistency: why 6 rows ????

What version? Can't duplicate on 5.1.22.

> why then demand it in the statement??!!

It checks before it's trying to insert the rows.

1) version is 5.0.45

2) I understand, but there is no need that it should.  It should simply write '1' and shut up!

I'm not sure. In that case you have a query that'll sometimes cause query errors (depending on whether certain rows exist or not).

It will of course have to check for the EXISTENCY of the column, but as the column should be updated as the statement tells, there is no need to check for a DEFAULT.

Now, I will also *shut up* here and see what the MySQL people have to say!

> It will of course have to check for the EXISTENCY of the column, but as the column should be updated as the statement tells, there is no need to check for a DEFAULT.

*Only* if the key exists already. Otherwise, it'll have to do an insert. And that's data-dependent.

Olaf, Peter,

I really can't follow your discussion. 
Please, could you summarize what is the problem here?
Do you want a new feature or just a documentation change?
Also, please let us know the version.

It's a request for a new feature. I don't think a version number makes sense for such a request.

The summary from the initial request is still valid.

Ahhh, now I understand.

You want the following implementation for update:

 UPDATE tabelle
    SET { col = expression |
          ( col [, ...] ) = ( expression [, ...] ) } [, ...]
    [ FROM from-list ]
    [ WHERE condition ]

Eh, no. Or maybe. 

What I'd like is the functionality of the following statement without the insert part. So, if a key does not exist, that value pair should be ignored.

INSERT t (rank, pid) VALUES (1, 1), (2, 4), (3, 9) ON DUPLICATE KEY UPDATE pid=VALUES(pid)

Supporting a query instead of "(rank, pid) VALUES (1, 1), (2, 4), (3, 9)" would be a nice addition.

Let me try again:

The SQL standard says, this could be possible:

UPDATE tab set (col1, col2, ...) = (val1, val2, ....), (col3, col4,...) = (val3, val4, ...), ...

You want:
UPDATE tab set (col1, co2, ...) = (val1, val2, ....), (val3, val4, ....), ...

or similar for a single col:
UPDATE tab set col1 = val1, val2, val3,...

That would be an add-on to the standard SQL syntax.

Did I understand it right, now?

Now I think I understand

Won't "UPDATE ... IGNORE ..." do the trick?

http://dev.mysql.com/doc/refman/5.1/en/update.html

"If you use the IGNORE keyword, the update statement does not abort even if errors occur during the update. Rows for which duplicate-key conflicts occur are not updated. Rows for which columns are updated to values that would cause data conversion errors are updated to the closest valid values instead"

@ Susanne

Please tell if I should create seperate reports for the 2 small issues I reported [16 Dec 0:05]

I'm not familiar with the SQL standard.

> UPDATE tab set (col1, col2, ...) = (val1, val2, ....), (col3, col4,...)
= (val3, val4, ...), ...

What would be the semantics of this statement?
Would it be equivalent to this one?
UPDATE tab set (col1, col2, ...) = (val1, val2, ....), ...
UPDATE tab set (col3, col4,...) = (val3, val4, ...), ...

> You want:

Yes, although the syntax from on duplicate key update looks more flexible.

> or similar for a single col:

No, that won't work, as you wouldn't have a key to identify the row.

> That would be an add-on to the standard SQL syntax.

Could be, as I said, I'm not familiar with the standard.

> Did I understand it right, now?

I hope so.

If I understand, what you do is to UPDATE existing rows and NOT(= never?) to INSERT new rows.

Then it would IMHO be logical to use the UPDATE statement and not the INSERT statement! IGNORE keyword will prevent an error to occur if the WHERE clause does not 'hit' any existing columns.

> Then it would IMHO be logical to use the UPDATE statement and not the INSERT statement!

The update statement doesn't support independent updates. You can only update multiple rows in the same way.

Ok, I think, it is clear now:

Your wish:
UDATE tab set (col1, col2, col3, ...) = (val1, val2, val3, ...), (val4, val5, ....), (val10, val11, val12, ...), ...

If this is implemented once, it would also work:
INSERT ... ON DUPLICATE KEY UPDATE (col1, col2, col3, ...) = (val1, val2, val3, ...), (val4, val5, val6, ...), ....

Yes, I think that's it.

I think, we both make thinking errors here.

UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...)
This could be possible.

UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (col3, col4, ...) = (val4, val5, ...)
This could be possible too.

But:
UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (val4, val5, val6, ...)
This can't be possible.

Just an example:
Update tab set (a,b)=(1,2),(3,4),(5,6) where c=5;

What do you expect the update will do?
Which values have "a" and "b" after the update?

when you have 3 rows where c=5 is true what should happen?

first step all "a" of that rows = 1 and all "b" of that rows = 2
second step all "a" of that rows = 3 and all "b" of that rows = 4
third step all "a" of that rows = 5 and all "b" of that rows = 6

That would mean, that at the end all "a" of the rows are 5 and all "b" are 6.
That could be easier solved with: Update tab set (a,b)=(5,6) where c=5;
In my eyes, this would be senseless.

The other possibility:
first of the three rows where c=5: a will get 1 and b will get 2
second of the three rows where c=5: a will get 3 and b will get 4
third of the three rows where c=5: a will get 5 and b will get 6

This is very difficult to implement and will miss the sense of update in my eyes.

The only think, we could implement is:
UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (col3, col4, ...) = (val4, val5, ...)

as example:
Update tab set (a,b)=(1,2), (c,d)=(3,4), (e,f)=(5,6) where c=5;

This would mean that all columns a of the three rows will get 1, all b will get 2, all c will get 3, and so on.

Of course, for this syntax also should be possible something like this:

Update tab set (a,b)=(select x,y from tab2 where ..), (c,d)=(select u,w ...), (e,f)=((select ...),6) where c=5;

The advantage is, that you have the columns as list first and then the values as list, like at the insert statement. Look:

INSERT into tab (col1, col2, ...) values (val1, val2, ...)
UPDATE tab set (col1, col2, ...)    =     (val1, val2, ...)

> UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (val4, val5, val6, ...)
> This can't be possible.

> Just an example:
> Update tab set (a,b)=(1,2),(3,4),(5,6) where c=5;

My request is for functionality comparable to insert ... on duplicate key update ...
So, that'd be:
insert into t (a,b) values (1,2),(3,4),(5,6) on duplicate key update b = values(b);
Assume a is a unique key. Then it'd be equivalent to:
update t set b = 2 where a = 1;
update t set b = 4 where a = 3;
update t set b = 6 where a = 5;

If you want to have a where clause, you could append it, but that's not necessary in my case.

> This is very difficult to implement and will miss the sense of update in my eyes.

Yes, and it's not what I requested. ;)

> The only think, we could implement is:
> UPDATE tab set (col1, col2, ...) = (val1, val2, val3, ...), (col3, col4, ...) = (val4,
> val5, ...)

> as example:
> Update tab set (a,b)=(1,2), (c,d)=(3,4), (e,f)=(5,6) where c=5;

> This would mean that all columns a of the three rows will get 1, all b will get 2, all c
> will get 3, and so on.

> Of course, for this syntax also should be possible something like this:

> Update tab set (a,b)=(select x,y from tab2 where ..), (c,d)=(select u,w ...),
> (e,f)=((select ...),6) where c=5;

Is that part of the SQL standard?
It's a very complex operation, a bit like:
update t set (a,b,c,d) = (select x,y from t2 where ... append_columns select u, w from t3 where ...)

But not what I asked for. Could be useful though.

Hi Olaf,

now, I have it ...

This already works in MySQL 6.0 and also in MySQL 5.0.51.

mysql> create table t1(id integer, num integer, primary key(id));
Query OK, 0 rows affected (0.03 sec)

mysql> set sql_mode='STRICT_TRANS_TABLES';
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 values(1,1),(2,2),(3,3);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> insert into t1 values(1,2),(2,3),(3,4) on duplicate key update num=values(num);
Query OK, 6 rows affected (0.03 sec)
Records: 3  Duplicates: 3  Warnings: 0

mysql> select * from t1\G
*************************** 1. row ***************************
 id: 1
num: 2
*************************** 2. row ***************************
 id: 2
num: 3
*************************** 3. row ***************************
 id: 3
num: 4
3 rows in set (0.00 sec)

I will close this bug. Please, feel free to open it again, if it's still not, what you wanted.

Sorry, I forget. Besides version 5.0.51, I tested with:

mysql> select version()\G
*************************** 1. row ***************************
version(): 6.0.3-alpha-debug

Of course *that* works.

Please read this again:
> When strict mode is enable, the 'ON DUPLICATE KEY UPDATE' trick doesn't work 
> anymore, because although the insert case should never be executed, there may be > fields that don't have a default value there.

Try this instead:
mysql> set sql_mode='strict_trans_tables';

mysql> drop table if exists t1;
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists t1;
create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 (id, num, a) values (1, 1, 1),(3, 3, 3);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t1 (id, num) values (1, 2), (2, 3), (3, 4) on duplicate key update num = values(num);
ERROR 1364 (HY000): Field 'a' doesn't have a default value

mysql> set sql_mode='';
Query OK, 0 rows affected (0.00 sec)

mysql> drop table if exists t1;

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 (id, num, a) values (1, 1, 1),(3, 3, 3);
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t1 (id, num) values (1, 2), (2, 3), (3, 4) on duplicate key update num = values(num);
Query OK, 5 rows affected, 1 warning (0.00 sec)
Records: 3  Duplicates: 2  Warnings: 0

mysql> select * from t1;
+----+------+---+
| id | num  | a |
+----+------+---+
|  1 |    2 | 1 | 
|  3 |    4 | 3 | 
|  2 |    3 | 0 | 
+----+------+---+
3 rows in set (0.00 sec)

mysql>

mysql> insert into t1 values(1,2),(2,3),(3,4) on duplicate key update
num=values(num);
Query OK, 6 rows affected (0.03 sec)

Why 6 rows??

> Why 6 rows??

That's a bug, please submit a new report for it.

done http://bugs.mysql.com/bug.php?id=33370

6 is a little bit logical.
http://dev.mysql.com/doc/refman/5.1/en/constraint-invalid-data.html

In my eyes:

Query OK, 6 rows affected (0.03 sec)
Records: 3  Duplicates: 3  Warnings: 0

3 records + 3 duplicates = 6

In my eyes: first it tries to insert the statements. Therefor 3 rows are affected. Because all are duplicate it tries to update them, therefor 3 rows are affected. The sum is 6.

This is also documented at the MySQL 5.0 Certification Study Guide:

Chapter 11.2.3 Using INSERT ... ON DUPLICATE KEY UPDATE

"Notice the difference in the 'row affected' value returned by the server for each INSERT statement: If a new record is inserted, the value is 1; if an already existing record is updated, the value is 2"

This one isn't supposed to be closed.

Please read bug #33370 and bug #33371 for further information.

> Please read bug #33370 and bug #33371 for further information.

Sorry, but I don't see how that invalidates this feature request.
See one of my last posts: [19 Dec 2007 16:44] Olaf van der Spek

Sorry, but I can't see a feature request here.
What exactly should the feature request be?

The following code should work by introducing a new update statement with semantics comparable to insert ... on duplicate key update ... without the insert part (and thus avoiding the issues with strict mode).

mysql> set sql_mode='strict_trans_tables';

mysql> drop table if exists t1;
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.00 sec)

mysql> create table t1 (id integer, num integer, a int not null, primary key (id));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t1 (id, num, a) values (1, 1, 1),(3, 3, 3);
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> insert into t1 (id, num) values (1, 2), (2, 3), (3, 4) on duplicate key update num
= values(num);
ERROR 1364 (HY000): Field 'a' doesn't have a default value

Please read bug #33371.

Using strict mode also means, that you are more strict to the SQL Standard and the expected behaviour of the state-of-the-art.

What should happen, when you made a thinking error, and there is no column, that could be updated? Then the insert will fail because of missing values. The system can't guess if you know, that this statement is always an UPDATE and not an INSERT.

Usually, INSERT ... ON DUPLICATE KEY UPDATE is used, when you are not sure if the row exist or not. And strict mode is used to avoid thinking errors or having inconsistent data.

> Please read bug #33371.

I have read #33371.

> Using strict mode also means, that you are more strict to the SQL Standard and > the expected behaviour of the state-of-the-art.

I know what strict mode means.

> What should happen, when you made a thinking error, 

I don't make thinking errors. ;)

> and there is no column, that could be updated? Then the insert will fail because of missing values. The system can't guess if

I know. That's why this is a *feature* request for a statement that only does updates, no inserts.

But that's no feature request. Such a feature is implemented since years and is called: "UPDATE". :)

When you know, that it always just is an "UPDATE" and no "INSERT" is needed, why using "INSERT ... ON DUPLICATE KEY UPDATE .."?
Why not just using "UPDATE ...."?

Because, like the initial description says: "I'd like to be able to update multiple rows (independent) in a single table with a single
query."
UPDATE only allows dependent updates, while I need to do independent updates.

Again, please read the initial description.

when you want to update multiple rows with different values, the system must know, which rows you want to update with which values.
This means, you need a condition for every row you want to update.

something like: UPDATE tab set a=1 where id=50, set b=5 where id=75, set a=7 where id=27, ...

This is not possible to implement.

The solution for your problem is writing a procedure/function, where you can handle all these conditions and update the rows.

After this you only will have one statement: "CALL PROCEDURE(...)" or "SELECT FUNCTION(...)"

Or you use "INSERT ... ON DUPLICATE KEY UPDATE" without using strict mode, when the strict mode is the problem here.

> This is not possible to implement.

Why not?
I don't see how removing the insert part from insert ... on duplicate key update ... is impossible.

Susanne?

Sorry for delay.

Because it's against SQL standard rules.

> Because it's against SQL standard rules.

What rule says you're not allowed to implement this extension?