Bug #20916 ambigous field with INSERT ... SELECT ... ON DUPLICATE KEY UPDATE
Submitted: 8 Jul 2006 17:35 Modified: 30 Oct 2006 19:08
Reporter: Daniel Kinzler Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Documentation Severity:S3 (Non-critical)
Version:4.1.13, 5.0.21 OS:Linux (Linux)
Assigned to: Paul DuBois CPU Architecture:Any

[8 Jul 2006 17:35] Daniel Kinzler
Description:
When using INSERT ... SELECT ... ON DUPLICATE KEY UPDATE, and the select references the table the insert points to, then any reference to a field in the UDPATE results in ERROR 1052 (23000): Column 'xxx' in field list is ambiguous.

Since the UPDATE part can only refer to fields in the existing row (using an unqualified name) and in the row pending insertion (using VALUES(x) syntax), there can be no ambiguity. Any attempt to reference fields from the select appear misguided.

I have tried several ways of aliasing/prefixing, to no avail. I also tried to wrap the select in a derived table, to rename all the fields. This results in a different error: ERROR 1110 (42000): COLUMN 'xxx' specified twice

It should be notes that qualifying the field names in the UPDATE part with a table alias defined in the SELECT gets rid of the error in 5.0 but not in 4.1 - but that addresses the *wrong* value - i.e. not the one from the existing row!

The examples below are what I originally had - they can probably be simplified.
Also, it would be worth trying to reference a different table in the select, instead of doing a self-referntial insert. If the other table has a conflicting field name, this error may or may not also occur (i suspect it does but did not try).

The following bugs seem related, but are marked as fixed: #12631, #8732, #13385, #10109 (the search in the bug tracker is pretty bad, btw)

How to repeat:
CREATE TABLE `meta2_tmp_alias` (
  `oldmeaning` int(12) NOT NULL default '0',
  `newmeaning` int(12) NOT NULL default '0'
) ENGINE=InnoDB DEFAULT CHARSET=binary 

CREATE TABLE `meta2_neighbours` (
  `frommeaning` int(12) NOT NULL default '0',
  `tomeaning` int(12) NOT NULL default '0',
  `conflict` tinyint(4) NOT NULL default '0',
  `level` tinyint(4) NOT NULL default '1',
  `via` varbinary(255) default NULL,
  UNIQUE KEY `fromto` (`frommeaning`,`tomeaning`),
  KEY `src` (`frommeaning`,`conflict`),
  KEY `tgt` (`tomeaning`,`conflict`),
  KEY `conflict` (`conflict`),
  KEY `level` (`level`)
) ENGINE=InnoDB DEFAULT CHARSET=binary

INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
                              SELECT newmeaning as frommeaning,
                                     tomeaning,
                                     conflict,
                                     level
                              FROM meta2_neighbours
                              JOIN meta2_tmp_alias ON oldmeaning =  frommeaning
                              ON DUPLICATE KEY UPDATE level = IF(level < VALUES(level), level, VALUES(level)),
                                                      conflict = IF(conflict > VALUES(conflict), conflict, VALUES(conflict))
                                         
-- produces  ERROR 1052 (23000): Column 'level' in field list is ambiguous

-- to solve this, i fiddeled around a bit.
-- all I manages was to get a different error:

INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
                              SELECT f, t, c, l FROM ( SELECT
                                     X.newmeaning as f,
                                     N.tomeaning as t,
                                     N.conflict as c,
                                     N.level as l
                              FROM meta2_neighbours as N
                              JOIN meta2_tmp_alias as X ON oldmeaning =  frommeaning ) as Y
                              ON DUPLICATE KEY UPDATE level = 1, conflict = 2

-- produces ERROR 1110 (42000): COLUMN 'level' specified twice

-- this can also be produced using a simpler query, without join:

INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
                              SELECT N.frommeaning as f,
                                     N.tomeaning as t,
                                     N.conflict as c,
                                     N.level as l
                              FROM meta2_neighbours as N
                              WHERE N.frommeaning = 1234
                              ON DUPLICATE KEY UPDATE level = 1, conflict = 2
                                                      
-- produces ERROR 1110 (42000): COLUMN 'level' specified twice

-- domas just supplied me with a much smaller test case:

insert into ma select a from ma on duplicate key update a=a;

-- produces ERROR 1052 (23000): Column 'a' in field list is ambiguous

-- the following produces an error in 4.1, but not in 5.0; 
-- it doen NOT however express the desired logic, since ma.a is the 
-- same as VALUES(a), i.e. a value from the row pending insertion, 
-- not from teh row that is already there.

insert into ma select a from ma x on duplicate key update a=values(a)+ma.a;

Suggested fix:
Inside an ON DUPLICATE KEY UPDATE clause, only the fields of the existing row (without prefix) and the ones from the (conflicting) row to be inserted (using VALUES(x)) should be visible. They refer to the fields given in the field list of the INSERT clause. No conflicts are possible.

If access to values from the select are supposed to be visible in the UPDATE clause after all (for some strange reason), then a form like CURRENT(x) should be introduced to address fields of the existing row unambiguously.
[8 Jul 2006 17:52] Daniel Kinzler
I just found a workaround: wrap the select in another select to rename the fields (just aliasing does not work) AND do not specify a field list for the INSERT. Then it works. Here's the code:

INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level, via)
                              SELECT f,t,c,l,v FROM ( SELECT newmeaning as f,
                                     tomeaning as t,
                                     conflict as c,
                                     level as l,
                              FROM meta2_neighbours
                              JOIN meta2_tmp_alias ON oldmeaning =  frommeaning ) as X
                              ON DUPLICATE KEY UPDATE level = IF(level < VALUES(level), level, VALUES(level)),
                                                      conflict = IF(conflict > VALUES(conflict), conflict, VALUES(conflict))

Or, using the more terse example:

insert into ma select x from (select a as x from ma) as Z on duplicate key update a=a;
[14 Jul 2006 12:44] Valeriy Kravchuk
With 5.0.25-BK I've got:

openxs@suse:~/dbs/5.0> bin/mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 3 to server version: 5.0.25

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> CREATE TABLE `meta2_tmp_alias` (
    ->   `oldmeaning` int(12) NOT NULL default '0',
    ->   `newmeaning` int(12) NOT NULL default '0'
    -> ) ENGINE=InnoDB DEFAULT CHARSET=binary ;
Query OK, 0 rows affected (0.10 sec)

mysql> CREATE TABLE `meta2_neighbours` (
    ->   `frommeaning` int(12) NOT NULL default '0',
    ->   `tomeaning` int(12) NOT NULL default '0',
    ->   `conflict` tinyint(4) NOT NULL default '0',
    ->   `level` tinyint(4) NOT NULL default '1',
    ->   `via` varbinary(255) default NULL,
    ->   UNIQUE KEY `fromto` (`frommeaning`,`tomeaning`),
    ->   KEY `src` (`frommeaning`,`conflict`),
    ->   KEY `tgt` (`tomeaning`,`conflict`),
    ->   KEY `conflict` (`conflict`),
    ->   KEY `level` (`level`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=binary;
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
    ->                               SELECT newmeaning as frommeaning,
    ->                                      tomeaning,
    ->                                      conflict,
    ->                                      level
    ->                               FROM meta2_neighbours
    ->                               JOIN meta2_tmp_alias ON oldmeaning =  from
meaning
    ->                               ON DUPLICATE KEY UPDATE level = IF(level <
    -> VALUES(level), level, VALUES(level)),
    ->                                                       conflict = IF(conf
lict >
    -> VALUES(conflict), conflict, VALUES(conflict));
ERROR 1052 (23000): Column 'level' in field list is ambiguous
mysql> INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
    ->                               SELECT f, t, c, l FROM ( SELECT
    ->                                      X.newmeaning as f,
    ->                                      N.tomeaning as t,
    ->                                      N.conflict as c,
    ->                                      N.level as l
    ->                               FROM meta2_neighbours as N
    ->                               JOIN meta2_tmp_alias as X ON oldmeaning =
    -> frommeaning ) as Y
    ->                               ON DUPLICATE KEY UPDATE level = 1, conflic
t = 2;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
    ->                               SELECT N.frommeaning as f,
    ->                                      N.tomeaning as t,
    ->                                      N.conflict as c,
    ->                                      N.level as l
    ->                               FROM meta2_neighbours as N
    ->                               WHERE N.frommeaning = 1234
    ->                               ON DUPLICATE KEY UPDATE level = 1, conflic
t = 2;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> create table ma (a int);
Query OK, 0 rows affected (0.02 sec)

mysql> insert into ma select a from ma on duplicate key update a=a;
ERROR 1052 (23000): Column 'a' in field list is ambiguous
mysql> insert into ma select a from ma on duplicate key update a=1;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> insert into ma select a from ma on duplicate key update a=ma.a;
ERROR 1052 (23000): Column 'ma.a' in field list is ambiguous
mysql> insert into ma select a from ma x on duplicate key update a=values(a)+ma
.a;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

While with 4.1.21-BK I've got:

openxs@suse:~/dbs/4.1> bin/mysql -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 1 to server version: 4.1.21

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> create table ma (a int);
Query OK, 0 rows affected (0.01 sec)

mysql> insert into ma select a from ma x on duplicate key update a=values(a)+ma
.a;
Query OK, 0 rows affected (0.01 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> CREATE TABLE `meta2_tmp_alias` (
    ->   `oldmeaning` int(12) NOT NULL default '0',
    ->   `newmeaning` int(12) NOT NULL default '0'
    -> ) ENGINE=InnoDB DEFAULT CHARSET=binary;
Query OK, 0 rows affected (0.01 sec)

mysql> CREATE TABLE `meta2_neighbours` (
    ->   `frommeaning` int(12) NOT NULL default '0',
    ->   `tomeaning` int(12) NOT NULL default '0',
    ->   `conflict` tinyint(4) NOT NULL default '0',
    ->   `level` tinyint(4) NOT NULL default '1',
    ->   `via` varbinary(255) default NULL,
    ->   UNIQUE KEY `fromto` (`frommeaning`,`tomeaning`),
    ->   KEY `src` (`frommeaning`,`conflict`),
    ->   KEY `tgt` (`tomeaning`,`conflict`),
    ->   KEY `conflict` (`conflict`),
    ->   KEY `level` (`level`)
    -> ) ENGINE=InnoDB DEFAULT CHARSET=binary;
Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
    ->                               SELECT newmeaning as frommeaning,
    ->                                      tomeaning,
    ->                                      conflict,
    ->                                      level
    ->                               FROM meta2_neighbours
    ->                               JOIN meta2_tmp_alias ON oldmeaning =  from
meaning
    ->                               ON DUPLICATE KEY UPDATE level = IF(level <
    -> VALUES(level), level, VALUES(level)),
    ->                                                       conflict = IF(conf
lict >
    -> VALUES(conflict), conflict, VALUES(conflict));
ERROR 1052 (23000): Column 'level' in field list is ambiguous
mysql> INSERT INTO meta2_neighbours (frommeaning, tomeaning, conflict, level)
    ->                               SELECT f, t, c, l FROM ( SELECT
    ->                                      X.newmeaning as f,
    ->                                      N.tomeaning as t,
    ->                                      N.conflict as c,
    ->                                      N.level as l
    ->                               FROM meta2_neighbours as N
    ->                               JOIN meta2_tmp_alias as X ON oldmeaning =
    -> frommeaning ) as Y
    ->                               ON DUPLICATE KEY UPDATE level = 1, conflic
t = 2;
Query OK, 0 rows affected (0.00 sec)
Records: 0  Duplicates: 0  Warnings: 0

So, both 5.0-BK and 4.1-BK are consistent now. I think, this is not a bug: we have a way to write the statement. It just have to be documented explicitely.
[30 Oct 2006 19:08] Paul DuBois
Thank you for your bug report. This issue has been addressed in the documentation. The updated documentation will appear on our website shortly, and will be included in the next release of the relevant products.

I'll add a note to:

http://dev.mysql.com/doc/refman/5.0/en/insert-select.html

pointing out the need to use table and column aliases in the
SELECT part to avoid ambiguous column errors.