Description:
INSERT [IGNORE] INTO ... SELECT ... FROM ... GROUP BY works OK.
INSERT INTO ... SELECT ... FROM ... ON DUPLICATE KEY UPDATE ... works OK, too
But combining GROUP BY with ON DUPLICATE KEY UPDATE does not work, because aggregation columns are not visible to the outer UPDATE part.
How to repeat:
> desc a;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| id | int(11) | NO | PRI | 0 | |
| val | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
> select * from a;
+----+------+
| id | val |
+----+------+
| 1 | 10 |
+----+------+
1 row in set (0.00 sec)
> select * from b;
+----+------+
| id | val |
+----+------+
| 1 | 100 |
| 1 | 110 |
| 2 | 200 |
+----+------+
3 rows in set (0.00 sec)
-- SIMPLE INSERT IGNORE WORKS:
> insert ignore into a select id, sum( val ) sum_val from b group by id;
Query OK, 1 row affected (0.00 sec)
Records: 2 Duplicates: 1 Warnings: 0
-- SAME SUBSELECT DOES NOT WORK WITH "ON DUPLICATE KEY UPDATE":
> insert into a select id, sum( val ) sum_val from b group by id on duplicate key update val = val + sum_val;
ERROR 1054 (42S22): Unknown column 'sum_val' in 'field list'
> insert into a select id, sum( val ) sum_val from b group by id on duplicate key update a.val = a.val + sum( val );
ERROR 1111 (HY000): Invalid use of group function
> insert into a select id, sum( val ) sum_val from b group by id on duplicate key update a.val = a.val + sum( b.val );
ERROR 1111 (HY000): Invalid use of group function
*I found two workarounds for this issue which are not quite satisfactory:*
-- WORKAROUND 1 -> VIEW:
> create view b_view as select id, sum( val ) sum_val from b group by id;
Query OK, 0 rows affected (0.07 sec)
> insert into a select id, sum_val from b_view on duplicate key update val = val + sum_val;
Query OK, 3 rows affected (0.00 sec)
Records: 2 Duplicates: 1 Warnings: 0
> select * from a;
+----+------+
| id | val |
+----+------+
| 1 | 220 |
| 2 | 200 |
+----+------+
-- WORKAROUND 2 -> VARIABLES:
> insert into a select id, @sum := sum( val ) from b group by id on duplicate key update val = val + @sum;
Query OK, 3 rows affected (0.00 sec)
Records: 2 Duplicates: 1 Warnings: 0
Suggested fix:
The aggregation column of the inner SELECT should be visible to the ON DUPLICATE KEY UPDATE part of the statement.