Bug #29717 INSERT INTO SELECT inserts values even if SELECT statement itself returns empty
Submitted: 11 Jul 2007 9:58 Modified: 3 Aug 2007 16:59
Reporter: Sveta Smirnova Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server Severity:S2 (Serious)
Version:4.1, 5.0, 5.1 BK OS:Any (Linux)
Assigned to: Alexey Botchkov CPU Architecture:Any
Tags: bfsm_2007_07_19

[11 Jul 2007 9:58] Sveta Smirnova
Description:
For some data INSERT INTO SELECT inserts values even if SELECT statement itself returns empty

How to repeat:
Use attached test case
[11 Jul 2007 10:00] Sveta Smirnova
test case

Attachment: bug29717.test (application/octet-stream, text), 4.22 KiB.

[25 Jul 2007 15:14] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31546

ChangeSet@1.2677, 2007-07-25 19:13:02+05:00, holyfoot@mysql.com +6 -0
  Bug #29717 INSERT INTO SELECT inserts values even if SELECT statement itself returns empty.
  
  We use different way to handle this SELECT in the INSERT INTO as we insert
  in the same table what we use in the select. So we use temporary table,
  and as we have suitable index, we can calculate group values at once
  and store them in the temporary table. At the same time the table
  whose field is in the GROUP BY contains just a single row, so optimizer
  decides to remove the group list at all.
  Still SELECT min(x) from empty_table; and
        SELECT min(x) from empty_table GROUP BY y; have to return different
  results - first query should return the single (NULL) row, second -
  an empty recordset.
  So this fix remembers the case when GROUP BY existed and was removed
  by optimizer and suppress the (NULL) row if that was the case.
[27 Jul 2007 12:48] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31695

ChangeSet@1.2677, 2007-07-27 16:46:47+05:00, holyfoot@mysql.com +4 -0
  Bug #29717 INSERT INTO SELECT inserts values even if
   SELECT statement itself returns empty.
  
  As a result of this bug some SELECT ... GROUP BY queries
  return (NULL) row instead of empty recordset.
  
  Ultimately failure happens in end_send_group() where we decide to
  return the (NULL) row as JOIN::group is FALSE
  
  We use different way to handle this SELECT in the INSERT INTO as we insert
  in the same table what we use in the select. So we use temporary table,
  and as we have suitable index, we can calculate group values at once
  and store them in the temporary table. At the same time the table
  whose field is in the GROUP BY contains just a single row, so optimizer
  decides to remove the group list at all.
  Still SELECT min(x) from empty_table; and
        SELECT min(x) from empty_table GROUP BY y; have to return different
  results - first query should return the single (NULL) row, second -
  an empty recordset.
  So this fix remembers the case when GROUP BY existed and was removed
  by optimizer and suppress the (NULL) row if that was the case.
[30 Jul 2007 20:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31853

ChangeSet@1.2677, 2007-07-31 00:44:50+05:00, holyfoot@mysql.com +6 -0
  Bug #29717 INSERT INTO SELECT inserts values even if
   SELECT statement itself returns empty.
  
  As a result of this bug 'SELECT AGGREGATE_FUNCTION(fld) ... GROUP BY'
  can return one row instead of empty recordset.
  
  When GROUP BY only has fields of a 'constant' table
  (with a single row), optimizer deletes the group_list.
  After that we lost the information about whether we had an
  GROUP BY statement. Though it's important
  as SELECT min(x) from empty_table; and
     SELECT min(x) from empty_table GROUP BY y; have to return
  different results - first query should return one row,
  second - an empty recordset.
  So here we add the 'group_optimized_away' flag to remember this case
  when GROUP BY exists in the query and was removed
  by optimizer, and check this flag in end_send_group()
[31 Jul 2007 6:30] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31866

ChangeSet@1.2677, 2007-07-31 10:28:50+05:00, holyfoot@mysql.com +6 -0
  Bug #29717 INSERT INTO SELECT inserts values even if
   SELECT statement itself returns empty.
  
  As a result of this bug 'SELECT AGGREGATE_FUNCTION(fld) ... GROUP BY'
  can return one row instead of an empty result set.
  
  When GROUP BY only has fields of constant tables
  (with a single row), the optimizer deletes the group_list.
  After that we lose the information about whether we had an
  GROUP BY statement. Though it's important
  as SELECT min(x) from empty_table; and
     SELECT min(x) from empty_table GROUP BY y; have to return
  different results - the first query should return one row,
  second - an empty result set.
  So here we add the 'group_optimized_away' flag to remember this case
  when GROUP BY exists in the query and is removed
  by the optimizer, and check this flag in end_send_group()
[31 Jul 2007 6:47] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/31868

ChangeSet@1.2677, 2007-07-31 10:46:04+05:00, holyfoot@mysql.com +6 -0
  Bug #29717 INSERT INTO SELECT inserts values even if
   SELECT statement itself returns empty.
  
  As a result of this bug 'SELECT AGGREGATE_FUNCTION(fld) ... GROUP BY'
  can return one row instead of an empty result set.
  
  When GROUP BY only has fields of constant tables
  (with a single row), the optimizer deletes the group_list.
  After that we lose the information about whether we had an
  GROUP BY statement. Though it's important
  as SELECT min(x) from empty_table; and
     SELECT min(x) from empty_table GROUP BY y; have to return
  different results - the first query should return one row,
  second - an empty result set.
  So here we add the 'group_optimized_away' flag to remember this case
  when GROUP BY exists in the query and is removed
  by the optimizer, and check this flag in end_send_group()
[2 Aug 2007 19:13] Bugs System
Pushed into 5.1.21-beta
[2 Aug 2007 19:14] Bugs System
Pushed into 5.0.48
[2 Aug 2007 19:16] Bugs System
Pushed into 4.1.24
[3 Aug 2007 16:59] Paul DuBois
Noted in 4.1.24, 5.0.48, 5.1.21 changelogs.

In some cases, INSERT INTO ... SELECT ... GROUP BY could insert rows
even if the SELECT by itself produced an empty result.