Bug #16105 Strange result on REPLACE statement with aggregate SELECT
Submitted: 30 Dec 2005 17:51 Modified: 14 Jul 2006 16:06
Reporter: Nathan Tanner Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.0.16 OS:Linux (Linux)
Assigned to: Georgi Kodinov CPU Architecture:Any

[30 Dec 2005 17:51] Nathan Tanner
Description:
Using REPLACE with an aggregate SELECT statement on a VARCHAR primary key produces unexpected results and warnings.

In the "How to repeat" queries, the SELECT statement returns:

mysql> SELECT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE ID='4' HAVING COUNT(*) IN (0,1);

+----------------+--------------------+
| IFNULL(ID,'4') | IFNULL(Field1+1,1) |
+----------------+--------------------+
| 4              |                  1 |
+----------------+--------------------+
1 row in set (0.00 sec)

Yet the REPLACE using that same SELECT returns:

Query OK, 1 row affected, 2 warnings (0.02 sec)
Records: 1  Duplicates: 0  Warnings: 2

SHOW WARNINGS returns:

| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'ID' at row 0     |
| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'Field1' at row 0 |

And the 'test' table ends up containing:

mysql> select * from test;
+----+--------+
| ID | Field1 |
+----+--------+
| 1  |      1 |
| 2  |      1 |
| 3  |      1 |
|    |      0 |
+----+--------+

When I would expect it to contain:

mysql> select * from test;
+----+--------+
| ID | Field1 |
+----+--------+
| 1  |      1 |
| 2  |      1 |
| 3  |      1 |
| 4  |      1 |
+----+--------+

I have tried this same test case with ID as a CHAR and INT, and it works correctly with those data types, so it must be related to the VARCHAR type. Also of note is that if the table starts out empty the test case produces correct results. Once the table has a couple entries, it begins failing.

How to repeat:
CREATE TABLE `test` (
  `ID` varchar(10) NOT NULL default '0',
  `Field1` int(11) NOT NULL,
  PRIMARY KEY  (`ID`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

INSERT INTO test VALUES ('1',1),('2',1),('3',1);

SELECT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE ID='4' HAVING COUNT(*) IN (0,1);

REPLACE INTO test SELECT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE ID='4' HAVING COUNT(*) IN (0,1);

Suggested fix:
Take steps to ensure that data returned by a SELECT statement can be used correctly in a REPLACE statement.

The work-around in this instance is to change the ID field to CHAR type instead of VARCHAR.
[30 Dec 2005 18:23] Valeriy Kravchuk
Thank you for a bug report. Verified just as described on 5.0.19-BK (ChangeSet@1.2003.3.1, 2005-12-28 20:30:57+01:00) on Linux:

mysql> CREATE TABLE `test` (
    ->   `ID` varchar(10) NOT NULL default '0',
    ->   `Field1` int(11) NOT NULL,
    ->   PRIMARY KEY  (`ID`)
    -> ) ENGINE=MyISAM DEFAULT CHARSET=latin1;

Query OK, 0 rows affected (0.01 sec)

mysql> INSERT INTO test VALUES ('1',1),('2',1),('3',1);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE ID='4' HAVING COUNT(*)
    -> IN (0,1);

+----------------+--------------------+
| IFNULL(ID,'4') | IFNULL(Field1+1,1) |
+----------------+--------------------+
| 4              |                  1 |
+----------------+--------------------+
1 row in set (0.02 sec)

mysql> REPLACE INTO test SELECT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE
    -> ID='4' HAVING COUNT(*) IN (0,1);
Query OK, 1 row affected, 2 warnings (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 2

mysql> show warnings;
+---------+------+---------------------------------------------------------------------------------+
| Level   | Code | Message                   |
+---------+------+---------------------------------------------------------------------------------+
| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'ID' at row 0     |
| Warning | 1263 | Column set to default value; NULL supplied to NOT NULL column 'Field1' at row 0 |
+---------+------+---------------------------------------------------------------------------------+
2 rows in set (0.00 sec)

mysql> select * from test;
+----+--------+
| ID | Field1 |
+----+--------+
| 1  |      1 |
| 2  |      1 |
| 3  |      1 |
|    |      0 |
+----+--------+
4 rows in set (0.00 sec)

mysql> delete from test where Field1=0;
Query OK, 1 row affected (0.00 sec)

mysql> alter table test modify `ID` char(10) NOT NULL default '0';
Query OK, 3 rows affected (0.03 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> SELECT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE ID='4' HAVING COUNT(*) IN (0,1);
+----------------+--------------------+
| IFNULL(ID,'4') | IFNULL(Field1+1,1) |
+----------------+--------------------+
| 4              |                  1 |
+----------------+--------------------+
1 row in set (0.00 sec)

mysql> REPLACE INTO test SELECT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE ID='4' HAVING COUNT(*) IN (0,1);
Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

mysql> select * from test;
+----+--------+
| ID | Field1 |
+----+--------+
| 1  |      1 |
| 2  |      1 |
| 3  |      1 |
| 4  |      1 |
+----+--------+
4 rows in set (0.01 sec)

mysql> select version();
+-----------+
| version() |
+-----------+
| 5.0.19    |
+-----------+
1 row in set (0.00 sec)

This difference should be explained somehow...
[21 Jun 2006 15:44] Georgi Kodinov
Thank you for taking the time to write to us, but this is not
a bug. Please double-check the documentation available at
http://www.mysql.com/documentation/ and the instructions on
how to report a bug at http://bugs.mysql.com/how-to-report.php

Additional info:

This behavior a consequence of the combination of following factors :
1. You are inserting into the same table from witch you are selecting. This causes the compiler to create a temporary table for the SELECT part of REPLACE ... SELECT.
So basically the SELECT part of REPLACE ... SELECT can be expressed as a standalone SELECT statement by adding the SQL_BUFFER_RESULT option :
SELECT SQL_BUFFER_RESULT IFNULL(ID,'4'), IFNULL(Field1+1,1) FROM test WHERE
ID='4' HAVING COUNT(*) IN (0,1);

2. According to the Reference manual (section 13.2.7) :
"The HAVING clause is applied nearly last, just before items are sent to the client, with no optimization. (LIMIT is applied after HAVING.)". Evidently HAVING is applied applied after reading from the temp table. However the IFNULL() function calls are called when filling up the temp table, so they never fire and you get NULL values in the result set. 
The IFNULL functions work in the standalone SELECT scenario because the application of HAVING happens to coincide with the calculation of the SELECT list columns.
[21 Jun 2006 15:57] Nathan Tanner
My only objection to this not being classified as a bug is the fact that the behaviour changes merely by changing the ID field to type CHAR rather than VARCHAR... if this is not a bug, and is document-supported normal behaviour, you would expect that it should behave the same whether the ID field is CHAR or VARCHAR.
[22 Jun 2006 10:50] Georgi Kodinov
The reason of behaviour change is evident from the explain(s) :
with a varchar column:
+----+-------------+-------+------+---------------+---------+---------+-------+------+------------------------------+
| id | select_type | table | type | possible_keys | key     | key_len | ref   | rows | Extra                        |
+----+-------------+-------+------+---------------+---------+---------+-------+------+------------------------------+
|  1 | SIMPLE      | test  | ref  | PRIMARY       | PRIMARY | 12      | const |    1 | Using where; Using temporary |
+----+-------------+-------+------+---------------+---------+---------+-------+------+------------------------------+
1 row in set (10 min 8.20 sec)

with char column:
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
| id | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra               |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+
|  1 | SIMPLE      | NULL  | NULL | NULL          | NULL | NULL    | NULL | NULL | Impossible WHERE noticed after reading const tables |
+----+-------------+-------+------+---------------+------+---------+------+------+-----------------------------------------------------+

This comes from equality with constant value compilation optimization firing and removing the temp table. This optimization is not applicable in both cases because of how trailing spaces are handled in VARCHAR and CHAR columns in indexed comparisons. The VARCHAR has to compensate for the trailing spaces and one value may have many matches in the key (example : 'a' matches 'a ', 'a  ' etc) so it can't really tell if one value will match 1 or more rows even if the key is unique, whereas CHAR compares like binary.
[22 Jun 2006 14:23] Nathan Tanner
Thank you, that makes sense now.
[14 Jul 2006 16:06] Paul DuBois
Added note requested by Georgi to INSERT ... SELECT section.