Bug #18360 SELECT with Aggregate (i.e., COUNT) and IN returns Wrong Results
Submitted: 20 Mar 2006 17:21 Modified: 20 Oct 2006 17:28
Reporter: Chris Calender Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1 OS:Any (All)
Assigned to: Evgeny Potemkin CPU Architecture:Any

[20 Mar 2006 17:21] Chris Calender
Description:
SELECT with Aggregate (i.e., COUNT) and IN returns Wrong Results when you quote some strings and not others in the IN clause.

As you'll see when repeating it, the first two SELECT statements return the correct results.  However, the third query returns the correct rows in the table for id1 when the value is an integer (i.e., 4 and 5).  However, when id1 is a character (i.e., 'z') and it is mixed with integers that are not enclosed by quotes, then it returns all results for any non-integer values, and the correct results for the integer values.

I have tested this on 4.1.18 and 5.0.18 and received the same results on both versions.

How to repeat:
create table t1 (id1 char(1), id2 int);
insert into t1 (id1, id2) values ('4', 16), ('5', 17), ('z', 18), ('a', 19);

select id1, count(*) from t1 where id1 in ('4', '5', 'z') group by id1;
select id1, count(*) from t1 where id1 in (4,5) group by id1;
select id1, count(*) from t1 where id1 in (4,5,'z') group by id1;
[21 Mar 2006 13:56] MySQL Verification Team
Thank you for the bug report.

mysql> select id1, count(*) from t1 where id1 in (4,5,'z') group by id1;
+------+----------+
| id1  | count(*) |
+------+----------+
| 4    | 1        |
| 5    | 1        |
| a    | 1        |
| z    | 1        |
+------+----------+
4 rows in set, 1 warning (0.01 sec)

mysql> 
mysql> select version();
+--------------+
| version()    |
+--------------+
| 5.0.20-debug |
+--------------+
1 row in set (0.00 sec)

mysql> select id1, count(*) from t1 where id1 in (4,5,'z') group by id1;
+------+----------+
| id1  | count(*) |
+------+----------+
| 4    |        1 |
| 5    |        1 |
| z    |        1 |
+------+----------+
3 rows in set (0.00 sec)

mysql> 
mysql> select version();
+------------------+
| version()        |
+------------------+
| 4.0.27-debug-log |
+------------------+
1 row in set (0.00 sec)
[21 Mar 2006 13:58] MySQL Verification Team
Forgot to add 5.1:

mysql> select id1, count(*) from t1 where id1 in (4,5,'z') group by id1;
+------+----------+
| id1  | count(*) |
+------+----------+
| 4    | 1        |
| 5    | 1        |
| a    | 1        |
| z    | 1        |
+------+----------+
4 rows in set, 1 warning (0.01 sec)

mysql> select version();
+------------------+
| version()        |
+------------------+
| 5.1.8-beta-debug |
+------------------+
1 row in set (0.00 sec)
[28 Apr 2006 10:07] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5679
[3 May 2006 12:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/5874
[6 May 2006 17:44] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6059
[15 May 2006 19:55] Evgeny Potemkin
The fix for this bug is a part of fix for DATE/TIME functions/fields comparison flaw.
Another part is the fix for bug#16377. These parts should be commited together.
This will be done when both of them will be approved.
[29 May 2006 19:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6990
[29 May 2006 20:36] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/6991
[19 Jun 2006 0:12] Evgeny Potemkin
Fixed in 4.1.21, 5.0.23, 5.1.12
[19 Jun 2006 22:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7883
[20 Jun 2006 10:54] Evgeny Potemkin
The patch is reverted because it breaks current comparison rules.
[20 Jun 2006 18:44] Michael Widenius
Current patch needs to be removed from 4.1 and 5.0.  Fix needs to be reworked in 5.1 so that result does not depend on values beeing const or not.
(Have discussed with Evgeny how to do this)
[20 Jun 2006 19:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/7957
[9 Aug 2006 20:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/10221

ChangeSet@1.2247, 2006-08-10 00:01:54+04:00, evgen@moonbone.local +4 -0
  Fixed bug #18360: Type aggregation for IN and CASE may lead to a wrong
  result
  
  The IN function aggregates result types of all expressions. It uses that 
  type in comparison of left expression and expressions in right part. 
  This approach works in most cases. But let's consider the case when the
  right part contains both strings and integers. In that case this approach may
  cause wrong results because all strings which do not start with a digit are
  evaluated as 0.
  CASE uses the same approach when a CASE expression is given thus it's also
  affected.
  
  The idea behind this fix is to make IN function to compare expressions with
  different result types differently. For example a string in the left
  part will be compared as string with strings specified in right part and
  will be converted to real for comparison to int or real items in the right
  part.
  
  The agg_cmp_type() function now aggregates the type of the left expression and 
  each expression in the right part individually. It collects different
  result types found. It also can be forced to aggregate collected result types
  to preserve compatibility.
  
  The cmp_item class now has the flag value_added. It indicates that a value is
  already stored.
  
  The Item_func_in class now can refer up to 5 cmp_item objects: 1 for each
  result type for comparison purposes. cmp_item objects are allocated according
  to found result types. The comparison of the left expression with any
  right part expression is now based only on result types of these expressions.
  
  The Item_func_case class is modified in the similar way when a CASE
  expression is specified. Now it can allocate up to 5 cmp_item objects
  to compare CASE expression with WHEN expressions of different types.
  The comparison of the CASE expression with any WHEN expression now based only 
  on result types of these expressions.
[13 Sep 2006 20:05] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/11887

ChangeSet@1.2247, 2006-09-14 00:02:03+04:00, evgen@moonbone.local +8 -0
  Fixed bug #18360: Type aggregation for IN and CASE may lead to a wrong
  result
  
  The IN function aggregates result types of all expressions. It uses that 
  type in comparison of left expression and expressions in right part. 
  This approach works in most cases. But let's consider the case when the
  right part contains both strings and integers. In that case this approach may
  cause wrong results because all strings which do not start with a digit are
  evaluated as 0.
  CASE uses the same approach when a CASE expression is given thus it's also
  affected.
  
  The idea behind this fix is to make IN function to compare expressions with
  different result types differently. For example a string in the left
  part will be compared as string with strings specified in right part and
  will be converted to real for comparison to int or real items in the right
  part.
  
  A new function called collect_cmp_types() is added. It collects different
  result types for comparison of first item in the provided list with each 
  other item in the list. 
  
  The Item_func_in class now can refer up to 5 cmp_item objects: 1 for each
  result type for comparison purposes. cmp_item objects are allocated according
  to found result types. The comparison of the left expression with any
  right part expression is now based only on result types of these expressions.
  
  The Item_func_case class is modified in the similar way when a CASE
  expression is specified. Now it can allocate up to 5 cmp_item objects
  to compare CASE expression with WHEN expressions of different types.
  The comparison of the CASE expression with any WHEN expression now based only 
  on result types of these expressions.
[18 Sep 2006 19:34] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/12167

ChangeSet@1.2247, 2006-09-18 23:34:39+04:00, evgen@moonbone.local +9 -0
  Fixed bug #18360: Type aggregation for IN and CASE may lead to a wrong
  result
  
  The IN function aggregates result types of all expressions. It uses that 
  type in comparison of left expression and expressions in right part. 
  This approach works in most cases. But let's consider the case when the
  right part contains both strings and integers. In that case this approach may
  cause wrong results because all strings which do not start with a digit are
  evaluated as 0.
  CASE uses the same approach when a CASE expression is given thus it's also
  affected.
  
  The idea behind this fix is to make IN function to compare expressions with
  different result types differently. For example a string in the left
  part will be compared as string with strings specified in right part and
  will be converted to real for comparison to int or real items in the right
  part.
  
  A new function called collect_cmp_types() is added. It collects different
  result types for comparison of first item in the provided list with each 
  other item in the list. 
  
  The Item_func_in class now can refer up to 5 cmp_item objects: 1 for each
  result type for comparison purposes. cmp_item objects are allocated according
  to found result types. The comparison of the left expression with any
  right part expression is now based only on result types of these expressions.
  
  The Item_func_case class is modified in the similar way when a CASE
  expression is specified. Now it can allocate up to 5 cmp_item objects
  to compare CASE expression with WHEN expressions of different types.
  The comparison of the CASE expression with any WHEN expression now based only 
  on result types of these expressions.
[25 Sep 2006 15:52] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/12485

ChangeSet@1.2247, 2006-09-25 00:27:46+04:00, evgen@moonbone.local +10 -0
  Fixed bug #18360: Type aggregation for IN and CASE may lead to a wrong
  result
  
  The IN function aggregates result types of all expressions. It uses that 
  type in comparison of left expression and expressions in right part. 
  This approach works in most cases. But let's consider the case when the
  right part contains both strings and integers. In that case this approach may
  cause wrong results because all strings which do not start with a digit are
  evaluated as 0.
  CASE uses the same approach when a CASE expression is given thus it's also
  affected.
  
  The idea behind this fix is to make IN function to compare expressions with
  different result types differently. For example a string in the left
  part will be compared as string with strings specified in right part and
  will be converted to real for comparison to int or real items in the right
  part.
  
  A new function called collect_cmp_types() is added. It collects different
  result types for comparison of first item in the provided list with each 
  other item in the list. 
  
  The Item_func_in class now can refer up to 5 cmp_item objects: 1 for each
  result type for comparison purposes. cmp_item objects are allocated according
  to found result types. The comparison of the left expression with any
  right part expression is now based only on result types of these expressions.
  
  The Item_func_case class is modified in the similar way when a CASE
  expression is specified. Now it can allocate up to 5 cmp_item objects
  to compare CASE expression with WHEN expressions of different types.
  The comparison of the CASE expression with any WHEN expression now based only 
  on result types of these expressions.
[1 Oct 2006 9:01] Georgi Kodinov
Pushed in 5.1.12
[20 Oct 2006 17:28] Paul DuBois
Noted in 5.1.12 changelog.