Bug #63470 Weird things happen with MAXMIN on BIT types
Submitted: 29 Nov 2011 9:43 Modified: 1 Dec 2011 8:39
Reporter: Peter Laursen (Basic Quality Contributor) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Data Types Severity:S3 (Non-critical)
Version:5.1.58, 5.5.18 OS:Any
Assigned to: CPU Architecture:Any
Tags: qc

[29 Nov 2011 9:43] Peter Laursen
Description:
Using MAX/MIN on a BIT changes datatype or value?

How to repeat:
CREATE TABLE `test`.`TableName1`(
   `id` INT NOT NULL AUTO_INCREMENT ,
   `b` BIT ,
   PRIMARY KEY (`id`)
  );

INSERT INTO bittest (b) VALUES (b'1'), (b'0');

-- Now query the table: 
-- In different clients it looks different (wheter the client is textbased or not - whether it has some 'smart' client side handling of BIT added or not)

mysql> SELECT b FROM bittest;
+------+
| b    |
+------+
| ☺    |
|      |
+------+
2 rows in set (0.00 sec)

mysql> SELECT MAX(b) FROM bittest;
+--------+
| MAX(b) |
+--------+
| 1      |
+--------+
1 row in set (0.00 sec)

Suggested fix:
(I am perfectly aware of this http://dev.mysql.com/doc/refman/5.5/en/bit-field-values.html "To display them in printable form, add 0 or use a conversion function such as BIN()" - but this is not the point here).

I am not sure exactly what happens so I cannot advice.  But I'd expect that the client returned one of the same values for the agggregate as for the non-aggregated statement. 

SELECT MAX(b) FROM bittest;
.. returns what I only think
SELECT MAX(b) + 0 FROM bittest;
.. etc. should do

Also see: http://bugs.mysql.com/bug.php?id=63457
[29 Nov 2011 9:58] Peter Laursen
SELECT HEX(b) FROM bittest; 
/*
HEX(b)
------
1     
0     
*/

SELECT HEX(MAX(b)) FROM bittest; 
/* hex(MAX(b))
-----------
1  
*/   

.. so it looks it is the meta-information about the data type in the result set that changes with MAX().
[29 Nov 2011 10:51] Peter Laursen
Updated version (5.1.18 was wrong)
[29 Nov 2011 12:06] Peter Laursen
HAHA!

mysql> select avg(b) from bittest;
+--------+
| avg(b) |
+--------+
| 0.5000 |
+--------+
1 row in set (0.00 sec)

What is this?  AVG will cast values to DOUBLE before doing the calculation? Could make sense. But how does BIT b'1' become DOUBLE '1'? If there is a cast to DOUBLE I think the result for both columns should be '0'. Just like when the string constant 'a' is cast.
[29 Nov 2011 12:26] Peter Laursen
compare with strings:

ALTER TABLE `test`.`bittest` 
   ADD COLUMN `t` CHAR(1) NULL AFTER `b`;
INSERT INTO bittest (t) VALUES ('a'), ('b');
SELECT AVG(t) FROM bittest; -- 0
-- what I think also "SELECT AVG(b) FROM bittest;" should return.
[29 Nov 2011 12:42] Peter Laursen
One more:

CREATE TABLE `test`.`bittest2`(
   `id` INT NOT NULL AUTO_INCREMENT ,
   `b` BIT(2) ,
   PRIMARY KEY (`id`)
  );

INSERT INTO bittest2 (b)  VALUES (b'11'), (b'00');

SELECT MAX(b) FROM bittest2; -- 3 (in command line client)
SELECT AVG(b) FROM bittest2; -- 1.5 (the average of 0 and 3)
[29 Nov 2011 19:25] Valeriy Kravchuk
Well, I do see some problem with metadata also:

C:\Program Files\MySQL\MySQL Server 5.5\bin>mysql -uroot -proot -P3310 --column-
type-info test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.1.58-community-log MySQL Community Server (GPL)

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> select b from `TableName1`;
Field   1:  `b`
Catalog:    `def`
Database:   `test`
Table:      `TableName1`
Org_table:  `tablename1`
Type:       BIT
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      UNSIGNED

+------+
| b    |
+------+
| ☺    |
|      |
+------+
2 rows in set (0.00 sec)

mysql> select max(b) from `TableName1`;
Field   1:  `max(b)`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       BIT
Collation:  binary (63)
Length:     1
Max_length: 1
Decimals:   0
Flags:      UNSIGNED BINARY

+--------+
| max(b) |
+--------+
| 1      |
+--------+
1 row in set (0.00 sec)

I would expect metadata for b and max(b) (other than column name) to be exactly the same in this case.
[29 Nov 2011 20:08] Peter Laursen
1) I have tried a few different clients and they all come out differently on MAX(bit-values).  But no-one returns the same as for the BIT itself.  So my guess is that metadata inconsistence confuse the clients.

2)
Another problem is the behavior with AVG(bit-values). The average is calculated like you would on base-10 numbers and returns a base-10 number (that may even be fractional). I think it should either return an error (disallow AVG on BIT) or a BIT even if the 'modulus' of the base-2 division should have to been thrown away (like "3 DIV 2 = 1 (with a modulus of 1)" ).  If current behaviour is kept it should be documented that "with the AVG agg. function the vales are converted to base-10 numbers before division"

Quesstion is: is this a new report or shall we wait for a conclusion?

(COUNT(bit-value) is fine after all!)
[1 Dec 2011 8:39] Peter Laursen
Maybe the comment I posted here is the underlying reason:
http://bugs.mysql.com/bug.php?id=63505

"I think I understand that *BIT* applies for storage in a table, but not when loaded into to memory. In memory it will need to be handled as a BINARY, a CHAR or a numerical of some kind"

That could explain that MAX|MIN aggregates on a BIT will need to be delievered as an integer (comparison only possible in memory whre BIT does not apply).  But in that case metadata should clearly identify them as integers.