Bug #42832 INFORMATION_SCHEMA.COLUMNS reports wrong CHARACTER_MAXIMUM_LENGTH for SET
Submitted: 13 Feb 2009 13:45 Modified: 13 Feb 2009 14:02
Reporter: Ulf Wendel Email Updates:
Status: Not a Bug Impact on me:
None 
Category:MySQL Server: Information schema Severity:S3 (Non-critical)
Version:5.1.30-debug OS:Linux (OpenSuse 10.3)
Assigned to: CPU Architecture:Any

[13 Feb 2009 13:45] Ulf Wendel
Description:
The I_S reports the wrong maximum number of characters for columns of type SET through INFORMATION_SCHEMA.COLUMNS.CHARACTER_MAXIMUM_LENGTH. It seems a bit like values are multiplied by two or as if SET would use a multi-byte character set regardless of the user settings.

Note the difference between SET and ENUM at "How to repeat". ENUM gives me what I expect to get from SET. Also note that I am using LENGTH in the first two example but I am using CHAR_LENGTH in the third SET w utf8 example. 

As the Connector/J is using the I_S, its likely that Connector/J is affected.

mysql> show variables like "%version%";
+-------------------------+---------------------+
| Variable_name           | Value               |
+-------------------------+---------------------+
| protocol_version        | 10                  |
| version                 | 5.1.30-debug        |
| version_comment         | Source distribution |
| version_compile_machine | x86_64              |
| version_compile_os      | suse-linux-gnu      |
+-------------------------+---------------------+

mysql> show variables like "%character%";
+--------------------------+----------------------------------------+
| Variable_name            | Value                                  |
+--------------------------+----------------------------------------+
| character_set_client     | latin1                                 |
| character_set_connection | latin1                                 |
| character_set_database   | latin1                                 |
| character_set_filesystem | binary                                 |
| character_set_results    | latin1                                 |
| character_set_server     | latin1                                 |
| character_set_system     | utf8                                   |
| character_sets_dir       | /usr/local/mysql/share/mysql/charsets/ |
+--------------------------+----------------------------------------+
8 rows in set (0.00 sec)

How to repeat:
----------- Test with SET('yes', 'no') CHARSET ascii ---------------------

mysql> SELECT VERSION();
+--------------+
| VERSION()    |
+--------------+
| 5.1.30-debug |
+--------------+
1 row in set (0.01 sec)

mysql> DROP TABLE IF EXISTS test.test;
Query OK, 0 rows affected (0.00 sec)

mysql>  CREATE TABLE test.test(col1 SET('yes', 'no')) CHARSET ascii;
Query OK, 0 rows affected (0.01 sec)

mysql> SHOW CREATE TABLE test.test;
+-------+---------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                      |
+-------+---------------------------------------------------------------------------------------------------+
| test  | CREATE TABLE `test` (
  `col1` set('yes','no') DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=ascii |
+-------+---------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL);
Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

ysql> SELECT col1, length(col1) FROM test.test;
+------+--------------+
| col1 | length(col1) |
+------+--------------+
| yes  |            3 |
| no   |            2 |
| NULL |         NULL |
+------+--------------+
3 rows in set (0.00 sec)

mysql> SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA LIKE "test" AND TABLE_NAME LIKE "test" AND COLUMN_NAME LIKE "col1";
+--------------+------------+-------------+--------------------------+------------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH |
+--------------+------------+-------------+--------------------------+------------------------+
| test         | test       | col1        |                        6 |                      6 |
+--------------+------------+-------------+--------------------------+------------------------+
1 row in set (0.00 sec)

----------- Test with SET('yes', 'no') CHARSET ucs2 ---------------------

mysql> SELECT VERSION(); DROP TABLE IF EXISTS test.test; CREATE TABLE test.test(col1 SET('yes', 'no')) CHARSET ucs2; INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL); SELECT col1, length(col1) FROM test.test; SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA LIKE "test" AND TABLE_NAME LIKE "test" AND COLUMN_NAME LIKE "col1";
+--------------+
| VERSION()    |
+--------------+
| 5.1.30-debug |
+--------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.04 sec)

Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

+------+--------------+
| col1 | length(col1) |
+------+--------------+
| yes  |            6 |
| no   |            4 |
| NULL |         NULL |
+------+--------------+
3 rows in set (0.00 sec)

+--------------+------------+-------------+--------------------------+------------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH |
+--------------+------------+-------------+--------------------------+------------------------+
| test         | test       | col1        |                        6 |                     12 |
+--------------+------------+-------------+--------------------------+------------------------+
1 row in set (0.01 sec)

-------------- SET with utf8 -------------------

mysql> SELECT VERSION(); DROP TABLE IF EXISTS test.test; CREATE TABLE test.test(col1 ENUM('yes', 'no')) CHARSET utf8; INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL); SELECT col1, char_length(col1) FROM test.test; SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA LIKE "test" AND TABLE_NAME LIKE "test" AND COLUMN_NAME LIKE "col1";
+--------------+
| VERSION()    |
+--------------+
| 5.1.30-debug |
+--------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

+------+-------------------+
| col1 | char_length(col1) |
+------+-------------------+
| yes  |                 3 |
| no   |                 2 |
| NULL |              NULL |
+------+-------------------+
3 rows in set (0.00 sec)

+--------------+------------+-------------+--------------------------+------------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH |
+--------------+------------+-------------+--------------------------+------------------------+
| test         | test       | col1        |                        3 |                      9 |
+--------------+------------+-------------+--------------------------+------------------------+
1 row in set (0.01 sec)

------------- Correct: ENUM -------------------------------------

mysql> SELECT VERSION(); DROP TABLE IF EXISTS test.test; CREATE TABLE test.test(col1 SET('yes', 'no')) CHARSET ucs2; INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL); SELECT col1, length(col1) FROM test.test; SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA LIKE "test" AND TABLE_NAME LIKE "test" AND COLUMN_NAME LIKE "col1";
+--------------+
| VERSION()    |
+--------------+
| 5.1.30-debug |
+--------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.04 sec)

Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

+------+--------------+
| col1 | length(col1) |
+------+--------------+
| yes  |            6 |
| no   |            4 |
| NULL |         NULL |
+------+--------------+
3 rows in set (0.00 sec)

+--------------+------------+-------------+--------------------------+------------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH |
+--------------+------------+-------------+--------------------------+------------------------+
| test         | test       | col1        |                        6 |                     12 |
+--------------+------------+-------------+--------------------------+------------------------+
1 row in set (0.01 sec)

mysql> SELECT VERSION(); DROP TABLE IF EXISTS test.test; CREATE TABLE test.test(col1 ENUM('yes', 'no')) CHARSET ascii; INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL); SELECT col1, length(col1) FROM test.test; SELECT TABLE_SCHEMA, TABLE_NAME, COLUMN_NAME, CHARACTER_MAXIMUM_LENGTH, CHARACTER_OCTET_LENGTH FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA LIKE "test" AND TABLE_NAME LIKE "test" AND COLUMN_NAME LIKE "col1";
+--------------+
| VERSION()    |
+--------------+
| 5.1.30-debug |
+--------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

Query OK, 0 rows affected (0.01 sec)

Query OK, 3 rows affected (0.00 sec)
Records: 3  Duplicates: 0  Warnings: 0

+------+--------------+
| col1 | length(col1) |
+------+--------------+
| yes  |            3 |
| no   |            2 |
| NULL |         NULL |
+------+--------------+
3 rows in set (0.00 sec)

+--------------+------------+-------------+--------------------------+------------------------+
| TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | CHARACTER_MAXIMUM_LENGTH | CHARACTER_OCTET_LENGTH |
+--------------+------------+-------------+--------------------------+------------------------+
| test         | test       | col1        |                        3 |                      3 |
+--------------+------------+-------------+--------------------------+------------------------+
1 row in set (0.01 sec)
[13 Feb 2009 13:49] Ulf Wendel
Metadata and I_S seem to differ.

+-------+--------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                     |
+-------+--------------------------------------------------------------------------------------------------+
| test  | CREATE TABLE `test` (
  `col1` set('yes','no') DEFAULT NULL
) ENGINE=MyISAM DEFAULT CHARSET=ucs2 |
+-------+--------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> SELECT col1, length(col1), char_length(col1) FROM test.test;
Field   1:  `col1`
Catalog:    `def`
Database:   `test`
Table:      `test`
Org_table:  `test`
Type:       STRING
Collation:  latin1_swedish_ci (8)
Length:     6
Max_length: 3
Decimals:   0
Flags:      SET

Field   2:  `length(col1)`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     10
Max_length: 1
Decimals:   0
Flags:      BINARY NUM

Field   3:  `char_length(col1)`
Catalog:    `def`
Database:   ``
Table:      ``
Org_table:  ``
Type:       LONGLONG
Collation:  binary (63)
Length:     10
Max_length: 1
Decimals:   0
Flags:      BINARY NUM

+------+--------------+-------------------+
| col1 | length(col1) | char_length(col1) |
+------+--------------+-------------------+
| yes  |            6 |                 3 |
| no   |            4 |                 2 |
| NULL |         NULL |              NULL |
+------+--------------+-------------------+
3 rows in set (0.00 sec)
[13 Feb 2009 14:01] Andrey Hristov
It is very strange. I use the command line to dump the metadata sent from the server. For simplicity character_set_results is latin1. I try different enums:
enum('yes', 'no')
enum('yes, 'no', 'maybe')
enum('yes', 'no, 'probably')

It seems that the length is the combined of all strings inclusive a space (probably a \0) in between. Seems like the length of the definition, not what is the maximal length of a field. Enum has one choice, would have it been a SET, maybe then this would have been correct.

+--------------+
| VERSION()    |
+--------------+
| 5.1.31-debug |
+--------------+
1 row in set (0.00 sec)

DROP TABLE IF EXISTS test.test; CREATE TABLE test.test(col1 SET('yes', 'no')) CHARSET latin1; INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL); SELECT col1, length(col1), char_length(col1) FROM test.test;

Records: 3  Duplicates: 0  Warnings: 0

Field   1:  `col1`
Catalog:    `def`
Database:   `test`
Table:      `test`
Org_table:  `test`
Type:       STRING
Collation:  latin1_swedish_ci (8)
Length:     6
Max_length: 3
Decimals:   0
Flags:      SET

...

+------+--------------+-------------------+
| col1 | length(col1) | char_length(col1) |
+------+--------------+-------------------+
| yes  |            3 |                 3 |
| no   |            2 |                 2 |
| NULL |         NULL |              NULL |
+------+--------------+-------------------+
3 rows in set (0.00 sec)

-----------------------------------------------------------
DROP TABLE IF EXISTS test.test; CREATE TABLE test.test(col1 SET('yes', 'no', 'maybe')) CHARSET latin1; INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL); SELECT col1, length(col1), char_length(col1) FROM test.test;

Field   1:  `col1`
Catalog:    `def`
Database:   `test`
Table:      `test`
Org_table:  `test`
Type:       STRING
Collation:  latin1_swedish_ci (8)
Length:     12
Max_length: 3
Decimals:   0
Flags:      SET
....
+------+--------------+-------------------+
| col1 | length(col1) | char_length(col1) |
+------+--------------+-------------------+
| yes  |            3 |                 3 |
| no   |            2 |                 2 |
| NULL |         NULL |              NULL |
+------+--------------+-------------------+

--------------------------------------------
DROP TABLE IF EXISTS test.test; CREATE TABLE test.test(col1 SET('yes', 'no', 'maybe', 'prolly')) CHARSET latin1; INSERT INTO test.test(col1) VALUES ('yes'), ('no'), (NULL); SELECT col1, length(col1), char_length(col1) FROM test.test;

Field   1:  `col1`
Catalog:    `def`
Database:   `test`
Table:      `test`
Org_table:  `test`
Type:       STRING
Collation:  latin1_swedish_ci (8)
Length:     19
Max_length: 3
Decimals:   0
Flags:      SET
....
+------+--------------+-------------------+
| col1 | length(col1) | char_length(col1) |
+------+--------------+-------------------+
| yes  |            3 |                 3 |
| no   |            2 |                 2 |
| NULL |         NULL |              NULL |
+------+--------------+-------------------+

This affects the I_S.COLUMNS which gives wrong data, like
*************************** 1. row ***************************
            TABLE_SCHEMA: test
              TABLE_NAME: test
             COLUMN_NAME: col1
CHARACTER_MAXIMUM_LENGTH: 19
  CHARACTER_OCTET_LENGTH: 19
1 row in set (0.01 sec)

For the last defition ('yes','no','maybe','prolly')
[13 Feb 2009 14:02] Andrey Hristov
Not a bug, this is a SET, not an ENUM, values are concatenated