Description:
The patch for bug#83895 refactored code such that UNION, COALESCE, and LEAST/GREATEST all use the same aggregation functions for determining resulting data type. One issue arising from this refactoring is that the aggregated data type becomes VARCHAR whenever the result type is STRING_RESULT whose length is longer than 255 characters (max length for CHAR in MySQL).
It's Item::aggregate_string_properties that's causing this particular behavioral change.
if (collation.collation != &my_charset_bin &&
max_length > char_to_byte_length_safe(MAX_FIELD_CHARLENGTH,
collation.collation->mbmaxlen))
set_data_type(MYSQL_TYPE_VARCHAR);
After the refactoring, this check has been discovered to be a bit too eager. While it should probably only include CHAR/BINARY fields, it currently covers all field types whose result type is a STRING_RESULT, e.g. CHAR/VARCHAR/BLOG/JSON/GIS etc. There are two problems with this:
Primary: Change of type for variable-sized columns types.
Secondary: CHAR fields are generated that are above the documented limit (255 characters).
How to repeat:
Before patch:
CREATE TABLE t1 (j JSON);
INSERT INTO t1 VALUE ('[2]'), ('["2"]');
CREATE TABLE t2 AS SELECT COALESCE(j, NULL) from t1;
SHOW CREATE TABLE t2;
+-------+-----------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+-----------------------------------------------------------------------------------------------------+
| t2 | CREATE TABLE `t2` (
`coalesce(j, NULL)` json DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
+-------+-----------------------------------------------------------------------------------------------------+
After the patch (current behavior):
CREATE TABLE t1 (j JSON);
INSERT INTO t1 VALUE ('[2]'), ('["2"]');
CREATE TABLE t2 AS SELECT COALESCE(j, NULL) from t1;
SHOW CREATE TABLE t2;
+-------+----------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------+----------------------------------------------------------------------------------------------------------------------+
| t2 | CREATE TABLE `t2` (
`a` longtext CHARACTER SET utf8mb4 COLLATE utf8mb4_bin
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 |
+-------+----------------------------------------------------------------------------------------------------------------------+
Suggested fix:
A suggested patch by Roy:
diff --git a/sql/item.cc b/sql/item.cc
index 7da7123..39dc997 100644
--- a/sql/item.cc
+++ b/sql/item.cc
@@ -7979,9 +7979,13 @@ bool Item::aggregate_string_properties(const char *name,
set_if_smaller(decimals, NOT_FIXED_DEC);
aggregate_char_length(items, nitems);
- if (collation.collation != &my_charset_bin &&
- max_length > char_to_byte_length_safe(MAX_FIELD_CHARLENGTH,
- collation.collation->mbmaxlen))
+ /*
+ If the resulting data type is a fixed length character or binary string
+ and the result maximum length in characters is longer than the MySQL
+ maximum CHAR/BINARY size, convert to a variable-sized type.
+ */
+ if (data_type() == MYSQL_TYPE_STRING &&
+ max_char_length() > MAX_FIELD_CHARLENGTH)
set_data_type(MYSQL_TYPE_VARCHAR);
return false;