Bug #58008 wrong datatype (blob) in the resultset if the string is larger than 512 bytes
Submitted: 5 Nov 2010 9:38 Modified: 5 Nov 2010 9:41
Reporter: Alexander Rubin Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Optimizer Severity:S3 (Non-critical)
Version:5.1, 5.0 OS:Any
Assigned to: CPU Architecture:Any
Tags: BLOB, datatype, performance, string, temporary table
Triage: Triaged: D3 (Medium)

[5 Nov 2010 9:38] Alexander Rubin
Description:
If we have SQL: select <STRING LARGER THAN 512 bytes> mysql will use blob datatype in the resultset. This is important if we do 
"select distinct concat() ... ", mysql may create temporary table on disk, not in memory, as MEMORY engine does not support BLOB/TEXT.
Example: 
1. select distinct concat(LONG_STRING) where...
2. If LONG_STRING field is larger than 512 bytes, mysql will autotype the resulting datatype to TEXT or BLOB.
3. As we have distinct, MySQL may need to create a temporary table and this table will be created on disk right away, as MEMORY engine does not support BLOB/TEXT.

The problem is: the code uses make_string_field() method in item.cc, code should probably use method string_field_type() instead.

How to repeat:
Test case with the simple string:

-- ==== 512 bytes: ==== ---

mysql>  drop table if exists tmpmem;  create table tmpmem as select
_latin1
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
as a; show create table tmpmem;
Query OK, 0 rows affected, 1 warning (0.00 sec)

Query OK, 1 row affected (0.01 sec)
Records: 1  Duplicates: 0  Warnings: 0

+--------+----------------------------------------------------------------------------------------+
| Table  | Create Table                                                
                          |
+--------+----------------------------------------------------------------------------------------+
| tmpmem | CREATE TABLE `tmpmem` (
  `a` longtext NOT NULL
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+--------+----------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

-- ==== 513 bytes: ==== ---

mysql>  drop table if exists tmpmem;  create table tmpmem as select
_latin1
'aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa'
as a; show create table tmpmem;
Query OK, 0 rows affected (0.00 sec)

Query OK, 1 row affected (0.00 sec)
Records: 1  Duplicates: 0  Warnings: 0

+--------+-------------------------------------------------------------------------------------------------------+
| Table  | Create Table                                                
                                         |
+--------+-------------------------------------------------------------------------------------------------------+
| tmpmem | CREATE TABLE `tmpmem` (
  `a` varchar(512) NOT NULL DEFAULT ''
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+--------+-------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

VERSION:

mysql> select version();
+------------+
| version()  |
+------------+
| 5.1.50-log |
+------------+
1 row in set (0.00 sec)

Suggested fix:
Use string_field_type() method to autotype instead of  make_string_field()

sql/item.cc

/*
  Create a field to hold a string value from an item

  SYNOPSIS
    make_string_field()
    table               Table for which the field is created

  IMPLEMENTATION
    If max_length > CONVERT_IF_BIGGER_TO_BLOB create a blob
    If max_length > 0 create a varchar
    If max_length == 0 create a CHAR(0)
*/

Field *Item::make_string_field(TABLE *table)
{
  Field *field;
  DBUG_ASSERT(collation.collation);
  if (max_length/collation.collation->mbmaxlen > CONVERT_IF_BIGGER_TO_BLOB)
    field= new Field_blob(max_length, maybe_null, name,
                          collation.collation);
  /* Item_type_holder holds the exact type, do not change it */
  else if (max_length > 0 &&
      (type() != Item::TYPE_HOLDER || field_type() != MYSQL_TYPE_STRING))
    field= new Field_varstring(max_length, maybe_null, name, table->s,
                               collation.collation);
  else
    field= new Field_string(max_length, maybe_null, name,
                            collation.collation);
  if (field)
    field->init(table);
  return field;
}
...

enum_field_types Item::string_field_type() const
{
  enum_field_types f_type= MYSQL_TYPE_VAR_STRING;
  if (max_length >= 16777216)
    f_type= MYSQL_TYPE_LONG_BLOB;
  else if (max_length >= 65536)
    f_type= MYSQL_TYPE_MEDIUM_BLOB;
  return f_type;
}
[5 Nov 2010 17:38] Peter Laursen
I reported another issue with the result set here:
http://bugs.mysql.com/bug.php?id=57709

It would be nice to have both fixed.