Bug #33911 alter table add index in place does not work when charset is UTF-8
Submitted: 17 Jan 2008 22:02 Modified: 29 Oct 2008 6:51
Reporter: Bill Mitchell Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Charsets Severity:S2 (Serious)
Version:5.1.22-rc OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[17 Jan 2008 22:02] Bill Mitchell
Description:
I am building a storage engine which can support indexes when they are added to an existing table in place, but which does not support temporary tables and therefore cannot add indexes by copying to a new temp table.  In other words, to the question alter_table_flags, this storage engine returns all of the index flags set including HA_ONLINE_ADD_INDEX, HA_ONLINE_ADD_UNIQUE_INDEX, HA_ONLINE_ADD_PK_INDEX.

On the instance of the server installed with the default charset being Latin1, this works fine.  But on a different instance of the server installed with the default charset being UTF-8, the ALTER TABLE ADD INDEX command fails.  In sql_table.cc when the routine compare_tables asks, for each field, is the new_field identical to the old_field (field->is_equal(new_field), on the UTF-8 server the answer comes back false.  On further debugging, I find that Field_str::is_equal and Field_varstring::is_equal compare new_field->length == max_display_length().  In my case, when the charset is UTF-8, the value in new_field->length is 255 and the value returned by max_display_length() is 755.  Obviously max_display_length allows for the worst-case UTF-8 expansion, whereas the new_field->length at this point is the character count.  

The real data being processed in my handler is UTF-8, so it would certainly be awkward to have to lie and configure the server as Latin-1 just to be able to add indexes.  

How to repeat:
I looked for a way to demonstrate this using the supplied storage engines, but it appears none of them support HA_ONLINE_ADD_INDEX.  Of course, one could add an alter_table_flags routine to ha_myisam.h just to demonstrate the problem.  

Suggested fix:
I believe that in both these routines, the comparison should be between new_field->char_length and char_length().  I suspect the same change applies to Field_new_decimal and maybe to Field_blob as well.  You can see the recommended changes in the routines below.  I have tested this change and after applying it ALTER TABLE ADD INDEX works the same irrespective of whether the character set is Latin-1 or UTF-8.

uint Field_str::is_equal(Create_field *new_field)
{
  if (compare_str_field_flags(new_field, flags))
    return 0;

  return ((new_field->sql_type == real_type()) &&
	  new_field->charset == field_charset &&
	  new_field->char_length == char_length());
}

uint Field_varstring::is_equal(Create_field *new_field)
{
  if (new_field->sql_type == real_type() &&
      new_field->charset == field_charset)
  {
    if (new_field->char_length == char_length())
      return IS_EQUAL_YES;
    if (new_field->char_length > char_length() &&
	((new_field->char_length <= 255 && char_length() <= 255) ||
	 (new_field->char_length > 255 && char_length() > 255)))
      return IS_EQUAL_PACK_LENGTH; // VARCHAR, longer variable length
  }
  return IS_EQUAL_NO;
}
[26 Jan 2008 6:44] Alexander Barkov
Hello Bill,

I'm not sure that there are bugs in is_equal() method.
In parser, MySQL converts character length to byte length
and then uses byte length in most parts of the code.

The bug is possibly in the upper level code which calls is_equal().

Can you please print values of these variables:

- new_field->charset->name (string)
- field_charset->name (string)
- new_field->length (number)
- max_display_length()  (number)

so we can see what is the exact reason for is_equal() to return false.

Please also tell us the full "ALTER TABLE ADD INDEX" statement,
and "SHOW CREATE TABLE" for the table being altered.

Thanks!
[28 Jan 2008 20:54] Bill Mitchell
Alexander, here is the information you requested:
new_field->charset == _my_charset_utf8_general_ci
this->charset == _my_charset_utf8_general_ci
new_field->length == 0x00ff
new_field->char_length == 0x00ff
max_display_length, i.e., this->field_length = 0x02fd

This error is diagnosed on the very first column of the table, the _handle column described below.  

The create table statement used to create this table:
CREATE TABLE customer_master (
  _handle VARCHAR (255) NOT NULL,
  cr_customer_id VARCHAR (20) NOT NULL,
  cr_company_name VARCHAR (40) NOT NULL,
  cr_contact_name VARCHAR (40) NOT NULL,
  cr_contact_title VARCHAR (30) NOT NULL,
  cr_customer_addr_line_1 VARCHAR (40) NOT NULL,
  cr_customer_addr_line_2 VARCHAR (40)     NULL,
  cr_city VARCHAR (20) NOT NULL,
  cr_state VARCHAR (2) NOT NULL,
  cr_postal_code VARCHAR (10) NOT NULL,
  cr_country VARCHAR (20) NOT NULL,
  cr_phone_number VARCHAR (12) NOT NULL,
  cr_discount DECIMAL (2, 2) NOT NULL,
  cr_created DATE NOT NULL,
  cr_last_modified DATE NOT NULL,
  cr_last_contacted DATE NOT NULL
  ) ENGINE = FRAMEWARE;   

The ALTER TABLE statement that fails:
ALTER TABLE customer_master ADD PRIMARY KEY (cr_customer_id);

Just for reference, here is the callstack when we arrive here:
>	mysqld.exe!Field_varstring::is_equal(Create_field * new_field=0x016940f0)  Line 7111	C++
 	mysqld.exe!compare_tables(st_table * table=0x0168f040, Alter_info * alter_info=0x0272ef50, st_ha_create_information * create_info=0x0272efb0, unsigned int order_num=0x00000000, enum_alter_table_change_level * need_copy_table=0x0272d75c, st_key * * key_info_buffer=0x0272d868, unsigned int * * index_drop_buffer=0x0272d850, unsigned int * index_drop_count=0x0272d85c, unsigned int * * index_add_buffer=0x0272d838, unsigned int * index_add_count=0x0272d844)  Line 5019 + 0x16 bytes	C++
 	mysqld.exe!mysql_alter_table(THD * thd=0x01660b18, char * new_db=0x01693f30, char * new_name=0x01693d20, st_ha_create_information * create_info=0x0272efb0, TABLE_LIST * table_list=0x01693d58, Alter_info * alter_info=0x0272ef50, unsigned int order_num=0x00000000, st_order * order=0x00000000, bool ignore=false)  Line 6077 + 0x3f bytes	C++
...

The reason I inferred that the error lies here in Field_str::equal and Field_varstring::equal, is the comment in the definition of the Create_field class, that Create_field::length is sometimes the length in bytes and sometimes the length in characters, depending on the stage of execution.  So it seemed to me that at this point in the ALTER TABLE logic, we must be at a point that it is the length in characters, not the length in bytes, to allow for the situation where ALTER TABLE might be changing the character set, and might need to create a new temp table with the "same" columns defined in a different character set.
[29 Sep 2008 6:51] Susanne Ebrecht
Bill,

many thanks for your feedback. Please can you test with newest version MySQL 5.1.28 and let us know if error still occurs.
[30 Oct 2008 0:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".