Bug #49609 check_db_name is incorrect for single-byte character sets
Submitted: 11 Dec 2009 5:10 Modified: 23 Dec 2009 21:23
Reporter: Mark Callaghan Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: General Severity:S3 (Non-critical)
Version:5.1.38, 5.1, next-mr OS:Any
Assigned to: Assigned Account CPU Architecture:Any
Tags: check_db_name, regression
Triage: Triaged: D5 (Feature request)

[11 Dec 2009 5:10] Mark Callaghan
Description:
Yan found this, I am just the messenger.

We are trying to run MySQL with the system charset changed from utf8 to latin1 by changing this code in sql/mysqld.cc
  system_charset_info= &my_charset_utf8_general_ci;
to 
  system_charset_info= &my_charset_latin1

This change works in mysql 5.0, but with 5.1 mysqld crashes when bootstrap is done during mysql-test-run. The problem is this code in check_db_name from sql/table.cc. The log for the 'return' in the 'if use_mb' block is not the same as in the 'else' block. In the 'if use_mb' block the code returns TRUE when the last character is a space or name_length is too big. In the 'else' block the code returns TRUE when the last character is not a space or name_length is too big.

The fix is to change from this:
    return ((org_name->str[org_name->length - 1] != ' ') ||
to this:
    return ((org_name->str[org_name->length - 1] == ' ') ||

#if defined(USE_MB) && defined(USE_MB_IDENT)
  if (use_mb(system_charset_info))
  {
    name_length= 0;
    bool last_char_is_space= TRUE;
    char *end= name + org_name->length;
    while (name < end)
    {
      int len;
      last_char_is_space= my_isspace(system_charset_info, *name);
      len= my_ismbchar(system_charset_info, name, end);
      if (!len)
        len= 1;
      name+= len;
      name_length++;
    }
    return (last_char_is_space || name_length > NAME_CHAR_LEN);
  }
  else
#endif
    return ((org_name->str[org_name->length - 1] != ' ') ||
            (name_length > NAME_CHAR_LEN)); /* purecov: inspected */

How to repeat:
change system charset as described above
run mysql-test-run

Suggested fix:
change check_db_name as described above
[11 Dec 2009 7:55] Sveta Smirnova
Thank you for the report.

This patch (which changes character set) does not work for me in 5.1:

$bzr diff
=== modified file 'sql/mysqld.cc'
--- sql/mysqld.cc       2009-11-20 12:09:50 +0000
+++ sql/mysqld.cc       2009-12-11 06:50:47 +0000
@@ -4695,7 +4695,8 @@
                                                   "MySQLShutdown"), 10);
 
   /* Must be initialized early for comparison of service name */
-  system_charset_info= &my_charset_utf8_general_ci;
+  system_charset_info= &my_charset_latin1;
+fprintf(stderr,"charset:%s\n", system_charset_info);
 
   if (Service.GetOS()) /* true NT family */
   {

$cat mysql-test/t/bug49609.test 
show variables like 'char%';

$do_test -b mysql-5.1
Logging: ./mysql-test-run.pl  --record --force bug49609
091211  8:51:23 [Note] Plugin 'FEDERATED' is disabled.
091211  8:51:23 [Note] Plugin 'ndbcluster' is disabled.
MySQL Version 5.1.43
Checking supported features...
 - using ndbcluster when necessary, mysqld supports it
 - SSL connections supported
 - binaries are debug compiled
Collecting tests...
vardir: /users/ssmirnova/blade12/src/mysql-5.1/mysql-test/var
Checking leftover processes...
Removing old var directory...
Creating var directory '/users/ssmirnova/blade12/src/mysql-5.1/mysql-test/var'...
Installing system database...
Using server port 39267

==============================================================================

TEST                                      RESULT   TIME (ms)
------------------------------------------------------------

worker[1] Using MTR_BUILD_THREAD 300, with reserved ports 13000..13009
main.bug49609                            [ pass ]      4
------------------------------------------------------------
The servers were restarted 0 times
Spent 0.004 of 2 seconds executing testcases

All 1 tests were successful.

bug49609.result
=====mysql-5.1=====
=====bug49609=====
show variables like 'char%';
Variable_name   Value
character_set_client    latin1
character_set_connection        latin1
character_set_database  latin1
character_set_filesystem        binary
character_set_results   latin1
character_set_server    latin1
character_set_system    utf8
character_sets_dir      /users/ssmirnova/blade12/src/mysql-5.1/sql/share/charsets/

While with version 5.0 it works fine. Is it only change you do? How do you compile MySQL server?
[16 Dec 2009 16:39] Mark Callaghan
I think you changed the windows version. Keep searching in the file for the non-windows code to change.
[16 Dec 2009 23:17] Sveta Smirnova
Thank you for the feedback.

Verified as described.

Bzr diff against latest sources to get behavior described:

=== modified file 'sql/mysqld.cc'
--- sql/mysqld.cc       2009-11-20 12:09:50 +0000
+++ sql/mysqld.cc       2009-12-16 23:07:13 +0000
@@ -7692,7 +7692,7 @@
   key_map_full.set_all();
 
   /* Character sets */
-  system_charset_info= &my_charset_utf8_general_ci;
+  system_charset_info= &my_charset_latin1;
   files_charset_info= &my_charset_utf8_general_ci;
   national_charset_info= &my_charset_utf8_general_ci;
   table_alias_charset= &my_charset_bin;
[17 Dec 2009 6:42] Alexander Barkov
This problem is already fixed in 6.0 sources:

bool check_table_name(const char *name, uint length)
{
  if (!length || length > NAME_LEN || name[length - 1] == ' ')
    return 1;
  LEX_STRING ident;
  ident.str= (char*) name;
  ident.length= length;
  return check_identifier_name(&ident);
}
[17 Dec 2009 21:39] Mark Callaghan
Why is this a feature request?
[23 Dec 2009 21:17] Shawn Green
on [17 Dec 22:39] Mark Callaghan asked:  "Why is this a feature request?"

This is a feature request because the internals to MySQL are not designed to operate with any character set other than UTF-8 nor is the code written in such a way that the internal character set can be modified by the end user. 

To make the code capable of performing such a change safely will require at the minimum a full audit of all functions that touch internal objects to make sure that their string management and other functions will cooperate with a different character set. 

The modification or validation of the MySQL internals to ensure they will be compatible with a character set other than UTF-8 is the crux of the bug you have raised. Such engineering qualifies this report as a "feature request" as the code does not yet have the capability to do what you are asking it to do.
[23 Dec 2009 21:23] Mark Callaghan
Then code in check_db_name for non-utf8 charsets should be removed and replaced with an assert.

We haven't asked for a my.cnf option to choose the system character set. About 500 unit tests must be fixed for that to be possible.