Bug #37339 SHOW VARIABLES not working properly with multi-byte datadir
Submitted: 11 Jun 2008 12:30 Modified: 8 Dec 2008 16:53
Reporter: Matthew Lord Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Charsets Severity:S3 (Non-critical)
Version:5.0.62 OS:Microsoft Windows (XP)
Assigned to: Georgi Kodinov CPU Architecture:Any
Triage: Triaged: D3 (Medium)

[11 Jun 2008 12:30] Matthew Lord
Description:
mysqld-nt is able to correctly use a datadir value that contains multi-byte
characters but it is not displayed correctly in SHOW VARIABLES.

This is NOT a problem on *nix.  

How to repeat:
dos>mkdir c:\téstü
dos>C:\Program Files\MySQL\MySQL Server 5.0\bin\mysqld-nt --no-defaults --skip-grant-tables --datadir=C:\téstü 

dos>C:\Program Files\MySQL\MySQL Server 5.0\bin\mysql

mysql>set names utf8;
mysql>show global variables like "datadir";
[11 Jun 2008 12:46] Tonci Grgin
Hi Matt and thanks for your report. Verified as described on WinXP SP2:

C:\mysql-5-0-64-pb1103-win32\bin>chcp 1250  (tested cp850, 852 ... too)
Active code page: 1250

C:\mysql-5-0-64-pb1103-win32\bin>mysql -uroot test
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 6
Server version: 5.0.64-pb1103 MySQL Pushbuild Edition, build 1103

Type 'help;' or '\h' for help. Type '\c' to clear the buffer.

mysql> set names utf8;
Query OK, 0 rows affected (0.00 sec)

mysql> show global variables like "datadir";
+---------------+--------+
| Variable_name | Value  |
+---------------+--------+
| datadir       | C:\Ton |
+---------------+--------+
1 row in set, 1 warning (0.00 sec)

mysql> quit
Bye

However, server knows where datadir is ("C:\Tonči" in my case) and this could be just a problem with windows console.
[11 Jun 2008 16:14] Michael Kuznetsov
It isn't console issue. A JDBC gives same result. Make show warnings. You will see that database has a problem.
[21 Oct 2008 13:08] Georgi Kodinov
The problem at hand:
Metadata code in the server expects data in character_set_system, but it's getting data in whatever character set the OS uses.
MySQL has character_set_system constantly set as UTF-8. 
On MS Windows however the OS treats mysqld as a non-unicode application and thus sends data (such as command line parameters) to it in the so called ANSI (non-wide) character set. In the current example I assume it's a variant of Latin1 (or windows-1250).
Note that the option file has a similar problem: mysql reads it as binary without any conversions.

There are less problems with modern Linux distributions, where usually the default OS character set is UTF-8. This causes the binary data read from the command line or the option files to have the correct UTF-8 encoding for the metadata handlers to handle correctly.
But on windows the metadata handlers are trying to treat a binary stream of chars in the ANSI character set as if they were UTF-8. As a result you get conversion error on the first non-ASCII character.

There are several possible solutions with varying degree of complexity and risk of regressions :
  1. Encode all the options to UTF-8 from the OS character set on read
  2. Set system_character_set to match the OS character set
  3. Encode the options to UTF-8 when filling in the data for metadata calls.
[21 Oct 2008 15:29] Peter Laursen
related somehow I think: http://bugs.mysql.com/bug.php?id=36458
[21 Oct 2008 15:43] Peter Laursen
I also think that is not always true "... sends data (such as command line parameters) to it in the so called ANSI (non-wide) character set."

It only does *if* the folder name is valid within *one* ANSI codepage. Try with a folder name in हिंदी (Hindi) ... I think it will use (little endian) UTF-16 (native Windows unicode impelmentation) for encoding of the folder name.  Same if there are both western non-ASCII characters and nonwestern characters at the same time (like 'æøåрусский'). Those simply cannot be represented as ANSI because 1) no ANSI codepage exists for Hindi 2) Not a single ANSI codepage possible for this!
[3 Nov 2008 17:01] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/57708

2715 Georgi Kodinov	2008-11-03
      Bug #37339: SHOW VARIABLES not working properly with multi-byte datadir
      
      The SHOW VARIABLES LIKE .../SELECT @@/SELECT ... FROM INFORMATION_SCHEMA.VARIABLES
      were assuming that all the system variables are in system charset (UTF-8).
      However the variables that are settable through command line will have a different
      character set (character_set_filesystem).
      Fixed the server to remember the correct character set of basedir, datadir, tmpdir,
      ssl variables; init_connect and init_slave variables and use it when processing
      data.
[22 Nov 2008 14:29] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/59620

2714 Georgi Kodinov	2008-11-22
      Bug #37339: SHOW VARIABLES not working properly with multi-byte datadir
      
      The SHOW VARIABLES LIKE .../SELECT @@/SELECT ... FROM INFORMATION_SCHEMA.VARIABLES
      were assuming that all the system variables are in system charset (UTF-8).
      However the variables that are settable through command line will have a different
      character set (character_set_filesystem).
      Fixed the server to remember the correct character set of basedir, datadir, tmpdir,
      ssl, plugin_dir, slave_load_tmpdir, innodb variables; init_connect and init_slave 
      variables and use it when processing data.
[24 Nov 2008 11:58] Alexander Barkov
The patch http://lists.mysql.com/commits/59620 looks ok to push.
[25 Nov 2008 14:55] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/59806

2714 Georgi Kodinov	2008-11-25
      Bug #37339: SHOW VARIABLES not working properly with multi-byte datadir
      
      The SHOW VARIABLES LIKE .../SELECT @@/SELECT ... FROM INFORMATION_SCHEMA.VARIABLES
      were assuming that all the system variables are in system charset (UTF-8).
      However the variables that are settable through command line will have a different
      character set (character_set_filesystem).
      Fixed the server to remember the correct character set of basedir, datadir, tmpdir,
      ssl, plugin_dir, slave_load_tmpdir, innodb variables; init_connect and init_slave 
      variables and use it when processing data.
[29 Nov 2008 11:18] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/60240

2714 Georgi Kodinov	2008-11-28
      Bug #37339: SHOW VARIABLES not working properly with multi-byte datadir
            
      The SHOW VARIABLES LIKE .../SELECT @@/SELECT ... FROM INFORMATION_SCHEMA.VARIABLES
      were assuming that all the system variables are in system charset (UTF-8).
      However the variables that are settable through command line will have a different
      character set (character_set_filesystem).
      Fixed the server to remember the correct character set of basedir, datadir, tmpdir,
      ssl, plugin_dir, slave_load_tmpdir, innodb variables; init_connect and init_slave 
      variables and use it when processing data.
[1 Dec 2008 11:35] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/60269

2728 Georgi Kodinov	2008-12-01
      Addendum to bug #37339 : make the test case portable to windows
      by using and taking out a full path.
[2 Dec 2008 13:02] Bugs System
Pushed into 5.0.74  (revid:kgeorge@mysql.com-20081201113453-i8sfukv7tsya4uzp) (version source revid:kgeorge@mysql.com-20081201113453-i8sfukv7tsya4uzp) (pib:5)
[3 Dec 2008 2:29] Paul Dubois
Noted in 5.0.74 changelog.

Statements that displayed the value of system variables (for example,
SHOW VARIABLES) expect variable values to be encoded in
character_set_system. However, variables set from the command line
such as basedir or datadir were encoded using
character_set_filesystem and not converted correctly.

Resetting report to NDI pending push into 5.1.x, 6.0.x.
[8 Dec 2008 10:22] Bugs System
Pushed into 5.1.31  (revid:kgeorge@mysql.com-20081201113453-i8sfukv7tsya4uzp) (version source revid:kgeorge@mysql.com-20081201131533-tmowqmfs474jcqz1) (pib:5)
[8 Dec 2008 11:33] Bugs System
Pushed into 6.0.9-alpha  (revid:kgeorge@mysql.com-20081201113453-i8sfukv7tsya4uzp) (version source revid:kgeorge@mysql.com-20081201133421-g0tgi1455kgn1xqh) (pib:5)
[8 Dec 2008 16:53] Paul Dubois
Noted in 5.1.31, 6.0.9 changelogs.
[19 Jan 2009 11:26] Bugs System
Pushed into 5.1.31-ndb-6.2.17 (revid:tomas.ulin@sun.com-20090119095303-uwwvxiibtr38djii) (version source revid:tomas.ulin@sun.com-20090108105244-8opp3i85jw0uj5ib) (merge vers: 5.1.31-ndb-6.2.17) (pib:6)
[19 Jan 2009 13:03] Bugs System
Pushed into 5.1.31-ndb-6.3.21 (revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (version source revid:tomas.ulin@sun.com-20090119104956-guxz190n2kh31fxl) (merge vers: 5.1.31-ndb-6.3.21) (pib:6)
[19 Jan 2009 16:09] Bugs System
Pushed into 5.1.31-ndb-6.4.1 (revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (version source revid:tomas.ulin@sun.com-20090119144033-4aylstx5czzz88i5) (merge vers: 5.1.31-ndb-6.4.1) (pib:6)