Bug #45516 SQL thread does not use database charset properly
Submitted: 16 Jun 2009 8:25 Modified: 7 Mar 2010 2:16
Reporter: Yoshinori Matsunobu Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.0, 5.1, 6.0 bzr OS:Any
Assigned to: Libing Song CPU Architecture:Any
Tags: replication

[16 Jun 2009 8:25] Yoshinori Matsunobu
Description:
There are cases that Replication SQL thread does not set database charset to thd->variables.collation_database properly.
For example, LOAD DATA (using database character set as a client encoding) does not covert characters correctly.

How to repeat:
master:
# character-set-server=latin1/cp932/etc(other than utf8)

mysql> charset utf8;
mysql> create database d charset utf8;
mysql> use d;
mysql> create table t (c1 varchar(100)) charset utf8;

mysql> load data local infile '/path/to/attached-load.sql' into table t fields terminated by ',' lines terminated by '\n';

mysql> select c1, hex(c1) from t;
+----------+--------------------------+
| c1       | hex(c1)                  |
+----------+--------------------------+
| テスト1  | E38386E382B9E3838831     |
| テスト2  | E38386E382B9E3838832     |
| テスト㈱ | E38386E382B9E38388E388B1 |
+----------+--------------------------+
3 rows in set (0.03 sec)

slave:
# character-set-server=latin1/cp932/etc(other than utf8)

mysql> charset utf8;
mysql> select c1, hex(c1) from t;
+----------------+------------------------------------------------------+
| c1             | hex(c1)                                              |
+----------------+------------------------------------------------------+
| ??†??????1    | C3A3C692E280A0C3A3E2809AC2B9C3A3C692CB8631           |
| ??†??????2    | C3A3C692E280A0C3A3E2809AC2B9C3A3C692CB8632           |
| ??†????????± | C3A3C692E280A0C3A3E2809AC2B9C3A3C692CB86C3A3CB86C2B1 |
+----------------+------------------------------------------------------+
3 rows in set (0.01 sec)

Suggested fix:
Add the following line in sql/log_event.cc 
int Query_log_event::do_apply_event(Relay_log_info const *rli,
                                      const char *query_arg, uint32 q_len_arg)

  new_db.length= db_len;
  new_db.str= (char *) rpl_filter->get_rewrite_db(db, &new_db.length);
+  thd->db_charset= get_default_db_collation(thd, new_db.str);
  thd->set_db(new_db.str, new_db.length);       /* allocates a copy of 'db' */
[16 Jun 2009 8:26] Yoshinori Matsunobu
test data

Attachment: load.txt (text/plain), 35 bytes.

[16 Jun 2009 8:41] Sveta Smirnova
Thank you for the report.

Verified as described.

Workaround since 5.1: set binlog_format='row';
[10 Jul 2009 10:57] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/78372

2772 Li-Bing.Song@sun.com	2009-07-10
      Fix bug #45516
      Assign thd->db's charset to thd->db_charset
[13 Jul 2009 7:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/78496
[16 Jul 2009 8:02] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/78808

2772 Li-Bing.Song@sun.com	2009-07-16
      BUG#45516 SQL thread does not use database charset properly
      
      Replication SQL thread does not set database default charset to 
      thd->variables.collation_database properly, when executing LOAD DATA binlog.
      This bug can be repeated by using "LOAD DATA" command in STATEMENT mode.
      
      This patch adds code to find the default character set of the current database 
      then assign it to thd->db_charset when slave server begins to execute a relay log.
      The test of this bug is added into rpl_loaddata_charset.test
[16 Jul 2009 9:45] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/78820
[17 Jul 2009 5:46] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/78912
[31 Jul 2009 11:33] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/79760
[10 Aug 2009 7:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80420
[11 Aug 2009 2:41] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80514
[11 Aug 2009 2:43] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80515
[11 Aug 2009 5:26] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/80520
[19 Aug 2009 5:52] Libing Song
The patch was pushed to mysql-5.0-bugteam, mysql-5.1-bugteam and mysql-pe
[20 Aug 2009 10:22] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/81138
[2 Sep 2009 16:41] Bugs System
Pushed into 5.1.39 (revid:joro@sun.com-20090902154533-8actmfcsjfqovgsb) (version source revid:li-bing.song@sun.com-20090812053156-m7vyqji0x12sl9kf) (merge vers: 5.1.38) (pib:11)
[8 Sep 2009 19:32] Jon Stephens
Documented bugfix in the 5.1.39 changelog as follows:

        When using statement-based replication, database-level character
        sets were not always honored by the replication SQL thread. This
        could cause data inserted on the master using LOAD DATA to be
        replicated using the wrong character set.

        NOTE. This was not an issue when using row-based replication.

Set status to NDI, waiting for pushes to 5.0 and 5.4 trees.
[14 Sep 2009 16:03] Bugs System
Pushed into 5.4.4-alpha (revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (version source revid:alik@sun.com-20090914155317-m1g9wodmndzdj4l1) (merge vers: 5.4.4-alpha) (pib:11)
[16 Sep 2009 9:48] Jon Stephens
Also documented in the 5.4.4 changelog.

Set status to NDI, waiting for push to 5.0 tree.
[1 Oct 2009 5:58] Bugs System
Pushed into 5.1.39-ndb-6.3.28 (revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (version source revid:jonas@mysql.com-20091001055605-ap2kiaarr7p40mmv) (merge vers: 5.1.39-ndb-6.3.28) (pib:11)
[1 Oct 2009 7:25] Bugs System
Pushed into 5.1.39-ndb-7.0.9 (revid:jonas@mysql.com-20091001072547-kv17uu06hfjhgjay) (version source revid:jonas@mysql.com-20091001071652-irejtnumzbpsbgk2) (merge vers: 5.1.39-ndb-7.0.9) (pib:11)
[1 Oct 2009 8:28] Jon Stephens
No need to document this separately for Cluster. 

Set status back to NDI, still waiting for push to 5.0.
[1 Oct 2009 13:25] Bugs System
Pushed into 5.1.39-ndb-7.1.0 (revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (version source revid:jonas@mysql.com-20091001123013-g9ob2tsyctpw6zs0) (merge vers: 5.1.39-ndb-7.1.0) (pib:11)
[1 Oct 2009 14:29] Jon Stephens
See previous comment.
[2 Oct 2009 0:17] Paul Dubois
Moved 5.4 changelog entry from 5.4.4 to 5.4.3.
[5 Oct 2009 10:50] Bugs System
Pushed into 5.1.39-ndb-6.2.19 (revid:jonas@mysql.com-20091005103850-dwij2dojwpvf5hi6) (version source revid:jonas@mysql.com-20090930185117-bhud4ek1y0hsj1nv) (merge vers: 5.1.39-ndb-6.2.19) (pib:11)
[6 Oct 2009 8:14] Jon Stephens
As before, no need to document this separately for MySQL Cluster. 

Set status back to NDI, still waiting for push to 5.0.
[6 Oct 2009 9:05] Georgi Kodinov
Pushed into 5.1.40 (revid:joro@sun.com-20091006073316-lea2cpijh9r6on7c)
(version source
revid:ingo.struewing@sun.com-20090916070128-6053el2ucp5z7pyn) (merge
vers: 5.1.39) (pib:11)
[6 Oct 2009 9:06] Georgi Kodinov
Pushed into 5.0.87 (revid:joro@sun.com-20091006073202-rj21ggvo2gw032ks)
(version source
revid:kristofer.pettersson@sun.com-20090929151855-gvpblm4dnnubypdv)
(merge vers: 5.0.87) (pib:11)
[6 Oct 2009 11:14] Jon Stephens
Documented in the 5.0.87 changelog as follows:

        Database-level character sets were not always honored by the
        replication SQL thread. This could cause data inserted on the
        master using LOAD DATA to be replicated using the wrong character 
        set.

Closed.
[11 Dec 2009 6:01] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20091211055901-yp18b3c7xuhl87rf) (version source revid:alik@sun.com-20091211055401-43rjwq7gjed6ds83) (merge vers: 6.0.14-alpha) (pib:13)
[11 Dec 2009 6:03] Bugs System
Pushed into 5.6.0-beta (revid:alik@sun.com-20091211055628-ltr7fero363uev7r) (version source revid:alik@sun.com-20091211055453-717czhtezc74u8db) (merge vers: 5.6.0-beta) (pib:13)
[11 Dec 2009 19:46] Paul Dubois
Noted in 5.6.0, 6.0.14 changelogs.
[6 Mar 2010 10:54] Bugs System
Pushed into 5.5.3-m3 (revid:alik@sun.com-20100306103849-hha31z2enhh7jwt3) (version source revid:vvaintroub@mysql.com-20091211201717-03qf8ckwiw0np80p) (merge vers: 5.6.0-beta) (pib:16)
[7 Mar 2010 2:16] Paul Dubois
Moved 5.6.0 changelog entry to 5.5.3.