Bug #33023 | Online backup could mangle object names if non-std charset is used | ||
---|---|---|---|
Submitted: | 5 Dec 2007 21:33 | Modified: | 2 Sep 2008 18:29 |
Reporter: | Chuck Bell | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: Backup | Severity: | S3 (Non-critical) |
Version: | 6.0 | OS: | Any |
Assigned to: | Rafal Somla | CPU Architecture: | Any |
[5 Dec 2007 21:33]
Chuck Bell
[14 Dec 2007 8:57]
Rafal Somla
PROPOSED SOLUTION When server needs to identify a table when talking to a storage engine (e.g. in open() call) it uses a path string where table and database names are encoded using special character set. This path string for a given table/database name can be created using build_table_filename() function (defined in sql_table.cc). The function assumes that the input table and database names are encoded using system_charset_info. To give access to the internal representation of table name the following methods will be added to Table_ref class: size_t Table_ref::internal_name(Iname_buf buf); size_t Table_ref::internal_name(char *buf, size_t buflen); Type Table_ref::Iname_buf will be a char array of size appropriate for storing internal table name representation (FN_REFLEN?). It can be used as follows: Table_ref t; Table_ref::Iname_buf tname; size_t len= t.internal_name(tname); DBUG_ASSERT(tname[len] == '\0'); Alternatively, user can decide itself what buffer size to use: Table_ref t; char tname[1024]; size_t len= t.internal_name(tname,1024);
[14 Dec 2007 13:28]
Rafal Somla
Here is a small test script which can be used to see how table name in non-latin character set is handled by online backup system. In the current tree it already exhibits some problems: - restore fails when executing DROP TABLE statement since table name is not quoted properly, - restore fails when the connection character set settings are non UTF8. These issues will be addressed in the patch. The main issue of not translating table name to the internal representation (in this case it should be 'test/@ff71@ff71@ff71') does not appear in the current tree since we have no native backup engines yet. ------------------------------------------------------------------ SET NAMES utf8; SET character_set_database = utf8; USE test; CREATE TABLE `アアア`(`キキキ` char(5)) DEFAULT CHARSET = utf8; SHOW TABLES; SHOW CREATE TABLE `アアア`; INSERT INTO `アアア` VALUES ("Rafal"); SELECT * FROM `アアア`; BACKUP DATABASE test TO "test.bak"; DROP DATABASE test; SET NAMES latin1; RESTORE FROM "test.bak"; SELECT @@character_set_client; SELECT @@character_set_results; SELECT @@character_set_connection; SHOW TABLES IN test; SET NAMES utf8; SHOW TABLES IN test;
[14 Dec 2007 16:19]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/40014 ChangeSet@1.2752, 2007-12-14 17:17:32+01:00, rafal@quant.(none) +5 -0 BUG#33023 (Table name mangling). This patch defines Table_ref::internal_name() method for getting an internal, character set independent string identifying given table. This string is in the format expected by storage engines. Apart from that, two more issues are fixed: - When constructing a DROP statement (used for dropping objects during restore), the name of the object is quoted as necessary. - The default character sets of the connection executing RESTORE command are set to system's default (utf8) which is used in the queries creating the objects. Without that, restore was failing if non-standard characters were used in object names and the default character set was different from utf8.
[14 Dec 2007 16:35]
Rafal Somla
There are small differences between the proposed solution and what the patch implements: - The Table_ref::Iname_buf type is named Table_ref::name_buf. The same type is used for Table_ref::internal_name() and Table_ref::describe() methods. - Method Table_ref::internal_name() returns pointer to the resulting string, not its length.
[22 Apr 2008 9:36]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/45805 ChangeSet@1.2612, 2008-04-22 11:35:52+02:00, rafal@quant.(none) +11 -0 BUG#33023 (Online backup could mangle object names if non-std charset is used) The following problems are fixed by this patch: 1. If during restore a connection character set different than at backup time was used, object names were wrongly interpreted leading to failures. 2. Character set and collation settings associated with a view were not stored in backup image and not restored correctly. 3. When errors were detected during restore of table data, the Backup_restore_ctx object was not correctly set to error state. 4. Tables in the list passed from backup kernel to backup/restore drivers were not identified using the same convention as used by storage engines. The internal table name representation uses only US-ascii characters and can handle names written using any character set. The solutions are as follows. Ad 1) Charset settings are changed to system defaults inside obs::Obj::execute() method and restored to previous values at the end. Ad 2) obs::TableObj::serialize() method is modified to prepend "SET CHARACTER_SET_CLIENT" and "SET COLLATION_CONNECTION" statements in front of view's serialization string. Ad 3) Backup_restore_ctx::fatal_error() method is used for reporting errors which interrupt restore process. Ad 4) Method internal_name() is added to backup::Table_ref(). It can be used by backup/restore drivers to obtain internal table name repesentation.
[22 Apr 2008 9:39]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/45806 ChangeSet@1.2612, 2008-04-22 11:37:50+02:00, rafal@quant.(none) +13 -0 BUG#33023 (Online backup could mangle object names if non-std charset is used) The following problems are fixed by this patch: 1. If during restore a connection character set different than at backup time was used, object names were wrongly interpreted leading to failures. 2. Character set and collation settings associated with a view were not stored in backup image and not restored correctly. 3. When errors were detected during restore of table data, the Backup_restore_ctx object was not correctly set to error state. 4. Tables in the list passed from backup kernel to backup/restore drivers were not identified using the same convention as used by storage engines. The internal table name representation uses only US-ascii characters and can handle names written using any character set. The solutions are as follows. Ad 1) Charset settings are changed to system defaults inside obs::Obj::execute() method and restored to previous values at the end. Ad 2) obs::TableObj::serialize() method is modified to prepend "SET CHARACTER_SET_CLIENT" and "SET COLLATION_CONNECTION" statements in front of view's serialization string. Ad 3) Backup_restore_ctx::fatal_error() method is used for reporting errors which interrupt restore process. Ad 4) Method internal_name() is added to backup::Table_ref(). It can be used by backup/restore drivers to obtain internal table name repesentation.
[22 Apr 2008 9:41]
Rafal Somla
Since backup kernel has changed considerably, a new patch for this bug had to be created. Please review the new patch.
[22 Apr 2008 9:49]
Rafal Somla
BUG#33022 is a duplicate of this one.
[5 May 2008 13:54]
Chuck Bell
Patch approved.
[5 May 2008 15:05]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/46356 ChangeSet@1.2612, 2008-05-05 17:03:24+02:00, rafal@quant.(none) +13 -0 BUG#33023 (Online backup could mangle object names if non-std charset is used) The following problems are fixed by this patch: 1. If during restore a connection character set different than at backup time was used, object names were wrongly interpreted leading to failures. 2. Character set and collation settings associated with a view were not stored in backup image and not restored correctly. 3. When errors were detected during restore of table data, the Backup_restore_ctx object was not correctly set to error state. 4. Tables in the list passed from backup kernel to backup/restore drivers were not identified using the same convention as used by storage engines. The internal table name representation uses only US-ascii characters and can handle names written using any character set. The solutions are as follows. Ad 1) Charset settings are changed to system defaults inside obs::Obj::execute() method and restored to previous values at the end. Ad 2) obs::TableObj::serialize() method is modified to prepend "SET CHARACTER_SET_CLIENT" and "SET COLLATION_CONNECTION" statements in front of view's serialization string. Ad 3) Backup_restore_ctx::fatal_error() method is used for reporting errors which interrupt restore process. Ad 4) Method internal_name() is added to backup::Table_ref(). It can be used by backup/restore drivers to obtain internal table name repesentation.
[1 Sep 2008 14:01]
Rafal Somla
Updating status as this patch has been already pushed into main 6.0.7 tree.
[2 Sep 2008 18:29]
Paul DuBois
Noted in 6.0.7 changelog. BACKUP DATABASE followed by RESTORE could mangle object names if a non-standard charset was used.
[14 Sep 2008 5:05]
Bugs System
Pushed into 6.0.7-alpha (revid:sp1r-rafal@quant.(none)-20080505150324-05691) (version source revid:john.embretsen@sun.com-20080724122511-9c0oudz1xrdrs6y6) (pib:3)