Bug #16298 | InnoDB segfaults in INSERTs in upgrade of 4.0 -> 5.0 tables with VARCHAR BINARY | ||
---|---|---|---|
Submitted: | 9 Jan 2006 11:19 | Modified: | 18 Jun 2010 2:03 |
Reporter: | Anders Henke | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server: InnoDB storage engine | Severity: | S1 (Critical) |
Version: | 5.0.18 | OS: | Linux (Linux) |
Assigned to: | Marko Mäkelä | CPU Architecture: | Any |
[9 Jan 2006 11:19]
Anders Henke
[9 Jan 2006 17:38]
Heikki Tuuri
Hi! I am assigning Marko to fix this. The bug is that in MySQL-4.0, InnoDB did not store mbminlen and mbmaxlen to dtype. The assertion should be removed. A similar wrong debug assertion is in row0sel.c. The bug has been in 5.0 at least for a year. The bug is not present in 4.1. Thank you, Heikki row0mysql.c: /* Remove trailing spaces from old style VARCHAR columns. */ /* Handle UCS2 strings differently. */ ulint mbminlen = dtype_get_mbminlen(dtype); ptr = mysql_data; if (mbminlen == 2) { /* space=0x0020 */ /* Trim "half-chars", just in case. */ col_len &= ~1; while (col_len >= 2 && ptr[col_len - 2] == 0x00 && ptr[col_len - 1] == 0x20) { col_len -= 2; } } else { ut_a(mbminlen == 1); /* space=0x20 */ while (col_len > 0 && ptr[col_len - 1] == 0x20) { col_len--; } } row0sel.c: } else { ut_ad(templ->mbminlen == 1); /* space=0x20 */ memset(pad_ptr, 0x20, field_end - pad_ptr); }
[9 Jan 2006 21:26]
Marko Mäkelä
I tried creating a table with a VARCHAR(255) column in MySQL 4.0.26 and updating it in 5.0, but it succeeded. Could we please have the CREATE TABLE statement? The table and column names can be mangled if they are sensitive; I'm only after the data type definitions.
[9 Jan 2006 21:32]
Dimitrij HIlt
Hi Marko, here is a 'table create' statement: mysql> show create table Reply\G *************************** 1. row *************************** Table: Reply Create Table: CREATE TABLE `Reply` ( `user_name` varchar(255) binary default NULL, `huntgroup_id` int(11) NOT NULL default '0', `attribute_id` int(11) NOT NULL default '0', `value` varchar(255) NOT NULL default '', `timestamp` datetime NOT NULL default '0000-00-00 00:00:00', KEY `user_name_idx` (`user_name`), KEY `Reply_idx_timestamp` (`timestamp`) ) TYPE=InnoDB 1 row in set (0.18 sec) mysql> Regards, Dimi
[10 Jan 2006 1:04]
Heikki Tuuri
Marko, 5.0.xx looks at the charset-collation code that we stored to the 'precise type' in the InnoDB internal data dictionary, and then uses a MySQL function to get the mbminlen and mbmaxlen for the charset. But at least 4.0.12 did not store any character set - collation code at all there. 5.0.xx then thinks that the charset number is 0. I tested with 5.0.16: (gdb) print all_charsets[0] $5 = (CHARSET_INFO *) 0x0 The function returns 0 as mbminlen and mbmaxlen. --- Hmm... but how could an upgrade 4.0 -> 4.1 ever succeed if the server used a non-latin1_swedish_cs default charset-collation? How could InnoDB sort the values at all, if it did not know charset? Maybe all non-latin1_swedish_cs users have rebuilt their tables in the upgrade: http://dev.mysql.com/doc/refman/4.1/en/upgrading-from-4-0.html Function get_charset() in 4.1.16 returns NULL if the charset number is 0. Regards, Heikki ha_innodb.cc in 4.0.12: /********************************************************************* Creates a table definition to an InnoDB database. */ static int create_table_def( /*=============*/ trx_t* trx, /* in: InnoDB transaction handle */ TABLE* form, /* in: information on table columns and indexes */ const char* table_name) /* in: table name */ { Field* field; dict_table_t* table; ulint n_cols; int error; ulint col_type; ulint nulls_allowed; ulint unsigned_type; ulint i; DBUG_ENTER("create_table_def"); DBUG_PRINT("enter", ("table_name: %s", table_name)); n_cols = form->fields; /* The '0' below specifies that everything is currently created in tablespace 0 */ table = dict_mem_table_create((char*) table_name, 0, n_cols); for (i = 0; i < n_cols; i++) { field = form->field[i]; col_type = get_innobase_type_from_mysql_type(field); if (field->null_ptr) { nulls_allowed = 0; } else { nulls_allowed = DATA_NOT_NULL; } if (field->flags & UNSIGNED_FLAG) { unsigned_type = DATA_UNSIGNED; } else { unsigned_type = 0; } dict_mem_table_add_col(table, (char*) field->field_name, col_type, (ulint)field->type() | nulls_allowed | unsigned_type, field->pack_length(), 0); } error = row_create_table_for_mysql(table, trx); error = convert_error_code_to_mysql(error, NULL); DBUG_RETURN(error); } ha_innodb.cc in 5.0.16: /********************************************************************** Get the variable length bounds of the given character set. NOTE that the exact prototype of this function has to be in /innobase/data/data0type.ic! */ extern "C" void innobase_get_cset_width( /*====================*/ ulint cset, /* in: MySQL charset-collation code */ ulint* mbminlen, /* out: minimum length of a char (in bytes) */ ulint* mbmaxlen) /* out: maximum length of a char (in bytes) */ { CHARSET_INFO* cs; ut_ad(cset < 256); ut_ad(mbminlen); ut_ad(mbmaxlen); cs = all_charsets[cset]; if (cs) { *mbminlen = cs->mbminlen; *mbmaxlen = cs->mbmaxlen; } else { ut_a(cset == 0); *mbminlen = *mbmaxlen = 0; } } ha_innodb.cc in 4.1.16: switch (mysql_tp) { case FIELD_TYPE_STRING: case FIELD_TYPE_VAR_STRING: case FIELD_TYPE_TINY_BLOB: case FIELD_TYPE_MEDIUM_BLOB: case FIELD_TYPE_BLOB: case FIELD_TYPE_LONG_BLOB: /* Use the charset number to pick the right charset struct for the comparison. Since the MySQL function get_charset may be slow before Bar removes the mutex operation there, we first look at 2 common charsets directly. */ if (charset_number == default_charset_info->number) { charset = default_charset_info; } else if (charset_number == my_charset_latin1.number) { charset = &my_charset_latin1; } else { charset = get_charset(charset_number, MYF(MY_WME)); if (charset == NULL) { fprintf(stderr, "InnoDB: fatal error: InnoDB needs charset %lu for doing a comparison,\n" "InnoDB: but MySQL cannot find that charset.\n", (ulong)charset_number); ut_a(0); } }
[10 Jan 2006 1:13]
Heikki Tuuri
Anders, Dimitrij, did you use latin1 or latin1_de as the default charset in the 4.0 server? Regards, Heikki
[10 Jan 2006 2:10]
Heikki Tuuri
All, I found out that InnoDB does use default_charset_info->number as the charset-collation for tables created with 4.0 or earlier. That is why an upgrade 4.0 -> 4.1 could work in the first place. Anders, did you change the default charset of the server in the upgrade? Does your upgraded server contain the default charset of 4.0? If MySQL thinks that the default charset is 0, that would explain how the assertion could fail in Anders' case. Regards, Heikki ha_innodb.cc in 5.0.16: data_mysql_default_charset_coll = (ulint)default_charset_info->number; dict0load.c in 5.0.16: if (dtype_is_non_binary_string_type(mtype, prtype) && dtype_get_charset_coll(prtype) == 0) { /* This is a non-binary string type, and the table was created with < 4.1.2. Use the default charset. */ prtype = dtype_form_prtype(prtype, data_mysql_default_charset_coll); } ha_innodb.cc in 5.0.16: /********************************************************************** Get the variable length bounds of the given character set. NOTE that the exact prototype of this function has to be in /innobase/data/data0type.ic! */ extern "C" void innobase_get_cset_width( /*====================*/ ulint cset, /* in: MySQL charset-collation code */ ulint* mbminlen, /* out: minimum length of a char (in bytes) */ ulint* mbmaxlen) /* out: maximum length of a char (in bytes) */ { CHARSET_INFO* cs; ut_ad(cset < 256); ut_ad(mbminlen); ut_ad(mbmaxlen); cs = all_charsets[cset]; if (cs) { *mbminlen = cs->mbminlen; *mbmaxlen = cs->mbmaxlen; } else { ut_a(cset == 0); *mbminlen = *mbmaxlen = 0; } }
[10 Jan 2006 2:19]
Heikki Tuuri
Hi! Ok, the mystery solved! A VARCHAR was defined as BINARY under 4.0. In that case, dict0load.c does NOT put the server default charset to the 'precise type'. The charset code is then 0. Marko, please check all places where mbminlen and mbmaxlen are used. Can the above break something also in other places? Regards, Heikki
[10 Jan 2006 8:34]
Dimitrij HIlt
Hi all, in our MySQL 4.0 is latin1 as default character_set defined. Dimi
[10 Jan 2006 10:07]
Marko Mäkelä
Verified by running the following two statements in 4.0.26 and then the last one in 5.0: CREATE TABLE `Reply` ( `user_name` varchar(255) binary default NULL, `huntgroup_id` int(11) NOT NULL default '0', `attribute_id` int(11) NOT NULL default '0', `value` varchar(255) NOT NULL default '', `timestamp` datetime NOT NULL default '0000-00-00 00:00:00', KEY `user_name_idx` (`user_name`), KEY `Reply_idx_timestamp` (`timestamp`) ) TYPE=InnoDB; INSERT INTO Reply VALUES(); UPDATE Reply SET user_name='fubar'; It looks like Heikki is right: the two assertions about mbminlen==1 can be safely removed. I'm still studying the uses of mbminlen and mbmaxlen before submitting a fix.
[10 Jan 2006 10:51]
Anders Henke
Sorry to correct Dimitrij, but in our 4.0-based servers we've always had german1 defined as default charset, while 4.1 and 5.0-based servers do use latin1 with collition latin1_german2_ci. german1 is basically latin1 with a different sorting order, so the difference is marginal, but still there.
[10 Jan 2006 11:25]
Heikki Tuuri
Marko, a better fix would be to give a charset also for old 'binary' type columns in dict0load.c? It is easy to forget that 4.0 DATA_BINARY etc. do not have a valid charset in InnoDB, and that can cause further bugs in the future. Regards, Heikki
[10 Jan 2006 11:43]
Marko Mäkelä
In row_ins_cascade_calc_update_vec() in row0ins.c, there is an ut_error assertion when mbminlen is not 1 or 2. I would guess that we should pad those old 4.0 columns (for which mbminlen=mbmaxlen=0) with 0x20 bytes. Also some assertions in row_sel_field_store_in_mysql_format() and one assertion in row_sel_store_mysql_rec() may fail. A simple fix would be to initialize mbminlen and mbmaxlen to 1 by default instead of 0.
[10 Jan 2006 12:35]
Marko Mäkelä
Preliminary patch to fix Bug #16298 in MySQL 5.0
Attachment: bug16298.patch (text/x-patch), 986 bytes.
[10 Jan 2006 12:38]
Marko Mäkelä
The preliminary patch that I just attached to this bug report initializes the charset-collation code of binary columns with unknown charset-collation to 63, the binary collation. This should ensure that mbminlen and mbmaxlen will be initialized to 1. (The fields will still remain 0 for non-string types, allowing the assertions to fail in case other than string columns are subjected to the trimming or padding of spaces.)
[10 Jan 2006 15:10]
Heikki Tuuri
Marko, I would prefer only setting the charset numebr to 63 if dtype_is_binary_string_type() is TRUE. We do not want to define a 'charset' for an INT column, for example. Regards, Heikki
[11 Jan 2006 9:04]
Marko Mäkelä
Improved preliminary patch to fix Bug #16298
Attachment: bug16298-2.patch (text/x-patch), 1.07 KiB.
[11 Jan 2006 9:07]
Marko Mäkelä
Heikki, I'm sorry, it was an oversight on my part. Like I wrote in the comment after the first patch, setting mbminlen=mbmaxlen=0 for non-string columns provides additional safety. I noticed that the definition of the symbol DATA_MYSQL_BINARY_CHARSET_COLL has not yet made it into the MySQL source repository. Those users who might want to try out the patch should substitute it with 63.
[31 Jan 2006 18:51]
Alexander Ivanov
Fixed in 5.0.19. Patch: http://lists.mysql.com/commits/1958.
[2 Feb 2006 13:26]
Jon Stephens
Thank you for your bug report. This issue has been committed to our source repository of that product and will be incorporated into the next release. If necessary, you can access the source repository and build the latest available version, including the bugfix, yourself. More information about accessing the source trees is available at http://www.mysql.com/doc/en/Installing_source_tree.html Additional info: Documented bugfix in 5.0.19 changelog. Closed.
[5 May 2010 15:27]
Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 1:49]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 5:59]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:27]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:55]
Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:44]
Paul DuBois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[17 Jun 2010 12:02]
Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:43]
Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:29]
Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)