Bug #16298 InnoDB segfaults in INSERTs in upgrade of 4.0 -> 5.0 tables with VARCHAR BINARY
Submitted: 9 Jan 2006 11:19 Modified: 18 Jun 2010 2:03
Reporter: Anders Henke Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S1 (Critical)
Version:5.0.18 OS:Linux (Linux)
Assigned to: Marko Mäkelä CPU Architecture:Any

[9 Jan 2006 11:19] Anders Henke
Description:
After upgrading a replication slave from 4.0.25 to 5.0.18, several Insert/Update/Delete operations on innodb table space lead to a segfault.

Issueing 'alter table xyz type=innodb' recreates the table to a new one', afterwards the same operations affecting the same table do no longer crash (yet other, unconverted tables still do - so I've tagged this bug as S2/Serious instead of S1/Critical).

Bug #15550 states that a similar issue has been fixed in 5.0.18, so I guess that this issue has not entirely been fixed.

How to repeat:
Create Innodb-based tables in MySQL 4.0 series (latest server 4.0.25), use
those tables for two years :-), afterwards upgrade to 5.0.18.

In my case, I'm upgrading a replication slave:
-shutdown 4.0.25
-start up 5.0.18 with skip-slave-start
-upon startup, Innodb upgrades innodb tablespace:

InnoDB: You are upgrading to an InnoDB version which allows multiple
InnoDB: tablespaces. Wait that purge and insert buffer merge run to
InnoDB: completion...
InnoDB: Full purge and insert buffer merge completed.
InnoDB: You have now successfully upgraded to the multiple tablespaces
InnoDB: format. You should NOT DOWNGRADE to an earlier version of
InnoDB: InnoDB! But if you absolutely need to downgrade, see
InnoDB: http://dev.mysql.com/doc/mysql/en/Multiple_tablespaces.html
InnoDB: for instructions.

-manually upgrade grant tables in 5.0.18 using fix_mysql_privileges.sql
-restart 5.0.18 (still with skip-slave-start enabled) to activate the new grant tables.
-remember values from SHOW SLAVE STATUS.
-RESET SLAVE;
-CHANGE MASTER TO ... $remembered values
-START SLAVE IO_THREAD fetches all required master logs successfully.

After issueing START SLAVE SQL_THREAD, one of the first insert/update/delete operions replayed from the replication logs leads to a segfault (reproducable) in innodb code. After issueing a 'ALTER TABLE xyz TYPE=INNODB', replication replay continues to a different insert/update/delete operation affecting a different table. It is notable that not every Insert/Update/Delete operation leads to a segfault, only a fraction (I guess somewhere between 10-20%) do. However, that's still lethal enough for replication.

Used 4.0.25 is the official MySQL rpm, 
5.0.18 the official "Linux (x86, glibc-2.2, static (Standard only), gcc)"-tgz.

060109 12:01:17 [Note] Slave I/O thread: connected to master 'repl@dbxyz.schlund.de:3306',  replication started in log 'binlog.13446' at position 41860247
060109 12:01:18InnoDB: Assertion failure in thread 2668686256 in file row0mysql.c line 341
InnoDB: Failing assertion: mbminlen == 1
InnoDB: We intentionally generate a memory trap.
InnoDB: Submit a detailed bug report to http://bugs.mysql.com.
InnoDB: If you get repeated assertion failures or crashes, even
InnoDB: immediately after the mysqld startup, there may be
InnoDB: corruption in the InnoDB tablespace. Please refer to
InnoDB: http://dev.mysql.com/doc/mysql/en/Forcing_recovery.html
InnoDB: about forcing recovery.
mysqld got signal 11;

Decoded Stacktrace from 5.0.18:

0x809f182 handle_segfault + 426
0x82dceb8 pthread_sighandler + 184
0x81b61df row_mysql_store_col_in_innobase_format + 287
0x81b63b4 row_mysql_convert_row_to_innobase + 120
0x81b70f1 row_insert_for_mysql + 329
0x8139b12 write_row__11ha_innobasePc + 890
0x80f8609 write_record__FP3THDP8st_tableP12st_copy_info + 1221
0x80f7762 mysql_insert__FP3THDP13st_table_listRt4List1Z4ItemRt4List1Zt4List1Z4It
emT2T215enum_duplicatesb + 1682
0x80b235e mysql_execute_command__FP3THD + 7738
0x80b7378 mysql_parse__FP3THDPcUi + 288
0x810b506 exec_event__15Query_log_eventP17st_relay_log_infoPCcUi + 866
0x810b19f exec_event__15Query_log_eventP17st_relay_log_info + 27
0x81661c9 exec_relay_log_event__FP3THDP17st_relay_log_info + 729
0x816700a handle_slave_sql + 798
0x82da66c pthread_start_thread + 220
0x8303faa thread_start + 4

The position/stacktrace is the same for all seen queries; the bug does not occur on
all operations, but on about 10-20% of all I/U/D operations on table space (and reproducably the same queries).

Suggested fix:
It's likely to be some format issue within innodb table space, as the 
'alter table xyz type=innob' does "only" re-create the same table freshly,
yet resolves the issue.

I've also setup a second replication slave using mysqldumps, this slave has been working fine for the full weekend, so it's really some on-disk-format issue.

We also used the backports.org-Package of MySQL 5.0.18 (to get a "differently compiled") release and repeated that issue on a different replication setup, the same error occurs and the stacktrace there decodes to the same issue like quoted above.

Maybe recheck that "compact" thing from bug 15550, but maybe it's a completely different issue..
[9 Jan 2006 17:38] Heikki Tuuri
Hi!

I am assigning Marko to fix this. The bug is that in MySQL-4.0, InnoDB did not store mbminlen and mbmaxlen to dtype. The assertion should be removed. A similar wrong debug assertion is in row0sel.c.

The bug has been in 5.0 at least for a year. The bug is not present in 4.1.

Thank you,

Heikki

row0mysql.c:
                        /* Remove trailing spaces from old style VARCHAR
                        columns. */

                        /* Handle UCS2 strings differently. */
                        ulint   mbminlen        = dtype_get_mbminlen(dtype);

                        ptr = mysql_data;

                        if (mbminlen == 2) {
                                /* space=0x0020 */
                                /* Trim "half-chars", just in case. */
                                col_len &= ~1;

                                while (col_len >= 2 && ptr[col_len - 2] == 0x00
                                                && ptr[col_len - 1] == 0x20) {
                                        col_len -= 2;
                                }
                        } else {
                                ut_a(mbminlen == 1);
                                /* space=0x20 */
                                while (col_len > 0
                                                && ptr[col_len - 1] == 0x20) {
                                        col_len--;
                                }
                        }

row0sel.c:
                } else {
                        ut_ad(templ->mbminlen == 1);
                        /* space=0x20 */

                        memset(pad_ptr, 0x20, field_end - pad_ptr);
                }
[9 Jan 2006 21:26] Marko Mäkelä
I tried creating a table with a VARCHAR(255) column in MySQL 4.0.26 and updating it in 5.0, but it succeeded. Could we please have the CREATE TABLE statement? The table and column names can be mangled if they are sensitive; I'm only after the data type definitions.
[9 Jan 2006 21:32] Dimitrij HIlt
Hi Marko,

here is a 'table create' statement:
mysql> show create table Reply\G
*************************** 1. row ***************************
       Table: Reply
Create Table: CREATE TABLE `Reply` (
  `user_name` varchar(255) binary default NULL,
  `huntgroup_id` int(11) NOT NULL default '0',
  `attribute_id` int(11) NOT NULL default '0',
  `value` varchar(255) NOT NULL default '',
  `timestamp` datetime NOT NULL default '0000-00-00 00:00:00',
  KEY `user_name_idx` (`user_name`),
  KEY `Reply_idx_timestamp` (`timestamp`)
) TYPE=InnoDB
1 row in set (0.18 sec)

mysql>

Regards,

Dimi
[10 Jan 2006 1:04] Heikki Tuuri
Marko,

5.0.xx looks at the charset-collation code that we stored to the 'precise type' in the InnoDB internal data dictionary, and then uses a MySQL function to get the mbminlen and mbmaxlen for the charset.

But at least 4.0.12 did not store any character set - collation code at all there. 5.0.xx then thinks that the charset number is 0. 

I tested with 5.0.16:

(gdb) print all_charsets[0]
$5 = (CHARSET_INFO *) 0x0

The function returns 0 as mbminlen and mbmaxlen.

---

Hmm... but how could an upgrade 4.0 -> 4.1 ever succeed if the server used a non-latin1_swedish_cs default charset-collation? How could InnoDB sort the values at all, if it did not know charset? Maybe all non-latin1_swedish_cs users have rebuilt their tables in the upgrade:

http://dev.mysql.com/doc/refman/4.1/en/upgrading-from-4-0.html

Function get_charset() in 4.1.16 returns NULL if the charset number is 0.

Regards,

Heikki

ha_innodb.cc in 4.0.12:

/*********************************************************************
Creates a table definition to an InnoDB database. */
static
int
create_table_def(
/*=============*/
	trx_t*		trx,		/* in: InnoDB transaction handle */
	TABLE*		form,		/* in: information on table
					columns and indexes */
	const char*	table_name)	/* in: table name */
{
	Field*		field;
	dict_table_t*	table;
	ulint		n_cols;
  	int 		error;
  	ulint		col_type;
  	ulint		nulls_allowed;
	ulint		unsigned_type;
  	ulint		i;

  	DBUG_ENTER("create_table_def");
  	DBUG_PRINT("enter", ("table_name: %s", table_name));

	n_cols = form->fields;

	/* The '0' below specifies that everything is currently
	created in tablespace 0 */

	table = dict_mem_table_create((char*) table_name, 0, n_cols);

	for (i = 0; i < n_cols; i++) {
		field = form->field[i];

		col_type = get_innobase_type_from_mysql_type(field);
		if (field->null_ptr) {
			nulls_allowed = 0;
		} else {
			nulls_allowed = DATA_NOT_NULL;
		}

		if (field->flags & UNSIGNED_FLAG) {
			unsigned_type = DATA_UNSIGNED;
		} else {
			unsigned_type = 0;
		}

		dict_mem_table_add_col(table, (char*) field->field_name,
					col_type, (ulint)field->type()
					| nulls_allowed | unsigned_type,
					field->pack_length(), 0);
	}

	error = row_create_table_for_mysql(table, trx);

	error = convert_error_code_to_mysql(error, NULL);

	DBUG_RETURN(error);
}

ha_innodb.cc in 5.0.16:

/**********************************************************************
Get the variable length bounds of the given character set.

NOTE that the exact prototype of this function has to be in
/innobase/data/data0type.ic! */
extern "C"
void
innobase_get_cset_width(
/*====================*/
        ulint   cset,           /* in: MySQL charset-collation code */
        ulint*  mbminlen,       /* out: minimum length of a char (in bytes) */
        ulint*  mbmaxlen)       /* out: maximum length of a char (in bytes) */
{
        CHARSET_INFO*   cs;
        ut_ad(cset < 256);
        ut_ad(mbminlen);
        ut_ad(mbmaxlen);

        cs = all_charsets[cset];
        if (cs) {
                *mbminlen = cs->mbminlen;
                *mbmaxlen = cs->mbmaxlen;
        } else {
                ut_a(cset == 0);
                *mbminlen = *mbmaxlen = 0;
        }
}

ha_innodb.cc in 4.1.16:

        switch (mysql_tp) {

        case FIELD_TYPE_STRING:
        case FIELD_TYPE_VAR_STRING:
        case FIELD_TYPE_TINY_BLOB:
        case FIELD_TYPE_MEDIUM_BLOB:
        case FIELD_TYPE_BLOB:
        case FIELD_TYPE_LONG_BLOB:
                /* Use the charset number to pick the right charset struct for
                the comparison. Since the MySQL function get_charset may be
                slow before Bar removes the mutex operation there, we first
                look at 2 common charsets directly. */

                if (charset_number == default_charset_info->number) {
                        charset = default_charset_info;
                } else if (charset_number == my_charset_latin1.number) {
                        charset = &my_charset_latin1;
                } else {
                        charset = get_charset(charset_number, MYF(MY_WME));

                        if (charset == NULL) {
                                fprintf(stderr,
"InnoDB: fatal error: InnoDB needs charset %lu for doing a comparison,\n"
"InnoDB: but MySQL cannot find that charset.\n", (ulong)charset_number);
                                ut_a(0);
                        }
                }
[10 Jan 2006 1:13] Heikki Tuuri
Anders, Dimitrij,

did you use latin1 or latin1_de as the default charset in the 4.0 server?

Regards,

Heikki
[10 Jan 2006 2:10] Heikki Tuuri
All,

I found out that InnoDB does use default_charset_info->number as the charset-collation for tables created with 4.0 or earlier. That is why an upgrade 4.0 -> 4.1 could work in the first place.

Anders, did you change the default charset of the server in the upgrade?

Does your upgraded server contain the default charset of 4.0?

If MySQL thinks that the default charset is 0, that would explain how the assertion could fail in Anders' case.

Regards,

Heikki

ha_innodb.cc in 5.0.16:

        data_mysql_default_charset_coll = (ulint)default_charset_info->number;

dict0load.c in 5.0.16:

                if (dtype_is_non_binary_string_type(mtype, prtype)
                    && dtype_get_charset_coll(prtype) == 0) {
                        /* This is a non-binary string type, and the table
                        was created with < 4.1.2. Use the default charset. */

                        prtype = dtype_form_prtype(prtype,
                                        data_mysql_default_charset_coll);
                }

ha_innodb.cc in 5.0.16:

/**********************************************************************
Get the variable length bounds of the given character set.

NOTE that the exact prototype of this function has to be in
/innobase/data/data0type.ic! */
extern "C"
void
innobase_get_cset_width(
/*====================*/
        ulint   cset,           /* in: MySQL charset-collation code */
        ulint*  mbminlen,       /* out: minimum length of a char (in bytes) */
        ulint*  mbmaxlen)       /* out: maximum length of a char (in bytes) */
{
        CHARSET_INFO*   cs;
        ut_ad(cset < 256);
        ut_ad(mbminlen);
        ut_ad(mbmaxlen);

        cs = all_charsets[cset];
        if (cs) {
                *mbminlen = cs->mbminlen;
                *mbmaxlen = cs->mbmaxlen;
        } else {
                ut_a(cset == 0);
                *mbminlen = *mbmaxlen = 0;
        }
}
[10 Jan 2006 2:19] Heikki Tuuri
Hi!

Ok, the mystery solved! A VARCHAR was defined as BINARY under 4.0. In that case, dict0load.c does NOT put the server default charset to the 'precise type'. The charset code is then 0.

Marko, please check all places where mbminlen and mbmaxlen are used. Can the above break something also in other places?

Regards,

Heikki
[10 Jan 2006 8:34] Dimitrij HIlt
Hi all,

in our MySQL 4.0 is latin1 as default character_set defined.

Dimi
[10 Jan 2006 10:07] Marko Mäkelä
Verified by running the following two statements in 4.0.26 and then the last one in 5.0:

CREATE TABLE `Reply` (
  `user_name` varchar(255) binary default NULL,
  `huntgroup_id` int(11) NOT NULL default '0',
  `attribute_id` int(11) NOT NULL default '0',
  `value` varchar(255) NOT NULL default '',
  `timestamp` datetime NOT NULL default '0000-00-00 00:00:00',
  KEY `user_name_idx` (`user_name`),
  KEY `Reply_idx_timestamp` (`timestamp`)
) TYPE=InnoDB;

INSERT INTO Reply VALUES();

UPDATE Reply SET user_name='fubar';

It looks like Heikki is right: the two assertions about mbminlen==1 can be safely removed. I'm still studying the uses of mbminlen and mbmaxlen before submitting a fix.
[10 Jan 2006 10:51] Anders Henke
Sorry to correct Dimitrij, but in our 4.0-based servers we've always had german1
defined as default charset, while 4.1 and 5.0-based servers do use latin1 with 
collition latin1_german2_ci.

german1 is basically latin1 with a different sorting order, so the difference is marginal, but still there.
[10 Jan 2006 11:25] Heikki Tuuri
Marko,

a better fix would be to give a charset also for old 'binary' type columns in dict0load.c?

It is easy to forget that 4.0 DATA_BINARY etc. do not have a valid charset in InnoDB, and that can cause further bugs in the future.

Regards,

Heikki
[10 Jan 2006 11:43] Marko Mäkelä
In row_ins_cascade_calc_update_vec() in row0ins.c, there is an ut_error assertion when mbminlen is not 1 or 2. I would guess that we should pad those old 4.0 columns (for which mbminlen=mbmaxlen=0) with 0x20 bytes.

Also some assertions in row_sel_field_store_in_mysql_format() and one assertion in row_sel_store_mysql_rec() may fail.

A simple fix would be to initialize mbminlen and mbmaxlen to 1 by default instead of 0.
[10 Jan 2006 12:35] Marko Mäkelä
Preliminary patch to fix Bug #16298 in MySQL 5.0

Attachment: bug16298.patch (text/x-patch), 986 bytes.

[10 Jan 2006 12:38] Marko Mäkelä
The preliminary patch that I just attached to this bug report initializes the charset-collation code of binary columns with unknown charset-collation to 63, the binary collation. This should ensure that mbminlen and mbmaxlen will be initialized to 1. (The fields will still remain 0 for non-string types, allowing the assertions to fail in case other than string columns are subjected to the trimming or padding of spaces.)
[10 Jan 2006 15:10] Heikki Tuuri
Marko,

I would prefer only setting the charset numebr to 63 if dtype_is_binary_string_type() is TRUE. We do not want to define a 'charset' for an INT column, for example.

Regards,

Heikki
[11 Jan 2006 9:04] Marko Mäkelä
Improved preliminary patch to fix Bug #16298

Attachment: bug16298-2.patch (text/x-patch), 1.07 KiB.

[11 Jan 2006 9:07] Marko Mäkelä
Heikki,
I'm sorry, it was an oversight on my part. Like I wrote in the comment after the first patch, setting mbminlen=mbmaxlen=0 for non-string columns provides additional safety.

I noticed that the definition of the symbol DATA_MYSQL_BINARY_CHARSET_COLL has not yet made it into the MySQL source repository. Those users who might want to try out the patch should substitute it with 63.
[31 Jan 2006 18:51] Alexander Ivanov
Fixed in 5.0.19.
Patch: http://lists.mysql.com/commits/1958.
[2 Feb 2006 13:26] Jon Stephens
Thank you for your bug report. This issue has been committed to our
source repository of that product and will be incorporated into the
next release.

If necessary, you can access the source repository and build the latest
available version, including the bugfix, yourself. More information 
about accessing the source trees is available at
    http://www.mysql.com/doc/en/Installing_source_tree.html

Additional info:

Documented bugfix in 5.0.19 changelog. Closed.
[5 May 2010 15:27] Bugs System
Pushed into 5.1.47 (revid:joro@sun.com-20100505145753-ivlt4hclbrjy8eye) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[6 May 2010 1:49] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug. Re-closing.
[28 May 2010 5:59] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100524190136-egaq7e8zgkwb9aqi) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (pib:16)
[28 May 2010 6:27] Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100524190941-nuudpx60if25wsvx) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[28 May 2010 6:55] Bugs System
Pushed into 5.5.5-m3 (revid:alik@sun.com-20100524185725-c8k5q7v60i5nix3t) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[29 May 2010 2:44] Paul Dubois
Push resulted from incorporation of InnoDB tree. No changes pertinent to this bug.
Re-closing.
[17 Jun 2010 12:02] Bugs System
Pushed into 5.1.47-ndb-7.0.16 (revid:martin.skold@mysql.com-20100617114014-bva0dy24yyd67697) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 12:43] Bugs System
Pushed into 5.1.47-ndb-6.2.19 (revid:martin.skold@mysql.com-20100617115448-idrbic6gbki37h1c) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)
[17 Jun 2010 13:29] Bugs System
Pushed into 5.1.47-ndb-6.3.35 (revid:martin.skold@mysql.com-20100617114611-61aqbb52j752y116) (version source revid:vasil.dimov@oracle.com-20100331130613-8ja7n0vh36a80457) (merge vers: 5.1.46) (pib:16)