Bug #42170 MySQL crashes with InnoDB, many columns and an anti-join
Submitted: 16 Jan 2009 20:09 Modified: 30 Apr 2012 12:36
Reporter: Tim Soderstrom Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: InnoDB storage engine Severity:S3 (Non-critical)
Version:5.0.67, 5.1.30 OS:Linux (ES 5.2)
Assigned to: Assigned Account CPU Architecture:Any
Tags: anti-join, innodb

[16 Jan 2009 20:09] Tim Soderstrom
Description:
MySQL crashes when running a query using MyISAM with a small number of columns and an InnoDB table with a very large number of columns with a NOT IN clause and a sub-select. Specifically:

DELETE FROM `property` WHERE `MLNumber` NOT IN (SELECT `primary` FROM `active` WHERE `table` = 'property' AND `timestamp_updated` >= CURDATE())

As the property table is very large, I have attached a SQL file so that it can be easily imported into MySQL (as well as the error output)

Increasing the innodb_log_buffer_size from 1M to 8M fixes this issue, however MySQL should not be crashing in this case. Instead it should flag an error (to the user, the error log, or both).

How to repeat:
1. Use the provided my.cnf or simply configure InnoDB as follows:

innodb_log_file_size=100M
innodb_buffer_pool_size=512M
innodb_log_buffer_size=1M

2. Import the provided test.sql onto a test database (called 'test' from this point forward):

mysql> CREATE DATABASE test;
# mysql test < test.sql

3. Edit the populate.php script to change the database, username and password and run it:

# php populate.php

(You could also any method of populating the table - just be sure it is around 250,000 rows in size.

4. Run the following query:

DELETE FROM `property` WHERE `MLNumber` NOT IN (SELECT `primary` FROM `active` WHERE `table` = 'property' AND `timestamp_updated` >= CURDATE())

5. Watch the error log. Note that the query may complete before MySQL crashes (it seems to crash a few seconds to a few minutes after the query runs).

6. Edit the my.cnf and set 'innodb_log_file_size=8M' If MySQL is already in a crash loop, simply wait for it to restart and it should be stable from that point forward.

Suggested fix:
Though I'm not sure exactly why, it seems that the innodb_log_buffer_size is too small to run the above query. Instead of crashing, however, MySQL should flag an error to the user, the error log, or both. Either way, MySQL should not crash.
[16 Jan 2009 20:10] Tim Soderstrom
Error Log, my.cnf, .SQL and .PHP scripts to reproduce bug

Attachment: bug.tar.gz (application/x-tar, text), 10.16 KiB.

[19 Jan 2009 17:11] MySQL Verification Team
Thank you for the bug report. To try your test case I modified some columns of InnoDB tables to accept NOT NULL values and I couldn't repeat. Did you try the test case you provided?. Thanks in advance.
[19 Jan 2009 17:59] Tim Soderstrom
Hi Miguel!

Doh! My apologies! The provided my.cnf has the innodb_log_buffer_size set to 8MB. MySQL crashes when this is set to 1MB, not 8. Sorry about that!

Otherwise, I have not modified the test case apart what is provided in the bugs.tar.gz.
[19 Jan 2009 20:13] Mikhail Izioumtchenko
Please continue verification and triage. Assigning to Inaam
in the meantime as Inaam is our log specialist, to determine if 
returning an error instead of asserting is feasible, if so, where exactly
it should be thrown etc
[20 Jan 2009 0:59] MySQL Verification Team
I couldn't repeat with latest source server on Windows. I will try with more records.

mysql 5.1 >DELETE FROM `property` WHERE `MLNumber` NOT IN (SELECT `primary` FROM `active` WHERE
    -> `table` = 'property' AND `timestamp_updated` >= CURDATE())
    -> ;
Query OK, 949991 rows affected (8 min 24.07 sec)

mysql 5.1 >show variables like "innodb_log_buffer_size";
+------------------------+---------+
| Variable_name          | Value   |
+------------------------+---------+
| innodb_log_buffer_size | 1048576 |
+------------------------+---------+
1 row in set (0.00 sec)
[30 Apr 2012 12:36] MySQL Verification Team
Hi Tim,

I'm calling this a duplicate of bug #50366
[9 Jun 2012 9:01] yin ye
I also found this problem in some business. using the test case of Tim Soderstrom, I repeat the problem in version 5.1.61:

120607  9:53:47  InnoDB: Assertion failure in thread 2969676688 in file log/log0log.c line 202
InnoDB: Failing assertion: len < log->buf_size / 2
... ...

/usr/local/mysql/libexec/mysqld [0x83c04f2]
/usr/local/mysql/libexec/mysqld(mtr_commit+0xad) [0x83cafcd]
/usr/local/mysql/libexec/mysqld [0x83eba62]
/usr/local/mysql/libexec/mysqld(row_purge_step+0x8e7) [0x83ec8d7]
/usr/local/mysql/libexec/mysqld(que_run_threads+0x50d) [0x83db8dd]
/usr/local/mysql/libexec/mysqld(trx_purge+0x2ec) [0x840501c]
/usr/local/mysql/libexec/mysqld(srv_master_thread+0x579) [0x83ff1c9]

Assert failed during purging. After parsed the redo log, I found the reason.
the mini-transactions occur during a B-tree compress produce many redo log, because all records in old page will move to new page, but  didn't set log mode to MTR_LOG_SHORT_INSERTS, which is different from B-tree split. And every redo log record contain all fields of the cluster index. If table has many fileds, the redo log record will become very big. I think innodb should take the same way with B-tree split during B-tree compress.

These are the redo log records of the last mini-transaction before crash:
begin--------------------------------------------
type:42, space:0, page no:3189, len:1290
type:46, space:0, page no:16409, len:1289
type:4, space:0, page no:16409, len:9
type:4, space:0, page no:3190, len:9
type:42, space:0, page no:62, len:1289
type:38, space:0, page no:62, len:1299
type:42, space:0, page no:16527, len:1291
type:46, space:0, page no:16526, len:1289
type:4, space:0, page no:16526, len:10
type:4, space:0, page no:25156, len:10
type:42, space:0, page no:62, len:1289
type:38, space:0, page no:16526, len:1301
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
type:38, space:0, page no:16526, len:1297
....

type:31, space:0, page no:0, len:1
end-----------------------------------------------

There are 600+ records with type 38,and every record is very big,and the total size is more than half of the redo log buffer.