| Bug #62037 | assert fails in row_search_for_mysql on TRX_ISO_READ_UNCOMMITTED | ||
|---|---|---|---|
| Submitted: | 30 Jul 2011 14:14 | Modified: | 13 Jul 2012 18:02 |
| Reporter: | Mark Callaghan | Email Updates: | |
| Status: | Duplicate | Impact on me: | |
| Category: | MySQL Server: InnoDB Plugin storage engine | Severity: | S1 (Critical) |
| Version: | 5.1.52, 5.1.58, 5.5.14 | OS: | Any |
| Assigned to: | CPU Architecture: | Any | |
| Tags: | innodb, regression | ||
[30 Jul 2011 14:58]
Mark Callaghan
There are many asserts for TRX_ISO_READ_UNCOMMITTED that are fairly new including this assert
3351.14.276 marko.makela@oracle.com 20101019 | /* Returning a row to MySQL */
|
| if (!row_sel_store_mysql_rec(buf, prebuilt, result_rec,
| result_rec != rec,
| offsets)) {
3351.14.170 marko.makela@oracle.com 20100630 | /* Only fresh inserts may contain
| incomplete externally stored
| columns. Pretend that such records do
| not exist. Such records may only be
| accessed at the READ UNCOMMITTED
3351.14.166 marko.makela@oracle.com 20100629 | isolation level or when rolling back a
3351.14.170 marko.makela@oracle.com 20100630 | recovered transaction. Rollback
| happens at a lower level, not here. */
3351.14.166 marko.makela@oracle.com 20100629 | ut_a(trx->isolation_level
| == TRX_ISO_READ_UNCOMMITTED);
| goto next_rec;
revno: 3351.14.166
committer: Marko Mäkelä <marko.makela@oracle.com>
branch nick: 5.1-innodb
timestamp: Tue 2010-06-29 15:55:18 +0300
message:
Bug#54358: READ UNCOMMITTED access failure of off-page DYNAMIC or COMPRESSED
columns
When the server crashes after a record stub has been inserted and
before all its off-page columns have been written, the record will
contain incomplete off-page columns after crash recovery. Such records
may only be accessed at the READ UNCOMMITTED isolation level or when
rolling back a recovered transaction in recv_recovery_rollback_active().
Skip these records at the READ UNCOMMITTED isolation level.
TODO: Add assertions for checking the above assumptions hold when an
incomplete BLOB is encountered.
btr_rec_copy_externally_stored_field(): Return NULL if the field is
incomplete.
row_prebuilt_t::templ_contains_blob: Clarify what "BLOB" means in this
context. Hint: MySQL BLOBs are not the same as InnoDB BLOBs.
row_sel_store_mysql_rec(): Return FALSE if not all columns could be
retrieved. Previously this function always returned TRUE. Assert that
the record is not delete-marked.
row_sel_push_cache_row_for_mysql(): Return FALSE if not all columns
could be retrieved.
row_search_for_mysql(): Skip records containing incomplete off-page
columns. Assert that the transaction isolation level is READ
UNCOMMITTED.
rb://380 approved by Jimmy Yang
[30 Jul 2011 15:19]
Mark Callaghan
And the root cause for the assert is that this function returns NULL
/*******************************************************************//**
Copies an externally stored field of a record to mem heap.
@return the field copied to heap, or NULL if the field is incomplete */
UNIV_INTERN
byte*
btr_rec_copy_externally_stored_field(
/*=================================*/
const rec_t* rec, /*!< in: record in a clustered index;
must be protected by a lock or a page latch */
const ulint* offsets,/*!< in: array returned by rec_get_offsets() */
ulint zip_size,/*!< in: nonzero=compressed BLOB page size,
zero for uncompressed BLOBs */
ulint no, /*!< in: field number */
ulint* len, /*!< out: length of the field */
mem_heap_t* heap) /*!< in: mem heap */
{
ulint local_len;
const byte* data;
ut_a(rec_offs_nth_extern(offsets, no));
/* An externally stored field can contain some initial
data from the field, and in the last 20 bytes it has the
space id, page number, and offset where the rest of the
field data is stored, and the data length in addition to
the data stored locally. We may need to store some data
locally to get the local record length above the 128 byte
limit so that field offsets are stored in two bytes, and
the extern bit is available in those two bytes. */
data = rec_get_nth_field(rec, offsets, no, &local_len);
ut_a(local_len >= BTR_EXTERN_FIELD_REF_SIZE);
if (UNIV_UNLIKELY
(!memcmp(data + local_len - BTR_EXTERN_FIELD_REF_SIZE,
field_ref_zero, BTR_EXTERN_FIELD_REF_SIZE))) {
/* The externally stored field was not written yet.
This record should only be seen by
recv_recovery_rollback_active() or any
TRX_ISO_READ_UNCOMMITTED transactions. */
return(NULL);
}
[30 Jul 2011 16:19]
Valeriy Kravchuk
It would be still nice to find out exact SQL failing or, at least, information about the table accessed while this assertion happens.
I had tried to run concurrent INSERTs, ROLLBACK WORK and SELECT (both all rows and by index on c2) for the table like this:
mysql> show create table tinno\G
*************************** 1. row ***************************
Table: tinno
Create Table: CREATE TABLE `tinno` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`c2` int(11) DEFAULT NULL,
`c3` blob,
PRIMARY KEY (`id`),
KEY `c2` (`c2`)
) ENGINE=InnoDB AUTO_INCREMENT=145390 DEFAULT CHARSET=latin1 ROW_FORMAT=DYNAMIC
1 row in set (0.06 sec)
mysql> show table status like 'tinno'\G
*************************** 1. row ***************************
Name: tinno
Engine: InnoDB
Version: 10
Row_format: Dynamic
Rows: 4098
Avg_row_length: 32883
Data_length: 134758400
Max_data_length: 0
Index_length: 98304
Data_free: 131072000
Auto_increment: 145390
Create_time: 2011-07-30 18:51:24
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options: row_format=DYNAMIC
Comment:
1 row in set (0.07 sec)
with innodb_file_per_table and Barracuda format set, but no assertions happened yet.
[30 Jul 2011 16:20]
Mark Callaghan
I think mysqldump SELECT statements were repeatedly crashing a server and am trying to reproduce that now
[30 Jul 2011 17:07]
Mark Callaghan
While row_search_for_mysql has good error messages for other errors -- messages that include the table_name, the error message in this case is extremely limited. So this is also a feature request to improve those error messges by including the table/database name or space ID. Otherwise, we must guess at the reproduction case. And please, any time new code adds an assertion to InnoDB, please think about what will be in the mysqld error log when that assertion fails. In many cases the log is insufficient unless more detail is included prior to the assert failure. Assume that the stack dump done by mysqld on asserts and segfaults doesn't always dump the SQL statement.
[31 Jul 2011 9:41]
MySQL Verification Team
On a clean datadir using 5.1.58 + plugin + compact tables + innodb file per table, I got this with 3 concurrent threads doing blob stuff: Version: '5.1.58-community' socket: '' port: 3306 MySQL Community Server (GPL) 110731 11:36:49 InnoDB: Assertion failure in thread 2036 in file .\row\row0sel.c line 4473 InnoDB: Failing assertion: trx->isolation_level == TRX_ISO_READ_UNCOMMITTED InnoDB: We intentionally generate a memory trap. ha_innodb_plugin.dll!row_search_for_mysql()[row0sel.c:4473] ha_innodb_plugin.dll!ha_innodb::index_read()[ha_innodb.cc:5479] ha_innodb_plugin.dll!ha_innodb::index_first()[ha_innodb.cc:5783] mysqld.exe!handler::read_range_first()[handler.cc:4261] mysqld.exe!handler::read_multi_range_first()[handler.cc:4139] mysqld.exe!QUICK_RANGE_SELECT::get_next()[opt_range.cc:8594] mysqld.exe!rr_quick()[records.cc:331] mysqld.exe!join_init_read_record()[sql_select.cc:12189] mysqld.exe!sub_select()[sql_select.cc:11459] mysqld.exe!do_select()[sql_select.cc:11217] mysqld.exe!JOIN::exec()[sql_select.cc:2343] mysqld.exe!mysql_select()[sql_select.cc:2555] mysqld.exe!handle_select()[sql_select.cc:269] mysqld.exe!execute_sqlcom_select()[sql_parse.cc:5171] mysqld.exe!mysql_execute_command()[sql_parse.cc:2307] mysqld.exe!mysql_parse()[sql_parse.cc:6092] mysqld.exe!dispatch_command()[sql_parse.cc:1277] mysqld.exe!do_command()[sql_parse.cc:893] mysqld.exe!handle_one_connection()[sql_connect.cc:1427] mysqld.exe!pthread_start()[my_winthread.c:85] mysqld.exe!_callthreadstart()[thread.c:293] Query (02517400): =select * from `blobtest_1` where `id`<=4495 limit 1 Repeating the query and dumping tables doesn't crash.
[31 Jul 2011 11:49]
MySQL Verification Team
I got this repeatable in seconds by running mysqld like this: ./bin/mysqld_safe --no-defaults --skip-gr --skip-na --ignore-builtin-innodb --plugin-load=ha_innodb_plugin.so --innodb-flush-log-at-trx-commit=0 --core-file then running two copies of the attached blob tester php app concurrently.
[31 Jul 2011 12:46]
Valeriy Kravchuk
Same test case crashed 64-bit release binaries of 5.5.14 on Mac OS X: 110731 15:41:46 [Note] /usr/local/mysql-5.5.14-osx10.5-x86_64/bin/mysqld: ready for connections. Version: '5.5.14' socket: '/tmp/mysql.sock' port: 3306 MySQL Community Server (GPL) 110731 15:42:41 InnoDB: Assertion failure in thread 4492226560 in file row0sel.c line 4471 InnoDB: Failing assertion: trx->isolation_level == TRX_ISO_READ_UNCOMMITTED InnoDB: We intentionally generate a memory trap.
[31 Jul 2011 14:28]
MySQL Verification Team
From my testcase results: 5.1.47: no assertion 5.1.48: no assertion 5.1.49: no assertion 5.1.50: hits assertion 5.1.58: hits assertion 5.5.14: hits assertion 5.5.15: no assertion. not sure why. On a side note, the SQL query was not printed in old versions due to bug #51817 .
[1 Aug 2011 5:53]
MySQL Verification Team
Could this be a duplicate of the following ? BUG 12612184 - RACE CONDITION AFTER BTR_CUR_PESSIMISTIC_UPDATE() http://lists.mysql.com/commits/138935 http://lists.mysql.com/commits/139001
[1 Aug 2011 6:40]
Marko Mäkelä
This looks like a duplicate of Oracle Bug#12612184, which I worked on in June. Since the beginning of InnoDB, there has been a race condition in updates (quoting what I posted in the bug in June):
----start quote----
When a record is updated so that its size grows, non-updated columns can be selected for external (off-page) storage. The bug is that the initially inserted updated record contains an all-zero BLOB pointer to the field that was not updated. Only after the BLOB is allocated, the pointer will be valid. This is the window of opportunity (quoting from row_upd_clust_rec() in MySQL 3.23.53):
err = btr_cur_pessimistic_update(BTR_NO_LOCKING_FLAG, btr_cur, &big_rec, node->update, node->cmpl_info, thr, mtr);
mtr_commit(mtr);
if (err == DB_SUCCESS && big_rec) {
mtr_start(mtr);
ut_a(btr_pcur_restore_position(BTR_MODIFY_TREE, pcur, mtr));
Between the release of the page latch in mtr_commit(mtr) and the re-latching of the page in btr_pcur_restore_position(), other threads can see the invalid BLOB pointer. I checked that the invalid BLOB pointer would be zeroed in dtuple_convert_big_rec() even in 3.23.53; old versions of InnoDB had the bad habit of not initializing all fields.
The problem is amplified by the ROW_FORMAT=DYNAMIC and ROW_FORMAT=COMPRESSED that were introduced in innodb_file_format=barracuda in InnoDB Plugin, but the bug does exist in all InnoDB versions.
----end quote----
Apparently, the 5.1.58 release does not contain the fix, even though the files in the source archive are dated 2011-07-01. I pushed the fix to the mysql-5.1 tree on 2011-06-16.
There is a known bug in my fix: If the server crashes between btr_cur_mtr_commit_and_start() and the final mtr_commit(mtr) after all BLOBs have been written, at crash recovery any information about the non-updated BLOBs (which were moved to off-page storage due to the record size changing in the update) will be permanently lost.
I outlined a fix for that in Bug#12612184 on 2011-06-28, and I will get back to it after I have completed more urgent tasks that have been piling up while I was on vacation.
[1 Aug 2011 7:23]
Marko Mäkelä
The recovery bug is being tracked as Oracle Bug #12704861 CORRUPTION AFTER A CRASH DURING BLOB UPDATE.
[1 Aug 2011 15:34]
Mark Callaghan
12704861 describes a race condition between concurrent INSERT (or UPDATE) and SELECT. I can reproduce this with one SELECT statement. Is my table now corrupt with a zero-length blob reference?
[2 Aug 2011 7:02]
Marko Mäkelä
Yes Mark, your table could be corrupted so that a BLOB pointer is full of zero bytes (sort of a null pointer). This could be a transient or a permanent condition. The condition should be transient when another transaction is trying to access an off-page column just after an UPDATE transaction has done the first mtr_commit() and is about to x-latch the clustered index page so that it can write the off-page columns. If you get a crash and crash recovery at an unfortunate moment, you will have permanent corruption. I am not sure if there is an easy way to detect the corruption. The following is from memory and may be wrong. "CHECK TABLE t" would only count the records, not attempt to dereference all off-page columns. The statements "CHECKSUM TABLE t" or "SELECT * FROM t" would read all columns. Alas, the InnoDB BLOB code (or code for retrieving off-page columns a.k.a. externally stored fields) as it was originally written by Heikki Tuuri is lacking some error checks or assertions. It could silently ignore null BLOB pointers, pretending that the column contained just what it was able to retrieve. The BLOB storage should always be an all-or-nothing deal. Either the clustered index leaf page record points to the full BLOB suffix, or it is a null pointer. If you see off-page columns that are 0 or 768 bytes long, alarm bells should ring. Before InnoDB Plugin, InnoDB would store 768 bytes locally in the clustered index page. In the Barracuda format (ROW_FORMAT=DYNAMIC or ROW_FORMAT=COMPRESSED), the clustered index record does not store any prefix.
[2 Aug 2011 15:01]
Mark Callaghan
Can Shane confirm this doesn't reproduce with 5.1.59 or 5.1 trunk?
[2 Aug 2011 16:50]
Valeriy Kravchuk
I will run on QuadXeon under Linux tomorrow, but with recent pull from mysql-5.1 tree in bzr I am not able to get this assertion on Mac after running for more than 1 hour: macbook-pro:5.1 openxs$ tail -f var/macbook-pro.err InnoDB: Setting log file ./ib_logfile1 size to 5 MB InnoDB: Database physically writes the file full: wait... InnoDB: Doublewrite buffer not found: creating new InnoDB: Doublewrite buffer created InnoDB: Creating foreign key constraint system tables InnoDB: Foreign key constraint system tables created 110802 18:41:46 InnoDB Plugin 1.0.17 started; log sequence number 0 110802 18:41:46 [Note] Event Scheduler: Loaded 0 events 110802 18:41:46 [Note] /Users/openxs/dbs/5.1/libexec/mysqld: ready for connections. Version: '5.1.59' socket: '/tmp/mysql.sock' port: 3306 Source distribution ^C macbook-pro:5.1 openxs$ date вт 2 сер 2011 19:49:33 EEST This is the same Mac where I reproduced assertions on both 5.5.14 and 5.1.58 (official 64-bit binaries).
[3 May 2012 6:58]
zhai weixiang
this assertion was reproduced on Percona Server 5.5.18, and ROW_FORMAT=COMPACT. i don't know if this bug affects all types of ROW_FORMAT. below is part of error log 120503 9:20:30 InnoDB: Assertion failure in thread 140210779649792 in file row0sel.c line 4612 InnoDB: Failing assertion: trx->isolation_level == TRX_ISO_READ_UNCOMMITTED InnoDB: We intentionally generate a memory trap. InnoDB: Submit a detailed bug report to http://bugs.mysql.com. InnoDB: If you get repeated assertion failures or crashes, even InnoDB: immediately after the mysqld startup, there may be InnoDB: corruption in the InnoDB tablespace. Please refer to InnoDB: http://dev.mysql.com/doc/refman/5.5/en/forcing-innodb-recovery.html InnoDB: about forcing recovery. 120503 9:20:30 - mysqld got signal 6 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=104857600 read_buffer_size=1048576 max_used_connections=48 max_threads=2100 thread_count=47 connection_count=47 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 2488475 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x3f64eec0 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 0x7f855db6ee78 thread_stack 0x40000 /u01/mysql/bin/mysqld(my_print_stacktrace+0x39)[0x7b1b69] /u01/mysql/bin/mysqld(handle_segfault+0x43c)[0x4fa39c] /lib64/libpthread.so.0[0x36d0a0f520] /lib64/libc.so.6(gsignal+0x35)[0x36d0632a45] /lib64/libc.so.6(abort+0x175)[0x36d0634225] /u01/mysql/bin/mysqld[0x8479c1] /u01/mysql/bin/mysqld[0x81f4fe] /u01/mysql/bin/mysqld[0x5a27d5] /u01/mysql/bin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x79)[0x59a689] /u01/mysql/bin/mysqld[0x59f56d] /u01/mysql/bin/mysqld(_ZN4JOIN4execEv+0xc5e)[0x5b566e] /u01/mysql/bin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x119) [0x5b1249] /u01/mysql/bin/mysqld(_Z13handle_selectP3THDP3LEXP13select_resultm+0x1c5)[0x5b6e55] /u01/mysql/bin/mysqld[0x574028] /u01/mysql/bin/mysqld(_Z21mysql_execute_commandP3THD+0x1102)[0x577f32] /u01/mysql/bin/mysqld(_Z11mysql_parseP3THDPcjP12Parser_state+0x3a4)[0x57b674] /u01/mysql/bin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x1448)[0x57cac8] /u01/mysql/bin/mysqld(_Z24do_handle_one_connectionP3THD+0x10d)[0x61a70d] /u01/mysql/bin/mysqld(handle_one_connection+0x4c)[0x61a85c] /lib64/libpthread.so.0[0x36d0a077e1] /lib64/libc.so.6(clone+0x6d)[0x36d06e68ed] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0x7f8448004bb0): SELECT * FROM `delayed_jobs` ORDER BY `id` LIMIT 0, 1000 Connection ID (thread ID): 28339 Status: NOT_KILLED
[13 Jul 2012 18:02]
Mark Callaghan
Can this get a status update?
[13 Jul 2012 18:26]
Valeriy Kravchuk
This bug was declared a duplicate of internal Bug #13721257, that is fixed in 5.5.22+ (see http://dev.mysql.com/doc/refman/5.5/en/news-5-5-22.html) and 5.1.63+ (Plugin only).
[26 Apr 2013 5:59]
liu hickey
/home/xiyu.lh/work/5.5 $ bzr log | grep 13721257 No 13721257 info in 5.5 launchpad ?
[9 Aug 2013 10:14]
zhai weixiang
it's not easy to find the related patch, because there isn't any comment in the bzr log.... May be this one ? http://bazaar.launchpad.net/~mysql/mysql-server/5.5/revision/3723 http://bazaar.launchpad.net/~mysql/mysql-server/5.5/revision/3730 I don't know if it's safe to patch the diff directly to MySQL5.5.18(we are using now..). and it's impossible for us to upgrade to a new version such as MySQL5.5.22(that means long time tests...)

Description: The SQL is not in the error log. The assertion is: 110730 2:22:17 InnoDB: Assertion failure in thread 1598527808 in file row/row0sel.c line 4460 InnoDB: Failing assertion: trx->isolation_level == TRX_ISO_READ_UNCOMMITTED The call stack from resolve_stack_dump is: 0x7d96b5 row_search_for_mysql + 6805 0x77fa67 _ZN11ha_innobase10index_readEPhPKhj16ha_rkey_function + 503 0x6d55b1 _ZN7handler18index_read_idx_mapEPhjPKhm16ha_rkey_function + 113 0x65058b _Z15join_read_constP13st_join_table + 235 0x65f1ee _Z21join_read_const_tableP13st_join_tableP11st_position + 78 0x65fc9c _Z20make_join_statisticsP4JOINP10TABLE_LISTP4ItemP16st_dynamic_array + 2252 0x663e01 _ZN4JOIN8optimizeEv + 1169 /usr/libexec/mysqld(mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned 0x66d4ed _Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select + 141 0x66df28 _Z13handle_selectP3THDP6st_lexP13select_resultm + 456 0x5f0454 _Z21execute_sqlcom_selectP3THDP10TABLE_LISTPy + 308 0x5f4f41 _Z21mysql_execute_commandP3THDPy + 9969 0x5fa603 _Z11mysql_parseP3THDPcjPPKcPy + 803 0x5fbb3c _Z16dispatch_command19enum_server_commandP3THDPcj + 5276 0x5fc193 _Z10do_commandP3THD + 275 0x5eba8a handle_one_connection + 1994 Other text in the error log: key_buffer_size=16777216 read_buffer_size=1048576 max_used_connections=2540 max_threads=5000 threads_connected=652 Trying to get some variables. Some pointers may be invalid and cause the dump to abort... thd->query at 0x2ab985504760 is an invalid pointer thd->thread_id=131391566 thd->killed=NOT_KILLED And I think I have seen this before ... Fatal signal 11 while backtracing mysqld: misc.cpp:103: int __cxa_pure_virtual(): Assertion `!"Aborted: pure virtual method called."' failed. mysqld: misc.cpp:103: int __cxa_pure_virtual(): Assertion `!"Aborted: pure virtual method called."' failed. This is the source code for the assert in row_search_for_mysql 4441 } else { 4442 if (prebuilt->template_type == ROW_MYSQL_DUMMY_TEMPLATE) { 4443 memcpy(buf + 4, result_rec 4444 - rec_offs_extra_size(offsets), 4445 rec_offs_size(offsets)); 4446 mach_write_to_4(buf, 4447 rec_offs_extra_size(offsets) + 4); 4448 } else { 4449 if (!row_sel_store_mysql_rec(buf, prebuilt, 4450 result_rec, offsets)) { 4451 /* Only fresh inserts may contain 4452 incomplete externally stored 4453 columns. Pretend that such records do 4454 not exist. Such records may only be 4455 accessed at the READ UNCOMMITTED 4456 isolation level or when rolling back a 4457 recovered transaction. Rollback 4458 happens at a lower level, not here. */ 4459 ut_a(trx->isolation_level 4460 == TRX_ISO_READ_UNCOMMITTED); 4461 goto next_rec; 4462 } How to repeat: I don't know Suggested fix: I don't know