Bug #53627 Mysql randomly crashing in read_view_sees_trx_id
Submitted: 13 May 2010 18:46 Modified: 24 Aug 2010 6:07
Reporter: Jonatas Cruz Email Updates:
Status: Duplicate Impact on me:
None 
Category:MySQL Server: Locking Severity:S1 (Critical)
Version:>=5.1.34, 5.1.46, 5.1.48-bzr OS:Any (RedHat 5.3 x64, Mac OS X, Windows)
Assigned to: CPU Architecture:Any
Tags: crash, innodb_locks_unsafe_for_binlog, read committed, regression

[13 May 2010 18:46] Jonatas Cruz
Description:
Mysql service crashing randomly with signal 11.

* Stacktrace:
********************************************************************************
100508  9:39:17 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=2147483648
read_buffer_size=6291456
max_used_connections=61
max_threads=500
threads_connected=6
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 6198237 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x2aadf0326140
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x46a89f10 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x85032e]
/usr/sbin/mysqld(handle_segfault+0x322)[0x5c1e92]
/lib64/libpthread.so.0[0x31d460e930]
/usr/sbin/mysqld(lock_sec_rec_cons_read_sees+0x42)[0x784da2]
/usr/sbin/mysqld(row_search_for_mysql+0x16a5)[0x7c0195]
/usr/sbin/mysqld(_ZN11ha_innobase13general_fetchEPhjj+0x79)[0x736759]
/usr/sbin/mysqld[0x61cf9a]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x56)[0x62bb76]
/usr/sbin/mysqld[0x62bac0]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x63)[0x62bb83]
/usr/sbin/mysqld[0x62de2a]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x502)[0x639112]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x16e)[0x63afde]
/usr/sbin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_resultm+0x169)[0x63b8f9]
/usr/sbin/mysqld[0x5cc204]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4c1)[0x5cf3b1]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x1fa)[0x5d49ca]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x100b)[0x5d59eb]
/usr/sbin/mysqld(_Z10do_commandP3THD+0xe4)[0x5d5fb4]
/usr/sbin/mysqld(handle_one_connection+0x6f0)[0x5c9010]
/lib64/libpthread.so.0[0x31d4606617]
/lib64/libc.so.6(clone+0x6d)[0x31d3ad3c2d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x2aadec3c5df0 is an invalid pointer
thd->thread_id=504249
thd->killed=KILL_BAD_DATA
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
100508 09:39:17 mysqld_safe Number of processes running now: 0
100508 09:39:17 mysqld_safe mysqld restarted
InnoDB: Log scan progressed past the checkpoint lsn 83 3930749744
100508  9:39:20  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.
InnoDB: Reading tablespace information from the .ibd files...
InnoDB: Restoring possible half-written data pages from the doublewrite
InnoDB: buffer...
InnoDB: Doing recovery: scanned up to log sequence number 83 3935992320
InnoDB: Doing recovery: scanned up to log sequence number 83 3941235200
InnoDB: Doing recovery: scanned up to log sequence number 83 3941735971
100508  9:39:23  InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percents: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
InnoDB: Apply batch completed
InnoDB: Last MySQL binlog file position 0 159700687, file name /dados/binlogs/zeus.000179
100508  9:39:57  InnoDB: Started; log sequence number 83 3941735971
100508  9:39:57 [Note] Recovering after a crash using /dados/binlogs/zeus
100508  9:39:58 [Note] Starting crash recovery...
100508  9:39:58 [Note] Crash recovery finished.
********************************************************************************

* My.cnf
(see private)

* SHOW GLOBAL STATUS
(see private)

* Linux commands
uname -a     
Linux [hostname] 2.6.18-128.4.1.el5 #1 SMP Thu Jul 23 19:59:19 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux

free -m
             total       used       free     shared    buffers     cached
Mem:         24089      22012       2076          0        537       8991
-/+ buffers/cache:      12483      11606
Swap:        12001          0      12001

Thanks,
Jonatas

How to repeat:
There is no single query which causes msqyl to crash. DB has been running fine with thousands of queries for days, and suddenly crashes with signal 11, with no change in query or load.

Suggested fix:
Fix the crash.
[14 May 2010 14:30] Susanne Ebrecht
Many thanks for writing a bug report.

Is there something in the error log before the crash?
[14 May 2010 14:36] Jonatas Cruz
Negative.
[14 May 2010 14:43] MySQL Verification Team
this is same as crash as bug #39022 which is fixed. Is it really 5.1.46 here according to SELECT VERSION();  ?
[14 May 2010 14:49] MySQL Verification Team
we must notice the error log has KILL_BAD_DATA.  This means whatever the crashing query was doing, it hit some error due to strict mode violation, such as out of range values...  Perhaps some error path in this area wasn't tested...
[14 May 2010 15:01] Jonatas Cruz
Sure, the version is 5.1.46.
[15 May 2010 19:37] MySQL Verification Team
for the record, I repeated this crash on 5.1.46 with innodb plugin 1.0.7:

ha_innodb_plugin.dll!read_view_sees_trx_id()[read0read.ic:70]
ha_innodb_plugin.dll!lock_clust_rec_cons_read_sees()[lock0lock.c:523]
ha_innodb_plugin.dll!row_search_for_mysql()[row0sel.c:4106]
ha_innodb_plugin.dll!ha_innodb::general_fetch()[ha_innodb.cc:5612]
ha_innodb_plugin.dll!ha_innodb::rnd_next()[ha_innodb.cc:5804]
mysqld.exe!rr_sequential()[records.cc:385]
mysqld.exe!flush_cached_records()[sql_select.cc:11630]
mysqld.exe!sub_select_cache()[sql_select.cc:11140]
mysqld.exe!sub_select()[sql_select.cc:11287]
mysqld.exe!do_select()[sql_select.cc:11080]
mysqld.exe!JOIN::exec()[sql_select.cc:2315]
mysqld.exe!mysql_select()[sql_select.cc:2512]
mysqld.exe!handle_select()[sql_select.cc:269]
mysqld.exe!execute_sqlcom_select()[sql_parse.cc:5052]
mysqld.exe!mysql_execute_command()[sql_parse.cc:2248]
mysqld.exe!mysql_parse()[sql_parse.cc:5975]
mysqld.exe!dispatch_command()[sql_parse.cc:1235]
mysqld.exe!do_command()[sql_parse.cc:878]
mysqld.exe!handle_one_connection()[sql_connect.cc:1127]
mysqld.exe!pthread_start()[my_winthread.c:85]
mysqld.exe!_callthreadstart()[thread.c:293]

thd->query at 0257BCC0=select d from t2,t1 where d in(select a from t1 where t1.a > -27998)
thd->thread_id=2748
thd->killed=NOT_KILLED

Will make a testcase...
[16 May 2010 17:45] MySQL Verification Team
testcase. run the SP in multiple threads, README at top....

Attachment: bug53627_testcase_stored_routine.sql (application/octet-stream, text), 1.84 KiB.

[16 May 2010 18:23] MySQL Verification Team
I repeated the crash on:
5.1.34, 5.1.37, 5.1.38, 5.1.39, 5.1.40, 5.1.41, 5.1.42, 5.1.43, 5.1.44, 5.1.45, 5.1.46

5.1.32 and below didn't crash. therefore, a regression. I didn't see anything obvious in the changelogs of 5.1.34/5.1.33 that could explain it..
[16 May 2010 19:31] MySQL Verification Team
upon further checking, crash happens with repeatable read without a need for unsafe locks!  only workaround i found is to use binlog_format=row.
[17 May 2010 15:16] Valeriy Kravchuk
Verified just as described with recent 5.1.48 from bzr on Mac OS X:

...
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0xb026af64 thread_stack 0x30000
0   mysqld                              0x0057fcbe my_print_stacktrace + 44
1   mysqld                              0x001010a4 handle_segfault + 836
2   libSystem.B.dylib                   0x940472bb _sigtramp + 43
3   ???                                 0xffffffff 0x0 + 4294967295
4   mysqld                              0x003d1724 lock_clust_rec_cons_read_sees + 62
5   mysqld                              0x00412594 row_search_for_mysql + 5246
6   mysqld                              0x0043f49e _ZN11ha_innobase13general_fetchEPhjj + 242
7   mysqld                              0x0043f655 _ZN11ha_innobase8rnd_nextEPh + 207
8   mysqld                              0x00247a2c _Z13rr_sequentialP11READ_RECORD + 110
9   mysqld                              0x00189c0a _Z18error_if_full_joinP4JOIN + 4230
10  mysqld                              0x00189cb2 _Z16sub_select_cacheP4JOINP13st_join_tableb + 44
11  mysqld                              0x00188a17 _Z10sub_selectP4JOINP13st_join_tableb + 65
12  mysqld                              0x0019a499 _ZN4JOIN9join_freeEv + 1687
13  mysqld                              0x001acc61 _ZN4JOIN4execEv + 9485
14  mysqld                              0x001a76bd _Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex + 879
15  mysqld                              0x001acfe7 _Z13handle_selectP3THDP6st_lexP13select_resultm + 569
16  mysqld                              0x00112bb6 _Z15update_precheckP3THDP10TABLE_LIST + 1150
17  mysqld                              0x001151ee _Z21mysql_execute_commandP3THD + 2936
18  mysqld                              0x002f4c3f _ZN13sp_instr_stmt9exec_coreEP3THDPj + 17
19  mysqld                              0x002f4ecc _ZN13sp_lex_keeper23reset_lex_and_exec_coreEP3THDPjbP8sp_instr + 516
20  mysqld                              0x002fb9f8 _ZN13sp_instr_stmt7executeEP3THDPj + 508
21  mysqld                              0x002f7658 _ZN7sp_head7executeEP3THD + 1356
22  mysqld                              0x002f8401 _ZN7sp_head17execute_procedureEP3THDP4ListI4ItemE + 1971
23  mysqld                              0x0011c859 _Z21mysql_execute_commandP3THD + 33251
24  mysqld                              0x0011ebaf _Z11mysql_parseP3THDPKcjPS2_ + 625
25  mysqld                              0x0011f979 _Z16dispatch_command19enum_server_commandP3THDPcj + 3067
26  mysqld                              0x00120da4 _Z10do_commandP3THD + 666
27  mysqld                              0x0010b7d6 handle_one_connection + 372
28  libSystem.B.dylib                   0x9400c095 _pthread_start + 321
29  libSystem.B.dylib                   0x9400bf52 thread_start + 34
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x10b7428 = select * from t2,t1 where t2.d=t1.a or d in(select a from t1 where t2.d > a)
...
[17 May 2010 21:38] Omer Barnir
triage: setting tag to SR51MRU, SR55RC and setting to provate as bug has DoS vector
[18 May 2010 7:55] MySQL Verification Team
for some unknown reason i was not able to construct a command line cut 'n paste testcase to repeat this crash.  i hope somebody else can succeed.
[19 May 2010 11:55] Marko Mäkelä
Some code is using node->read_view != NULL as the condition to enter lock_clust_rec_cons_read_sees(), but this paths is using prebuilt->select_lock_type != LOCK_NONE. I will review the code to see when the read_view is created.
[25 May 2010 6:25] Marko Mäkelä
I can repeat this in 5.1.47 as well as the current InnoDB Plugin development tree in 5.1.
[25 May 2010 12:29] Marko Mäkelä
Somehow, the transaction has been committed while a join is executing:

assertion failed: prebuilt->sql_stat_start || trx->conc_state == TRX_ACTIVE
#4  0x082dcdff in rr_sequential(READ_RECORD*) ()
#5  0x08257771 in flush_cached_records(JOIN*, st_join_table*, bool) ()
#6  0x08257963 in sub_select_cache(JOIN*, st_join_table*, bool) ()
#7  0x0825887c in do_select(JOIN*, List<Item>*, st_table*, Procedure*) ()
#8  0x0826d249 in JOIN::exec() ()
#9  0x08269139 in mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_lex_unit*, st_select_lex*) ()
#10 0x0826e7bf in handle_select(THD*, st_lex*, select_result*, unsigned long) ()
#11 0x081fea53 in execute_sqlcom_select(THD*, TABLE_LIST*) ()

Sorry, I did not have symbol info for the C++ files enabled (can't see the failing statement).

This seems related to Bug #39022, Bug #45303, Bug #46175. I will add assertions to catch this type of errors earlier, but the bug has to be fixed in the MySQL core.
[25 May 2010 12:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109168
[25 May 2010 12:54] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109169
[25 May 2010 13:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109175
[25 May 2010 13:06] Bugs System
A patch for this bug has been committed. After review, it may
be pushed to the relevant source trees for release in the next
version. You can access the patch from:

  http://lists.mysql.com/commits/109176
[25 May 2010 13:09] Marko Mäkelä
The patches that I just committed will merely add InnoDB ut_ad debug assertions. To enable those, define UNIV_DEBUG or replace the ut_ad with ut_a.
[27 May 2010 13:22] Jonatas Cruz
Still crashing..
It will be fixed in the next release?
[28 May 2010 11:24] Gleb Shchepa
Most likely this bug is a regression of the fix for the bug #39843.
[31 May 2010 8:29] Bugs System
Pushed into 5.1.48 (revid:vasil.dimov@oracle.com-20100531082307-9x08gg1g7zybx2jy) (version source revid:vasil.dimov@oracle.com-20100531082307-9x08gg1g7zybx2jy) (merge vers: 5.1.48) (pib:16)
[1 Jun 2010 23:18] Gleb Shchepa
This bug is a duplicate of the bug #46947 (already fixed).
[2 Jun 2010 2:20] Jonatas Cruz
Are you sure?
That bug don`t mention any crash.
[2 Jun 2010 5:30] Gleb Shchepa
> [2 Jun 4:20] Jonatas Cruz
> Are you sure?

Yes.

> That bug don`t mention any crash.

Crash is just another issue of the same regression.

Will post a shorter test suite by the next message.
You can make sure that the crash goes away with that fix.
[2 Jun 2010 5:39] Gleb Shchepa
test file

Attachment: bug53627.c (text/x-csrc), 9.68 KiB.

[17 Jun 2010 6:12] Bugs System
Pushed into 5.5.5-m3 (revid:alexey.kopytov@sun.com-20100615145247-8bj0vmuqlotbqsn9) (version source revid:marko.makela@oracle.com-20100601103738-upm8awahesmeh9dr) (merge vers: 5.1.48) (pib:16)
[17 Jun 2010 6:16] Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100615150216-cubqoyn1fj9b6a2p) (version source revid:marko.makela@oracle.com-20100601103738-upm8awahesmeh9dr) (pib:16)
[29 Jun 2010 12:15] Jonatas Cruz
Crash again today, relase 5.1.48

100629  7:24:24 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=2147483648
read_buffer_size=6291456
max_used_connections=231
max_threads=1000
threads_connected=21
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 10299448 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x2aade88044d0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x47556f10 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x2e)[0x850fce]
/usr/sbin/mysqld(handle_segfault+0x322)[0x5c2622]
/lib64/libpthread.so.0[0x35b0c0e4c0]
/usr/sbin/mysqld(lock_sec_rec_cons_read_sees+0x42)[0x7862d2]
/usr/sbin/mysqld(row_search_for_mysql+0x16b5)[0x7c1155]
/usr/sbin/mysqld(_ZN11ha_innobase13general_fetchEPhjj+0x79)[0x737a79]
/usr/sbin/mysqld[0x61da2a]
/usr/sbin/mysqld(_Z10sub_selectP4JOINP13st_join_tableb+0x56)[0x62c646]
/usr/sbin/mysqld[0x62e8fa]
/usr/sbin/mysqld(_ZN4JOIN4execEv+0x502)[0x639be2]
/usr/sbin/mysqld(_Z12mysql_selectP3THDPPP4ItemP10TABLE_LISTjR4ListIS1_ES2_jP8st_orderSB_S2_SB_yP13select_resultP18st_select_lex_unitP13st_select_lex+0x16e)[0x63baae]
/usr/sbin/mysqld(_Z13handle_selectP3THDP6st_lexP13select_resultm+0x169)[0x63c3c9]
/usr/sbin/mysqld[0x5cca24]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THD+0x4c1)[0x5cfbd1]
/usr/sbin/mysqld(_Z11mysql_parseP3THDPKcjPS2_+0x1fa)[0x5d51ea]
/usr/sbin/mysqld(_Z16dispatch_command19enum_server_commandP3THDPcj+0x103b)[0x5d623b]
/usr/sbin/mysqld(_Z10do_commandP3THD+0xe4)[0x5d6844]
/usr/sbin/mysqld(handle_one_connection+0x6f0)[0x5c9820]
/lib64/libpthread.so.0[0x35b0c06367]
/lib64/libc.so.6(clone+0x6d)[0x35b00d30ad]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x2aadf82d6600 is an invalid pointer
thd->thread_id=16424975
thd->killed=KILL_BAD_DATA
[8 Aug 2010 15:20] MySQL Verification Team
same stack trace (and easier testcase) in bug #55580 ...
[17 Aug 2010 18:21] Gleb Shchepa
Unassigning myself -- can't repeat.
Most probably it's a duplicate of the bug #55580 (already fixed).
[18 Aug 2010 14:19] Georgi Kodinov
Valeriy, can you please try repeating the crash with the latest 5.1-bugteam (that contains the 55580 fix) ?
[24 Aug 2010 6:07] Valeriy Kravchuk
I can not repeat this with 5.1.51 (current mysql-5.1 tree), so let's consider this a duplicate of bug #55580.
[14 Oct 2010 8:30] Bugs System
Pushed into mysql-5.1-telco-7.0 5.1.51-ndb-7.0.20 (revid:martin.skold@mysql.com-20101014082627-jrmy9xbfbtrebw3c) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (merge vers: 5.5.5-m3) (pib:21)
[14 Oct 2010 8:46] Bugs System
Pushed into mysql-5.1-telco-6.3 5.1.51-ndb-6.3.39 (revid:martin.skold@mysql.com-20101014083757-5qo48b86d69zjvzj) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (merge vers: 5.5.5-m3) (pib:21)
[14 Oct 2010 9:00] Bugs System
Pushed into mysql-5.1-telco-6.2 5.1.51-ndb-6.2.19 (revid:martin.skold@mysql.com-20101014084420-y54ecj85j5we27oa) (version source revid:vasil.dimov@oracle.com-20100531152341-x2d4hma644icamh1) (merge vers: 5.5.5-m3) (pib:21)