Bug #50201 | Server crashes in explain_filename on an InnoDB partitioned table | ||
---|---|---|---|
Submitted: | 9 Jan 2010 0:49 | Modified: | 15 Mar 2010 15:50 |
Reporter: | Elena Stepanova | Email Updates: | |
Status: | Closed | Impact on me: | |
Category: | MySQL Server | Severity: | S2 (Serious) |
Version: | 5.5.1-m2 | OS: | Any |
Assigned to: | Mattias Jonsson | CPU Architecture: | Any |
[9 Jan 2010 0:49]
Elena Stepanova
[11 Jan 2010 13:15]
Mikhail Izioumtchenko
Is InnoDB code in 5.5.1-m2 any different from InnoDB in 5.1.42 and 5.5.1-m1 where the bug is not reproducing?
[11 Jan 2010 14:58]
Elena Stepanova
5.5.0-m2 contains InnoDB plugin 1.0.5, while 5.5.1-m2 goes with 1.0.6, so there is difference; also, bug#32430 was fixed in 5.5.1, but not in 5.5.0. However, 5.1.42 also contains InnoDB plugin 1.0.6 (and, presumably, bugfix for bug#32430), so I'm not sure whether there is a difference in InnoDB plugin code between 5.1.42 and 5.5.1-m2. (I could not reproduce the crash on 5.1.42 neither with the built-in InnoDB nor with the plugin).
[11 Jan 2010 16:59]
Mikhail Izioumtchenko
if this is not reproducible in 5.1.42 there's probably not much we could do here. On the other hand considering it's a SEGV there's a chance it's a simple bug debuggable by a good coredump and code review. Could you reproduce it with 5.5.1-m2 with mysqld compiled without -O3, those <value optimized out> coredumps are a pain to debug, then please provide the full stack backtrace (gdb's bt full) of the thread in question. Assigning to Jimmy in the meantime.
[12 Jan 2010 9:06]
Sergey Vojtovich
From the stack trace... it looks like explain_filename() is called by InnoDB "timeout_and_monitor" thread, which has no binding to THD object. When explain_filename() attempts to get some nice text ER(ER_PARTITION_NAME) macro is used. This macro uses current_thd. As there is no binding to THD object we end up with server crash. How to fix: add new ER_THD(thd, err_code) macro, which will use explicit thd object instead of current_thd.
[12 Jan 2010 15:13]
Elena Stepanova
bt full of all threads on debug version
Attachment: bug50201_all_threads_bt_full.out (application/octet-stream, text), 55.54 KiB.
[13 Jan 2010 7:29]
Marko Mäkelä
The function explain_filename was introduced by Mattias Jonsson in order to fix Bug #32430. I noticed and fixed some bugs in explain_filename before making InnoDB call it, that is, actually fixing Bug #32430 in InnoDB. Unfortunately, I did not notice that explain_filename could crash when called outside a MySQL connection handler thread. InnoDB background threads can report table names to the error log, even before any client connection has been started. Therefore, explain_filename must be callable when current_thd==NULL. I am tentatively assigning this to Mattias Jonsson and setting myself as a reviewer. Note that this bug is in 5.1 too, in both the built-in InnoDB and the Plugin.
[13 Jan 2010 9:21]
Mattias Jonsson
Seems like it is not affecting 5.1, since it is first in mysql-trunk (5.5+) that uses current_thd in the ER macro.
[13 Jan 2010 10:35]
Mattias Jonsson
This patch seems to fix the problem (as Sergey suggested), I'm currently looking how to create a repeatable test case for mysql-test-run. b50201-mysql-trunk-bugfixing$ bzr diff === modified file 'sql/sql_table.cc' --- sql/sql_table.cc 2009-12-11 09:39:38 +0000 +++ sql/sql_table.cc 2010-01-13 09:59:39 +0000 @@ -291,7 +291,7 @@ { if (explain_mode == EXPLAIN_ALL_VERBOSE) { - to_p= strnmov(to_p, ER(ER_DATABASE_NAME), end_p - to_p); + to_p= strnmov(to_p, ER_THD(thd, ER_DATABASE_NAME), end_p - to_p); *(to_p++)= ' '; to_p= add_identifier(thd, to_p, end_p, db_name, db_name_len); to_p= strnmov(to_p, ", ", end_p - to_p); @@ -304,7 +304,7 @@ } if (explain_mode == EXPLAIN_ALL_VERBOSE) { - to_p= strnmov(to_p, ER(ER_TABLE_NAME), end_p - to_p); + to_p= strnmov(to_p, ER_THD(thd, ER_TABLE_NAME), end_p - to_p); *(to_p++)= ' '; to_p= add_identifier(thd, to_p, end_p, table_name, table_name_len); } @@ -321,18 +321,18 @@ if (name_type != NORMAL) { if (name_type == TEMP) - to_p= strnmov(to_p, ER(ER_TEMPORARY_NAME), end_p - to_p); + to_p= strnmov(to_p, ER_THD(thd, ER_TEMPORARY_NAME), end_p - to_p); else - to_p= strnmov(to_p, ER(ER_RENAMED_NAME), end_p - to_p); + to_p= strnmov(to_p, ER_THD(thd, ER_RENAMED_NAME), end_p - to_p); to_p= strnmov(to_p, " ", end_p - to_p); } - to_p= strnmov(to_p, ER(ER_PARTITION_NAME), end_p - to_p); + to_p= strnmov(to_p, ER_THD(thd, ER_PARTITION_NAME), end_p - to_p); *(to_p++)= ' '; to_p= add_identifier(thd, to_p, end_p, part_name, part_name_len); if (subpart_name) { to_p= strnmov(to_p, ", ", end_p - to_p); - to_p= strnmov(to_p, ER(ER_SUBPARTITION_NAME), end_p - to_p); + to_p= strnmov(to_p, ER_THD(thd, ER_SUBPARTITION_NAME), end_p - to_p); *(to_p++)= ' '; to_p= add_identifier(thd, to_p, end_p, subpart_name, subpart_name_len); } === modified file 'sql/unireg.h' --- sql/unireg.h 2009-10-22 22:30:28 +0000 +++ sql/unireg.h 2010-01-13 09:58:29 +0000 @@ -46,6 +46,10 @@ #define ER(X) CURRENT_THD_ERRMSGS[(X) - ER_ERROR_FIRST] #define ER_DEFAULT(X) DEFAULT_ERRMSGS[(X) - ER_ERROR_FIRST] #define ER_SAFE(X) (((X) >= ER_ERROR_FIRST && (X) <= ER_ERROR_LAST) ? ER(X) : "Invalid error code") +#define ER_THD(thd,X) ((thd) ? \ + (thd)->variables.lc_messages->errmsgs->errmsgs[(X) - \ + ER_ERROR_FIRST] : \ + ER_DEFAULT(X)) #define ERRMAPP 1 /* Errormap f|r my_error */
[13 Jan 2010 10:35]
Mattias Jonsson
Removing 5.1.41+ from version field, since it is not affected.
[13 Jan 2010 10:43]
Marko Mäkelä
Matthias, from an educated guess how the macros might work, your patch looks OK to me. Side note: why is CURRENT_THD_ERRMSGS there in the first place? I was under the impression that current_thd expands to a slow function call that should be avoided whenever possible. As far as I understand, that perceived slowness is the reason why handler::ha_thd() and ha_innobase::user_thd exist.
[13 Jan 2010 11:07]
Mattias Jonsson
Marko: CURRENT_THD_ERRMSGS is new in 5.5+ and as I understand it is because of the new internationalization effort to get error messages to the current language setting per session. Btw, do you have an idea on how to easily create a repeatable test case for mysql-test-run, (perhaps by adding DEBUG_SYNC() somewhere in the mysql server code in a two thread test)?
[13 Jan 2010 11:56]
Marko Mäkelä
Could the repeatable test case be something simple like this: BEGIN; SELECT COUNT(*) FROM partitioned_table FOR UPDATE; --send SELECT SLEEP(10); -- another connection SET innodb_lock_wait_timeout=1; SELECT COUNT(*) FROM partitioned_table FOR UPDATE; -- lock wait timeout will occur A deadlock would be similar, but easiest to do with two tables. The first transaction would lock table A by a SELECT, the second one table B and attempt to access table A, and the first one would try to lock table B. You would need the --send and reap statements in mysql-test for this.
[13 Jan 2010 15:14]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/96792 2932 Mattias Jonsson 2010-01-13 Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table Problem was that in mysql-trunk the ER() macro is now dependent on current_thd and the innodb monitor thread has no binding to that thd object. This cause the crash because of bad derefencing. Solution was to add a new macro which take the thd as an argument (which the innodb thread uses for the call). @ mysql-test/r/partition_innodb_status_file.result Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table New result file @ mysql-test/t/partition_innodb_status_file-master.opt Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table New opt-file @ mysql-test/t/partition_innodb_status_file.test Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table New test file for testing the innodb monitor thread with partitioning. (i.e. possible bugs related to explain_filename and other functions). @ sql/sql_table.cc Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table Using the new ER_THD() macro instead, to supply the correct THD reference. @ sql/unireg.h Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table Added ER_THD() macro to be used when current_thd is not apropriate or does not resolve to the correct thd.
[13 Jan 2010 15:22]
Mattias Jonsson
Thanks Marko for the test. I finally got stuck on the timing since it seems to be hardcoded for the innodb monitor thread to only wake up every 15 seconds, which I don't know how to decrease. Do you know how one could do a test that does not take at least 15 seconds to run? If you (the reviewers) want, I can at least move the test to the parts-suite. Note that the crash only occurs when running with --innodb-status-file=1
[14 Jan 2010 10:58]
Marko Mäkelä
Sorry Mattias, I don’t know how the 15-second delay could be avoided in the test case. Your patch is OK to push, as far as I am concerned.
[10 Feb 2010 9:48]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/99789 2938 Mattias Jonsson 2010-02-10 Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table Problem was that in mysql-trunk the ER() macro is now dependent on current_thd and the innodb monitor thread has no binding to that thd object. This cause the crash because of bad derefencing. Solution was to add a new macro which take the thd as an argument (which the innodb thread uses for the call). (Updated according to reviewers comments, i.e. added ER_THD_OR_DEFAULT and moved test to suite parts.) @ mysql-test/suite/parts/r/partition_innodb_status_file.result Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table New test result file @ mysql-test/suite/parts/t/partition_innodb_status_file-master.opt Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table New test opt file @ mysql-test/suite/parts/t/partition_innodb_status_file.test Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table New test. Note that the innodb monitor thread only runs every 15 seconds, so this test will take at least 15 seconds, so I have moved it to the parts suite. @ sql/sql_table.cc Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table Using thd safe ER macro. @ sql/unireg.h Bug#50201: Server crashes in explain_filename on an InnoDB partitioned table Added ER macros for use with specified thd pointer.
[10 Feb 2010 11:08]
Bugs System
A patch for this bug has been committed. After review, it may be pushed to the relevant source trees for release in the next version. You can access the patch from: http://lists.mysql.com/commits/99801 3875 Mattias Jonsson 2010-02-10 [merge] manual merge of bug#50201 from mysql-next-mr-bugfixing into mysql-6.0-codebase-bugfixing, due to read only innodb_lock_wait_timeout variable
[10 Feb 2010 11:09]
Mattias Jonsson
pushed to mysql-trunk-bugfixing, mysql-next-mr-bugfixing and mysql-6.0-codebase-bugfixing.
[12 Feb 2010 17:41]
Bugs System
Pushed into 5.5.2-m2 (revid:joerg@mysql.com-20100212164100-jnurxdw5z88m472s) (version source revid:joerg@mysql.com-20100212164100-jnurxdw5z88m472s) (merge vers: 5.5.2-m2) (pib:16)
[13 Feb 2010 8:36]
Bugs System
Pushed into 6.0.14-alpha (revid:alik@sun.com-20100213083436-9pesg4h55w1mekxc) (version source revid:luis.soares@sun.com-20100211135109-t63avry9fqpgyh78) (merge vers: 6.0.14-alpha) (pib:16)
[13 Feb 2010 8:38]
Bugs System
Pushed into mysql-next-mr (revid:alik@sun.com-20100213083327-cee4ao3jpg33eggv) (version source revid:luis.soares@sun.com-20100211135018-1f9dbghg0itszigo) (pib:16)
[4 Mar 2010 17:10]
Paul DuBois
Noted in 5.5.2, 6.0.14 changelogs. The server crashed when an InnoDB background thread attempted to write a message containing a partitioned table name to the error log. Setting report to Need Merge pending push of Celosia to release tree.
[15 Mar 2010 15:50]
Paul DuBois
This is in mysql-trunk now. Closing.