Bug #65156 MySql crashes accessing information_schema.PROCESSLIST
Submitted: 30 Apr 2012 16:24 Modified: 27 Dec 2013 5:53
Reporter: sam munkes Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Information schema Severity:S1 (Critical)
Version:5.1.61-0+squeeze1-log OS:Linux (Debian Squeeze)
Assigned to: CPU Architecture:Any
Tags: crash processlist segfault proc_info

[30 Apr 2012 16:24] sam munkes
Description:
Our production MySql instance crashed when accessing the information_schema.PROCESSLIST table.

This is the second time this happened, so this is definitely a bug in MySql.

Error log dump:
120429  4:22:07 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=6442450944
read_buffer_size=2097152
max_used_connections=144
max_threads=500
threads_connected=52
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 9368608 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x7fa591c0cb00
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f9eab73be88 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x29) [0x7fa56cf23829]
/usr/sbin/mysqld(handle_segfault+0x404) [0x7fa56cc2db14]
/lib/libpthread.so.0(+0xef60) [0x7fa56c48ff60]
/lib/libc.so.6(+0x107382) [0x7fa56b008382]
/usr/sbin/mysqld(fill_schema_processlist(THD*, TABLE_LIST*, Item*)+0x185) [0x7fa56cd5b655]
/usr/sbin/mysqld(get_schema_tables_result(JOIN*, enum_schema_table_state)+0x200) [0x7fa56cd52210]
/usr/sbin/mysqld(JOIN::exec()+0x4e5) [0x7fa56ccaf235]
/usr/sbin/mysqld(mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_$
/usr/sbin/mysqld(handle_select(THD*, st_lex*, select_result*, unsigned long)+0x174) [0x7fa56ccb0f24]
/usr/sbin/mysqld(+0x37edfa) [0x7fa56cc3bdfa]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0x516) [0x7fa56cc3fdf6]
/usr/sbin/mysqld(mysql_parse(THD*, char const*, unsigned int, char const**)+0x3fb) [0x7fa56cc4530b]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0xb34) [0x7fa56cc45e54]
/usr/sbin/mysqld(do_command(THD*)+0xea) [0x7fa56cc46d3a]
/usr/sbin/mysqld(handle_one_connection+0x235) [0x7fa56cc38a25]
/lib/libpthread.so.0(+0x68ba) [0x7fa56c4878ba]
/lib/libc.so.6(clone+0x6d) [0x7fa56afd002d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x7fa591cf5070 = SELECT COUNT(*) FROM information_schema.PROCESSLIST WHERE Command NOT IN ('Sleep', 'Daemon','Binlog Dump') AND USER <> 'system user'
thd->thread_id=471
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
120429  4:22:09 [Note] Plugin 'FEDERATED' is disabled.
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3.4
120429  4:22:10  InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 1762350987596
120429  4:22:11  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.

Error log dump from the first time this happened (4/16):
120416 11:03:33 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=6442450944
read_buffer_size=2097152
max_used_connections=224
max_threads=500
threads_connected=94
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 9368608 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x7f1e0c348890
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x7f1dd7c70e88 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x29) [0x7f24dcdbc829]
/usr/sbin/mysqld(handle_segfault+0x404) [0x7f24dcac6b14]
/lib/libpthread.so.0(+0xef60) [0x7f24dc328f60]
/lib/libc.so.6(+0x107382) [0x7f24daea1382]
/usr/sbin/mysqld(fill_schema_processlist(THD*, TABLE_LIST*, Item*)+0x185) [0x7f24dcbf4655]
/usr/sbin/mysqld(get_schema_tables_result(JOIN*, enum_schema_table_state)+0x200) [0x7f24dcbeb210]
/usr/sbin/mysqld(JOIN::exec()+0x4e5) [0x7f24dcb48235]
/usr/sbin/mysqld(mysql_select(THD*, Item***, TABLE_LIST*, unsigned int, List<Item>&, Item*, unsigned int, st_order*, st_order*, Item*, st_order*, unsigned long long, select_result*, st_select_$
/usr/sbin/mysqld(handle_select(THD*, st_lex*, select_result*, unsigned long)+0x174) [0x7f24dcb49f24]
/usr/sbin/mysqld(+0x37edfa) [0x7f24dcad4dfa]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0x516) [0x7f24dcad8df6]
/usr/sbin/mysqld(mysql_parse(THD*, char const*, unsigned int, char const**)+0x3fb) [0x7f24dcade30b]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0xb34) [0x7f24dcadee54]
/usr/sbin/mysqld(do_command(THD*)+0xea) [0x7f24dcadfd3a]
/usr/sbin/mysqld(handle_one_connection+0x235) [0x7f24dcad1a25]
/lib/libpthread.so.0(+0x68ba) [0x7f24dc3208ba]
/lib/libc.so.6(clone+0x6d) [0x7f24dae6902d]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x7f1df1c02540 is an invalid pointer
thd->thread_id=1911160
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
120416 11:03:36 [Note] Plugin 'FEDERATED' is disabled.
InnoDB: The InnoDB memory heap is disabled
InnoDB: Mutexes and rw_locks use GCC atomic builtins
InnoDB: Compressed tables use zlib 1.2.3.4
120416 11:03:37  InnoDB: highest supported file format is Barracuda.
InnoDB: Log scan progressed past the checkpoint lsn 1679439864940
120416 11:03:37  InnoDB: Database was not shut down normally!
InnoDB: Starting crash recovery.

See bug #51377 which was *supposed* to be fixed in 5.1.46.

Also, See Bugs #55034 & #59223

How to repeat:
Hard to repeat, must be a timing issue between threads.
[30 Apr 2012 19:40] Valeriy Kravchuk
Please, check if the same problem still happens with a newer version, 5.1.62.
[30 Apr 2012 20:20] sam munkes
Was there something that changed since 5.1.49 that would fix this issue?
[1 May 2012 8:49] Sveta Smirnova
Thank you for the feedback.

There were a lot of changes after 5.1.49 and this can be easily fixed. Trace also reminds me old bug #43357, but if analyse that old bug and its fix, then create test case, it does not fail in current 5.1 So, please, try current version of MySQL server in your environment, so we are sure problem is still repeatable.
[1 May 2012 14:29] sam munkes
Thanks,

We will upgrade to 5.1.61 (latest Debian stable) over the weekend. Since this issue is intermittent, it might take a while to confirm the issue was fixed.
[1 May 2012 15:12] MySQL Verification Team
also keep in mind you could be hitting bug #58198 which isn't fixed yet.
[2 Jun 2012 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[17 Jul 2012 16:45] sam munkes
Well, this bug just hit us again. Our production master DB crashed...

This is a very serious issue, as there is no way of knowing when this may hit again.

Any suggestions? 

This is what the error logs give:

mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=6442450944
read_buffer_size=2097152
max_used_connections=103
max_threads=500
thread_count=72
connection_count=70
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 9368631 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f8ffb176030
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f901a389e88 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x29) [0x7f96dd05eb59]
/usr/sbin/mysqld(handle_fatal_signal+0x483) [0x7f96dce734a3]
/lib/libpthread.so.0(+0xeff0) [0x7f96dc5c5ff0]
/usr/sbin/mysqld(mysqld_list_processes(THD*, char const*, bool)+0x8e2) [0x7f96dce94e12]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0xd75) [0x7f96dcd78035]
/usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, char const**)+0x3fb) [0x7f96dcd7cdbb]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x115a) [0x7f96dcd7df2a]
/usr/sbin/mysqld(do_command(THD*)+0xea) [0x7f96dcd7e85a]
/usr/sbin/mysqld(handle_one_connection+0x235) [0x7f96dcd70435]
/lib/libpthread.so.0(+0x68ca) [0x7f96dc5bd8ca]
/lib/libc.so.6(clone+0x6d) [0x7f96db10586d]
/lib/libpthread.so.0(+0x68ca) [0x7f96dc5bd8ca]
/lib/libc.so.6(clone+0x6d) [0x7f96db10586d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f8ffd98ee80): is an invalid pointer
Connection ID (thread ID): 977594
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

--
Thanks
[17 Jul 2012 16:47] sam munkes
One more note: since the first crash we switched from querying the information_schema.PROCESSLIST table directly, to the SHOW FULL PROCESSLIST command.
[17 Sep 2012 1:28] sam munkes
Guys, our MySql instance crashed again on Friday night...
Same bug, different day.

Can we please look into this & possible roll out a fix?

05:26:47 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed,
something is definitely wrong and this may fail.

key_buffer_size=6442450944
read_buffer_size=2097152
max_used_connections=129
max_threads=500
thread_count=42
connection_count=40
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 9368631 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f1393b07530
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f13b5785e88 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x29) [0x7f1a78aabb59]
/usr/sbin/mysqld(handle_fatal_signal+0x483) [0x7f1a788c04a3]
/lib/libpthread.so.0(+0xeff0) [0x7f1a78012ff0]
/usr/sbin/mysqld(mysqld_list_processes(THD*, char const*, bool)+0x8e2) [0x7f1a788e1e12]
/usr/sbin/mysqld(mysql_execute_command(THD*)+0xd75) [0x7f1a787c5035]
/usr/sbin/mysqld(mysql_parse(THD*, char*, unsigned int, char const**)+0x3fb) [0x7f1a787c9dbb]
/usr/sbin/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x115a) [0x7f1a787caf2a]
/usr/sbin/mysqld(do_command(THD*)+0xea) [0x7f1a787cb85a]
/usr/sbin/mysqld(handle_one_connection+0x235) [0x7f1a787bd435]
/lib/libpthread.so.0(+0x68ca) [0x7f1a7800a8ca]
/lib/libc.so.6(clone+0x6d) [0x7f1a76b5286d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f138d716300): is an invalid pointer
Connection ID (thread ID): 3905938
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
[17 Sep 2012 20:11] Sveta Smirnova
Thank you for the feedback.

Please check if this is not same as bug #58198: do you switch database while execute SHOW PROCESSLIST or SELECT ... FROM INFORMATION_SCHEMA.PROCESSLIST in parallel?

This can happen if you use USE DB; query or use full names, such as db_name.tbl_name, db_name.stored_routine_name while db_name is different from default database.
[20 Sep 2012 15:00] sam munkes
Since we use connection pooling & a single DB, it is unlikely that the issue is related to bug #58198 (The only time we USE DB is on the initial connect).

I think it has to do with the fact that the SHOW PROCESSLIST is trying to access a half-assed query that was either not fully created or not fully destroyed yet:

thd->query at 0x7f1df1c02540 is an invalid pointer
thd->thread_id=1911160
thd->killed=NOT_KILLED

Thanks.
[20 Sep 2012 18:49] sam munkes
Looking at the source code, is it possible that the segfault is due to a race condition between the checking and referencing of THD->proc_info?

void mysqld_list_processes(THD *thd,const char *user, bool verbose)
{
...
thd_info->state_info= (char*) (tmp->locked ? "Locked" :
   tmp->net.reading_or_writing ?
   (tmp->net.reading_or_writing == 2 ?
	"Writing to net" :
	thd_info->command == COM_SLEEP ? "" :
	"Reading from net") :
   tmp->proc_info ? tmp->proc_info :
   tmp->mysys_var &&
   tmp->mysys_var->current_cond ?
   "Waiting on cond" : NullS);
...

Is it possible that tmp->proc_info was changed from a valid pointer to 0 (via prepare_new_connection_state), and that is triggering the segfault?

Also, Is the Security_context immutable? IE the thread's security_ctx is always pointing to a Security_context that does not change?

Thanks.
[20 Sep 2012 19:42] Sveta Smirnova
Thank you for the feedback.

We have hidden bug #66412 about very same problem, but it has another backtrace.

Please check if you have repeating messages like "[ERROR] bin/mysqld: Sort aborted: Server shutdown in progress" in your server error log.
[20 Sep 2012 20:24] sam munkes
Sveta,

Got 3 Sort aborted in the past 6 months, but none specify why the sort was aborted.

$ grep -i "sort aborted" error.log
120402 12:07:42 [ERROR] /usr/sbin/mysqld: Sort aborted
120420 15:43:02 [ERROR] /usr/sbin/mysqld: Sort aborted
120828 10:32:47 [ERROR] /usr/sbin/mysqld: Sort aborted

Thanks.
[20 Sep 2012 22:50] sam munkes
Doing more research on the possibility of proc_info being modified by the other threads brought up this bug & fix showing similar segfault crashes with innodb's SHOW ENGINE STATUS:

Bug #38883 
Fix: http://lists.mysql.com/commits/58172

Can someone please apply the same fix (use a temp variable) to mysqld_list_processes & fill_schema_processlist?
[20 Sep 2012 22:50] sam munkes
updated tags
[1 Feb 2013 16:22] MySQL Verification Team
Sorry, we just can not blindly apply patches for the problems that we are unable to reproduce.

Can you please provide us with fully repeatable test case and we shall gladly attempt to fix a bug.

On the side note, sort is aborted most frequently due to lack of memory or disk space  on a partition which contains temporary directory.
[2 Mar 2013 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[27 Nov 2013 5:45] jony wang
Hi  We meet the same problem with  mysql  5.1.45

mysql crash  when we run "show processlist"

we found following message in mysql.err

======================================
131126 19:48:29 - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help diagnose
the problem, but since we have already crashed, something is definitely wrong
and this may fail.

key_buffer_size=268435456
read_buffer_size=2097152
max_used_connections=516
max_threads=2000
threads_connected=304
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 8474487 K
bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

thd: 0x7fa7536c1580
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0x46f801b8 thread_stack 0x30000
/home/mysql/mysql/libexec/mysqld(my_print_stacktrace+0x2f) [0x8d47b5]
/home/mysql/mysql/libexec/mysqld(handle_segfault+0x253) [0x5e1e61]
/lib64/tls/libpthread.so.0 [0x302b80c420]
/lib64/tls/libc.so.6(strlen+0x30) [0x302af6ff40]
/home/mysql/mysql/libexec/mysqld(strdup_root+0x19) [0x8bee0c]
/home/mysql/mysql/libexec/mysqld(Query_arena::strdup(char const*)+0x21) [0x5301bf]
/home/mysql/mysql/libexec/mysqld(mysqld_list_processes(THD*, char const*, bool)+0x541) [0x71c885]
/home/mysql/mysql/libexec/mysqld(mysql_execute_command(THD*)+0x2b1c) [0x5f1b3a]
/home/mysql/mysql/libexec/mysqld(mysql_parse(THD*, char const*, unsigned int, char const**)+0x19c) [0x5f7892]
/home/mysql/mysql/libexec/mysqld(dispatch_command(enum_server_command, THD*, char*, unsigned int)+0x984) [0x5edb72]
/home/mysql/mysql/libexec/mysqld(do_command(THD*)+0x13c) [0x5ed066]
/home/mysql/mysql/libexec/mysqld(handle_one_connection+0x149) [0x5ebb17]
/lib64/tls/libpthread.so.0 [0x302b80610a]
/lib64/tls/libc.so.6(__clone+0x73) [0x302afc6003]
Trying to get some variables.
Some pointers may be invalid and cause the dump to abort...
thd->query at 0x7fa74caa7dc0 is an invalid pointer
thd->thread_id=62698620
thd->killed=NOT_KILLED
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
131126 19:48:34 mysqld_safe Number of processes running now: 0
131126 19:48:34 mysqld_safe mysqld restarted
[27 Nov 2013 5:53] MySQL Verification Team
5.1.45 is too old. please try 5.1.72, 5.5.34, or 5.6.14.
At least, related bug #58198 is fixed in 5.1.72, hence my request.
[28 Dec 2013 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".