Bug #85852 flush query cache crashes MySQL
Submitted: 7 Apr 2017 9:58 Modified: 23 Nov 2017 14:21
Reporter: Matthias Wolle Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Query Cache Severity:S2 (Serious)
Version:5.6.31,5.6.35,5.7.17 OS:CentOS (6.4, 6.9, 7.3)
Assigned to: CPU Architecture:Any
Tags: query_cache

[7 Apr 2017 9:58] Matthias Wolle
Description:
I have experienced infrequent server crashes while running flush query cache commands.
The following MySQL Community Server versions were tested and are affected:
Version: '5.6.31'
Version: '5.6.35'
Version: '5.7.17'

stack trace:
key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=20
max_threads=1024
thread_count=20
connection_count=20
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 423264 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7fcca029e360
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7fcd22b63e28 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0xf42a95]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x7cd8b4]
/lib64/libpthread.so.0(+0xf7e0)[0x7fcd753947e0]
/usr/sbin/mysqld(_ZN17Query_cache_query16unlock_n_destroyEv+0x13)[0xcbc7d3]
/usr/sbin/mysqld(_ZN11Query_cache12move_by_typeEPPhPP17Query_cache_blockPmS3_+0x3c3)[0xcbfef3]
/usr/sbin/mysqld(_ZN11Query_cache10pack_cacheEv+0x5d)[0xcc044d]
/usr/sbin/mysqld(_ZN11Query_cache4packEmj+0x60)[0xcc0520]
/usr/sbin/mysqld(_Z20reload_acl_and_cacheP3THDmP10TABLE_LISTPi+0x435)[0xd3b2f5]
/usr/sbin/mysqld(_Z21mysql_execute_commandP3THDb+0x86d)[0xd0a0ed]
/usr/sbin/mysqld(_Z11mysql_parseP3THDP12Parser_state+0x3a5)[0xd0ed95]
/usr/sbin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x1778)[0xd10578]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x194)[0xd10ea4]
/usr/sbin/mysqld(handle_connection+0x29c)[0xde3f1c]
/usr/sbin/mysqld(pfs_spawn_thread+0x174)[0xf60b74]
/lib64/libpthread.so.0(+0x7aa1)[0x7fcd7538caa1]
/lib64/libc.so.6(clone+0x6d)[0x7fcd742f6bbd]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7fcca01a78f0): is an invalid pointer
Connection ID (thread ID): 959
Status: NOT_KILLED

How to repeat:
I stressed the database with jmeter performance test for sugarcrm (https://github.com/sugarcrm/performance) against a sugarcrm 7.7.1.2 installation. jmeter run 15 threads with 10 different users.
Unfortunately the performance tool is invite only. https://developer.sugarcrm.com/2015/07/27/sugar-7-unit-tests-and-performance-testing-tools...

The load test produced approximately 100 select statements per second.
As second I run flush query cache in a loop:
while [ true ]; do echo  "flush query cache;"| mysql; done

Under these conditions the server crashed approximately ever 3 minutes.
[7 Apr 2017 10:37] MySQL Verification Team
Hello Matthias Wolle,

Thank you for the report.
Could you please provide exact conf file used for these tests? Please mark it as private after posting here.

Thanks,
Umesh
[23 Oct 2017 14:21] MySQL Verification Team
Hi!

I would like to inform you first that the query cache is deprecated as of MySQL 5.7.20, and is removed in MySQL 8.0. That means that if this bug is to be fixed, it would be fixed only in 5.6. I must also underline that, since this is a crashing bug, we. are very interested in fixing it.

However, in order to go any further with this bug we require a test case. A test case must be complete and fully repeatable when run on any of our 5.6 binaries. Which means, it should include all the settings, all tables which were queried, their full dumps and all the queries that filled up the query cache, whereupon the crash occurs. Please, upload the file, with all info that I specified above, to this bug report. Use "Files" tab to perform the upload.

Only when we have a fully repeatable test case can we determine the cause of the crash and thereafter we could fix it.

Thank you in advance.
[24 Nov 2017 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".
[21 Dec 2017 4:38] Eugene Zheganin
mysql 5.7.20 config

Attachment: my.cnf (application/octet-stream, text), 1.76 KiB.

[21 Dec 2017 4:43] Eugene Zheganin
Hi,

I'm getting these crashes pretty much very frequently.
Backtrace:

===Cut===
16:31:10 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=16777216
read_buffer_size=8388608
max_used_connections=103
max_threads=512
thread_count=78
connection_count=78
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 4336312 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f5a783229f0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f5bbdccfe70 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x3b)[0xef8feb]
/usr/sbin/mysqld(handle_fatal_signal+0x461)[0x7b0191]
/lib64/libpthread.so.0(+0xf5e0)[0x7f5ea279a5e0]
/usr/sbin/mysqld(_ZN11Query_cache14get_free_blockEmcm+0x42)[0xc78732]
/usr/sbin/mysqld(_ZN11Query_cache14allocate_blockEmcm+0x3e)[0xc7902e]
/usr/sbin/mysqld(_ZN11Query_cache19allocate_data_chainEPP17Query_cache_blockmS1_c+0xbf)[0xc7a8cf]
/usr/sbin/mysqld(_ZN11Query_cache17write_result_dataEPP17Query_cache_blockmPhS1_NS0_10block_typeE+0x32)[0xc7a992]
/usr/sbin/mysqld(_ZN11Query_cache18append_result_dataEPP17Query_cache_blockmPhS1_+0x15e)[0xc7ab8e]
/usr/sbin/mysqld(_ZN11Query_cache6insertEP15Query_cache_tlsPKcmj+0xfd)[0xc7ac9d]
/usr/sbin/mysqld(net_write_packet+0x38)[0xc1e578]
/usr/sbin/mysqld(net_flush+0x23)[0xc1e7c3]
/usr/sbin/mysqld(_Z12net_send_eofP3THDjj+0x105)[0xc2cdf5]
/usr/sbin/mysqld(_ZN3THD21send_statement_statusEv+0x9f)[0xc85cbf]
/usr/sbin/mysqld(_Z16dispatch_commandP3THDPK8COM_DATA19enum_server_command+0x335)[0xcc9d55]
/usr/sbin/mysqld(_Z10do_commandP3THD+0x19f)[0xccbeef]
/usr/sbin/mysqld(handle_connection+0x288)[0xd8b668]
/usr/sbin/mysqld(pfs_spawn_thread+0x1b4)[0x126f4a4]
/lib64/libpthread.so.0(+0x7e25)[0x7f5ea2792e25]
/lib64/libc.so.6(clone+0x6d)[0x7f5ea124f34d]

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (7f5a781f6fe0): SELECT * FROM `file` WHERE `id`=8681974
Connection ID (thread ID): 66006
Status: NOT_KILLED

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
===Cut===

And yeah, I'm pretty much sure it can eat up to 4 gigs safely, because this server has 32 gigs of RAM and it's a dedicaded mysql server.

I was also getting these crashes on 5.6.x, after some version update, then I decided that this is probably the memory problem, but memtest 5.x didn't find any errors when running for a whole day, did 3 passes.

Seems to have gone after I disabled the query cache.
[21 Dec 2017 4:43] Eugene Zheganin
Follow-up: and yeah, I've attached the config file you requested.
[21 Dec 2017 13:59] MySQL Verification Team
I will repeat what I wrote before:

in order to go any further with this bug we require a test
case. A test case must be complete and fully repeatable when run on any
of our 5.6 binaries. Which means, it should include all the settings,
all tables which were queried, their full dumps and all the queries that
filled up the query cache, whereupon the crash occurs. Please, upload
the file, with all info that I specified above, to this bug report. Use
"Files" tab to perform the upload.

Only when we have a fully repeatable test case can we determine the
cause of the crash and thereafter we could fix it.

Thank you in advance.