Bug #72586 memcached daemon crashes and restarts on second ADD operation
Submitted: 9 May 2014 1:01 Modified: 30 Jul 2014 11:53
Reporter: David Schwartz Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Memcached Severity:S1 (Critical)
Version:5.6 OS:Linux (Centos 6.5)
Assigned to: CPU Architecture:Any

[9 May 2014 1:01] David Schwartz
Description:
Attempting to perform operations on a timed out key causes the server to crash.

There are several combinations of operations that can lead to this situation, including using SET or GETS and CAS also.

Here is the log info:

Version: '5.6.17'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)
Failed to read, and not due to blocking:
errno: 14 Bad address 
rcurr=7f8414000b5f ritem=1 rbuf=7f8414000b50 rlbytes=1 rsize=2048
Failed to read, and not due to blocking:
errno: 14 Bad address 
rcurr=7f8414000b5f ritem=1 rbuf=7f8414000b50 rlbytes=1 rsize=2048
10:35:47 UTC - mysqld got signal 7 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68245 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/usr/sbin/mysqld(my_print_stacktrace+0x35)[0x8d6475]
/usr/sbin/mysqld(handle_fatal_signal+0x4a4)[0x663084]
/lib64/libpthread.so.0(+0xf710)[0x7f8466c4a710]
/lib64/libc.so.6(memmove+0xa7)[0x7f8465946b87]
/usr/lib64/mysql/plugin/libmemcached.so(conn_nread+0x125)[0x7f8464038b95]
/usr/lib64/mysql/plugin/libmemcached.so(+0x9cde)[0x7f846402dcde]
/usr/lib64/mysql/plugin/libmemcached.so(event_base_loop+0x286)[0x7f846403fe26]
/usr/lib64/mysql/plugin/libmemcached.so(+0x19872)[0x7f846403d872]
/lib64/libpthread.so.0(+0x79d1)[0x7f8466c429d1]
/lib64/libc.so.6(clone+0x6d)[0x7f84659abb6d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

How to repeat:
get one
delete one
add one 0 1 1
1
Wait two seconds then post the commands again.

Server crashes and restarts
[9 May 2014 1:05] David Schwartz
Name        : mysql-server
Arch        : x86_64
Version     : 5.1.73
Release     : 3.el6_5

mysql> SHOW VARIABLES LIKE "%version%";
+-------------------------+------------------------------+
| Variable_name           | Value                        |
+-------------------------+------------------------------+
| innodb_version          | 5.6.17                       |
| protocol_version        | 10                           |
| slave_type_conversions  |                              |
| version                 | 5.6.17                       |
| version_comment         | MySQL Community Server (GPL) |
| version_compile_machine | x86_64                       |
| version_compile_os      | Linux                        |
[8 Jul 2014 12:07] Umesh Shastry
Thank you for the report and test case.
In my tests I noticed that telnet session is terminated and error log is flooded with "errno: 14 Bad address":

# telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
get one
END
delete one
NOT_FOUND
add one 0 1 1
1
STORED
get one
END
delete one
DELETED
add one 0 1 1
1
Connection closed by foreign host.
# telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
gets one
END
delete one
NOT_FOUND
add one 0 1 1
1
STORED
get one
END
delete one
DELETED
add one 0 1 1
1
Connection closed by foreign host.
# telnet 127.0.0.1 11211
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
get one
END
delete one
NOT_FOUND
add one 0 1 1
1
STORED
gets one
END
delete one
DELETED
add one 0 1 1
1
Connection closed by foreign host.

// Error log

Failed to read, and not due to blocking:
errno: 14 Bad address
rcurr=7f6d8c000b5f ritem=1 rbuf=7f6d8c000b50 rlbytes=1 rsize=2048
Failed to read, and not due to blocking:
errno: 14 Bad address
rcurr=7f6d8c000b5f ritem=1 rbuf=7f6d8c000b50 rlbytes=1 rsize=2048
Failed to read, and not due to blocking:
errno: 14 Bad address
rcurr=7f6d8c000b5f ritem=1 rbuf=7f6d8c000b50 rlbytes=1 rsize=2048
Failed to read, and not due to blocking:
errno: 14 Bad address
rcurr=7f6d8c000b5f ritem=1 rbuf=7f6d8c000b50 rlbytes=1 rsize=2048
Failed to read, and not due to blocking:
errno: 14 Bad address
rcurr=7f6d8c000b5f ritem=1 rbuf=7f6d8c000b50 rlbytes=1 rsize=2048
Failed to read, and not due to blocking:
errno: 14 Bad address
rcurr=7f6d8c000b5f ritem=1 rbuf=7f6d8c000b50 rlbytes=1 rsize=2048
[root@cluster-repo mysql-advanced-5.6.20]#  date
Thu Jul 10 08:19:00 IST 2014
[8 Jul 2014 12:10] Umesh Shastry
// 5.6.20

mysql> show variables like '%version%';
+-------------------------+---------------------------------------------------------+
| Variable_name           | Value                                                   |
+-------------------------+---------------------------------------------------------+
| innodb_version          | 5.6.20                                                  |
| protocol_version        | 10                                                      |
| slave_type_conversions  |                                                         |
| version                 | 5.6.20-enterprise-commercial-advanced-log               |
| version_comment         | MySQL Enterprise Server - Advanced Edition (Commercial) |
| version_compile_machine | x86_64                                                  |
| version_compile_os      | Linux                                                   |
+-------------------------+---------------------------------------------------------+
7 rows in set (0.00 sec)
[30 Jul 2014 11:53] Daniel Price
Fixed as of the upcoming 5.6.21, 5.7.5 release, and here's the changelog entry:

Attempting to perform operations on a timed out key would cause the
"memcached" daemon to crash and restart. 

Thank you for the bug report.
[25 Sep 2014 12:07] Laurynas Biveinis
revno: 6098
committer: Jimmy Yang <jimmy.yang@oracle.com>
branch nick: mysql-5.6
timestamp: Wed 2014-07-30 09:35:07 +0800
message:
  fix Bug #19172013 - MEMCACHED DAEMON CRASHES AND RESTARTS ON SECOND ADD
  OPERATION 
              
  rb://6067 approved by Sunny Bains