Bug #93961 mysqld SEGFAULT under load from several memcached plugin clients
Submitted: 17 Jan 2019 11:25 Modified: 28 May 2019 12:49
Reporter: John Daragon Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Memcached Severity:S1 (Critical)
Version:5.7.25 OS:Linux (Ubuntu Server 18.04)
Assigned to: CPU Architecture:x86 (8 core, 32GiB RAM, 100 GB SSD)

[17 Jan 2019 11:25] John Daragon
Description:
A similar problem was closed "Not a Bug" under #80273 (for an earlier release)on 24-NOV-18.

We have an Amazon AWS RDS MySQL instance running a single pre-populated table accessed via the MySQL memcached plugin.  We have a maximum of 16 concurrent clients, each of which is single-threaded, issuing many gets and many fewer sets. The payload for each get/set is relatively large; the columns of the table are two 8 byte CHARs and a BLOB, but all are much smaller than the maximum payload of 1MB.

Our clients are AWS Lambda functions so they start and stop asynchronously. We run a maximum of 16 concurrently. 

NOTE: We have not yet tried to demonstrate this behaviour with a single client.

Here is the stack trace (and some other log data supplied by the RDS Error Log)

16:21:21 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.
key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=6
max_threads=2538
thread_count=5
connection_count=5
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1349805 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...

stack_bottom = 0 thread_stack 0x40000
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace+0x2c)[0xf28b3c]
/rdsdbbin/mysql/bin/mysqld(handle_fatal_signal+0x47b)[0x8432ab]
/lib64/libpthread.so.0(+0xf100)[0x2b7bdf80b100]
/lib64/libc.so.6(+0x150666)[0x2b7be08d9666]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin//innodb_engine.so(+0x8b0db)[0x2b824068f0db]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0xdeba)[0x2b7bdf532eba]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0x120ce)[0x2b7bdf5370ce]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(conn_parse_cmd+0x11)[0x2b7bdf5392d1]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0xade8)[0x2b7bdf52fde8]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(event_base_loop+0x605)[0x2b7bdf5433e5]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0x1a062)[0x2b7bdf53f062]
/lib64/libpthread.so.0(+0x7dc5)[0x2b7bdf803dc5]
/lib64/libc.so.6(clone+0x6d)[0x2b7be087fc9d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.  

How to repeat:
Because this has been reported before, I don't think it's specific to our workload or data. We may be able to supply a database dump and replication data.

1) Create database table with CHAR primary key and BLOB column.  
2) Map to memcached
3) Read and write at scale

Suggested fix:
None
[18 Jan 2019 13:33] MySQL Verification Team
Hi,

First of all, the older bug that you are mentioning was not closed as "Not a Bug", but we simply never got the feedback back from the reporter.

Next, in your case it seems that you are running of of the thread stack space. Increase it to 2 Mb and if it does not help, to 4 Mb.

If that does not help either, then please, send us the exact configuration, queries to run, number of threads and all details that are necessary to repeat the crash with 5.7.23 or 5.7.24.
[21 Jan 2019 9:18] John Daragon
Thanks for the fast response.

You're right, of course. I conflated the ""Not a Bug status with closure. Sorry!

I'll try to get the stack space increased on this managed instance.
[21 Jan 2019 13:25] MySQL Verification Team
Hi,

Please, first increase thread stack to the values that I recommended.
[22 Jan 2019 18:24] John Daragon
Increased stack to 8MiB, and we're still seeing the same symptoms:

18:10:34 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=16777216
read_buffer_size=262144
max_used_connections=4
max_threads=2538
thread_count=1
connection_count=1
It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 1349805 K bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x800000
/rdsdbbin/mysql/bin/mysqld(my_print_stacktrace+0x2c)[0xf28b3c]
/rdsdbbin/mysql/bin/mysqld(handle_fatal_signal+0x47b)[0x8432ab]
/lib64/libpthread.so.0(+0xf100)[0x2b43fb4cd100]
/lib64/libc.so.6(+0x150666)[0x2b43fc59b666]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin//innodb_engine.so(+0x8b0db)[0x2b43fca9a0db]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0xdeba)[0x2b43fb1f4eba]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0x120ce)[0x2b43fb1f90ce]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(conn_parse_cmd+0x11)[0x2b43fb1fb2d1]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0xade8)[0x2b43fb1f1de8]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(event_base_loop+0x605)[0x2b43fb2053e5]
/rdsdbbin/mysql-5.7.23.R4/lib/plugin/libmemcached.so(+0x1a062)[0x2b43fb201062]
/lib64/libpthread.so.0(+0x7dc5)[0x2b43fb4c5dc5]
/lib64/libc.so.6(clone+0x6d)[0x2b43fc541c9d]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2019-01-22T18:10:46.739597Z 0 [Warning] 'NO_AUTO_CREATE_USER' sql mode was not set.
2019-01-22T18:10:46.739704Z 0 [Warning] Insecure configuration for --secure-file-priv: Location is accessible to all OS users. Consider choosing a different directory.
2019-01-22T18:10:46.739744Z 0 [Note] /rdsdbbin/mysql/bin/mysqld (mysqld 5.7.23) starting as process 3401 ...
2019-01-22T18:10:46.741227Z 0 [Warning] You need to use --log-bin to make --log-slave-updates work.
2019-01-22T18:10:46.743007Z 0 [Note] InnoDB: PUNCH HOLE support available
2019-01-22T18:10:46.743023Z 0 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2019-01-22T18:10:46.743027Z 0 [Note] InnoDB: Uses event mutexes
2019-01-22T18:10:46.743030Z 0 [Note] InnoDB: GCC builtin __atomic_thread_fence() is used for memory barrier
2019-01-22T18:10:46.743033Z 0 [Note] InnoDB: Compressed tables use zlib 1.2.8
2019-01-22T18:10:46.744107Z 0 [Note] InnoDB: Number of pools: 1
2019-01-22T18:10:46.744209Z 0 [Note] InnoDB: Using CPU crc32 instructions
2019-01-22T18:10:46.745930Z 0 [Note] InnoDB: Initializing buffer pool, total size = 23G, instances = 8, chunk size = 128M
2019-01-22T18:10:48.096816Z 0 [Note] InnoDB: Completed initialization of buffer pool
2019-01-22T18:10:48.414047Z 0 [Note] InnoDB: If the mysqld execution user is authorized, page cleaner thread priority can be changed. See the man page of setpriority().
2019-01-22T18:10:48.435700Z 0 [Note] InnoDB: Highest supported file format is Barracuda.
2019-01-22T18:10:48.662034Z 0 [Note] InnoDB: Log scan progressed past the checkpoint lsn 887232607054
2019-01-22T18:10:48.672998Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887237849600
2019-01-22T18:10:48.684312Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887243092480
2019-01-22T18:10:48.705712Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887248335360
2019-01-22T18:10:48.723925Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887253578240
2019-01-22T18:10:48.741644Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887258821120
2019-01-22T18:10:48.760979Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887264064000
2019-01-22T18:10:48.784980Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887269306880
2019-01-22T18:10:48.818305Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887274549760
2019-01-22T18:10:48.851021Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887279792640
2019-01-22T18:10:48.873684Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887285035520
2019-01-22T18:10:48.886871Z 0 [Note] InnoDB: Doing recovery: scanned up to log sequence number 887289352098
2019-01-22T18:10:48.886947Z 0 [Note] InnoDB: Database was not shutdown normally!
2019-01-22T18:10:48.886954Z 0 [Note] InnoDB: Starting crash recovery.
2019-01-22T18:10:49.176225Z 0 [Note] InnoDB: Starting an apply batch of log records to the database...
InnoDB: Progress in percent: 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 
2019-01-22T18:10:50.927371Z 0 [Note] InnoDB: Apply batch completed
2019-01-22T18:10:50.927408Z 0 [Note] InnoDB: Last MySQL binlog file position 0 511, file name OFF.000002
2019-01-22T18:10:51.045717Z 0 [Note] InnoDB: Removed temporary tablespace data file: "ibtmp1"
2019-01-22T18:10:51.045737Z 0 [Note] InnoDB: Creating shared tablespace for temporary tables
2019-01-22T18:10:51.045769Z 0 [Note] InnoDB: Setting file '/rdsdbdata/db/innodb/ibtmp1' size to 12 MB. Physically writing the file full; Please wait ...
2019-01-22T18:10:51.076972Z 0 [Note] InnoDB: File '/rdsdbdata/db/innodb/ibtmp1' size is now 12 MB.
2019-01-22T18:10:51.077526Z 0 [Note] InnoDB: 96 redo rollback segment(s) found. 96 redo rollback segment(s) are active.
2019-01-22T18:10:51.077534Z 0 [Note] InnoDB: 32 non-redo rollback segment(s) are active.
2019-01-22T18:10:51.077726Z 0 [Note] InnoDB: Waiting for purge to start
2019-01-22T18:10:51.127911Z 0 [Note] InnoDB: 5.7.23 started; log sequence number 887289352098
2019-01-22T18:10:51.128244Z 0 [Note] InnoDB: Loading buffer pool(s) from /rdsdbdata/db/innodb/ib_buffer_pool
2019-01-22T18:10:51.128338Z 0 [Note] Plugin 'FEDERATED' is disabled.
InnoDB MEMCACHED: Memcached uses atomic increment 
2019-01-22T18:10:51.135325Z 0 [Note] Skipping generation of SSL certificates as options related to SSL are specified.
2019-01-22T18:10:51.135975Z 0 [Note] Skipping generation of RSA key pair as key files are present in data directory.
2019-01-22T18:10:51.136061Z 0 [Note] Server hostname (bind-address): '*'; port: 3306
2019-01-22T18:10:51.136096Z 0 [Note] IPv6 is available.
2019-01-22T18:10:51.136104Z 0 [Note] - '::' resolves to '::';
2019-01-22T18:10:51.136112Z 0 [Note] Server socket created on IP: '::'.
Loaded engine: InnoDB Memcache 5.7.23
Supplying the following features: LRU, persistent storage, compare and swap
2019-01-22T18:10:51.205022Z 0 [Note] Event Scheduler: Loaded 0 events
2019-01-22T18:10:51.205185Z 0 [Note] /rdsdbbin/mysql/bin/mysqld: ready for connections.
Version: '5.7.23' socket: '/tmp/mysql.sock' port: 3306 Source distribution
2019-01-22T18:10:51.547838Z 0 [Note] InnoDB: Buffer pool(s) load completed at 190122 18:10:51
----------------------- END OF LOG ----------------------
[23 Jan 2019 13:09] MySQL Verification Team
Hi,

So, if this is not a case, there is no other way for us to proceed except to attempt to reproduce the behaviour.

What we require is a complete MySQL & OS configuration, all data that are necessary to reproduce the crash, how do you load plugins and all statements that should be run so that we experience the crash.

We shall wait for your data.
[23 Jan 2019 13:30] John Daragon
OK - As this is running in a managed AWS instance I'll build a standalone Linux server and reproduce it on that so that you can have some certainty.
[23 Jan 2019 13:32] MySQL Verification Team
Just make sure to reproduce it.

Thanks in advance.

If you are running it in the managed instance, it could be possible that you reach certain limit there.
[20 Feb 2019 11:21] John Daragon
I've now replicated this using MySQL 5.7.25 under Ubuntu Server 18.04 and libevent-dev ( a plugin dependency) 2.1.8.

Here's the log message :

10:43:25 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=16777216
read_buffer_size=131072
max_used_connections=0
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 76388 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x30000
/usr/sbin/mysqld(my_print_stacktrace+0x3b)[0xebeb7b]
/usr/sbin/mysqld(handle_fatal_signal+0x48b)[0x790f8b]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x12890)[0x7f6304971890]
/lib/x86_64-linux-gnu/libc.so.6(+0x18eb3e)[0x7f6303db8b3e]
/usr/lib/mysql/plugin//innodb_engine.so(+0x982c)[0x7f62dc3b282c]
/usr/lib/mysql/plugin/libmemcached.so(+0xb822)[0x7f62e40c3822]
/usr/lib/mysql/plugin/libmemcached.so(+0x102ad)[0x7f62e40c82ad]
/usr/lib/mysql/plugin/libmemcached.so(conn_parse_cmd+0x11)[0x7f62e40c9f51]
/usr/lib/mysql/plugin/libmemcached.so(+0x87c1)[0x7f62e40c07c1]
/usr/lib/x86_64-linux-gnu/libevent_core-2.1.so.6(+0x1aed8)[0x7f62dc5dfed8]
/usr/lib/x86_64-linux-gnu/libevent_core-2.1.so.6(event_base_loop+0x53f)[0x7f62dc5e091f]
/usr/lib/mysql/plugin/libmemcached.so(+0x17d82)[0x7f62e40cfd82]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76db)[0x7f63049666db]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x3f)[0x7f6303d4b88f]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.

The replication technique is a bit complex, and there's a lot of data. I'll get it compressed and uploaded later today.
[20 Feb 2019 11:35] John Daragon
Updated to reflect replication on 5.7.25 and Ubuntu
[20 Feb 2019 16:17] John Daragon
File mysql-bug-data-93961.tar uploaded to sftp.oracle.com:/support/incoming
[21 Feb 2019 8:25] John Daragon
Built a debug version of 5.7.25 from source last night, and the problem is reproducible in this, too.
[21 Feb 2019 14:05] MySQL Verification Team
Thank you for your latest feedback.

We are in the process of preparing a setup for the reproduction of this bug report.
[21 Feb 2019 15:58] MySQL Verification Team
Temporary re-assignment.
[22 Feb 2019 15:09] MySQL Verification Team
Hi Jon,

Any output expected from this go program? I did not recompile just run your binary and I see that load is not going over 2 on my test box and I don't see any output except the initial 
...
value of endpoint is localhost
value of port is 11211
value of fpath is ./xbm
value of endpoint is localhost
value of port is 11211
value of fpath is ./xbp
value of endpoint is localhost
...

30min into test no crashes yet

all best
Bogdan
[22 Feb 2019 15:47] John Daragon
No other output expected - I *could* update the code to report every (say) 10000 messages?
[22 Feb 2019 16:19] MySQL Verification Team
You have reported a crashing bug, so we need a way to repeat it.

If we do not repeat the crash, we shall have to close this report as "Not a bug".
[22 Feb 2019 16:24] MySQL Verification Team
Hi John,

this is running for a while now, no need to update the code I can do that myself, I just asked if you already did (before I checked the source code).. 

as you can see, no crashes here

top - 17:21:00 up 190 days, 19:01,  2 users,  load average: 3.71, 2.57, 2.25
Tasks: 217 total,   1 running, 216 sleeping,   0 stopped,   0 zombie
%Cpu(s): 21.9 us,  4.3 sy,  0.0 ni,  0.0 id, 71.8 wa,  0.7 hi,  1.3 si,  0.0 st
KiB Mem :  4027708 total,   120552 free,   973040 used,  2934116 buff/cache
KiB Swap:  4194300 total,  4079956 free,   114344 used.  2632152 avail Mem

   PID USER      PR  NI    VIRT    RES    SHR S %CPU %MEM     TIME+ COMMAND
125249 arhimed   20   0 1201784 470884  19256 S  7.6 11.7   6:24.66 mysqld
125597 arhimed   20   0  222456  13752   4100 S  1.7  0.3   0:27.42 exerciseMemcach
125517 arhimed   20   0  214260  13216   4088 S  1.3  0.3   0:13.25 exerciseMemcach
125531 arhimed   20   0  214260  13760   4036 S  1.0  0.3   0:13.12 exerciseMemcach
125562 arhimed   20   0  214260  13868   4092 S  1.0  0.3   0:13.02 exerciseMemcach
125614 arhimed   20   0  214260  12856   4104 S  1.0  0.3   0:11.30 exerciseMemcach
125617 arhimed   20   0  214260  13172   4076 S  1.0  0.3   0:19.51 exerciseMemcach
125513 arhimed   20   0  214260  13564   4124 S  0.7  0.3   0:12.52 exerciseMemcach
125520 arhimed   20   0  214260  13548   4048 S  0.7  0.3   0:13.42 exerciseMemcach
125521 arhimed   20   0  214260  13520   4116 S  0.7  0.3   0:13.97 exerciseMemcach
125522 arhimed   20   0  214260  13932   4048 S  0.7  0.3   0:13.78 exerciseMemcach
125527 arhimed   20   0  214260  14600   4104 S  0.7  0.4   0:15.28 exerciseMemcach
125528 arhimed   20   0  214260  13280   4108 S  0.7  0.3   0:20.51 exerciseMemcach
125566 arhimed   20   0  279796  13544   4100 S  0.7  0.3   0:18.07 exerciseMemcach
125620 arhimed   20   0  214260  13120   4080 S  0.7  0.3   0:17.82 exerciseMemcach
125624 arhimed   20   0  214260  13424   4080 S  0.7  0.3   0:12.51 exerciseMemcach
   469 root      20   0       0      0      0 S  0.3  0.0   1:12.19 jbd2/sda1-8
 46324 root      20   0  164516   4432   3572 R  0.3  0.1   0:00.01 top
125512 arhimed   20   0  214260  13092   4112 S  0.3  0.3   0:14.10 exerciseMemcach
125514 arhimed   20   0  214260  12928   4124 S  0.3  0.3   0:12.73 exerciseMemcach
125515 arhimed   20   0  214260  13328   4116 S  0.3  0.3   0:14.06 exerciseMemcach
125516 arhimed   20   0  214260  12440   4048 S  0.3  0.3   0:13.13 exerciseMemcach
125518 arhimed   20   0  214260  13216   4084 S  0.3  0.3   0:12.57 exerciseMemcach
125523 arhimed   20   0  214260  14020   4092 S  0.3  0.3   0:15.83 exerciseMemcach
125524 arhimed   20   0  140528  12696   4112 S  0.3  0.3   0:13.16 exerciseMemcach
125525 arhimed   20   0  214260  13480   4028 S  0.3  0.3   0:17.59 exerciseMemcach
125526 arhimed   20   0  214260  12972   4116 S  0.3  0.3   0:16.62 exerciseMemcach
125529 arhimed   20   0  214260  13144   4052 S  0.3  0.3   0:14.00 exerciseMemcach
125530 arhimed   20   0  214260  13380   4056 S  0.3  0.3   0:14.98 exerciseMemcach
125563 arhimed   20   0  222456  13432   4124 S  0.3  0.3   0:14.00 exerciseMemcach
125564 arhimed   20   0  279796  14120   4104 S  0.3  0.4   0:16.74 exerciseMemcach
125615 arhimed   20   0  214260  13444   4032 S  0.3  0.3   0:14.15 exerciseMemcach
125616 arhimed   20   0  214260  13304   4088 S  0.3  0.3   0:18.04 exerciseMemcach
125618 arhimed   20   0  214260  14588   4096 S  0.3  0.4   0:16.95 exerciseMemcach
125621 arhimed   20   0  214260  13736   4036 S  0.3  0.3   0:18.88 exerciseMemcach
125622 arhimed   20   0  140528  12996   4080 S  0.3  0.3   0:13.13 exerciseMemcach
125623 arhimed   20   0  214260  12816   4084 S  0.3  0.3   0:13.92 exerciseMemcach
     1 root      20   0  128036   4356   3028 S  0.0  0.1   1:16.23 systemd
     2 root      20   0       0      0      0 S  0.0  0.0   0:00.09 kthreadd
     3 root      20   0       0      0      0 S  0.0  0.0   0:02.10 ksoftirqd/0
     5 root       0 -20       0      0      0 S  0.0  0.0   0:00.00 kworker/0:0H

but not very high load neither. I'll let this run overnight but if no crash happens I need to know how to proceed. 
 - are you running this on a single box (mysqld and memcached client)
 - what your load is on that box when you have a crash
 - can I run 20 more clients from box2 and another 20 from box3 using same "data files" to try to increase load on the mysqld server

all best
Bogdan
[22 Feb 2019 16:56] MySQL Verification Team
Hi John,

Took bit longer then 30minutes but I verified using the go app and your data sample, thanks.

Version: '5.7.25'  socket: '/tmp/mysql_sandbox5725.sock'  port: 5725  MySQL Community Server (GPL)
16:48:58 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
Attempting to collect some information that could help diagnose the problem.
As this is a crash and something is definitely wrong, the information
collection process might fail.

key_buffer_size=8388608
read_buffer_size=131072
max_used_connections=1
max_threads=151
thread_count=0
connection_count=0
It is possible that mysqld could use up to
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 68196 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
/home/arhimed/opt/mysql/5.7.25/bin/mysqld(my_print_stacktrace+0x35)[0xf4f775]
/home/arhimed/opt/mysql/5.7.25/bin/mysqld(handle_fatal_signal+0x4a4)[0x7d1d24]
/lib64/libpthread.so.0(+0x10a00)[0x7f28a5926a00]
/lib64/libc.so.6(+0x15b316)[0x7f28a43c6316]
/home/arhimed/opt/mysql/5.7.25/lib/plugin//innodb_engine.so(+0xafbf)[0x7f28843bbfbf]
/home/arhimed/opt/mysql/5.7.25/lib/plugin/libmemcached.so(+0x1052b)[0x7f28845da52b]
/home/arhimed/opt/mysql/5.7.25/lib/plugin/libmemcached.so(+0x15781)[0x7f28845df781]
/home/arhimed/opt/mysql/5.7.25/lib/plugin/libmemcached.so(+0x166ac)[0x7f28845e06ac]
/home/arhimed/opt/mysql/5.7.25/lib/plugin/libmemcached.so(conn_parse_cmd+0x11)[0x7f28845e12f1]
/home/arhimed/opt/mysql/5.7.25/lib/plugin/libmemcached.so(+0x958c)[0x7f28845d358c]
/home/arhimed/opt/mysql/5.7.25/lib/plugin/libmemcached.so(event_base_loop+0x286)[0x7f28845e5ab6]
/home/arhimed/opt/mysql/5.7.25/lib/plugin/libmemcached.so(+0x19582)[0x7f28845e3582]
/lib64/libpthread.so.0(+0x761a)[0x7f28a591d61a]
/lib64/libc.so.6(clone+0x6d)[0x7f28a436d5fd]
The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
2019-02-22T16:48:58.966846Z mysqld_safe Number of processes running now: 0
2019-02-22T16:48:58.971658Z mysqld_safe mysqld restarted
2019-02-22T16:48:59.085781Z 0 [Warning] Changed limits: max_open_files: 1024 (requested 5000)
2019-02-22T16:48:59.085839Z 0 [Warning] Changed limits: table_open_cache: 431 (requested 2000)
2019-02-22T16:48:59.227439Z 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_defaults_for_timestamp server option (see documentation for more details).
2019-02-22T16:48:59.227482Z 0 [Note] --secure-file-priv is set to NULL. Operations related to importing and exporting data are disabled
2019-02-22T16:48:59.227498Z 0 [Note] /home/arhimed/opt/mysql/5.7.25/bin/mysqld (mysqld 5.7.25) starting as process 46453 ...
2019-02-22T16:48:59.232721Z 0 [Note] InnoDB: PUNCH HOLE support available
...
[23 Feb 2019 6:55] MySQL Verification Team
is it possible somebody can build mysqld themselves, run it in gdb, get a crash, and put here a proper stack trace so it's searchable and we know where crash happens?
[23 Feb 2019 22:36] MySQL Verification Team
thread apply all bt

Attachment: thappallbt.txt (text/plain), 42.61 KiB.

[24 Feb 2019 2:32] MySQL Verification Team
weird to verify using gdb ... the thappallbt.txt attached is crash that happened when I killall all the go instances...

but attm I'm running the test case for 4 hours with gdb attached to mysqld and no crash in site... I'll let it run trough weekend but.. (it's same build that crashed when gdb was not attached)
[24 Feb 2019 5:09] MySQL Verification Team
thread apply all bt - after crash

Attachment: tapallbt-1.txt (text/plain), 47.24 KiB.

[24 Feb 2019 5:13] MySQL Verification Team
(gdb) backtrace full
#0  __memmove_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2710
No locals.
#1  0x00007fc47c3bbfbf in innodb_get (handle=0x7fc464008680, cookie=<optimized out>, item=0x7fc4637fcea8, key=<optimized out>,
    nkey=<optimized out>, vbucket=<optimized out>)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/plugin/innodb_memcached/innodb_memcache/src/innodb_engine.c:1936
        col_value = 0x7fc4700835f0
        i = <optimized out>
        c_value = <optimized out>
        value_end = 0x7fc3b648fcf6 "T0tS1Ml3+xCL3hkItkGQRdEuNu2mffuFaM+MD5KsgydpA+Rv/obSDK0DxbP+338s/76gYROcZzDL//zlb3/98X9//vHXX1bw/wusfvlpIUY0CzmFiIrIKyRaELQKpAhJwb8Rf9BBIdsFNVmgyXiHbjJBW5pQW2Np2R7LciTAZgigBT2CXQhIOa0YFJvrjx+Qsf2EjgNP"...
        int_buf = "`\314\177c\000\000\000\000\003\000\000\000\000\000\000\000\000\000\352\001"
        total_len = 38
        innodb_eng = 0x7fc464008680
        crsr = 0x7fc47003b3d8
        err = <optimized out>
        result = 0x7fc470073740
        err_ret = <optimized out>
        conn_data = 0x7fc470073630
        meta_info = <optimized out>
        option_length = <optimized out>
        option_delimiter = 0x7fc464027a84 "|"
        key_len = 8
        report_table_switch = <optimized out>
        __PRETTY_FUNCTION__ = "innodb_get"
#2  0x00007fc47c5da52b in process_get_command (c=0x7fc46c052050, tokens=0x7fc4637fcf20, ntokens=<optimized out>,
    return_cas=1 '\001')
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/plugin/innodb_memcached/daemon_memcached/daemon/memcached.c:4076
        ret = <optimized out>
        key = 0x7fc436ddf065 "cb4e9218"
        nkey = 8
        i = 0
        it = 0x7fc470020c60
        key_token = 0x7fc4637fcf30
        __PRETTY_FUNCTION__ = "process_get_command"
#3  0x00007fc47c5df781 in process_command (c=0x7fc46c052050, command=<optimized out>)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/plugin/innodb_memcached/daemon_memcached/daemon/memcached.c:4605
        tokens = {{value = 0x7fc436ddf060 "gets", length = 4}, {value = 0x7fc436ddf065 "cb4e9218", length = 8}, {value = 0x0,
            length = 0}, {value = 0x7fc4637fcfb0 "p\321\177c\304\177", length = 19277709}, {
            value = 0x3c <error: Cannot access memory at address 0x3c>, length = 140481669365792}, {
            value = 0x26 <error: Cannot access memory at address 0x26>, length = 60}, {value = 0x7fc4637fd180 "<\250\027\004",
            length = 38}, {value = 0x3c <error: Cannot access memory at address 0x3c>, length = 140481459638656}, {
            value = 0x161f440 "/export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/storage/innobase/btr/btr0pcur.cc", length = 0}, {value = 0x7fc4637fd170 ";", length = 16331034}, {value = 0x7fc4712812a0 "p\025JB\304\177", length = 1}, {
            value = 0x7fc464027c20 "\200|\002d\304\177", length = 140479916044584}, {value = 0x7fc451137b72 "\207B\304\177",
            length = 0}, {value = 0x7fc4637fd170 ";", length = 140481874611895}, {value = 0x7fc400000003 "l\304\177",
            length = 140479790317568}, {value = 0x7fc4637fd3af "", length = 8}, {value = 0x7fc464027970 "`{\002d\304\177",
            length = 140481672285992}, {
            value = 0x7fc3a565e468 "2e8ff61700000000|00000000|eJy8XFuu5LbRfv9XwQ38TN1401sWEeSNwCAeGBMYnsBjx+PdB1WUTosS1adb6jMBcrpGxZI+3op1o//vH/XvFYOEkpmj1N++/vH75/9++fznt0b+v9H+98+/fP/0+y81cKZcsWTP5FH/H0KV6FHYk0QPfyPesk3GUUFxSKU4pMSOCk"..., length = 0}, {
            value = 0x7fc4637fd1c0 "\240p~\a\304\177", length = 140481874611895}, {value = 0x7fc400000000 " ",
            length = 140479916044472}, {value = 0x7fc3a48b1568 "", length = 3832}, {
            value = 0x8 <error: Cannot access memory at address 0x8>, length = 140481459638552}, {value = 0x7fc4637fd0a0 "",
            length = 16696509}, {value = 0x7fc47182da40 "\230\066\272\a\304\177", length = 8}, {
            value = 0x7fc4637fd100 "\370\222,p\304\177", length = 17757769}, {
            value = 0x1 <error: Cannot access memory at address 0x1>, length = 8}, {value = 0x0, length = 0}, {value = 0x0,
            length = 140479916044584}, {value = 0x4 <error: Cannot access memory at address 0x4>, length = 0}, {
            value = 0x4 <error: Cannot access memory at address 0x4>, length = 140481672286968}}
        ntokens = <optimized out>
        comm = <optimized out>
        ret = 0x0
        __PRETTY_FUNCTION__ = "process_command"
[24 Feb 2019 5:13] MySQL Verification Team
#4  0x00007fc47c5e06ac in try_read_command (c=0x7fc46c052050)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/plugin/innodb_memcached/daemon_memcached/daemon/memcached.c:4873
        cont = 0x7fc436ddf06f "0"
        lb = 13 '\r'
        thread = 0x7fc464023810
        el = 0x7fc436ddf06d ""
        left = <optimized out>
        __PRETTY_FUNCTION__ = "try_read_command"
#5  0x00007fc47c5e12f1 in conn_parse_cmd (c=0x7fc46c052050)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/plugin/innodb_memcached/daemon_memcached/daemon/memcached.c:5303
No locals.
#6  0x00007fc47c5d358c in event_handler (fd=<optimized out>, which=<optimized out>, arg=0x7fc46c052050)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/plugin/innodb_memcached/daemon_memcached/daemon/memcached.c:5707
        c = 0x7fc46c052050
        __PRETTY_FUNCTION__ = "event_handler"
        thr = 0x7fc464023810
        max_items = 256
        pending_close = {0x0 <repeats 178 times>, 0x7fc470000050, 0x7fc470000070, 0x7fc4637fdbb0,
          0x7fc49bb61e37 <sysmalloc+583>, 0x7fc4637fdb00, 0x7fc470000000, 0xfffffffffffff000, 0x2000, 0x9e000, 0x9c000, 0x0,
          0x0, 0x0, 0x7fc49bb61e37 <sysmalloc+583>, 0x0, 0x7fc470000078, 0x7fc470000020, 0x7fc470000078, 0x2710, 0xc, 0x1910,
          0x7fc49bb63003 <_int_malloc+3075>, 0x24, 0x7fc4637fdc90, 0x230, 0x7fc470000078, 0x7fc470000020, 0x7fc470000078,
          0xffff803b9c802371, 0x7fc4637fdc8f, 0x9, 0x3900000024, 0x1, 0x0, 0x6e0000005c, 0x0, 0x0, 0x7c00000077,
          0xffff803b9c802321, 0x7fc4637fdcdf, 0xa, 0x3a00000029, 0x1, 0x7fc47c5e800e <epoll_add+174>, 0xec00000001, 0x0,
          0x7fc46c01c350, 0x7fc46c01c350, 0x7fc464024970, 0x0, 0x7fc4637fdd60, 0x7fc47c5e54fc <event_add+316>, 0x7fc47009d140,
          0x7fc47c5d1886 <conn_reset_buffersize+374>, 0x288, 0x7fc470099e80, 0x7fc470099e80, 0x0, 0x7fc4637fdd50,
          0x7fc46c01c350, 0x7fc46c01c330, 0xec, 0x12, 0x7fc47c7f8810 <conn_cache>, 0x7fc4637fdd80,
          0x7fc47c5d1d62 <register_event+18>, 0x7fc464023810, 0x7fc46c01c350, 0x7fc4637fdde0, 0x7fc47c5d42ed <conn_new+733>,
          0x0, 0x800, 0x1, 0x7fc464024970, 0x7fc46c052050, 0x7fc464024970, 0x7fc46c052070, 0x7fc46401f5f0}
        n_pending_close = 0
#7  0x00007fc47c5e5ab6 in event_process_active (base=<optimized out>)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/libevent/event.c:392
        activeq = 0x7fc46401f5f0
        i = <optimized out>
        ev = <optimized out>
        ncalls = 0
#8  event_base_loop (base=0x7fc464024970, flags=<optimized out>)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/libevent/event.c:544
        evsel = 0x7fc47c7f4960 <epollops>
        evbase = 0x7fc464027820
        tv = {tv_sec = 0, tv_usec = 0}
        tv_p = <optimized out>
        res = <optimized out>
#9  0x00007fc47c5e3582 in worker_libevent (arg=<optimized out>)
    at /export/home/pb2/build/sb_0-32013917-1545390211.74/mysql-5.7.25/plugin/innodb_memcached/daemon_memcached/daemon/thread.c:309
        me = 0x7fc464023810
#10 0x00007fc49d19361a in start_thread (arg=0x7fc4637fe700) at pthread_create.c:334
        __res = <optimized out>
        pd = 0x7fc4637fe700
        now = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140481459644160, 5443721516807542131, 140481568675839, 140481459644160,
                8388608, 0, -5475465518075423373, -5475601085839387277}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0,
              0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
        pagesize_m1 = <optimized out>
        sp = <optimized out>
        freesize = <optimized out>
#11 0x00007fc49bbe35fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
No locals.
(gdb)
[24 Feb 2019 5:13] MySQL Verification Team
thread apply all bt full

Attachment: tapallbt-2.txt (text/plain), 104.66 KiB.

[29 Mar 2019 14:45] Joanna Holsgrove
Further investigation shows that the memcpy at mysql-server/plugin/innodb_memcached/innodb_memcache/srcinnodb_engine.c line 1937 is being given a null string and trying to copy 20 bytes when doing a get for a BLOB. This only seems to happen when more than one 'gets' is being processed at the same time.
Consistently, col_value has the same values for value_str and value_len before the copy fails. 
{value_str = 0x0, value_len = 20, value_int = 140330815543374, is_str = 1 '\001', is_unsigned = 88 'X', is_valid = 1 '\001', is_null = 80 'P', allocated = 0 '\000'}

Checking for a null string before the memcpy (innodb_engine line 1937) and returning ENGINE_KEY_ENOENT prevents MYSQL from crashing and returns a fetch error. 

BLOB meta data extra_col_info
{col_name = 0x7fa0e003f4d0 "msgs", col_name_len = 4, field_id = 3, col_meta = {type = IB_BLOB, attr = IB_COL_NONE, type_len = 12, client_type = 252, charset = 0x7f9a00000000}}
[2 Apr 2019 12:42] MySQL Verification Team
Thank you for your note.

It has been forwarded to the developers in charge of this plugin maintenance.
[23 Apr 2019 12:37] MySQL Verification Team
Thank you for your confirmation !!!!!!
[28 May 2019 12:49] Daniel Price
Posted by developer:
 
Fixed as of the upcoming 5.7.28, 8.0.18 release, and here's the changelog entry:

In READ UNCOMMITTED isolation level, a segmentation fault occurred under
heavy load from memcached clients. An externally stored BLOB column that
was being updated by one transaction was read by another transaction as
having a NULL value and a non-zero data length.