Bug #81222 Server repeatedly crashes without further notice
Submitted: 28 Apr 2016 12:21 Modified: 12 May 2016 7:20
Reporter: Martin Kirchner Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server Severity:S1 (Critical)
Version:5.7.12 OS:Ubuntu (14.04)
Assigned to: CPU Architecture:Any

[28 Apr 2016 12:21] Martin Kirchner
Description:
I have a dump from a MySQL 5.0.91 database containing about 2000 databases.
It was created like this:
mysqldump --master-data --add-drop-database --routines ...
where as the databases mysql and information_schema were not dumped.

I tried to reload it under MySQL 5.7.12 using the mysql commandline client:

root@localhost [(none)]> source dump.sql;

After about 600-700 databases the reloading stops with the output "Killed".

I'll attach the log. Unfortunately it does not contain any information why the server crashed.

How to repeat:
Restore my dump. Unfortunately it contains sensitive customer data and cannot be provided here.
[28 Apr 2016 12:22] Martin Kirchner
error-log

Attachment: error.log (application/octet-stream, text), 11.56 KiB.

[28 Apr 2016 14:20] Tsubasa Tanaka
Hi, I'm not oracle person but I had a similar experience.

Did you check your /var/log/messages ?
Is there any trace of OOM-Killer?

I've seen many times such error-log when OOM-Killer kills mysqld.

regards,
[28 Apr 2016 14:31] Martin Kirchner
@tsubasa tanaka
It seems you're right. In the syslog I find out-of-memory errors. However strange that the MySQL error log does not show any hint.

Apr 28 14:05:13 xyz-db1 kernel: [169959.732416] mysqld invoked oom-killer: gfp_mask=0x280da, order=0, oom_score_adj=0
Apr 28 14:05:13 xyz-db1 kernel: [169959.732420] mysqld cpuset=/ mems_allowed=0
Apr 28 14:05:13 xyz-db1 kernel: [169959.732424] CPU: 2 PID: 13912 Comm: mysqld Not tainted 3.19.0-25-generic #26~14.04.1-Ubuntu
[...]
Apr 28 14:05:13 xyz-db1 kernel: [169959.732604] Out of memory: Kill process 13852 (mysqld) score 942 or sacrifice child
Apr 28 14:05:13 xyz-db1 kernel: [169959.732631] Killed process 13852 (mysqld) total-vm:10235872kB, anon-rss:5715888kB, file-rss:0kB
Apr 28 14:05:13 xyz-db1 kernel: [169959.735388] vmtoolsd invoked oom-killer: gfp_mask=0x201da, order=0, oom_score_adj=0
Apr 28 14:05:13 xyz-db1 kernel: [169959.735391] vmtoolsd cpuset=/ mems_allowed=0
[...]
Apr 28 14:05:13 xyz-db1 kernel: [169959.735603] Out of memory: Kill process 9726 (mysql) score 5 or sacrifice child
Apr 28 14:05:13 xyz-db1 kernel: [169959.735628] Killed process 9726 (mysql) total-vm:76900kB, anon-rss:46788kB, file-rss:0kB
[28 Apr 2016 14:38] Tsubasa Tanaka
Hi Martin Kirchner,

The reason of MySQL error-log doesn't have any notice is OOM-Killer kills process by SIGKILL.
When mysqld traps other signals(SIGSEGV, SIGABRT, so on), mysqld's signal-handler catches it and write informations into error-log.
SIGKILL isn't caught by signal-handler. https://en.wikipedia.org/wiki/Unix_signal#SIGKILL
[4 May 2016 9:28] MySQL Verification Team
Hello Martin,

Thank you for he report.
I tried to reproduce with no luck(also, I followed one series at a time rather than 5.0->5.7). As Tanaka-San nicely explained, OOM-Killer is invoked upon memory exhaustion and mysqld is killed. If you can provide more information, feel free to add it to this bug and change the status back to 'Open'.

Thank you for your interest in MySQL.

Thanks,
Umesh
[11 May 2016 10:21] Martin Kirchner
Hello Umesh,

I invested some more time in this issue.

My use case: Restore a dump from MySQL 5.0.91 on MySQL 5.7.12 in order to replace the existing server with a new. The dump contains about 1600 databases. The dump was created using mysqldump.

The machine has 24 GB of RAM and 2 GB of SWAP.
After I start the restore of the dump memory consumption continuously increases until the complete RAM and SWAP are used by mysqld. Then the kernel kills the process.

Output from "vmstat 1" right before and after the crash:
procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1 11 2220028 176464    368  19472    0    0 1324348   924 37618 35852  0 27  2 71  0
 1 11 2220028 177256    352  18592    8    8 1383992   620 41363 43980  0 30  1 69  0
 1 10 2220028 176572    344  19432    0    0 1027916   436 27221 27916  0 28  4 68  0
 1 10 2220028 176180    320  19416   32   12 1124472   236 35857 33904  0 26  6 69  0
 2  0 1356008 13096528   5352  23260 1844 1832 83288  2228 12031 6672  2 26 51 22  0
 2  0  83192 23885808   6500  61096  364    0 39260    28 12322 120824 12 25 50 13  0
 4  0  83192 21982220  19596 127864    0    0 80160     0 8967 93073 30 13 49  8  0
 
From what I configured in /etc/mysql/my.cnf and what mysqltuner shows me MySQL is not allowed to consume that much of memory.
mysqltuner says:
[--] Total buffers: 19.1G global + 8.6M per thread (80 max threads)
[OK] Maximum possible memory usage: 19.8G (84% of installed RAM)

Extract from /etc/mysql/my.cnf:

[mysqld_safe]
open_files_limit               = 65000
innodb_open_files              = 65000

[mysqld]
max_allowed_packet             = 256M   
max_connections                = 80                                 # Values < 1000 are typically good
max_user_connections           = 75                                 # Limit one specific user/application
thread_cache_size              = 80                                 # Up to max_connections makes sense
query_cache_type               = 1                                   # Set to 0 to avoid global QC Mutex
query_cache_size               = 32M                                 # Avoid too big (> 128M) QC because of QC clean-up lock!
sort_buffer_size               = 4M                                  # Could be too big for many small sorts
tmp_table_size                 = 200M                                 # Make sure your temporary results do NOT contain BLOB/TEXT attributes
read_buffer_size               = 384k
read_rnd_buffer_size           = 1M
join_buffer_size               = 3M
table_definition_cache         = 29000                               # As big as many tables you have
table_open_cache               = 29000                               # connections x tables/connection (~2)
binlog_cache_size              = 1M
binlog_stmt_cache_size         = 1M
max_binlog_size                = 512M                                # Make bigger for high traffic to reduce number of files
key_buffer_size                = 8M                                  # Set to 25 - 33 % of RAM if you still use MyISAM
max_heap_table_size            = 64M                                 # Should be greater or equal to tmp_table_size
innodb_strict_mode             = ON
innodb_file_format_check       = 1
innodb_buffer_pool_size        = 19G                               # Go up to 80% of your available RAM
innodb_buffer_pool_instances   = 8                                   # Bigger if huge InnoDB Buffer Pool or high concurrency
innodb_file_per_table          = 1                                   # Is the recommended way nowadays
innodb_flush_log_at_trx_commit = 0                                   # 1 for durability, 0 or 2 for performance
innodb_log_buffer_size         = 8M                                  # Bigger if innodb_flush_log_at_trx_commit = 0
innodb_log_file_size           = 256M                                # Bigger means more write throughput but longer recovery time
thread_stack                    = 256k

I will attach screenshots from my monitoring where I monitor the memory consumption specifically for the process mysqld.

Thank you for your help.

Kind regards,
Martin
[11 May 2016 10:23] Martin Kirchner
Virtual memory as returned by ps faux

Attachment: MySQL_VirtualMemory.png (image/png, text), 18.79 KiB.

[11 May 2016 10:23] Martin Kirchner
Resident memory as returned by ps faux

Attachment: MySQL_ResidentMemory.png (image/png, text), 19.88 KiB.

[11 May 2016 10:24] Martin Kirchner
Memory in percent as returned by ps faux

Attachment: MySQL_MemoryPercent.png (image/png, text), 20.13 KiB.

[11 May 2016 10:24] Martin Kirchner
Screenshot from htop

Attachment: 2016-05-11 11_11_23-SDC1-SmartDesign Overview - Agile Board - JIRA.png (image/png, text), 8.13 KiB.

[11 May 2016 10:33] MySQL Verification Team
So,  5.7.12 mysqld is consuming more memory than 5.0.  
You can try see where it is allocated by enabling p_s items like this:

UPDATE performance_schema.setup_instruments set enabled='YES' WHERE name like 'memory/%';

then monitor the outputs of the queries such as :

SELECT * FROM performance_schema.memory_summary_global_by_event_name ORDER BY   CURRENT_NUMBER_OF_BYTES_USED desc;
SELECT * FROM sys.memory_global_total;
SELECT * FROM sys.memory_global_by_current_bytes;
SELECT * FROM sys.memory_by_thread_by_current_bytes;
[11 May 2016 19:03] Martin Kirchner
Hi Shane,

I did a next try... This time the crash was around 15:54:50.

I'll attach the measured values from the performance_schema/sys schema directly before the crash as files.

If it is of any significance I can provide these status values in 5s intervals from the beginning of the import until the crash.
Thank you for your help.

Kind regards,
Martin
[11 May 2016 19:03] Martin Kirchner
memory_by_thread_by_current_bytes

Attachment: memory_by_thread_by_current_bytes.log (application/octet-stream, text), 3.51 KiB.

[11 May 2016 19:04] Martin Kirchner
memory_global_by_current_bytes

Attachment: memory_global_by_current_bytes.log (application/octet-stream, text), 10.15 KiB.

[11 May 2016 19:04] Martin Kirchner
memory_global_total

Attachment: memory_global_total.log (application/octet-stream, text), 177 bytes.

[11 May 2016 19:05] Martin Kirchner
memory_summary_global_by_event_name

Attachment: memory_summary_global_by_event_name.xlsx (application/vnd.openxmlformats-officedocument.spreadsheetml.sheet, text), 29.68 KiB.

[11 May 2016 19:05] Martin Kirchner
memory_summary_global_by_event_name

Attachment: memory_summary_global_by_event_name.log (application/octet-stream, text), 22.86 KiB.

[12 May 2016 5:06] MySQL Verification Team
Can you please upload the real my.cnf used.  Your error log and outputs indicate a 4G innodb buffer pool size,  but the my.cnf pasted in bug report says 19G.
I don't see anything wrong now.
[12 May 2016 5:26] Martin Kirchner
my.cnf

Attachment: my.cnf (application/octet-stream, text), 8.42 KiB.

[12 May 2016 5:30] Martin Kirchner
Result of "show global variables like '%size%'"

Attachment: variables_size.csv (application/vnd.ms-excel, text), 2.31 KiB.

[12 May 2016 5:40] MySQL Verification Team
Still not seeing an issue here.   If the OOM kills occurred with a 19G buffer pool far before using 19G :

Killed process 13852 (mysqld) total-vm:10235872kB, anon-rss:5715888kB, file-rss:0kB

then you need to either free up memory on server, add more memory, or tell it to not kill mysqld..  Hint: Set  /proc/PID/oom_score_adj to a lower number for mysqld process.
[12 May 2016 7:20] Martin Kirchner
Hi Shane,

sorry for that. The extract from the syslog was from several days ago before I increased the innodb_buffer_pool_size.
From yesterdays's crash the log shows:
May 11 15:54:52 ecrm-db1 kernel: [185450.668687] Killed process 27070 (mysqld) total-vm:28650552kB, anon-rss:23969896kB, file-rss:0kB

That is exactly what I saw in htop and ps that mysql consumes all RAM and SWAP.

The machine is a dedicated mysql server with no other processes running that consume a lot of memory.

I will attach the complete syslog part where you can see the other processes.

Thank you,
Martin