Bug #67072 Out of memory error on replication slave
Submitted: 4 Oct 2012 7:35 Modified: 31 Jan 2013 14:04
Reporter: Jan Rusch Email Updates:
Status: Can't repeat Impact on me:
None 
Category:MySQL Server: Replication Severity:S3 (Non-critical)
Version:5.6.6 5.6.7 OS:Windows (Windows 7 64bit)
Assigned to: CPU Architecture:Any
Tags: out of memory, replication, slave

[4 Oct 2012 7:35] Jan Rusch
Description:
After upgrading one of our test slave replication servers to 5.6.6 or 5.6.7 we see the following error messages in the mysql error log about all 2-3 days while replication large amounts of data (heavy load):

121003  3:53:57 [ERROR] C:\MySQL\MySQL567\bin\mysqld: Out of memory (Needed 9057600 bytes)
121003  3:53:58 [ERROR] Out of memory; check if mysqld or some other process uses all available memory; if not, you may have to use 'ulimit' to allow mysqld to use more memory or you can add more swap space

How to repeat:
As this happens only under a high replication load reproducing the error might be a problem.

The master server is still running 5.6.5

I upgraded the test server by installing the new binarys. Creating a new service for MySQL 5.6.6, started the server and run mysql_upgrade afterwards without any error messages. The same happens when upgrading to 5.6.7. Everything runs without any problems, if MySQL 5.6.5 is used.

At night huge amounts of data are replaced on the master server. After the replacement a post processing via a sql script on the master server takes place, which creates new tables and updates several others.

There are no other connections on the slave server at the time the slave crashes.

At first I thought this might be an OS/host related problem, but it can be repeated on different hosts. I increased the amount of swap space to some (at least for this server) very high values. In this case the "Out of memory" message is not printed, but the reserved memory is never released afterwards, so the server then crashes the next day.

We see the increasing RAM usages in our monitoring system. Another slave server running 5.6.5 does not show this behaviour. If needed, I could mail some PDFs reflecting this.

my.ini config used at 5.6.6 / 5.6.7 slave is:
[client]
port=3306
[mysql]
default-character-set=utf8
[mysqld]
port=3306
basedir="C:/MySQL/MySQL567/"
datadir="D:/MySQL/data/"
tmpdir="E:/Temp/MySQL"
character-set-server=utf8
default-storage-engine=INNODB
default-tmp-storage-engine=INNODB
sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
max_connections=100
query_cache_size=181M
tmp_table_size=185M
thread_cache_size=38
myisam_max_sort_file_size=10G
myisam_sort_buffer_size=125M
key_buffer_size=55M
read_buffer_size=64K
read_rnd_buffer_size=256K
sort_buffer_size=256K
innodb_data_home_dir="D:/MySQL/data/"
innodb_additional_mem_pool_size=32M
innodb_flush_log_at_trx_commit=2
innodb_log_buffer_size=16M
innodb_buffer_pool_size=3G
innodb_log_file_size=100M
innodb_lock_wait_timeout=1200
innodb_table_locks=0
innodb_write_io_thread=8
innodb_read_io_thread=8
innodb_thread_concurrency=8
group_concat_max_len=20M
max_allowed_packet=500M
wait_timeout=3600
log-warnings
server-id=151
skip-slave-start
report-host=sbwus1
wait_timeout=3600
performance_schema
innodb_ft_min_token_size=2
ft_min_word_len=2

my.ini config used at 5.6.5 slave running without problems is:
[client]
port=3306
[mysql]
default-character-set=utf8
[mysqld]
port=3306
basedir="C:/MySQL/MySQL565/"
datadir="D:/MySQL/data/"
tmpdir="E:/Temp/MySQL"
character-set-server=utf8
default-storage-engine=INNODB
default-tmp-storage-engine=INNODB
sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
max_connections=160
query_cache_size=181M
tmp_table_size=185M
thread_cache_size=38
myisam_max_sort_file_size=10G
myisam_sort_buffer_size=125M
key_buffer_size=55M
read_buffer_size=64K
read_rnd_buffer_size=256K
sort_buffer_size=256K
innodb_data_home_dir="D:/MySQL/data/"
innodb_additional_mem_pool_size=32M
innodb_flush_log_at_trx_commit=2
innodb_log_buffer_size=16M
innodb_buffer_pool_size=5G
innodb_log_file_size=100M
innodb_lock_wait_timeout=1200
innodb_table_locks=0
innodb_write_io_thread=4
innodb_read_io_thread=8
innodb_thread_concurrency=8
group_concat_max_len=20M
max_allowed_packet=500M
wait_timeout=3600
log-warnings
server-id=75
skip-slave-start
report-host=sbwebdb2
wait_timeout=3600
performance_schema
innodb_ft_min_token_size=2
ft_min_word_len=2

my.ini config used at 5.6.5 master is:
[client]
port=3306
[mysql]
default-character-set=utf8
max_allowed_packet=500M
[mysqld]
port=3306
basedir="C:/MySQL/MySQL565/"
datadir="D:/MySQL/data/"
character-set-server=utf8
default-storage-engine=INNODB
sql-mode="STRICT_TRANS_TABLES,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION"
max_connections=100
query_cache_size=365M
table_cache=1520
tmp_table_size=369M
thread_cache_size=38
myisam_max_sort_file_size=100G
myisam_sort_buffer_size=616M
key_buffer_size=58M
read_buffer_size=64K
read_rnd_buffer_size=256K
sort_buffer_size=256K
innodb_data_home_dir="D:/MySQL/data/"
innodb_additional_mem_pool_size=32M
innodb_flush_log_at_trx_commit=2
innodb_log_buffer_size=16M
innodb_buffer_pool_size=6G
innodb_log_file_size=100M
innodb_max_dirty_pages_pct = 90
innodb_write_io_threads = 8
innodb_read_io_threads = 8
innodb_thread_concurrency=8
innodb_table_locks=0
innodb_lock_wait_timeout=360
innodb_io_capacity=1000
innodb_old_blocks_time=5000
server-id=74
skip-slave-start
log-bin=mysql-bin
sync_binlog=1
binlog-format=mixed
group_concat_max_len=20M
max_allowed_packet=500M
wait_timeout=3600
connect_timeout=120
log_warnings
performance_schema
innodb_ft_min_token_size=2
ft_min_word_len=2

Suggested fix:
Fix memory usage on replication slaves.
[10 Oct 2012 6:10] Jan Rusch
After doing some more testing, the problem is related to an Update-Statement calling a stored function which returns a longtext field. Inside the function GROUP_CONCAT is called, which generates a string at max 4 KB long.

When the server crashes with "Out of Memory" replication always stops at this statement, which in mixed replication setting is replicated as a statement. I configured the GROUP_CONCAT_MAX_LEN variable to 50M and everything works fine...

The question is: Why was this working in a replication from 5.6.5 -> 5.6.5 and is not working with 5.6.5 -> 5.6.6 or 5.6.7 ?
[12 Oct 2012 19:48] Sveta Smirnova
Thank you for the report.

We can not say why same statement worked fine in 5.6.5->5.6.5 replication while started to show issues when you upgraded slave.

Did you run mysql_upgrade? If not, please do now. Please also run CHECK TABLE command and ensure all involved tables are in good state.

If you still be continue experiencing issues, send us dump, or at least SHOW CREATE TABLE and SHOW TABLE STATUS statements, of all involved tables and problematic query.
[15 Oct 2012 13:43] Jan Rusch
After setting GROUP_CONCAT_MAX_LEN to a higher value, replication work fine with 5.6.7. Seems to be resoved.
[31 Jan 2013 13:55] Erlend Dahl
Jan, based on your latest comment, can we close this?
[31 Jan 2013 14:04] Jan Rusch
Please close, cannot be reproduced anymore
[31 Jan 2013 14:04] Jan Rusch
..
[31 Jan 2013 14:04] Erlend Dahl
Thanks!