Bug #65152 MySQL can't start if relay logs are removed and skip_slave_start = 0
Submitted: 30 Apr 2012 5:39 Modified: 4 Jun 2012 10:21
Reporter: Joffrey MICHAIE Email Updates:
Status: Closed Impact on me:
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.5m8 OS:Linux (RHEL 6.2 64bits)
Assigned to: CPU Architecture:Any
Tags: regression, remove relay logs, skip_slave_start, start failure

[30 Apr 2012 5:39] Joffrey MICHAIE
If you remove the relay-logs, or change the relay-log parameter, while a replication is enabled, mysql fails to restart (and loops if you use mysqld_safe) with the following error :

120430  1:30:04 InnoDB: The InnoDB memory heap is disabled
120430  1:30:04 InnoDB: Mutexes and rw_locks use GCC atomic builtins
120430  1:30:04 InnoDB: Compressed tables use zlib 1.2.3
120430  1:30:04 InnoDB: Using Linux native AIO
120430  1:30:04 InnoDB: CPU supports crc32 instructions
120430  1:30:04 InnoDB: Initializing buffer pool, size = 64.0M
120430  1:30:04 InnoDB: Completed initialization of buffer pool
120430  1:30:04 InnoDB: highest supported file format is Barracuda.
120430  1:30:04 InnoDB: 128 rollback segment(s) are active.
120430  1:30:04 InnoDB: Waiting for the background threads to start
120430  1:30:04 InnoDB: 1.2.5 started; log sequence number 13152792
120430  1:30:04 [Note] Recovering after a crash using mysql-bin
120430  1:30:04 [Note] Starting crash recovery...
120430  1:30:04 [Note] Crash recovery finished.
120430  1:30:04 [Warning] Slave SQL: If a crash happens this configuration does not guarantee that the relay log info will be consistent, Error_code: 0
05:30:04 UTC - mysqld got signal 11 ;
This could be because you hit a bug. It is also possible that this binary
or one of the libraries it was linked against is corrupt, improperly built,
or misconfigured. This error can also be caused by malfunctioning hardware.
We will try our best to scrape up some info that will hopefully help
diagnose the problem, but since we have already crashed, 
something is definitely wrong and this may fail.

It is possible that mysqld could use up to 
key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 47920 K  bytes of memory
Hope that's ok; if not, decrease some variables in the equation.

Thread pointer: 0x7f0df8000990
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 7f0e1458a810 thread_stack 0x40000
120430  1:30:04 [Warning] Storing MySQL user name or password information in the master.info repository is not secure and is therefore not recommended. Please see the MySQL Manual for more about this issue and possible alternatives.
120430  1:30:04 [ERROR] Slave I/O: error connecting to master 'replic@' - retry-time: 60  retries: 1, Error_code: 2003
120430  1:30:04 [Note] Event Scheduler: Loaded 0 events

Trying to get some variables.
Some pointers may be invalid and cause the dump to abort.
Query (0): is an invalid pointer
Connection ID (thread ID): 1

The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains
information that should help you find out what is causing the crash.
120430  1:30:04 [Note] /usr/sbin/mysqld: ready for connections.
Version: '5.6.5-m8-log'  socket: '/var/lib/mysql/mysql.sock'  port: 3306  MySQL Community Server (GPL)
120430 01:30:04 mysqld_safe Number of processes running now: 0
120430 01:30:04 mysqld_safe mysqld restarted

How to repeat:
1) Enable replication between 2x MySQL 5.6 servers (config file and commands present in bug http://bugs.mysql.com/bug.php?id=65151)

2) Stop MySQL
[root@nowhere ~]# mysqladmin shutdown

3) Remove relaylogs
[root@nowhere ~]# rm -rf /var/lib/mysql/*relay*

4) Start MySQL
[root@ip-10-54-79-221 ~]# /etc/init.d/mysql start
Starting MySQL....                                         [  OK  ]

5) Try to connect to mysql ...
[root@nowhere ~]# mysql
ERROR 2002 (HY000): Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock' (2)

6) Stop the safe_mysqld looping script (dangerous command)
[root@nowhere ~]# killall -9 mysqld_safe

7) Add skip_slave_start in [mysqld] section of /etc/mysql/my.cnf

8) Start MySQL
/etc/init.d/mysql start
9) Start Replication

Note : I didn't try with converting the mysql.*repl* table to InnoDB (as it will be in 5.6.6), but don't think it will help

Suggested fix:
changing the relay-log parameter (or removing relay-log files) should not make mysql crash if skip_slave_start is not set
[1 May 2012 9:12] Sveta Smirnova
Thank you for the report.

Verified as described.

Backtrace in my case is a bit different:

mysqld: /home/ssmirnov/blade12/src/mysql-trunk/sql/rpl_slave.cc:5156: void* handle_slave_sql(void*): Assertion `rli->inited' failed.
09:09:16 UTC - mysqld got signal 6 ;
This could be because you hit a bug. It is also possible that this binary
Thread pointer: 0x0
Attempting backtrace. You can use the following information to find out
where mysqld died. If you see no messages after this, something went
terribly wrong...
stack_bottom = 0 thread_stack 0x40000
[1 May 2012 9:35] Sveta Smirnova
Problem is not repeatable with version 5.5
[4 Jun 2012 10:21] Jon Stephens
Fixed in 5.6. Documented as follows in the MySQL 5.6.6 changelog:

      If the relay logs were removed after the server was stopped, without
      stopping replication first, the server could not be started correctly.