Bug #70115 MySQL corrupts sha256 password RSA keys on crash
Submitted: 22 Aug 2013 3:19 Modified: 30 Sep 2013 8:18
Reporter: Nicholas Williams Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Pluggable Authentication Severity:S1 (Critical)
Version:5.6.12 OS:Linux (OpenSUSE 12.3)
Assigned to: CPU Architecture:Any
Tags: sha256 rsa key corruption

[22 Aug 2013 3:19] Nicholas Williams
Description:
Our data center had a power failure today. When the servers came back up, MySQL failed to start properly. It recovered from the binary logs without issue, but then croaked on the RSA keys used for sha256 passwords / pluggable authentication. Here are the log contents from the restart:

130821 12:29:48 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
2013-08-21 12:29:49 1849 [Note] Plugin 'FEDERATED' is disabled.
2013-08-21 12:29:49 1849 [Note] InnoDB: The InnoDB memory heap is disabled
2013-08-21 12:29:49 1849 [Note] InnoDB: Mutexes and rw_locks use GCC atomic builtins
2013-08-21 12:29:49 1849 [Note] InnoDB: Compressed tables use zlib 1.2.8
2013-08-21 12:29:49 1849 [Note] InnoDB: Using Linux native AIO
2013-08-21 12:29:49 1849 [Note] InnoDB: Not using CPU crc32 instructions
2013-08-21 12:29:49 1849 [Note] InnoDB: Initializing buffer pool, size = 600.0M
2013-08-21 12:29:49 1849 [Note] InnoDB: Completed initialization of buffer pool
2013-08-21 12:29:49 1849 [Note] InnoDB: Highest supported file format is Barracuda.
2013-08-21 12:29:49 1849 [Note] InnoDB: The log sequence numbers 27259967 and 27259967 in ibdata files do not match the log sequence number 27533679 in the ib_logfiles!
2013-08-21 12:29:49 1849 [Note] InnoDB: Database was not shutdown normally!
2013-08-21 12:29:49 1849 [Note] InnoDB: Starting crash recovery.
2013-08-21 12:29:49 1849 [Note] InnoDB: Reading tablespace information from the .ibd files...
2013-08-21 12:29:49 1849 [Note] InnoDB: Restoring possible half-written data pages 
2013-08-21 12:29:49 1849 [Note] InnoDB: from the doublewrite buffer...
InnoDB: Last MySQL binlog file position 0 3159073, file name lynyrd-bin.000027
2013-08-21 12:29:49 1849 [Note] InnoDB: 128 rollback segment(s) are active.
2013-08-21 12:29:49 1849 [Note] InnoDB: Waiting for purge to start
2013-08-21 12:29:49 1849 [Note] InnoDB: 5.6.12 started; log sequence number 27533679
2013-08-21 12:29:49 1849 [Note] Recovering after a crash using /var/lib/mysql/lynyrd-bin
2013-08-21 12:29:49 1849 [Note] Starting crash recovery...
2013-08-21 12:29:49 1849 [Note] Crash recovery finished.
2013-08-21 12:29:49 1849 [ERROR] Failure to parse RSA public key (file exists): /var/lib/mysql//rsa-mysql-server.pub: error:0906D06C:PEM routines:PEM_read_bio:no start line
2013-08-21 12:29:49 1849 [ERROR] Aborting

2013-08-21 12:29:49 1849 [Note] Binlog end
2013-08-21 12:29:49 1849 [Note] Shutting down plugin 'partition'
2013-08-21 12:29:49 1849 [Note] Shutting down plugin 'ARCHIVE'
2013-08-21 12:29:49 1849 [Note] Shutting down plugin 'PERFORMANCE_SCHEMA'
2013-08-21 12:29:49 1849 [Note] Shutting down plugin 'BLACKHOLE'
2013-08-21 12:29:49 1849 [Note] Shutting down plugin 'INNODB_SYS_DATAFILES'
...
2013-08-21 12:29:49 1849 [Note] Shutting down plugin 'INNODB_TRX'
2013-08-21 12:29:49 1849 [Note] Shutting down plugin 'InnoDB'
2013-08-21 12:29:49 1849 [Note] InnoDB: FTS optimize thread exiting.
2013-08-21 12:29:49 1849 [Note] InnoDB: Starting shutdown...
2013-08-21 12:29:51 1849 [Note] InnoDB: Shutdown completed; log sequence number 27538069
2013-08-21 12:29:51 1849 [Note] Shutting down plugin 'CSV'
...
2013-08-21 12:29:51 1849 [Note] Shutting down plugin 'binlog'
2013-08-21 12:29:51 1849 [Note] /usr/sbin/mysqld: Shutdown complete

Notice the failure to read the RSA public key. Thinking "that's weird", I went to inspect the private and public keys. They were working yesterday...

$ openssl rsa -in rsa-mysql-server.pem 
writing RSA key
-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----

$ openssl rsa -in rsa-mysql-server.pub 
writing RSA key
-----BEGIN RSA PRIVATE KEY-----
...
-----END RSA PRIVATE KEY-----

Okay so "corrupt" really isn't the best word here. MySQL /replaces/ the public key file with the private key on a crash. That's crazy, and not good. Thankfully the public key was still recoverable:

$ openssl rsa -in rsa-mysql-server.pem -pubout
writing RSA key
-----BEGIN PUBLIC KEY-----
...
-----END PUBLIC KEY-----

So I just re-exported it back to the public key file:

$ openssl rsa -in rsa-mysql-server.pem -pubout > rsa-mysql-server.pub 
writing RSA key

And with that MySQL happily started back up. Kind-of. I had to restart my slaves, because for some reason they were now trying to connect without passwords?

2013-08-21 21:57:36 4023 [Warning] Access denied for user 'user'@'slave-1-hostname' (using password: NO)
2013-08-21 21:57:36 4023 [Warning] Access denied for user 'user'@'slave-2-hostname' (using password: NO)

Restarting the slaves cleared that up, too:

2013-08-21 22:09:30 4023 [Note] Start binlog_dump to master_thread_id(223) slave_server(97), pos(lynyrd-bin.000027, 3179475) <-- slave 1 restarted
2013-08-21 22:09:46 4023 [Warning] Access denied for user 'user'@'slave-1-hostname' (using password: NO)
2013-08-21 22:10:47 4023 [Warning] Access denied for user 'user'@'slave-2-hostname' (using password: NO)
2013-08-21 22:11:11 4023 [Note] Start binlog_dump to master_thread_id(244) slave_server(98), pos(lynyrd-bin.000027, 3179475) <-- slave 2 restarted

Here's some output from one of the slave logs:

2013-08-21 21:55:35 1634 [ERROR] Slave I/O: error reconnecting to master 'nwtsdns@master-ip:3307' - retry-time: 60  retries: 380, Error_code: 2003
2013-08-21 21:56:35 1634 [ERROR] Slave I/O: error reconnecting to master 'nwtsdns@master-ip:3307' - retry-time: 60  retries: 381, Error_code: 2003
2013-08-21 21:57:37 1634 [ERROR] Slave I/O: error reconnecting to master 'nwtsdns@master-ip:3307' - retry-time: 60  retries: 382, Error_code: 1045 <-- master finally got started with correct RSA keys
2013-08-21 21:58:37 1634 [ERROR] Slave I/O: error reconnecting to master 'nwtsdns@master-ip:3307' - retry-time: 60  retries: 383, Error_code: 1045

This is just bizarre behavior from MySQL during a crash recovery. Notice that it is somewhat related to bug 70060, because the error messages for that last problem are no help at all. Obviously the slaves weren't failing to provide a password.

How to repeat:
- Have RSA keys for sha256 authentication
- Crash
[27 Aug 2013 20:01] Sveta Smirnova
Thank you for the report.

Please check content of your  /var/lib/mysql//rsa-mysql-server.pub Looks like this file was corrupted, because power failure.
[27 Aug 2013 20:33] Nicholas Williams
Please read my entire bug report. I clearly said that I *DID* inspect the contents of my public key, and it was *NOT* corrupted. The two problems I identified were:

1) MySQL *replaced* the public key file contents with the private key file contents upon the power failure.
2) The slaves did not recover automatically. After the master recovered, the slaves had to be manually restarted.
[31 Aug 2013 8:18] Sveta Smirnova
Thank you for the feedback.

I don't see in MySQL sources how it can overwrite the file: it just opens it for read and closes immediately. Please send extract from the error log file prior the power failure: maybe it has information about other errors.
[1 Oct 2013 1:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".