Bug #59777 Replication over SSL; error 2026 most, but not all, of the time
Submitted: 27 Jan 2011 12:33 Modified: 28 Apr 2011 23:24
Reporter: Justin Ewing Email Updates:
Status: No Feedback Impact on me:
None 
Category:MySQL Server: Replication Severity:S1 (Critical)
Version:5.1.48, 5.1.55 OS:Solaris
Assigned to: CPU Architecture:Any

[27 Jan 2011 12:33] Justin Ewing
Description:
Mysql 5.1.48 & openssl-1.0.0a compiled with Sun Studio 12 on Solaris 10.  Same compilation packaged and deployed on both master and slave.  working fine just trying to get replication over ssl setup.

SSL keys and CSR generated on each host respectively; signed by same CA
CA certificate and master's signed cert (along with key, of course) on master
CA certificate and slave's signed cert (along with key, of course) on slave

key is 2048 bit

in master's my.cnf:

ssl-ca = /opt/mysql/certs/CA.crt
ssl-key = /opt/mysql/private/master.key
ssl-cert = /opt/mysql/certs/master.crt

slave's master.info:
15
mysql-bin.001425
44982801
<FQDN of master server>
REPLslavesvr
slavesvrREPL!#
3306
1
1
/opt/mysql/certs/CA.crt

/opt/mysql/certs/slave.crt
DHE-RSA-AES256-SHA
/opt/mysql/private/slave.key
0

From users table on master

mysql> select * from user WHERE User = 'REPLslavesvr';
+----------------------------+---------------+-------------------------------------------+-------------+-------------+-------------+-------------+-------------+-----------+-------------+---------------+--------------+-----------+------------+-----------------+------------+------------+--------------+------------+-----------------------+------------------+--------------+-----------------+------------------+------------------+----------------+---------------------+--------------------+------------------+------------+--------------+----------+------------+-------------+--------------+---------------+-------------+-----------------+----------------------+
| Host                       | User          | Password                                  | Select_priv | Insert_priv | Update_priv | Delete_priv | Create_priv | Drop_priv | Reload_priv | Shutdown_priv | Process_priv | File_priv | Grant_priv | References_priv | Index_priv | Alter_priv | Show_db_priv | Super_priv | Create_tmp_table_priv | Lock_tables_priv | Execute_priv | Repl_slave_priv | Repl_client_priv | Create_view_priv | Show_view_priv | Create_routine_priv | Alter_routine_priv | Create_user_priv | Event_priv | Trigger_priv | ssl_type | ssl_cipher | x509_issuer | x509_subject | max_questions | max_updates | max_connections | max_user_connections |
+----------------------------+---------------+-------------------------------------------+-------------+-------------+-------------+-------------+-------------+-----------+-------------+---------------+--------------+-----------+------------+-----------------+------------+------------+--------------+------------+-----------------------+------------------+--------------+-----------------+------------------+------------------+----------------+---------------------+--------------------+------------------+------------+--------------+----------+------------+-------------+--------------+---------------+-------------+-----------------+----------------------+
| %                          | REPLslavesvr | *19A3D0E44828F58BFFD97C331FE270BA8317B478 | N           | N           | N           | N           | N           | N         | N           | N             | N            | N         | N          | N               | N          | N          | N            | N          | N                     | N                | N            | N               | N                | N                | N              | N                   | N                  | N                | N          | N            |          |            |             |              |             0 |           0 |               0 |                    0 |
| <FQDN of slave server> | REPLslavesvr | *19A3D0E44828F58BFFD97C331FE270BA8317B478 | N           | N           | N           | N           | N           | N         | N           | N             | N            | N         | N          | N               | N          | N          | N            | N          | N                     | N                | N            | Y               | N                | N                | N              | N                   | N                  | N                | N          | N            | X509     |            |             |              |             0 |           0 |               0 |                    0 |
+----------------------------+---------------+-------------------------------------------+-------------+-------------+-------------+-------------+-------------+-----------+-------------+---------------+--------------+-----------+------------+-----------------+------------+------------+--------------+------------+-----------------------+------------------+--------------+-----------------+------------------+------------------+----------------+---------------------+--------------------+------------------+------------+--------------+----------+------------+-------------+--------------+---------------+-------------+-----------------+----------------------+
2 rows in set (0.00 sec)

Slave starts and instantly goes to the following status
+----------------------+---------------+-----------------------------------------------------------------------------------------------------------+
| Slave_IO_State       | Last_IO_Errno | Last_IO_Error                                                                                             |
| Connecting to master |          2026 | error connecting to master 'REPLdsdclvwdb@p1clv1d1.edc.cingular.net:3306' - retry-time: 1  retries: 86400 |
+----------------------+---------------+-----------------------------------------------------------------------------------------------------------+

eventually whatever is going on works itself out and...

+----------------------------------+---------------+---------------+
| Slave_IO_State                   | Last_IO_Errno | Last_IO_Error |
| Waiting for master to send event |             0 |               |
+----------------------------------+---------------+---------------+

and as long as the connection is there it's fine... but if it flaps it's back to error 2026.

As a side note I can make it 2026 from the command line.  When I execute

bin/mysql --ssl-ca=certs/CA.crt --ssl-cert=certs/slave.crt --ssl-key=private/slave.key -h<FQDN of master server> -uREPLslavesvr -pslavesvrREPL!#

I repeatedly get error 2026 before getting a random connection and then back to 2026.  I can do the same thing with other users too.  I added in

--ssl-cipher=DHE-RSA-AES256-SHA 

and it seems to work more often but not consistently.  I performed a 'change master' on the slave to add in the cipher and turned the reconnect time to 1 second, which is why those values are in the slave's master.info

I have seen similiar complaints about this on the internet but no resolution and couldn't find a current bug.

How to repeat:
Configure the same as noted in description

Suggested fix:
Fix SSL implementation
[17 Feb 2011 17:15] Guillermo Simeon
Same thing on debian any version after 5.1.37 (last tested: 5.1.55)

Any ssl connection will fail with same or different Common Name for server and client certificates.
[28 Mar 2011 23:24] Sveta Smirnova
Thank you for the report.

This can happen because different reasons.

Do you have same issues if setup replication without SSL?

Please add option --log-warnings=2 to your master binary log and send us error log file after few connection failures.

Please also try to increase timeouts (connect_timeout, net_write_timeout and net_read_timeout) and inform us if it helps.
[29 Apr 2011 23:00] Bugs System
No feedback was provided for this bug for over a month, so it is
being suspended automatically. If you are able to provide the
information that was originally requested, please do so and change
the status of the bug back to "Open".