Bug #73494 Slave can't reconnect after 2003 error when sha256_password is used
Submitted: 7 Aug 2014 8:15 Modified: 7 Aug 2014 20:49
Reporter: Hermitte Bastien Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:5.6.20 OS:Linux (CentOS 6.4)
Assigned to: CPU Architecture:Any
Tags: Error_code 1045, SHA256

[7 Aug 2014 8:15] Hermitte Bastien
Description:
When sha256_password authentication plugin is used for the replication user, and after a 2003 error on the slave (can't connect to master, after a network failure for example), slave can't reconnect automatically to the master and fails with error_code 1045.

A tcpdump on the master when slave try to connects show :
#28000Access denied for user '<slave_user>'@'<slave_ip_address>' (using password: NO)

So it seems that the slave doesn't send the password for automatically reconnecting.

The problem occur with MySQL officials RPMs (compiled with yaSSL) and with our MySQL RPMs (compiled with OpenSSL).

The servers are virtual machines on VMware vSphere 5.1.

Let me know if you need more details.

How to repeat:
Setup a replication between two nodes and use the sha256_password authentication plugin for the replication user (according to http://dev.mysql.com/doc/refman/5.6/en/sha256-authentication-plugin.html).

Note : with MySQL compiled with yaSSL the replication must use SSL for use SHA256 authentication, but not needed for MySQL compiled with OpenSSL.

Then generate a 2003 error (in my case I disconnect the network card of the slave VM, but you can also stop mysql on the master for few seconds).
After the master comes back up, the slave can't reconnect with error :
[ERROR] Slave I/O: error reconnecting to master '<slave_user>'@'<slave_ip_address>' - retry-time: 10  retries: 3879, Error_code: 1045

Configuration is default except :
default-authentication-plugin=sha256_password
old_passwords=2
slave_net_timeout=10
slave_compressed_protocol=1
slave_parallel_workers=8
log-slave-updates
binlog-format=MIXED

The server for replication user is IP address with % wildcard on the last octet.

Suggested fix:
Not a good/secure solution, but if '<slave_user>'@'<slave_ip_address>' is created on the master without password, the slave reconnects well.
[7 Aug 2014 20:49] Sveta Smirnova
Thank you for the report.

Verified as described.

To repeat.

1. Have to copy of MySQL installations (source and build dir are sufficient)
2. Start MTR: ./mtr --start --suite=rpl rpl_alter --mysqld=--gtid_mode=ON --mysqld=--log-slave-updates --mysqld=--enforce-gtid-consistency --mysqld=--innodb_buffer_pool_size=1G --mysqld=--tmp_table_size=1G --mysqld=--max_heap_table_size=1G --mysqld=--relay_log_info_repository=table --mysqld=--master_info_repository=table --mysqld=--sync-master-info=1 --mysqld=--ssl --mysqld=--ssl-key=`pwd`/std_data/server-key.pem --mysqld=--ssl-cert=`pwd`/std_data/server-cert.pem --mysqld=--ssl-ca=`pwd`/std_data/cacert.pem &
3. Connect to master and create a user:

mysql> grant replication slave on *.* to bug73494@'%' identified with sha256_password;
Query OK, 0 rows affected (0.03 sec)

mysql> SET old_passwords = 2;
Query OK, 0 rows affected (0.00 sec)

mysql> set password for bug73494@'%'=password('bug73494');
Query OK, 0 rows affected (0.03 sec)

4. In the second MYSQL installation dir, run MTR as in step 2. It will start servers with different ports.

5. Run CHANGE MASTER on slave, started at step 4:

mysql> change master to master_ssl=1, master_ssl_ca='/home/sveta/src/mysql-5.6/mysql-test/std_data/cacert.pem', master_ssl_key='/home/sveta/src/mysql-5.6/mysql-test/std_data/client-key.pem', master_ssl_cert='/home/sveta/src/mysql-5.6/mysql-test/std_data/client-cert.pem';
Query OK, 0 rows affected (0.47 sec)

mysql> change master to master_host='127.0.0.1', master_port=13000, master_user='bug73494', master_password='bug73494';
Query OK, 0 rows affected, 2 warnings (0.34 sec)

6. Start slave, check slave status, ensure slave is running.

7. Stop servers, started at step 2.

8. Restart MTR with start-dirty:

./mtr --start-dirty --suite=rpl rpl_alter --mysqld=--gtid_mode=ON --mysqld=--log-slave-updates --mysqld=--enforce-gtid-consistency --mysqld=--innodb_buffer_pool_size=1G --mysqld=--tmp_table_size=1G --mysqld=--max_heap_table_size=1G --mysqld=--relay_log_info_repository=table --mysqld=--master_info_repository=table --mysqld=--sync-master-info=1 --mysqld=--ssl --mysqld=--ssl-key=`pwd`/std_data/server-key.pem --mysqld=--ssl-cert=`pwd`/std_data/server-cert.pem --mysqld=--ssl-ca=`pwd`/std_data/cacert.pem &

9. Connect to the slave, find out that IO thread in "Connecting" state and cannot connect, because error 1045.

10. Check error log:

2014-08-07 23:42:19 903 [ERROR] Slave I/O: error reconnecting to master 'bug73494@127.0.0.1:13000' - retry-time: 60  retries: 1, Error_code: 2003
2014-08-07 23:43:19 903 [ERROR] Slave I/O: error reconnecting to master 'bug73494@127.0.0.1:13000' - retry-time: 60  retries: 2, Error_code: 1045
2014-08-07 23:44:20 903 [ERROR] Slave I/O: error reconnecting to master 'bug73494@127.0.0.1:13000' - retry-time: 60  retries: 3, Error_code: 1045
2014-08-07 23:45:20 903 [ERROR] Slave I/O: error reconnecting to master 'bug73494@127.0.0.1:13000' - retry-time: 60  retries: 4, Error_code: 1045
2014-08-07 23:46:20 903 [ERROR] Slave I/O: error reconnecting to master 'bug73494@127.0.0.1:13000' - retry-time: 60  retries: 5, Error_code: 1045
2014-08-07 23:47:20 903 [ERROR] Slave I/O: error reconnecting to master 'bug73494@127.0.0.1:13000' - retry-time: 60  retries: 6, Error_code: 1045
...

11. Connect to slave server, run STOP SLAVE; START SLAVE; ensure that this fixes the issue.
[30 Oct 2017 14:28] Shahriyar Rzayev
I have encountered the same situation but, it turns out it was due to blank users in MySQL :)

So created user as tried to run change master:

2017-10-30 14:03:59 DEBUG    Running -> /home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/bin/mysql -A -uroot -S/home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/socket.sock --force test -e 'select @@port'
2017-10-30 14:03:59 DEBUG    Running -> /home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/bin/mysql -A -uroot -S/home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/socket.sock --force test -e "CREATE USER 'repl'@'%' IDENTIFIED BY 'Baku12345'"
2017-10-30 14:03:59 DEBUG    Running -> /home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/bin/mysql -A -uroot -S/home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/socket.sock --force test -e "GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%'"
2017-10-30 14:03:59 DEBUG    Running -> /home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/bin/mysql -A -uroot -S/home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/sock0.sock --force test -e "CHANGE MASTER TO MASTER_HOST='127.0.0.1', MASTER_USER='repl', MASTER_PASSWORD='Baku12345', MASTER_PORT=10023, MASTER_AUTO_POSITION=1"
2017-10-30 14:03:59 DEBUG    Running -> /home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/bin/mysql -A -uroot -S/home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/sock0.sock --force test -e 'start slave'

After that got the error:

2017-10-30 13:22:05 1314 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 11, Error_code: 2003
2017-10-30 13:22:48 7182 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 106, Error_code: 2003
2017-10-30 13:22:57 28984 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 1316, Error_code: 2003
2017-10-30 13:23:05 1314 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 12, Error_code: 2003
2017-10-30 13:23:48 7182 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 107, Error_code: 2003
2017-10-30 13:23:57 28984 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 1317, Error_code: 1045
2017-10-30 13:24:05 1314 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 13, Error_code: 1045
2017-10-30 13:24:48 7182 [ERROR] Slave I/O: error connecting to master 'repl@localhost:10023' - retry-time: 60  retries: 108, Error_code: 1045

Tried to connect via mysql client:

$ /home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/bin/mysql -urepl --password='Baku12345' --port=10023 --socket=/home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/socket.sock
Warning: Using a password on the command line interface can be insecure.
ERROR 1045 (28000): Access denied for user 'repl'@'localhost' (using password: YES)

Removed blank users from master:

5.6.37>drop user ''@'localhost';
Query OK, 0 rows affected (0.01 sec)

5.6.37>drop user ''@'qaserver-02.ci.percona.com';
Query OK, 0 rows affected (0.00 sec)

Tried to reconnect:

$ /home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/bin/mysql -urepl --password='Baku12345' --port=10023 --socket=/home/shahriyar.rzaev/XB_TEST/server_dir/PS231017-percona-server-5.6.37-82.2-linux-x86_64-debug/socket.sock
Warning: Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 623
Server version: 5.6.37-82.2-debug-log MySQL Community Server (GPL)

Copyright (c) 2009-2017 Percona LLC and/or its affiliates
Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> 

So "Blank users" took my day here to figure out, why my script was failing.