Bug #89959 Replication fails when users are created using caching_sha2_password
Submitted: 8 Mar 10:59 Modified: 8 Mar 13:51
Reporter: Giuseppe Maxia (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:8.0.4 OS:Any
Assigned to: CPU Architecture:Any
Tags: auth, replication

[8 Mar 10:59] Giuseppe Maxia
Description:
When MySQL users are defined with caching_sha2_password, replication slaves fail to connect.

The procedure used is the same used for all previous MySQL versions, up to 8.0.2.

change master to master_host='127.0.0.1', master_port=24009, master_user='rsandbox', master_password='rsandbox';
start slave;
SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Connecting to master
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 24009
                Connect_Retry: 60
              Master_Log_File:
          Read_Master_Log_Pos: 4
               Relay_Log_File: mysql-relay.000001
                Relay_Log_Pos: 4
        Relay_Master_Log_File:
             Slave_IO_Running: Connecting
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 0
              Relay_Log_Space: 155
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 2061
                Last_IO_Error: error connecting to master 'rsandbox@127.0.0.1:24009' - retry-time: 60  retries: 1
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 0
                  Master_UUID:
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp: 180308 11:51:55
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0
         Replicate_Rewrite_DB:
                 Channel_Name:
           Master_TLS_Version:
       Master_public_key_path:
        Get_master_public_key: 0
1 row in set (0.00 sec)

However, if I run a manual connection to the master using the replication  user, the slave connection succeeds.

~/sandboxes/multi_msb_8_0_4/n1 -u rsandbox -prsandbox -h 127.0.0.1 -e 'set @a=1'

Then, using a different slave node:

node3 [localhost] {msandbox} ((none)) > change master to master_host='127.0.0.1', master_port=24009, master_user='rsandbox', master_password='rsandbox';
Query OK, 0 rows affected, 2 warnings (0.01 sec)

node3 [localhost] {msandbox} ((none)) > start slave;
Query OK, 0 rows affected (0.00 sec)

node3 [localhost] {msandbox} ((none)) > SHOW SLAVE STATUS\G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: 127.0.0.1
                  Master_User: rsandbox
                  Master_Port: 24009
                Connect_Retry: 60
              Master_Log_File: mysql-bin.000001
          Read_Master_Log_Pos: 155
               Relay_Log_File: mysql-relay.000002
                Relay_Log_Pos: 369
        Relay_Master_Log_File: mysql-bin.000001
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

How to repeat:
1. Using MySQL 8.0.4, prepare one master and two slave nodes;
2. Without any operation in the master, connect the first slave using CHANGE MASTER TO
change master to master_host='127.0.0.1', master_port=24009, master_user='rsandbox', master_password='rsandbox';
start slave;
3. Check the replication status
4. Connect to the master using the replication user:
mysql -u rsandbox -prsandbox -h 127.0.0.1 -e 'set @a=1'
5. Connect another slave, or run 'reset slave' on the previous one and repeat the CHANGE MASTER TO.
6. Check replication status.
[8 Mar 12:52] Umesh Shastry
Hello Giuseppe,

Thank you for the report and feedback.
Observed reported issue but also noted that slave connected when I restarted slave.

Thanks,
Umesh
[8 Mar 12:53] Umesh Shastry
test results

Attachment: 89959_8.0.4.results (application/octet-stream, text), 10.58 KiB.

[8 Mar 13:02] Giuseppe Maxia
Hi Umesh,
Thanks for verifying the issue.
In the meantime, I have seen that there is also another method to gain a correct connection, in addition to running a query using the regular client.
We can add GET_MASTER_PUBLIC_KEY to the "CHANGE MASTER TO" statement, and the connection succeeds. 
The manual explains this option (https://dev.mysql.com/doc/refman/8.0/en/change-master-to.html) although it is not clear why the regular connection with another client solves the issue.
I'd say that the bug I reported is probably a matter of RTFM (albeit the docs could tell more about the problem, and especially they could mention this issue when announcing the change of default authentication plugin.)
However, if the problem can be solved by a connection from a different client, I think there is a different problem, which is the master is trusting a client that should not be trusted.
[8 Mar 13:51] Giuseppe Maxia
One more thing. The addition of GET_MASTER_PUBLIC_KEY=1 does not solve the issue for group replication, while the previous connection with a  regular client does.
So, for the time being, I'd like to keep this bug report active, until more testing can be performed.
[9 Mar 6:05] Harin Vadodaria
Hi Giuseppe,

For group replication, in additional to options related to SSL following options are available for configuring RSA public key:
1. --group-replication-recovery-get-public-key
2. --group-replication-recovery-public-key-path

Please see: https://dev.mysql.com/doc/refman/8.0/en/group-replication-options.html

The reason why regular connection with other client solve the issue is explained here: https://dev.mysql.com/doc/refman/8.0/en/caching-sha2-pluggable-authentication.html#caching...

Caching_sha2_password creates an in-memory cache after first successful authentication through secure channel for a given user account. This cache allows fast authentication based on sha256 scramble for subsequent connection. This cache may be cleared(fully or partially) as a part of user management operations.
[9 Mar 8:30] Rene' Cannao'
If I understand this correctly, does this mean that replication can only perform a "Full Authentication via RSA Key Exchange" and not a "Full Authentication via SSL/TLS" ?
[9 Mar 9:20] Harin Vadodaria
No, replication can perform full authentication using TLS too.
[16 May 8:59] Umesh Shastry
Bug #90883 marked as duplicate of this one