Bug #87551 in ReplicationGroup, open connection timeout doesn't mark server as UnAvailable
Submitted: 26 Aug 2017 7:20 Modified: 6 Sep 2017 10:35
Reporter: mog liang Email Updates:
Status: Not a Bug Impact on me:
None 
Category:Connector / NET Severity:S3 (Non-critical)
Version:6.9.9 OS:Microsoft Windows
Assigned to: Chiranjeevi Battula CPU Architecture:Any
Tags: fallback, replication

[26 Aug 2017 7:20] mog liang
Description:
we setuped several 2 mysql servers in mysql replication group. Yesterday, one mysql server encountered some error, any connection to it would encounters Timeout Exception.

We though MySQL replication group can handle such scenario and mark that server to "Unavailable", however, the truth is: mysql client library still sending new connection to bad server.

callstack is like below:
Unhandled exception System.Data.Entity.Core.EntityException: The underlying provider failed on Open. ---> System.TimeoutException: Timeout in IO operation
   at MySql.Data.MySqlClient.TimedStream.StopTimer()
   at MySql.Data.MySqlClient.TimedStream.Read(Byte[] buffer, Int32 offset, Int32 count)
   at System.IO.BufferedStream.Read(Byte[] array, Int32 offset, Int32 count)
   at MySql.Data.MySqlClient.MySqlStream.ReadFully(Stream stream, Byte[] buffer, Int32 offset, Int32 count)
   at MySql.Data.MySqlClient.MySqlStream.LoadPacket()
   at MySql.Data.MySqlClient.MySqlStream.ReadPacket()
   at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.ReadPacket()
   at MySql.Data.MySqlClient.Authentication.MySqlAuthenticationPlugin.Authenticate(Boolean reset)
   at MySql.Data.MySqlClient.NativeDriver.Open()
   at MySql.Data.MySqlClient.Driver.Open()
   at MySql.Data.MySqlClient.Driver.Create(MySqlConnectionStringBuilder settings)
   at MySql.Data.MySqlClient.Replication.ReplicationManager.GetNewConnection(String groupName, Boolean master, MySqlConnection connection)
   at MySql.Data.MySqlClient.MySqlConnection.Open()
   at System.Data.Entity.Infrastructure.Interception.InternalDispatcher`1.Dispatch[TTarget,TInterceptionContext](TTarget target, Action`2 operation, TInterceptionContext interceptionContext, Action`3 executing, Action`3 executed)
   at System.Data.Entity.Infrastructure.Interception.DbConnectionDispatcher.Open(DbConnection connection, DbInterceptionContext interceptionContext)
   at System.Data.Entity.Core.EntityClient.EntityConnection.Open()

code analysis:
===
we checked code ReplicationManager.GetNewConnection(), found that it ignored Timeout exception and let go. I believe this is a bad idea, if open connection get timeout, it's definitely that server is broken.

How to repeat:
it's not easy to repeat a sever error. however, if you read code, it's quite obvious

please check ReplicationManager.GetNewConnection(string groupName, bool master, MySqlConnection connection) try...catch statement.

Suggested fix:
please consider a timeout exception as "server unavailable"
[29 Aug 2017 7:58] Chiranjeevi Battula
Hello mog liang,

Thank you for the report.
Replication failover only works when connnecting, if a connection is already established and it brokes after and the user application should handle with exceptions.

Thanks,
Chiranjeevi.
[6 Sep 2017 10:35] mog liang
thanks for reply.

Issue happens on Connecting phase. open connection get timed out.
if you check the callstack, it shows it's trying to open mysqlconnection.

at MySql.Data.MySqlClient.NativeDriver.Open()
   at MySql.Data.MySqlClient.Driver.Open()