| Bug #114624 | Deadlock on primary replica when adding instance to replica cluster with writes | ||
|---|---|---|---|
| Submitted: | 11 Apr 2024 13:37 | Modified: | 14 Oct 2024 13:16 |
| Reporter: | Paulo Machado | Email Updates: | |
| Status: | Verified | Impact on me: | |
| Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
| Version: | 8.0.39 | OS: | Ubuntu (22.04) |
| Assigned to: | CPU Architecture: | x86 | |
[11 Apr 2024 13:37]
Paulo Machado
[11 Apr 2024 14:46]
Paulo Machado
mysqlsh log where add instance is executed
Attachment: mysqlsh.log (text/x-log), 185.41 KiB.
[11 Apr 2024 14:47]
Paulo Machado
GDB stack trace from primary instance of replica cluster
Attachment: gdb_stack_trace.txt (text/plain), 90.56 KiB.
[11 Apr 2024 14:47]
Paulo Machado
Thread on primary instance of replica cluster
Attachment: performance_schema.threads.txt (text/plain), 52.10 KiB.
[11 Apr 2024 15:43]
MySQL Verification Team
Thanks for the report
[15 Apr 2024 14:12]
Paulo Machado
Extra context: The issue is being consistently replicated in the context of a operator for mysql (same behavior on vms, bare metal and kubernetes). That said, the sequence of events in the replication steps happens in quick succession, which may be playing a role in triggering the deadlock
[8 Aug 2024 21:01]
Paulo Machado
Reproducible using:
dba.deploy_sandbox_instance(3310, {'password': "", 'sandboxDir': "/tmp/sandbox"});
dba.deploy_sandbox_instance(3320, {'password': "", 'sandboxDir': "/tmp/sandbox"});
dba.deploy_sandbox_instance(3330, {'password': "", 'sandboxDir': "/tmp/sandbox"});
# 1. Deploy Group Replication cluster (A)
shell.connect("root@127.0.0.1:3310", "")
cluster = dba.create_cluster("my_cluster")
cluster.add_instance("127.0.0.1:3320", {'recoveryMethod':'incremental'});
cluster.add_instance("127.0.0.1:3330", {'recoveryMethod':'incremental'});
# 2. Create ClusterSet on A
clusterset = cluster.create_cluster_set("foobar")
# 3. Deploy standalone instance B1
dba.deploy_sandbox_instance(3340, {'password': "", 'sandboxDir': "/tmp/sandbox"});
# 4. Create ReplicaCluster on B1 from A (cluster_set.create_replica_cluster)
clusterset.create_replica_cluster("127.0.0.1:3340", "replica", {'recoveryMethod':'incremental'});
# 5. Add some write load to cluster A
# \sql
my_session = mysql.get_classic_session('root@localhost:3310', '');
my_session.run_sql("CREATE DATABASE test");
my_session.run_sql("USE test;");
my_session.run_sql("CREATE TABLE t1 (c1 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, c2 INT) ENGINE=InnoDB;");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) values (2);");
my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;");
my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;");
my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;");
my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;");
my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;");
On other terminal generate load with mysqlslap:
mysqlslap -S /tmp/sandbox/3310/sandboxdata/mysqld.sock --create-schema=test --delimiter=";" --iterations=300000 --query="INSERT INTO t1 (c2) SELECT c2 FROM t1 LIMIT 100" --concurrency=100 &
And on mysqlshell execute final steps:
shell.connect("root@127.0.0.1:3340", "")
dba.deploy_sandbox_instance(3350, {'password': "", 'sandboxDir': "/tmp/sandbox"});
# First difference - I'm adding instances to the replica cluster
cluster = dba.get_cluster("replica")
cluster.add_instance("127.0.0.1:3350", {'recoveryMethod':'incremental'})
# Second difference - sometimes It was needed to add a third instance to the replica cluster
dba.deploy_sandbox_instance(3360, {'password': "", 'sandboxDir': "/tmp/sandbox"});
cluster.add_instance("127.0.0.1:3360", {'recoveryMethod':'incremental'})
session.run_sql("SELECT max(c1) FROM test.t1")
# PK stop incrementing while mysqlslap is running
# connect to `my_cluster` primary
shell.connect("root@127.0.0.1:3310", "")
# PK still incrementing while mysqlslap is running
session.run_sql("SELECT max(c1) FROM test.t1")
[14 Oct 2024 13:16]
Paulo Machado
Hello there! DO you need further information? Any progress on the bug replication?
