Bug #114624 | Deadlock on primary replica when adding instance to replica cluster with writes | ||
---|---|---|---|
Submitted: | 11 Apr 13:37 | Modified: | 14 Oct 13:16 |
Reporter: | Paulo Machado | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Group Replication | Severity: | S2 (Serious) |
Version: | 8.0.39 | OS: | Ubuntu (22.04) |
Assigned to: | CPU Architecture: | x86 |
[11 Apr 13:37]
Paulo Machado
[11 Apr 14:46]
Paulo Machado
mysqlsh log where add instance is executed
Attachment: mysqlsh.log (text/x-log), 185.41 KiB.
[11 Apr 14:47]
Paulo Machado
GDB stack trace from primary instance of replica cluster
Attachment: gdb_stack_trace.txt (text/plain), 90.56 KiB.
[11 Apr 14:47]
Paulo Machado
Thread on primary instance of replica cluster
Attachment: performance_schema.threads.txt (text/plain), 52.10 KiB.
[11 Apr 15:43]
MySQL Verification Team
Thanks for the report
[15 Apr 14:12]
Paulo Machado
Extra context: The issue is being consistently replicated in the context of a operator for mysql (same behavior on vms, bare metal and kubernetes). That said, the sequence of events in the replication steps happens in quick succession, which may be playing a role in triggering the deadlock
[8 Aug 21:01]
Paulo Machado
Reproducible using: dba.deploy_sandbox_instance(3310, {'password': "", 'sandboxDir': "/tmp/sandbox"}); dba.deploy_sandbox_instance(3320, {'password': "", 'sandboxDir': "/tmp/sandbox"}); dba.deploy_sandbox_instance(3330, {'password': "", 'sandboxDir': "/tmp/sandbox"}); # 1. Deploy Group Replication cluster (A) shell.connect("root@127.0.0.1:3310", "") cluster = dba.create_cluster("my_cluster") cluster.add_instance("127.0.0.1:3320", {'recoveryMethod':'incremental'}); cluster.add_instance("127.0.0.1:3330", {'recoveryMethod':'incremental'}); # 2. Create ClusterSet on A clusterset = cluster.create_cluster_set("foobar") # 3. Deploy standalone instance B1 dba.deploy_sandbox_instance(3340, {'password': "", 'sandboxDir': "/tmp/sandbox"}); # 4. Create ReplicaCluster on B1 from A (cluster_set.create_replica_cluster) clusterset.create_replica_cluster("127.0.0.1:3340", "replica", {'recoveryMethod':'incremental'}); # 5. Add some write load to cluster A # \sql my_session = mysql.get_classic_session('root@localhost:3310', ''); my_session.run_sql("CREATE DATABASE test"); my_session.run_sql("USE test;"); my_session.run_sql("CREATE TABLE t1 (c1 INT NOT NULL AUTO_INCREMENT PRIMARY KEY, c2 INT) ENGINE=InnoDB;"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) values (2);"); my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;"); my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;"); my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;"); my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;"); my_session.run_sql("INSERT INTO t1 (c2) SELECT c2 FROM t1;"); On other terminal generate load with mysqlslap: mysqlslap -S /tmp/sandbox/3310/sandboxdata/mysqld.sock --create-schema=test --delimiter=";" --iterations=300000 --query="INSERT INTO t1 (c2) SELECT c2 FROM t1 LIMIT 100" --concurrency=100 & And on mysqlshell execute final steps: shell.connect("root@127.0.0.1:3340", "") dba.deploy_sandbox_instance(3350, {'password': "", 'sandboxDir': "/tmp/sandbox"}); # First difference - I'm adding instances to the replica cluster cluster = dba.get_cluster("replica") cluster.add_instance("127.0.0.1:3350", {'recoveryMethod':'incremental'}) # Second difference - sometimes It was needed to add a third instance to the replica cluster dba.deploy_sandbox_instance(3360, {'password': "", 'sandboxDir': "/tmp/sandbox"}); cluster.add_instance("127.0.0.1:3360", {'recoveryMethod':'incremental'}) session.run_sql("SELECT max(c1) FROM test.t1") # PK stop incrementing while mysqlslap is running # connect to `my_cluster` primary shell.connect("root@127.0.0.1:3310", "") # PK still incrementing while mysqlslap is running session.run_sql("SELECT max(c1) FROM test.t1")
[14 Oct 13:16]
Paulo Machado
Hello there! DO you need further information? Any progress on the bug replication?