Description:
Transaction write set extraction feature of MySQL Group Replication is incorrectly considering non unique keys and unique with NULL values, which is causing some transactions to be rollback when them should not.
How to repeat:
The following test case should not cause transactions conflicts, but due to the bug it is causing:
--source include/have_debug_sync.inc
--source include/have_group_replication_plugin.inc
--source include/master-slave.inc
--echo
--echo ############################################################
--echo # 1. Create a table on server1.
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT, KEY `c2` (`c2`));
--source include/rpl_sync.inc
--echo
--echo ############################################################
--echo # 2. Set a debug sync before broadcast message to group on
--echo # connection server_1.
--echo # Commit a transaction that will be block before broadcast.
--let $rpl_connection_name= server_1
--source include/rpl_connection.inc
SET DEBUG_SYNC='group_replication_before_message_broadcast WAIT_FOR waiting';
BEGIN;
INSERT INTO t1 VALUES (1, 2);
--send COMMIT
--echo
--echo ############################################################
--echo # 3. Wait until server_1 connection reaches the
--echo # group_replication_before_message_broadcast debug sync point.
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--let $wait_condition=SELECT COUNT(*)=1 FROM INFORMATION_SCHEMA.PROCESSLIST WHERE State = 'debug sync point: group_replication_before_message_broadcast'
--source include/wait_condition.inc
--echo
--echo ############################################################
--echo # 4. Execute a transaction on server2, that will reach first
--echo # certification, since server_1 is blocked before broadcast.
--let $rpl_connection_name= server2
--source include/rpl_connection.inc
INSERT INTO t1 VALUES (3, 2);
--let $sync_slave_connection= server1
--source include/sync_slave_sql_with_master.inc
--echo
--echo ############################################################
--echo # 5. Signal the waiting thread on server_1 to resume.
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
SET DEBUG_SYNC='now SIGNAL waiting';
--echo
--echo ############################################################
--echo # 6. Must not error out since c2 column allows duplicate
--echo # keys.
--let $rpl_connection_name= server_1
--source include/rpl_connection.inc
--reap
Suggested fix:
The bug can be fixed with the following patch:
--- a/sql/rpl_write_set_handler.cc
+++ b/sql/rpl_write_set_handler.cc
@@ -259,6 +259,10 @@ void add_pke(TABLE *table, THD *thd)
{
for (uint key_number=0; key_number < table->s->keys; key_number++)
{
+ // Skip non unique or null key.
+ if (!((table->key_info[key_number].flags & (HA_NOSAME | HA_NULL_PART_KEY)) == HA_NOSAME))
+ continue;
+
std::string unhashed_string;
if (key_number == 0)
unhashed_string.append("P");