Bug #76876 --transaction-write-set-extraction is considering non unique and NULL keys
Submitted: 28 Apr 2015 18:53 Modified: 13 Jul 2015 10:28
Reporter: Nuno Carvalho Email Updates:
Status: Closed Impact on me:
None 
Category:MySQL Server: Group Replication Severity:S3 (Non-critical)
Version:0.4.0 OS:Any
Assigned to: CPU Architecture:Any

[28 Apr 2015 18:53] Nuno Carvalho
Description:
Transaction write set extraction feature of MySQL Group Replication is incorrectly considering non unique keys and unique with NULL values, which is causing some transactions to be rollback when them should not.

How to repeat:
The following test case should not cause transactions conflicts, but due to the bug it is causing:

--source include/have_debug_sync.inc
--source include/have_group_replication_plugin.inc
--source include/master-slave.inc

--echo
--echo ############################################################
--echo # 1. Create a table on server1.
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
CREATE TABLE t1 (c1 INT PRIMARY KEY, c2 INT, KEY `c2` (`c2`));
--source include/rpl_sync.inc

--echo
--echo ############################################################
--echo # 2. Set a debug sync before broadcast message to group on
--echo #    connection server_1.
--echo #    Commit a transaction that will be block before broadcast.
--let $rpl_connection_name= server_1
--source include/rpl_connection.inc
SET DEBUG_SYNC='group_replication_before_message_broadcast WAIT_FOR waiting';
BEGIN;
INSERT INTO t1 VALUES (1, 2);
--send COMMIT

--echo
--echo ############################################################
--echo # 3. Wait until server_1 connection reaches the
--echo # group_replication_before_message_broadcast debug sync point.
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--let $wait_condition=SELECT COUNT(*)=1 FROM INFORMATION_SCHEMA.PROCESSLIST WHERE State = 'debug sync point: group_replication_before_message_broadcast'
--source include/wait_condition.inc

--echo
--echo ############################################################
--echo # 4. Execute a transaction on server2, that will reach first
--echo #    certification, since server_1 is blocked before broadcast.
--let $rpl_connection_name= server2
--source include/rpl_connection.inc
INSERT INTO t1 VALUES (3, 2);
--let $sync_slave_connection= server1
--source include/sync_slave_sql_with_master.inc

--echo
--echo ############################################################
--echo # 5. Signal the waiting thread on server_1 to resume.
--let $rpl_connection_name= server1
--source include/rpl_connection.inc
SET DEBUG_SYNC='now SIGNAL waiting';

--echo
--echo ############################################################
--echo # 6. Must not error out since c2 column allows duplicate
--echo #    keys.
--let $rpl_connection_name= server_1
--source include/rpl_connection.inc
--reap

Suggested fix:
The bug can be fixed with the following patch:

--- a/sql/rpl_write_set_handler.cc
+++ b/sql/rpl_write_set_handler.cc
@@ -259,6 +259,10 @@ void add_pke(TABLE *table, THD *thd)
   {
     for (uint key_number=0; key_number < table->s->keys; key_number++)
     {
+      // Skip non unique or null key.
+      if (!((table->key_info[key_number].flags & (HA_NOSAME | HA_NULL_PART_KEY)) == HA_NOSAME))
+        continue;
+
       std::string unhashed_string;
       if (key_number == 0)
         unhashed_string.append("P");
[13 Jul 2015 10:28] David Moss
Nuno Carvalho confirmed that this is related to internal code only and does not impact on docs. Therefore closing.