Bug #106918 master and slave may get serious in-consistence after add auto-increment as pk
Submitted: 5 Apr 2022 17:46 Modified: 25 May 2022 10:06
Reporter: dennis GAO (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Row Based Replication ( RBR ) Severity:S3 (Non-critical)
Version:8.0.25 OS:Any
Assigned to: CPU Architecture:Any
Tags: Contribution

[5 Apr 2022 17:46] dennis GAO
Description:
Setup a master-slave replication, m1->s1 with RBR.

In m1 do the following init operations:
1. create table table without pk

create table t(c1 int,c2 int);

2. insert the following test data:

insert into t values (1,1),(2,2),(3,3),(4,4),(5,5),(1,1),(2,2);

3. check the data on m1 and s1, will see the result sequence is the same:

Then start the test:
1. do the following update on m1

update t set c2=rand()*100 where 1=1;

2. check the data on m1 and s1, will see the result sequence is different:

On m1
mysql> select * from t;
+------+------+
| c1   | c2   |
+------+------+
|    1 |   43 |
|    2 |   23 |
|    3 |   87 |
|    4 |   65 |
|    5 |   63 |
|    1 |   21 |
|    2 |   17 |
+------+------+
7 rows in set (0.01 sec)

On s1
mysql> select * from t;
+------+------+
| c1   | c2   |
+------+------+
|    1 |   21 |       <====== the result sequence is different with master
|    2 |   17 |
|    3 |   87 |
|    4 |   65 |
|    5 |   63 |
|    1 |   43 |
|    2 |   23 |
+------+------+
7 rows in set (0.01 sec)

3. add auto_increment as pk on m1

alter table t add column id int auto_increment primary key;

4. re-check the data in m1 and s1, we will see the data in-consistence 

On m1
mysql> select * from t;
+------+------+----+
| c1   | c2   | id |
+------+------+----+
|    1 |   43 |  1 |
|    2 |   23 |  2 |
|    3 |   87 |  3 |
|    4 |   65 |  4 |
|    5 |   63 |  5 |
|    1 |   21 |  6 |
|    2 |   17 |  7 |
+------+------+----+
7 rows in set (0.01 sec)

On s1
mysql> select * from t;
+------+------+----+
| c1   | c2   | id |
+------+------+----+
|    1 |   21 |  1 |   <===== the row with id=1 is different with the master
|    2 |   17 |  2 |
|    3 |   87 |  3 |
|    4 |   65 |  4 |
|    5 |   63 |  5 |
|    1 |   43 |  6 |
|    2 |   23 |  7 |
+------+------+----+
7 rows in set (0.01 sec)

4. Do the following update on m1

update t set c2=c2*10 where c1=1 and c2=43;

5. re-check the data in m1 and s1, we will see the data serious in-consistence

On m1
mysql> select * from t;
+------+------+----+
| c1   | c2   | id |
+------+------+----+
|    1 |  430 |  1 |       <==========c2 is update from 43 to 430 as expect
|    2 |   23 |  2 |
|    3 |   87 |  3 |
|    4 |   65 |  4 |
|    5 |   63 |  5 |
|    1 |   21 |  6 |
|    2 |   17 |  7 |
+------+------+----+
7 rows in set (0.01 sec)

On s1
mysql> select * from t;
+------+------+----+
| c1   | c2   | id |
+------+------+----+
|    1 |  430 |  1 |     <======the update in slave replay on the wrong row
|    2 |   17 |  2 |
|    3 |   87 |  3 |
|    4 |   65 |  4 |
|    5 |   63 |  5 |
|    1 |   43 |  6 |     <======it should update this row
|    2 |   23 |  7 |
+------+------+----+
7 rows in set (0.01 sec)

How to repeat:
see description
[6 Apr 2022 8:48] MySQL Verification Team
Hello dennis GAO,

Thank you for the report and feedback.

regards,
Umesh
[15 Apr 2022 3:52] huahua xu
Hi dennis GAO,

I think it is a bug, and guess that the inconsistent data may be caused by the search algorithm HASH_SCAN on slave(slave_rows_search_algorithms).

The table does not have any PK、UK oy Key, so the slave will search rows by HASH_SCAN(Hash over the entire table).

For you case, HASH_SCAN will build a muilti hash map(std::unordered_multimap) from Update_rows_log_event:
{key: (1,1), Value:{BI:(1,1), AI:(1,43 )}}
{key: (2,2), Value:{BI:(2,2), AI:(2,23 )}}
...
{key: (1,1), Value:{BI:(1,1), AI:(1,21 )}}
{key: (2,2), Value:{BI:(2,2), AI:(2,17 )}}

When scanning the whole table, the table's first record  will matche {key: (1,1), Value:{BI:(1,1), AI:(1,21 )}} in the muilti hash map.

set global slave_rows_search_algorithms='INDEX_SCAN,TABLE_SCAN', you will avoid the problem.
[15 Apr 2022 6:31] huahua xu
For unordered_multimap, I tested the order of its search results:

#include <iostream>
#include <string>
#include <unordered_map>
using namespace std;

int main()
{

    // declaration
    unordered_multimap<int, int> sample;

    // inserts key and elements
    sample.emplace(1, 1);
    sample.emplace(1, 2);
    sample.emplace(1, 3);
    sample.emplace(4, 9);
    sample.emplace(60, 89);

    const auto it = sample.find(1);
    if (it != sample.end())
    {
        cout << "{" << it->first << ":" << it->second << "}\n ";
    }

    return 0;
}

The test case output:

{1:3}
[16 Apr 2022 11:02] dennis GAO
Hi huahua xu,

Thanks for the suggestion and analysis.
But the inconsistent can be re-produced after I do set global slave_rows_search_algorithms='INDEX_SCAN,TABLE_SCAN'.

The steps are:
1. on m1

set global slave_rows_search_algorithms='INDEX_SCAN,TABLE_SCAN';

2. on s1

set global slave_rows_search_algorithms='INDEX_SCAN,TABLE_SCAN';
stop slave;
start slave;

3. on m1

create table t(c1 int,c2 int);
insert into t values (1,1),(2,2),(3,3),(4,4),(5,5),(1,1),(2,2);
update t set c2=rand()*100 where 1=1;
select * from t

+------+------+
| c1   | c2   |
+------+------+
|    1 |   37 |
|    2 |   97 |
|    3 |   72 |
|    4 |   70 |
|    5 |   35 |
|    1 |   62 |
|    2 |    8 |
+------+------+

4. on s1

select *from t;
+------+------+
| c1   | c2   |
+------+------+
|    1 |   37 |
|    2 |    8 |
|    3 |   72 |
|    4 |   70 |
|    5 |   35 |
|    1 |   62 |
|    2 |   97 |
+------+------+

5. on m1

alter table t add column id int auto_increment primary key;
select * from t
+------+------+----+
| c1   | c2   | id |
+------+------+----+
|    1 |   37 |  1 |
|    2 |   97 |  2 |
|    3 |   72 |  3 |
|    4 |   70 |  4 |
|    5 |   35 |  5 |
|    1 |   62 |  6 |
|    2 |    8 |  7 |
+------+------+----+

6. on s1

select * from t;
+------+------+----+
| c1   | c2   | id |
+------+------+----+
|    1 |   37 |  1 |
|    2 |    8 |  2 |
|    3 |   72 |  3 |
|    4 |   70 |  4 |
|    5 |   35 |  5 |
|    1 |   62 |  6 |
|    2 |   97 |  7 |
+------+------+----+

You can see the data is still in-consistence.
[17 Apr 2022 9:13] huahua xu
Hi dennis GAO,

I repeated your case about the inconsistent data, and it will be ok after slave_rows_search_algorithms is set to 'INDEX_SCAN,TABLE_SCAN' on slave.

It is really strange: the records (1,1) both are correct, but the other records (2,2) are incorrect.
[25 May 2022 10:06] dennis GAO
In mysql8.0 Hash_slave_rows use std::unordered_multimap to store the binlog_event_row hash values.

For some duplicate rows, such as (1,1),(1,1),(1,1), which have the same hash key, will be added into the some bucket in inverse order.
    
For example, there are three rows update events:
    row1: before_image:(1,1), after_image:(1,5)
    row2: before_image:(1,1), after_image:(1,6)
    row3: before_image:(1,1), after_image:(1,7)
    
The above three rows in the hash bucket are in inverse order:
    
    row3->row2->row1
    
When we use m_hash.find(key), we will get the iterator poin to row3 as the start point.
    
In Rows_log_event::do_scan_and_update, the innodb row (1,1) will firt match row3 by using m_hash.get; then row2 and finial row1.
    
So in the slave, the update row execute sequence is inverse comparing to the master.
    
After the replication, the slave will get the following select output:
    (1,7)
    (1,6)
    (1,5)
    
While the master's select output is:
    (1,5)
    (1,6)
    (1,7)
    
The row sequence is different between master and slave.
If the master exeucte "alter table t add column id int auto_increment primary key;" in this case, the data will be in-consistence between master and slave.

For fix:
    Use std::multimap to store the binlog_event_row hash values, rather than std::unordered_multimap.
[25 May 2022 10:08] dennis GAO
adding the patch as contribution

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: add_auto_inc_replication_broken.diff (text/x-patch), 3.80 KiB.