Bug #101828 Hash scan incorrectly applied the log resulting in HA_ERR_END_OF_FILE
Submitted: 2 Dec 2020 2:37 Modified: 25 Jun 2024 7:48
Reporter: Xiaodong Huang (OCA) Email Updates:
Status: Verified Impact on me:
None 
Category:MySQL Server: Replication Severity:S2 (Serious)
Version:8.0.37 OS:CentOS
Assigned to: CPU Architecture:x86
Tags: Contribution, regression

[2 Dec 2020 2:37] Xiaodong Huang
Description:
Assume that a table t1 without a primary key and unique index has {r[1],r[2], r[3],...r[x],...,r[y],... }, which satisfies When crc32(r[x])=crc32(r[y]), the update to r[y] will be applied to r[x] on the slave. This leads to inconsistent values of r[x] and r[y] on the master and slave. Then on the master, executing delete /update r[x] or r[y] will cause HA_ERR_END_OF_FILE(1032) error.

How to repeat:
Execute the following testcase:

--source include/master-slave.inc
--source include/have_binlog_format_row.inc

--echo # init ===================================================================
connection master;

--let $rows_alg_tmp=`select @@global.slave_rows_search_algorithms`
SET @@global.slave_rows_search_algorithms= 'HASH_SCAN,INDEX_SCAN,TABLE_SCAN';

CREATE TABLE t1 (
  a bigint unsigned not null,
  b bigint unsigned not null
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

# The two row have the same crc value
--echo # crc32(null_bytes,row[1].a,row[1].b)=crc32(null_bytes,row[2].a,row[2].b)
--echo # crc32(0xffa87ced4c74eee8a8ede45e459e115068)= crc32(0xff10b90d175cd25c1396c7927505c51669)
insert into t1 values(0xa8e8ee744ced7ca8, 0x6850119e455ee4ed),(0x135cd25c170db910, 0x6916c5057592c796);

--sync_slave_with_master
connection slave;
SET @@global.slave_rows_search_algorithms= 'HASH_SCAN,INDEX_SCAN,TABLE_SCAN';
select * from t1;

--echo # phrase 1 ===================================================================
--echo # The log of the second record is applied to the first record
connection master;
update t1 set a=1 where a=0x135cd25c170db910 and b=0x6916c5057592c796;
select * from t1;

--echo # slave========================================
--sync_slave_with_master
connection slave;
select * from t1;

--echo # phrase 2 ===================================================================
--echo # HA_ERR_KEY_NOT_FOUND error!
connection master;
update t1 set a=2 where a=1;
select * from t1;

--echo # slave========================================
--sync_slave_with_master
connection slave;
select * from t1;

--echo # cleanup ===================================================================
connection master;
--eval SET @@global.slave_rows_search_algorithms='$rows_alg_tmp'
drop table t1;

--sync_slave_with_master
connection slave;
--eval SET @@global.slave_rows_search_algorithms='$rows_alg_tmp'

--source include/rpl_end.inc

Produces the following result, But it should be able to execute successfully.

Slave_IO_State	Waiting for master to send event
Master_Host	127.0.0.1
Master_User	root
Master_Port	13000
Connect_Retry	1
Master_Log_File	master-bin.000001
Read_Master_Log_Pos	1361
Relay_Log_File	slave-relay-bin.000002
Relay_Log_Pos	1265
Relay_Master_Log_File	master-bin.000001
Slave_IO_Running	Yes
Slave_SQL_Running	No
Replicate_Do_DB
Replicate_Ignore_DB
Replicate_Do_Table
Replicate_Ignore_Table
Replicate_Wild_Do_Table
Replicate_Wild_Ignore_Table
Last_Errno	1032
Last_Error	Could not execute Update_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 1330
Skip_Counter	0
Exec_Master_Log_Pos	1048
Relay_Log_Space	1787
Until_Condition	None
Until_Log_File
Until_Log_Pos	0
Master_SSL_Allowed	No
Master_SSL_CA_File
Master_SSL_CA_Path
Master_SSL_Cert
Master_SSL_Cipher
Master_SSL_Key
Seconds_Behind_Master
Master_SSL_Verify_Server_Cert	No
Last_IO_Errno	0
Last_IO_Error
Last_SQL_Errno	1032
Last_SQL_Error	Could not execute Update_rows event on table test.t1; Can't find record in 't1', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the event's master log master-bin.000001, end_log_pos 1330
Replicate_Ignore_Server_Ids
Master_Server_Id	1
Master_UUID	dc38226c-3400-11eb-a34e-fa163eca90dd
Master_Info_File	mysql.slave_master_info
SQL_Delay	0
SQL_Remaining_Delay
Slave_SQL_Running_State
Master_Retry_Count	10
Master_Bind
Last_IO_Error_Timestamp
Last_SQL_Error_Timestamp	201201 21:13:07
Master_SSL_Crl
Master_SSL_Crlpath
Retrieved_Gtid_Set
Executed_Gtid_Set
Auto_Position	0
Replicate_Rewrite_DB
Channel_Name
Master_TLS_Version
Master_public_key_path
Get_master_public_key	0

Suggested fix:
Fix from Tencent TXSQL Team.

patch:
[2 Dec 2020 2:40] Xiaodong Huang
Simple fix from TXSQL Team of Tencent

(*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

Contribution: fix.patch (application/octet-stream, text), 466 bytes.

[2 Dec 2020 2:57] Xiaodong Huang
The error reported in 8.0.22 is HA_ERR_END_OF_FILE
[2 Dec 2020 3:19] Xiaodong Huang
In testcase:

CREATE TABLE t1 (
  a bigint unsigned not null,
  b bigint unsigned not null
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

Since 8.0 no longer uses "utf8", it generates warnings and the fix is:

CREATE TABLE t1 (
  a bigint unsigned not null,
  b bigint unsigned not null
) ENGINE=InnoDB;
[2 Dec 2020 4:28] MySQL Verification Team
Hello Xiaodong,

Thank you for the report and contribution.
Verified as described.

regards,
Umesh
[25 Jun 2024 7:48] Xiaodong Huang
In recent years, I have received a lot of feedback about this problem. Below, I will explain the reasons for the problem by combining the testcase above. Additionally, this problem can still be reproduced in version 8.0.37.

Background: In tables without a primary key, the hash scan algorithm is widely used when applying row-format binlog. We provide an example in this bug report to illustrate this problem:

Example:
```sql
SET @@global.slave_rows_search_algorithms= 'HASH_SCAN,INDEX_SCAN,TABLE_SCAN';

CREATE TABLE t1 (
  a bigint unsigned not null,
  b bigint unsigned not null
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

-- insert into t1 values(r[1]),(r[2]);
insert into t1 values(0xa8e8ee744ced7ca8, 0x6850119e455ee4ed),(0x135cd25c170db910, 0x6916c5057592c796);
-- update t1 set r[2] = r'[2]
update t1 set a=1 where a=0x135cd25c170db910 and b=0x6916c5057592c796;
```

Problem: In this update statement, table t1 {r[1],r[2]}, In here crc32(r[1])=crc32(r[2]),  When an update is applied to r[2], it will be applied to r[1] instead of r[2] on the slave. This leads to inconsistent values of t1 on the master and slave. After executing update statement on the master, the table t1 becomes {r[1], r'[2]}, but on the slave, the table t1 becomes {r'[2], r[2]}.

Analysis:
Start analyzing the execution process of the above update statement on the slave.
1. When slave executes the update event related to the update satement, it will create a hash table for this event, the hash table is as following:

hash_value                                          |                                    hash_records[0]                 |   hash records[1]
                                                    |                before image            |  after image              | 
crc32((0x135cd25c170db910, 0x6916c5057592c796)) --> [(0x135cd25c170db910, 0x6916c5057592c796), (0x1, 0x6916c5057592c796)] --> nullptr

2. We then begin to scan table t1 {r[1], r[2]}. The process starts by reading r[1] , comparing the key, and applying it using following code segment. we using step1~step12 to illustrate the executing of update row event on this problem.
  2.1. In step1~step7. we don't find a record to apply current event related to r[2], but table->record[0] is r[2].
  2.2. In step8~step12, the while condition mistakenly believes that we have updated r[1] to r[2]. so directly update r[1] to r'[2].
  ```c++
  do {
    // step1. start, table->record[0] = r[1]
    error = next_record_scan(i == 0);
    i++;

    switch (error) {
      case 0: {
        // step2. get first record of crc32((0x135cd25c170db910, 0x6916c5057592c796))
        entry = m_hash.get(table, &this->m_local_cols);
        do {
          // step3. table->record[1] =  table->record[0] = r[1]
          // step8. table->record[1] =  table->record[0] = r[2]
          store_record(table, record[1]);

          // iterator the hash records
          while (entry) {
            m_curr_row = entry->positions->bi_start;
            m_curr_row_end = entry->positions->bi_ends;

            prepare_record(table, &this->m_local_cols, false);
            // step4, step9. table->record[0] = befor image of hash_records[0] = r[2]
            if ((error = unpack_current_row(rli, &m_cols, false /*is not AI*/)))
              goto close_table;

            // step5, step10. judge if table->record[0] == table->record[1]
            if (record_compare(table, &this->m_local_cols))
              // step6. entry = hash_records[2] = nullptr
              m_hash.next(&entry);
            else
              // step11. found a row operation log in event 
              break;  // we found a match
          }
           // step6  entry is nullptr and skip apply 
          if (entry) {
            // step12. apply this event [r[2], r'[2]] to r[1]
          }
          // step7. table->record[0] = r[2], So while condition is satisfied
        } while (this->get_general_type_code() ==
                     binary_log::UPDATE_ROWS_EVENT &&
                 !is_pk_present && (entry = m_hash.get(table, &m_local_cols)));
      } break;
    }
  } while (((idempotent_errors < m_hash.size()) && !m_hash.is_empty()) &&
         (!error || (error == HA_ERR_RECORD_DELETED)));
  ```

Fix: 
If the entry is nullptr, it mean that we can't find any more row operation logs (a update or delete in rows event) in current event on current record. So we add "entry != nullptr" to the while condition in step7 to move next record.