Bug #101828 | Hash scan incorrectly applied the log resulting in HA_ERR_END_OF_FILE | ||
---|---|---|---|
Submitted: | 2 Dec 2020 2:37 | Modified: | 25 Jun 2024 7:48 |
Reporter: | Xiaodong Huang (OCA) | Email Updates: | |
Status: | Verified | Impact on me: | |
Category: | MySQL Server: Replication | Severity: | S2 (Serious) |
Version: | 8.0.37 | OS: | CentOS |
Assigned to: | CPU Architecture: | x86 | |
Tags: | Contribution, regression |
[2 Dec 2020 2:37]
Xiaodong Huang
[2 Dec 2020 2:40]
Xiaodong Huang
Simple fix from TXSQL Team of Tencent (*) I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.
Contribution: fix.patch (application/octet-stream, text), 466 bytes.
[2 Dec 2020 2:57]
Xiaodong Huang
The error reported in 8.0.22 is HA_ERR_END_OF_FILE
[2 Dec 2020 3:19]
Xiaodong Huang
In testcase: CREATE TABLE t1 ( a bigint unsigned not null, b bigint unsigned not null ) ENGINE=InnoDB DEFAULT CHARSET=utf8; Since 8.0 no longer uses "utf8", it generates warnings and the fix is: CREATE TABLE t1 ( a bigint unsigned not null, b bigint unsigned not null ) ENGINE=InnoDB;
[2 Dec 2020 4:28]
MySQL Verification Team
Hello Xiaodong, Thank you for the report and contribution. Verified as described. regards, Umesh
[25 Jun 2024 7:48]
Xiaodong Huang
In recent years, I have received a lot of feedback about this problem. Below, I will explain the reasons for the problem by combining the testcase above. Additionally, this problem can still be reproduced in version 8.0.37. Background: In tables without a primary key, the hash scan algorithm is widely used when applying row-format binlog. We provide an example in this bug report to illustrate this problem: Example: ```sql SET @@global.slave_rows_search_algorithms= 'HASH_SCAN,INDEX_SCAN,TABLE_SCAN'; CREATE TABLE t1 ( a bigint unsigned not null, b bigint unsigned not null ) ENGINE=InnoDB DEFAULT CHARSET=utf8; -- insert into t1 values(r[1]),(r[2]); insert into t1 values(0xa8e8ee744ced7ca8, 0x6850119e455ee4ed),(0x135cd25c170db910, 0x6916c5057592c796); -- update t1 set r[2] = r'[2] update t1 set a=1 where a=0x135cd25c170db910 and b=0x6916c5057592c796; ``` Problem: In this update statement, table t1 {r[1],r[2]}, In here crc32(r[1])=crc32(r[2]), When an update is applied to r[2], it will be applied to r[1] instead of r[2] on the slave. This leads to inconsistent values of t1 on the master and slave. After executing update statement on the master, the table t1 becomes {r[1], r'[2]}, but on the slave, the table t1 becomes {r'[2], r[2]}. Analysis: Start analyzing the execution process of the above update statement on the slave. 1. When slave executes the update event related to the update satement, it will create a hash table for this event, the hash table is as following: hash_value | hash_records[0] | hash records[1] | before image | after image | crc32((0x135cd25c170db910, 0x6916c5057592c796)) --> [(0x135cd25c170db910, 0x6916c5057592c796), (0x1, 0x6916c5057592c796)] --> nullptr 2. We then begin to scan table t1 {r[1], r[2]}. The process starts by reading r[1] , comparing the key, and applying it using following code segment. we using step1~step12 to illustrate the executing of update row event on this problem. 2.1. In step1~step7. we don't find a record to apply current event related to r[2], but table->record[0] is r[2]. 2.2. In step8~step12, the while condition mistakenly believes that we have updated r[1] to r[2]. so directly update r[1] to r'[2]. ```c++ do { // step1. start, table->record[0] = r[1] error = next_record_scan(i == 0); i++; switch (error) { case 0: { // step2. get first record of crc32((0x135cd25c170db910, 0x6916c5057592c796)) entry = m_hash.get(table, &this->m_local_cols); do { // step3. table->record[1] = table->record[0] = r[1] // step8. table->record[1] = table->record[0] = r[2] store_record(table, record[1]); // iterator the hash records while (entry) { m_curr_row = entry->positions->bi_start; m_curr_row_end = entry->positions->bi_ends; prepare_record(table, &this->m_local_cols, false); // step4, step9. table->record[0] = befor image of hash_records[0] = r[2] if ((error = unpack_current_row(rli, &m_cols, false /*is not AI*/))) goto close_table; // step5, step10. judge if table->record[0] == table->record[1] if (record_compare(table, &this->m_local_cols)) // step6. entry = hash_records[2] = nullptr m_hash.next(&entry); else // step11. found a row operation log in event break; // we found a match } // step6 entry is nullptr and skip apply if (entry) { // step12. apply this event [r[2], r'[2]] to r[1] } // step7. table->record[0] = r[2], So while condition is satisfied } while (this->get_general_type_code() == binary_log::UPDATE_ROWS_EVENT && !is_pk_present && (entry = m_hash.get(table, &m_local_cols))); } break; } } while (((idempotent_errors < m_hash.size()) && !m_hash.is_empty()) && (!error || (error == HA_ERR_RECORD_DELETED))); ``` Fix: If the entry is nullptr, it mean that we can't find any more row operation logs (a update or delete in rows event) in current event on current record. So we add "entry != nullptr" to the while condition in step7 to move next record.